AB

Vocabulary Flashcards: Essentials of Statistics for the Behavioral Sciences (Ch. 1-3)

1.1 Statistics, Science, and Observations

  • Statistics = a set of mathematical procedures for organizing, summarizing, and interpreting information.
  • Two general purposes of statistics:
    • Organize and summarize information so researchers can see what happened and communicate results.
    • Use sample data to answer questions about population parameters and justify general conclusions.
  • Population vs. Sample:
    • Population: the entire set of individuals of interest for a research question.
    • Sample: a smaller, more manageable group drawn from the population, intended to be representative.
    • Population size can be very large; samples are used because examining everyone is usually impractical.
  • Data, Datum, and Data Sets:
    • Datum (singular) = a single measurement/observation (score).
    • Data (plural) = measurements/observations (a data set is a collection of scores).
  • Descriptive vs. Inferential Statistics:
    • Descriptive statistics: organize, summarize, and present data (e.g., tables, graphs, means).
    • Inferential statistics: use sample data to draw conclusions about populations and generalize beyond the data.
    • Inferential statistics must address sampling error (the discrepancy between a sample statistic and the population parameter).
  • Key terms:
    • Parameter: a value that describes a population.
    • Statistic: a value that describes a sample.
    • Each population parameter has a corresponding sample statistic; most research uses statistics to infer parameters.
  • Margin of Error (Box 1.1): sampling error is the naturally occurring discrepancy between sample statistics and population parameters; polls report margins of error (e.g., +/− 4 percentage points).
  • Diagrammatic idea (conceptual): different samples from the same population yield different statistics, illustrating sampling error.

1.2 Populations and Samples

  • Research typically begins with a question about a population (the entire group of interest).

  • Sample should be representative of its population; results from the sample are generalized to the population.

  • Constructs, variables, and measurement:

    • Variable: a characteristic that changes or has different values for different individuals (e.g., height, mood, temperature).
    • Datum/score: the measurement obtained for each individual.
    • Data set: collection of scores.
  • Parameter vs. Statistic (definitions repeated for clarity):

    • Parameter: a characteristic of a population.
    • Statistic: a characteristic of a sample.
  • Parameter–Statistic relationship:

    • Every population parameter has a corresponding sample statistic; most research uses sample statistics to infer population parameters.
  • Two data structures (used to classify research methods and statistical procedures):

    • Descriptive and Inferential statistics are connected to the data structures below.
    • Data Structure I: Measuring two variables for each individual (the correlational method).
    • Data Structure II: Comparing two (or more) groups of scores (experimental vs. nonexperimental methods).
  • Variables, measurement, and data types:

    • A variable can be a characteristic that changes (e.g., wake-up time, academic performance).
    • A datum is a single measurement; the set of all scores is the data set.
  • SECTION 1.2 KEY CONNECTIONS:

    • Population parameter ↔ Sample statistic (and their sampling error).
    • Samples provide the practical basis for inferring about populations.
  • Visual/graphic idea:

    • Figure 1.1 shows the relationship: Population ⟶ Sample ⟶ Generalize to Population.
    • Figure 1.2 illustrates sampling error via two samples from the same population with different statistics.
  • The two major data structures in research:

    • Data Structure I (Correlational): two variables measured for each individual; example scatter plot showing relationship between wake-up time and academic performance; correlation describes the relation but does not imply causation.
    • Data Structure II (Group Comparisons): two or more groups defined by a variable; compares scores across groups; can be experimental (manipulation + control) or nonexperimental (no random assignment).
  • Key terms related to data, sampling, and statistics:

    • Datum, Data Set, X, Y notation for variables; N for population size, n for sample size (notation varies; in this text, N denotes population size and n denotes sample size).
    • Sampling error: discrepancy between a sample statistic and the corresponding population parameter, due to the randomness of sampling.
  • Important connections to research method:

    • Correlational studies assess relationships between two variables but cannot establish causation.
    • Experimental method manipulates one variable (independent variable) to observe causal effects on another (dependent variable) while controlling extraneous variables.

1.3 Data Structures, Research Methods, and Statistics

  • Data Structure I: Correlational method
    • Measure two variables for each individual (e.g., wake-up time and academic performance).
    • Data presentation: table of two scores per individual and a scatter plot where x = wake-up time and y = academic performance.
    • Limitation: cannot establish causation; relationship observed does not imply one variable causes changes in the other.
    • Example: wake-up time vs. academic performance; as wake-up time increases, performance tends to decrease (illustrative pattern).
  • Data Structure II: Experimental and nonexperimental methods (group comparisons)
    • Compare two or more groups defined by a variable; then measure the second variable to obtain scores for each group.
    • Experimental method: manipulation of an independent variable to create treatment conditions, then observe a dependent variable; aim is to demonstrate causation with control of extraneous variables.
    • Example: weather violence study (hypothetical); kids exposed to violent TV show vs. non-violent; measure aggression on playground.
  • The Experimental Method (two defining features)
    1) Manipulation: the researcher changes the value of an independent variable across conditions.
    2) Control: the researcher controls extraneous variables to prevent them from influencing the relationship.
  • Example: money-counting experiment (Zhou & Vohs, 2009)
    • Independent variable: material participants handle (money vs. blank paper).
    • Dependent variable: pain rating after hands are placed in hot water.
    • Finding: counting money reduces pain perception relative to counting paper.
  • Participant variables and environmental variables (potential confounds):
    • Participant variables: age, gender, intelligence, etc. (could confound results if groups differ on these factors).
    • Environmental variables: time of day, lighting, weather, etc.
  • Techniques to control extraneous variables in experiments
    • Random assignment: equal chance of being assigned to each treatment condition; helps distribute participant characteristics evenly and controls environmental variables.
    • Matching: create equivalent groups based on key characteristics (e.g., gender proportions).
    • Holding variables constant: study uses a single age group, for example.
  • Terminology in experimental research
    • Independent variable (IV): the variable manipulated by the experimenter (e.g., money vs. paper).
    • Dependent variable (DV): the variable measured to assess the effect of the IV (e.g., pain rating).
    • Control group: does not receive the experimental treatment (baseline).
    • Experimental group: receives the treatment.
    • Confounded: when more than one factor varies with the treatment, making it difficult to attribute effects to a single cause.
  • Nonexperimental methods (comparative studies without true manipulation)
    • Nonequivalent groups design: groups defined by a preexisting characteristic (e.g., gender) without random assignment.
    • Pre–post design: same participants measured before and after a treatment, but no control over the passage of time.
    • In all nonexperimental designs, causal conclusions are weaker due to potential confounds.
  • Terminology for nonexperimental studies
    • Quasi-independent variable: the variable used to create groups in a nonexperimental study (not truly manipulated).
    • Dependent variable remains the measured outcome.
  • Recap: Data structures in practice
    • I. Correlational: two variables per individual; analyze relationship (correlation) but not causation.
    • II. Group comparisons: two or more groups; can be experimental (causal inference) or nonexperimental (no random assignment; weaker causal claims).

1.4 Constructs and Operational Definitions

  • Constructs and measurement
    • Some variables are directly observable (e.g., height, weight), others are internal constructs (e.g., intelligence, anxiety, hunger) that require indirect measurement.
    • Constructs (hypothetical constructs) are internal attributes useful for describing/explaining behavior.
    • Operational definitions define how a construct will be measured or observed; they specify the measurement procedure and use resulting measurements as the definition of the construct.
  • Examples of operational definitions
    • Intelligence measured via IQ test scores; IQ test results serve as an operational definition of the construct intelligence.
    • Hunger measured by number of hours since last eating; this defines hunger operationally.
  • Important terminology
    • Discrete vs. Continuous variables
    • Discrete: separate, indivisible categories (e.g., number of children, race, gender, occupation).
    • Continuous: infinite number of possible values within an interval (e.g., time, height, weight). Continuous variables can be subdivided into fractional parts; real limits define measurement boundaries.
  • Scales of measurement (start with simple to complex)
    • Nominal scale: categories have names with no quantitative order (e.g., major, race, gender). Differences between categories are not meaningful in magnitude or direction.
    • Ordinal scale: categories have a meaningful order (e.g., ranks, class standing, shirt sizes) but not equal intervals; you can say which is bigger but not by how much.
    • Interval scale: ordered categories with equal intervals between adjacent values; no true zero point (arbitrary zero) (e.g., Fahrenheit temperature, calendar years, some test scores).
    • Ratio scale: interval scale with an absolute zero (nonarbitrary zero) that allows meaningful ratio comparisons (e.g., height, weight, reaction time, number of errors).
  • Real limits (for continuous variables)
    • Measurements are continuous and boundaries between scores are real limits (e.g., weight measured to the nearest pound yields real limits 149.5 to 150.5 around a score of 150).
    • Real limits define intervals; each observed score corresponds to an interval on the measurement scale.
  • Practical implications of scales
    • Numerical calculations (means, standard deviations) are appropriate for interval/ratio scales.
    • For nominal/ordinal data, use nonparametric techniques (e.g., median, mode, Spearman correlation, chi-square tests).
  • Examples from Section 1.4
    • Interval vs. ratio distinction: height in inches is inherently a ratio scale (true zero = none of height); a converted scale that centers around the average (e.g., deviations from the mean) becomes an interval scale.
    • A common classroom exercise: transforming a measurement (e.g., height) to a different scale while preserving the information about differences but changing the zero point; ratio comparisons become invalid under the transformed scale.
  • Practical questions (Learning checks) based on scales
    • Identify scales for given variables (income, number of dependents, SSN, grades, preferences, number of children, etc.).
    • Determine whether variables are discrete or continuous, and identify real limits for specific measurement precisions.

1.5 Scores Summation Notation

  • Basic notation
    • X, Y: scores for variables X and Y.
    • N: population size (uppercase for population); n: sample size (lowercase for sample).
    • Σ (sigma) denotes summation. The expression ΣX is the sum of X values.
  • Examples of summation notation
    • Given scores: 4, 3, 7, 1
    • ΣX = 15
    • ΣX^2 = 75
    • (ΣX)^2 = 225
    • Σ(X − 1) = 11
    • Σ(X − 1)^2 = 49
    • Example table for pairwise data (X, Y) with XY products:
    • ΣX = 15, ΣY = 14, ΣXY = 54
  • Important steps and order of operations
    • Order of mathematical operations (as applied to statistical computations):
    1. Parentheses first.
    2. Exponents (squaring, etc.).
    3. Multiplication and/or division (left to right).
    4. Summation (Σ) next.
    5. Addition and/or subtraction last.
    • Some computations involve sequences of steps that lead to multiple intermediate results (e.g., computing oX, oX^2, (oX)^2, o(X−1), o(X−1)^2).
  • Worked examples (from the text)
    • Example: oX = ΣX for X = {3, 1, 7, 1} → 15.
    • oX^2 = ΣX^2 = 9 + 1 + 49 + 1 = 75.
    • (oX)^2 = (15)^2 = 225.
    • o(X−1) = Σ(X−1) = (3−1) + (1−1) + (7−1) + (1−1) = 2 + 0 + 6 + 0 = 8? (Note: actual text example shows 11 for {4,3,7,1}, works if using the given data; use the exact numbers from the text’s example: X = {4,3,7,1} yields ΣX = 15; Σ(X−1) = 11; Σ(X−1)^2 = 49.)
    • o(X−1)^2 = Σ(X−1)^2 = 49 for the example above.
  • Additional computational examples
    • Example 1.6 shows: for pairs (X, Y) with XY products, ΣX = 15, ΣY = 14, ΣXY = 54.
  • Practical advice for using Σ notation
    • Two key points:
    • The Σ sign is always followed by the symbol/expression identifying which values to add.
    • The summation operation is combined with other operations (multiplication, squaring) and must follow the proper order of operations.
  • Summary of implications for statistics students
    • Summation is a core operation in many statistics formulas (means, variances, covariances, etc.).
    • Mastery of Σ notation and order of operations is essential for correct calculations.

1.6 Chapter-wide connections and learning tools (brief overview)

  • Real-world relevance: statistics provide a structured, objective approach to gathering, organizing, and interpreting data.
  • Practical study notes from the Preface (briefly): the book emphasizes conceptual understanding, problem solving, and real-world examples to aid learning.
  • End-of-chapter materials (not detailed here) include problems, demonstrations, and learning checks to reinforce concepts.

SUMMARY OF CHAPTER 1: INTRODUCTION TO STATISTICS

  • Statistics defined as procedures for organizing, summarizing, and interpreting data; used to describe samples and infer properties of populations.
  • Two major functions:
    • Descriptive statistics: organize/summarize data.
    • Inferential statistics: use sample data to draw conclusions about populations, accounting for sampling error.
  • Populations vs. samples; parameters vs. statistics; the inevitability of sampling error when generalizing from sample to population.
  • Data structures in behavioral research:
    • Correlational method (two variables measured for each individual) – cannot establish causation.
    • Experimental/nonexperimental methods (group comparisons) – manipulation of an independent variable and control of extraneous variables; causal conclusions depend on design strength.
  • Constructs and measurement:
    • Constructs are internal attributes (e.g., intelligence, hunger) defined by operational definitions based on observable behavior.
    • Operational definitions specify how a construct is measured and the resulting scores.
  • Scales of measurement (nominal, ordinal, interval, ratio) and the implications for statistical techniques.
  • Discrete vs. continuous variables and the concept of real limits for continuous measurements.
  • Notation: X, Y for scores; N vs. n; Σ as summation; and the importance of order of operations in statistical calculations.
  • The derivative goal: help students move from memorization to conceptual understanding and principled application of statistics in research.