Vocabulary Flashcards: Essentials of Statistics for the Behavioral Sciences (Ch. 1-3)

1.1 Statistics, Science, and Observations

Statistics = a set of mathematical procedures for organizing, summarizing, and interpreting information.
Two general purposes of statistics:
- Organize and summarize information so researchers can see what happened and communicate results.
- Use sample data to answer questions about population parameters and justify general conclusions.
Population vs. Sample:
- Population: the entire set of individuals of interest for a research question.
- Sample: a smaller, more manageable group drawn from the population, intended to be representative.
- Population size can be very large; samples are used because examining everyone is usually impractical.
Data, Datum, and Data Sets:
- Datum (singular) = a single measurement/observation (score).
- Data (plural) = measurements/observations (a data set is a collection of scores).
Descriptive vs. Inferential Statistics:
- Descriptive statistics: organize, summarize, and present data (e.g., tables, graphs, means).
- Inferential statistics: use sample data to draw conclusions about populations and generalize beyond the data.
- Inferential statistics must address sampling error (the discrepancy between a sample statistic and the population parameter).
Key terms:
- Parameter: a value that describes a population.
- Statistic: a value that describes a sample.
- Each population parameter has a corresponding sample statistic; most research uses statistics to infer parameters.
Margin of Error (Box 1.1): sampling error is the naturally occurring discrepancy between sample statistics and population parameters; polls report margins of error (e.g., +/− 4 percentage points).
Diagrammatic idea (conceptual): different samples from the same population yield different statistics, illustrating sampling error.

1.2 Populations and Samples

Research typically begins with a question about a population (the entire group of interest).
Sample should be representative of its population; results from the sample are generalized to the population.
Constructs, variables, and measurement:
- Variable: a characteristic that changes or has different values for different individuals (e.g., height, mood, temperature).
- Datum/score: the measurement obtained for each individual.
- Data set: collection of scores.
Parameter vs. Statistic (definitions repeated for clarity):
- Parameter: a characteristic of a population.
- Statistic: a characteristic of a sample.
Parameter–Statistic relationship:
- Every population parameter has a corresponding sample statistic; most research uses sample statistics to infer population parameters.
Two data structures (used to classify research methods and statistical procedures):
- Descriptive and Inferential statistics are connected to the data structures below.
- Data Structure I: Measuring two variables for each individual (the correlational method).
- Data Structure II: Comparing two (or more) groups of scores (experimental vs. nonexperimental methods).
Variables, measurement, and data types:
- A variable can be a characteristic that changes (e.g., wake-up time, academic performance).
- A datum is a single measurement; the set of all scores is the data set.
SECTION 1.2 KEY CONNECTIONS:
- Population parameter ↔ Sample statistic (and their sampling error).
- Samples provide the practical basis for inferring about populations.
Visual/graphic idea:
- Figure 1.1 shows the relationship: Population ⟶ Sample ⟶ Generalize to Population.
- Figure 1.2 illustrates sampling error via two samples from the same population with different statistics.
The two major data structures in research:
- Data Structure I (Correlational): two variables measured for each individual; example scatter plot showing relationship between wake-up time and academic performance; correlation describes the relation but does not imply causation.
- Data Structure II (Group Comparisons): two or more groups defined by a variable; compares scores across groups; can be experimental (manipulation + control) or nonexperimental (no random assignment).
Key terms related to data, sampling, and statistics:
- Datum, Data Set, X, Y notation for variables; N for population size, n for sample size (notation varies; in this text, N denotes population size and n denotes sample size).
- Sampling error: discrepancy between a sample statistic and the corresponding population parameter, due to the randomness of sampling.
Important connections to research method:
- Correlational studies assess relationships between two variables but cannot establish causation.
- Experimental method manipulates one variable (independent variable) to observe causal effects on another (dependent variable) while controlling extraneous variables.

1.3 Data Structures, Research Methods, and Statistics

Data Structure I: Correlational method
- Measure two variables for each individual (e.g., wake-up time and academic performance).
- Data presentation: table of two scores per individual and a scatter plot where x = wake-up time and y = academic performance.
- Limitation: cannot establish causation; relationship observed does not imply one variable causes changes in the other.
- Example: wake-up time vs. academic performance; as wake-up time increases, performance tends to decrease (illustrative pattern).
Data Structure II: Experimental and nonexperimental methods (group comparisons)
- Compare two or more groups defined by a variable; then measure the second variable to obtain scores for each group.
- Experimental method: manipulation of an independent variable to create treatment conditions, then observe a dependent variable; aim is to demonstrate causation with control of extraneous variables.
- Example: weather violence study (hypothetical); kids exposed to violent TV show vs. non-violent; measure aggression on playground.
The Experimental Method (two defining features)
1) Manipulation: the researcher changes the value of an independent variable across conditions.
2) Control: the researcher controls extraneous variables to prevent them from influencing the relationship.
Example: money-counting experiment (Zhou & Vohs, 2009)
- Independent variable: material participants handle (money vs. blank paper).
- Dependent variable: pain rating after hands are placed in hot water.
- Finding: counting money reduces pain perception relative to counting paper.
Participant variables and environmental variables (potential confounds):
- Participant variables: age, gender, intelligence, etc. (could confound results if groups differ on these factors).
- Environmental variables: time of day, lighting, weather, etc.
Techniques to control extraneous variables in experiments
- Random assignment: equal chance of being assigned to each treatment condition; helps distribute participant characteristics evenly and controls environmental variables.
- Matching: create equivalent groups based on key characteristics (e.g., gender proportions).
- Holding variables constant: study uses a single age group, for example.
Terminology in experimental research
- Independent variable (IV): the variable manipulated by the experimenter (e.g., money vs. paper).
- Dependent variable (DV): the variable measured to assess the effect of the IV (e.g., pain rating).
- Control group: does not receive the experimental treatment (baseline).
- Experimental group: receives the treatment.
- Confounded: when more than one factor varies with the treatment, making it difficult to attribute effects to a single cause.
Nonexperimental methods (comparative studies without true manipulation)
- Nonequivalent groups design: groups defined by a preexisting characteristic (e.g., gender) without random assignment.
- Pre–post design: same participants measured before and after a treatment, but no control over the passage of time.
- In all nonexperimental designs, causal conclusions are weaker due to potential confounds.
Terminology for nonexperimental studies
- Quasi-independent variable: the variable used to create groups in a nonexperimental study (not truly manipulated).
- Dependent variable remains the measured outcome.
Recap: Data structures in practice
- I. Correlational: two variables per individual; analyze relationship (correlation) but not causation.
- II. Group comparisons: two or more groups; can be experimental (causal inference) or nonexperimental (no random assignment; weaker causal claims).

1.4 Constructs and Operational Definitions

Constructs and measurement
- Some variables are directly observable (e.g., height, weight), others are internal constructs (e.g., intelligence, anxiety, hunger) that require indirect measurement.
- Constructs (hypothetical constructs) are internal attributes useful for describing/explaining behavior.
- Operational definitions define how a construct will be measured or observed; they specify the measurement procedure and use resulting measurements as the definition of the construct.
Examples of operational definitions
- Intelligence measured via IQ test scores; IQ test results serve as an operational definition of the construct intelligence.
- Hunger measured by number of hours since last eating; this defines hunger operationally.
Important terminology
- Discrete vs. Continuous variables
- Discrete: separate, indivisible categories (e.g., number of children, race, gender, occupation).
- Continuous: infinite number of possible values within an interval (e.g., time, height, weight). Continuous variables can be subdivided into fractional parts; real limits define measurement boundaries.
Scales of measurement (start with simple to complex)
- Nominal scale: categories have names with no quantitative order (e.g., major, race, gender). Differences between categories are not meaningful in magnitude or direction.
- Ordinal scale: categories have a meaningful order (e.g., ranks, class standing, shirt sizes) but not equal intervals; you can say which is bigger but not by how much.
- Interval scale: ordered categories with equal intervals between adjacent values; no true zero point (arbitrary zero) (e.g., Fahrenheit temperature, calendar years, some test scores).
- Ratio scale: interval scale with an absolute zero (nonarbitrary zero) that allows meaningful ratio comparisons (e.g., height, weight, reaction time, number of errors).
Real limits (for continuous variables)
- Measurements are continuous and boundaries between scores are real limits (e.g., weight measured to the nearest pound yields real limits 149.5 to 150.5 around a score of 150).
- Real limits define intervals; each observed score corresponds to an interval on the measurement scale.
Practical implications of scales
- Numerical calculations (means, standard deviations) are appropriate for interval/ratio scales.
- For nominal/ordinal data, use nonparametric techniques (e.g., median, mode, Spearman correlation, chi-square tests).
Examples from Section 1.4
- Interval vs. ratio distinction: height in inches is inherently a ratio scale (true zero = none of height); a converted scale that centers around the average (e.g., deviations from the mean) becomes an interval scale.
- A common classroom exercise: transforming a measurement (e.g., height) to a different scale while preserving the information about differences but changing the zero point; ratio comparisons become invalid under the transformed scale.
Practical questions (Learning checks) based on scales
- Identify scales for given variables (income, number of dependents, SSN, grades, preferences, number of children, etc.).
- Determine whether variables are discrete or continuous, and identify real limits for specific measurement precisions.

1.5 Scores Summation Notation

Basic notation
- X, Y: scores for variables X and Y.
- N: population size (uppercase for population); n: sample size (lowercase for sample).
- Σ (sigma) denotes summation. The expression ΣX is the sum of X values.
Examples of summation notation
- Given scores: 4, 3, 7, 1
- ΣX = 15
- ΣX^2 = 75
- (ΣX)^2 = 225
- Σ(X − 1) = 11
- Σ(X − 1)^2 = 49
- Example table for pairwise data (X, Y) with XY products:
- ΣX = 15, ΣY = 14, ΣXY = 54
Important steps and order of operations
- Order of mathematical operations (as applied to statistical computations):
1. Parentheses first.
2. Exponents (squaring, etc.).
3. Multiplication and/or division (left to right).
4. Summation (Σ) next.
5. Addition and/or subtraction last.
- Some computations involve sequences of steps that lead to multiple intermediate results (e.g., computing oX, oX^2, (oX)^2, o(X−1), o(X−1)^2).
Worked examples (from the text)
- Example: oX = ΣX for X = {3, 1, 7, 1} → 15.
- oX^2 = ΣX^2 = 9 + 1 + 49 + 1 = 75.
- (oX)^2 = (15)^2 = 225.
- o(X−1) = Σ(X−1) = (3−1) + (1−1) + (7−1) + (1−1) = 2 + 0 + 6 + 0 = 8? (Note: actual text example shows 11 for {4,3,7,1}, works if using the given data; use the exact numbers from the text’s example: X = {4,3,7,1} yields ΣX = 15; Σ(X−1) = 11; Σ(X−1)^2 = 49.)
- o(X−1)^2 = Σ(X−1)^2 = 49 for the example above.
Additional computational examples
- Example 1.6 shows: for pairs (X, Y) with XY products, ΣX = 15, ΣY = 14, ΣXY = 54.
Practical advice for using Σ notation
- Two key points:
- The Σ sign is always followed by the symbol/expression identifying which values to add.
- The summation operation is combined with other operations (multiplication, squaring) and must follow the proper order of operations.
Summary of implications for statistics students
- Summation is a core operation in many statistics formulas (means, variances, covariances, etc.).
- Mastery of Σ notation and order of operations is essential for correct calculations.

1.6 Chapter-wide connections and learning tools (brief overview)

Real-world relevance: statistics provide a structured, objective approach to gathering, organizing, and interpreting data.
Practical study notes from the Preface (briefly): the book emphasizes conceptual understanding, problem solving, and real-world examples to aid learning.
End-of-chapter materials (not detailed here) include problems, demonstrations, and learning checks to reinforce concepts.

SUMMARY OF CHAPTER 1: INTRODUCTION TO STATISTICS

Statistics defined as procedures for organizing, summarizing, and interpreting data; used to describe samples and infer properties of populations.
Two major functions:
- Descriptive statistics: organize/summarize data.
- Inferential statistics: use sample data to draw conclusions about populations, accounting for sampling error.
Populations vs. samples; parameters vs. statistics; the inevitability of sampling error when generalizing from sample to population.
Data structures in behavioral research:
- Correlational method (two variables measured for each individual) – cannot establish causation.
- Experimental/nonexperimental methods (group comparisons) – manipulation of an independent variable and control of extraneous variables; causal conclusions depend on design strength.
Constructs and measurement:
- Constructs are internal attributes (e.g., intelligence, hunger) defined by operational definitions based on observable behavior.
- Operational definitions specify how a construct is measured and the resulting scores.
Scales of measurement (nominal, ordinal, interval, ratio) and the implications for statistical techniques.
Discrete vs. continuous variables and the concept of real limits for continuous measurements.
Notation: X, Y for scores; N vs. n; Σ as summation; and the importance of order of operations in statistical calculations.
The derivative goal: help students move from memorization to conceptual understanding and principled application of statistics in research.