Chapter 1: Introduction to Statistics & Research

Statistics is the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data.
Statistics is used to make sense of data in four ways:
1. Organize individual scores to examine patterns
2. Summarize data to understand general characteristics
3. Communicate results of a study
4. Interpret what the data indicate

The goal of behavioral research is to understand the “laws of nature” that apply to the behaviors of living organisms.

Population: the entire group to which a law of nature applies.
Sample: a relatively small subset of a population intended to represent the population.
The individuals who are measured in a sample are called participants.
Example mappings:
- Company -> population? (contextual illustration)
- Workforce -> population
- U.S. Workforce -> population
- Individual -> participant
- Employee -> participant

Scenario 1: Researchers examine how dark chocolate might affect heart health in U.S. women. They select 120 female students from Suffolk University for a chocolate study.
- Population: all U.S. women
- Sample: 120 Suffolk University female students
Scenario 2: Researchers are interested in how U.S. college graduates fare in the job market 2 years post-graduation. A survey is sent through the Texas State alumni email newsletter, and responses come from 350 recent TXST graduates.
- Population: all TXST graduates (Texas State University)
- Sample: 350 TXST graduates who responded

We use scores (participant responses) in a sample to infer (estimate) the scores we would expect to find in the population.
This assumes a sample is representative of the population.
Example: In a warehouse with a sample of 1,000 people split 500 males and 500 females, a smaller group of 100 males and 100 females could generate a representative sample of the larger group.
If a sample is NOT representative, it inaccurately reflects the population and may give misleading results.

Quantitative variables (continuous):
- Scores indicate the amount of a variable.
- Examples: Age, weight/BMI, number of pets, number of siblings, etc.
Qualitative variables (discrete / categorical):
- Classify or categorize an individual based on a characteristic.
- Examples: Gender, ethnicity, college classification, etc.

In a relationship, as the scores on one variable change, the scores on the other variable change in a consistent manner.

Pattern 1 (Positive): As X increases, Y increases.
- Examples: more attendance to class → better exam performance; more alcohol → more intoxication.
Pattern 2 (Negative/Inverse): As X increases, Y decreases.
- Examples: more climbing → lower temperature; less money spent on morale → more employees leave.

Perfect consistency: as the score on one variable changes, the score on the other variable changes in a perfectly one-to-one fashion.
In practice, perfect consistency is not required; some degree of consistency is sufficient to indicate a relationship.

Descriptive statistics: procedures for organizing and summarizing sample data.
Inferential statistics: procedures for drawing inferences about the scores and relationships that would be found in the population.

Statistic: a number describing an aspect of the scores in a sample.
- Represented using English letters (e.g., A, B, C).
Parameter: a number describing an aspect of the scores in the population.
- Represented using Greek letters such as $\alpha, \beta, \eta, \mu, \sigma$ .
Note: Common examples include sample mean vs population mean, etc. (not explicitly listed in the text as symbols, but implied by the distinction).

Research design is the framework of research methods and techniques chosen to conduct a study.
Different designs require different descriptive and inferential statistical procedures; we must learn when to use each procedure.
Two major types of research designs:
- Experimental studies
- Correlational studies

In an experiment, the researcher actively changes or manipulates Variable A (the Independent Variable), then measures participants’ scores on Variable B (the Dependent Variable) to see if a relationship is produced by the change in Variable A.
Example: Rubber Plant height when placed inside vs outside during summer.
- Variable A – Location (inside/outside)
- Variable B – Plant Growth (height)

The dependent variable measures a behavior or attribute of a participant that may be influenced by the IV.
Example: Plant Growth – Lots of growth? Little to no growth?

Identify the independent variable, its conditions, and the dependent variable for the study: effect of caffeine on typing speed.
- IV: Caffeine consumption
- Conditions: No caffeine, 50 mg caffeine, 100 mg caffeine
- DV: Number of characters typed per second

In a correlational study, the researcher measures participants’ scores on two variables and then determines whether a relationship exists.

Per capita consumption of chicken correlates with total US crude oil imports.
Correlation: $r = 0.899899$ (r ≈ 0.90; very strong positive relationship).
Data example highlights time series of chicken consumed vs crude oil imports (illustrative data: 2000–2009).
Note: Correlation does not imply causation: $ext{Correlation} \neq \text{Causation}$ .

The kind of information scores convey depends on the scale of measurement used:
- Nominal scale: categorizes or classifies individuals; does not measure amount.
- Ordinal scale: indicates rank order; there is no true zero, and equal spacing between adjacent scores is not guaranteed.
- Interval scale: indicates an actual quantity with equal spacing between adjacent scores; there is no true zero.
- Ratio scale: indicates an actual quantity with equal spacing; zero means none of the variable is present.

A variable may be continuous (measured in fractional amounts; decimals allowed) or discrete/categorical (fixed amounts; cannot be broken into smaller amounts).

(Checklist items from the slides are omitted here but align with the sections above.)