Chapter 1: Introduction to Statistics & Research
What is Statistics?
- Statistics is the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data.
- Statistics is used to make sense of data in four ways:
- Organize individual scores to examine patterns
- Summarize data to understand general characteristics
- Communicate results of a study
- Interpret what the data indicate
Behavioral Research
- The goal of behavioral research is to understand the “laws of nature” that apply to the behaviors of living organisms.
Samples vs. Populations
- Population: the entire group to which a law of nature applies.
- Sample: a relatively small subset of a population intended to represent the population.
- The individuals who are measured in a sample are called participants.
- Example mappings:
- Company -> population? (contextual illustration)
- Workforce -> population
- U.S. Workforce -> population
- Individual -> participant
- Employee -> participant
Practice: Sample vs. Population
- Scenario 1: Researchers examine how dark chocolate might affect heart health in U.S. women. They select 120 female students from Suffolk University for a chocolate study.
- Population: all U.S. women
- Sample: 120 Suffolk University female students
- Scenario 2: Researchers are interested in how U.S. college graduates fare in the job market 2 years post-graduation. A survey is sent through the Texas State alumni email newsletter, and responses come from 350 recent TXST graduates.
- Population: all TXST graduates (Texas State University)
- Sample: 350 TXST graduates who responded
Scores & Representation
- We use scores (participant responses) in a sample to infer (estimate) the scores we would expect to find in the population.
- This assumes a sample is representative of the population.
- Example: In a warehouse with a sample of 1,000 people split 500 males and 500 females, a smaller group of 100 males and 100 females could generate a representative sample of the larger group.
- If a sample is NOT representative, it inaccurately reflects the population and may give misleading results.
Understanding Variables
- A variable is anything that can produce two or more different scores.
- Common variables in behavioral research include:
- Age
- Race
- Gender
- Personality type
- Physical attributes
Types of Variables
- Quantitative variables (continuous):
- Scores indicate the amount of a variable.
- Examples: Age, weight/BMI, number of pets, number of siblings, etc.
- Qualitative variables (discrete / categorical):
- Classify or categorize an individual based on a characteristic.
- Examples: Gender, ethnicity, college classification, etc.
Relationships
- In a relationship, as the scores on one variable change, the scores on the other variable change in a consistent manner.
Types of Relationships in Statistics
- Pattern 1 (Positive): As X increases, Y increases.
- Examples: more attendance to class → better exam performance; more alcohol → more intoxication.
- Pattern 2 (Negative/Inverse): As X increases, Y decreases.
- Examples: more climbing → lower temperature; less money spent on morale → more employees leave.
Relationship Consistency
- Perfect consistency: as the score on one variable changes, the score on the other variable changes in a perfectly one-to-one fashion.
- In practice, perfect consistency is not required; some degree of consistency is sufficient to indicate a relationship.
Applying Statistics
- Descriptive statistics: procedures for organizing and summarizing sample data.
- Inferential statistics: procedures for drawing inferences about the scores and relationships that would be found in the population.
Statistics vs Parameters
- Statistic: a number describing an aspect of the scores in a sample.
- Represented using English letters (e.g., A, B, C).
- Parameter: a number describing an aspect of the scores in the population.
- Represented using Greek letters such as α,β,η,μ,σ.
- Note: Common examples include sample mean vs population mean, etc. (not explicitly listed in the text as symbols, but implied by the distinction).
Research Designs
- Research design is the framework of research methods and techniques chosen to conduct a study.
- Different designs require different descriptive and inferential statistical procedures; we must learn when to use each procedure.
- Two major types of research designs:
- Experimental studies
- Correlational studies
Experimental Study Design
- In an experiment, the researcher actively changes or manipulates Variable A (the Independent Variable), then measures participants’ scores on Variable B (the Dependent Variable) to see if a relationship is produced by the change in Variable A.
- Example: Rubber Plant height when placed inside vs outside during summer.
- Variable A – Location (inside/outside)
- Variable B – Plant Growth (height)
The Independent Variable (IV)
- The independent variable is changed or manipulated by the experimenter.
- An IV must have at least 2 levels/conditions.
- Example: Location with levels 1) Outside and 2) Inside.
The Dependent Variable (DV)
- The dependent variable measures a behavior or attribute of a participant that may be influenced by the IV.
- Example: Plant Growth – Lots of growth? Little to no growth?
Practice: Research Design
- Identify the independent variable, its conditions, and the dependent variable for the study: effect of caffeine on typing speed.
- IV: Caffeine consumption
- Conditions: No caffeine, 50 mg caffeine, 100 mg caffeine
- DV: Number of characters typed per second
Correlational Studies
- In a correlational study, the researcher measures participants’ scores on two variables and then determines whether a relationship exists.
Example: Correlation Illustration
- Per capita consumption of chicken correlates with total US crude oil imports.
- Correlation: r=0.899899 (r ≈ 0.90; very strong positive relationship).
- Data example highlights time series of chicken consumed vs crude oil imports (illustrative data: 2000–2009).
- Note: Correlation does not imply causation: extCorrelation=Causation.
Measurement Scales
- The kind of information scores convey depends on the scale of measurement used:
- Nominal scale: categorizes or classifies individuals; does not measure amount.
- Ordinal scale: indicates rank order; there is no true zero, and equal spacing between adjacent scores is not guaranteed.
- Interval scale: indicates an actual quantity with equal spacing between adjacent scores; there is no true zero.
- Ratio scale: indicates an actual quantity with equal spacing; zero means none of the variable is present.
Continuous vs Discrete / Categorical
- A variable may be continuous (measured in fractional amounts; decimals allowed) or discrete/categorical (fixed amounts; cannot be broken into smaller amounts).
Practice: Continuous vs. Discrete
- For each variable, indicate (1) the measurement scale and (2) whether it is continuous or discrete:
- The number of tickets sold to an event
- Scale: ratio
- Type: discrete
- Your favorite soft drink
- Scale: nominal
- Type: discrete
- Weight
- Scale: ratio
- Type: continuous
- IQ
- Scale: interval
- Type: continuous
Checklists
- (Checklist items from the slides are omitted here but align with the sections above.)