Chapter 1: Introduction to Statistics & Research

What is Statistics?

  • Statistics is the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data.
  • Statistics is used to make sense of data in four ways:
    1. Organize individual scores to examine patterns
    2. Summarize data to understand general characteristics
    3. Communicate results of a study
    4. Interpret what the data indicate

Behavioral Research

  • The goal of behavioral research is to understand the “laws of nature” that apply to the behaviors of living organisms.

Samples vs. Populations

  • Population: the entire group to which a law of nature applies.
  • Sample: a relatively small subset of a population intended to represent the population.
  • The individuals who are measured in a sample are called participants.
  • Example mappings:
    • Company -> population? (contextual illustration)
    • Workforce -> population
    • U.S. Workforce -> population
    • Individual -> participant
    • Employee -> participant

Practice: Sample vs. Population

  • Scenario 1: Researchers examine how dark chocolate might affect heart health in U.S. women. They select 120 female students from Suffolk University for a chocolate study.
    • Population: all U.S. women
    • Sample: 120 Suffolk University female students
  • Scenario 2: Researchers are interested in how U.S. college graduates fare in the job market 2 years post-graduation. A survey is sent through the Texas State alumni email newsletter, and responses come from 350 recent TXST graduates.
    • Population: all TXST graduates (Texas State University)
    • Sample: 350 TXST graduates who responded

Scores & Representation

  • We use scores (participant responses) in a sample to infer (estimate) the scores we would expect to find in the population.
  • This assumes a sample is representative of the population.
  • Example: In a warehouse with a sample of 1,000 people split 500 males and 500 females, a smaller group of 100 males and 100 females could generate a representative sample of the larger group.
  • If a sample is NOT representative, it inaccurately reflects the population and may give misleading results.

Understanding Variables

  • A variable is anything that can produce two or more different scores.
  • Common variables in behavioral research include:
    • Age
    • Race
    • Gender
    • Personality type
    • Physical attributes

Types of Variables

  • Quantitative variables (continuous):
    • Scores indicate the amount of a variable.
    • Examples: Age, weight/BMI, number of pets, number of siblings, etc.
  • Qualitative variables (discrete / categorical):
    • Classify or categorize an individual based on a characteristic.
    • Examples: Gender, ethnicity, college classification, etc.

Relationships

  • In a relationship, as the scores on one variable change, the scores on the other variable change in a consistent manner.

Types of Relationships in Statistics

  • Pattern 1 (Positive): As X increases, Y increases.
    • Examples: more attendance to class → better exam performance; more alcohol → more intoxication.
  • Pattern 2 (Negative/Inverse): As X increases, Y decreases.
    • Examples: more climbing → lower temperature; less money spent on morale → more employees leave.

Relationship Consistency

  • Perfect consistency: as the score on one variable changes, the score on the other variable changes in a perfectly one-to-one fashion.
  • In practice, perfect consistency is not required; some degree of consistency is sufficient to indicate a relationship.

Applying Statistics

  • Descriptive statistics: procedures for organizing and summarizing sample data.
  • Inferential statistics: procedures for drawing inferences about the scores and relationships that would be found in the population.

Statistics vs Parameters

  • Statistic: a number describing an aspect of the scores in a sample.
    • Represented using English letters (e.g., A, B, C).
  • Parameter: a number describing an aspect of the scores in the population.
    • Represented using Greek letters such as α,β,η,μ,σ\alpha, \beta, \eta, \mu, \sigma.
  • Note: Common examples include sample mean vs population mean, etc. (not explicitly listed in the text as symbols, but implied by the distinction).

Research Designs

  • Research design is the framework of research methods and techniques chosen to conduct a study.
  • Different designs require different descriptive and inferential statistical procedures; we must learn when to use each procedure.
  • Two major types of research designs:
    • Experimental studies
    • Correlational studies

Experimental Study Design

  • In an experiment, the researcher actively changes or manipulates Variable A (the Independent Variable), then measures participants’ scores on Variable B (the Dependent Variable) to see if a relationship is produced by the change in Variable A.
  • Example: Rubber Plant height when placed inside vs outside during summer.
    • Variable A – Location (inside/outside)
    • Variable B – Plant Growth (height)

The Independent Variable (IV)

  • The independent variable is changed or manipulated by the experimenter.
  • An IV must have at least 2 levels/conditions.
  • Example: Location with levels 1) Outside and 2) Inside.

The Dependent Variable (DV)

  • The dependent variable measures a behavior or attribute of a participant that may be influenced by the IV.
  • Example: Plant Growth – Lots of growth? Little to no growth?

Practice: Research Design

  • Identify the independent variable, its conditions, and the dependent variable for the study: effect of caffeine on typing speed.
    • IV: Caffeine consumption
    • Conditions: No caffeine, 50 mg caffeine, 100 mg caffeine
    • DV: Number of characters typed per second

Correlational Studies

  • In a correlational study, the researcher measures participants’ scores on two variables and then determines whether a relationship exists.

Example: Correlation Illustration

  • Per capita consumption of chicken correlates with total US crude oil imports.
  • Correlation: r=0.899899r = 0.899899 (r ≈ 0.90; very strong positive relationship).
  • Data example highlights time series of chicken consumed vs crude oil imports (illustrative data: 2000–2009).
  • Note: Correlation does not imply causation: extCorrelationCausationext{Correlation} \neq \text{Causation}.

Measurement Scales

  • The kind of information scores convey depends on the scale of measurement used:
    • Nominal scale: categorizes or classifies individuals; does not measure amount.
    • Ordinal scale: indicates rank order; there is no true zero, and equal spacing between adjacent scores is not guaranteed.
    • Interval scale: indicates an actual quantity with equal spacing between adjacent scores; there is no true zero.
    • Ratio scale: indicates an actual quantity with equal spacing; zero means none of the variable is present.

Continuous vs Discrete / Categorical

  • A variable may be continuous (measured in fractional amounts; decimals allowed) or discrete/categorical (fixed amounts; cannot be broken into smaller amounts).

Practice: Continuous vs. Discrete

  • For each variable, indicate (1) the measurement scale and (2) whether it is continuous or discrete:
    • The number of tickets sold to an event
    • Scale: ratio\text{ratio}
    • Type: discrete
    • Your favorite soft drink
    • Scale: nominal\text{nominal}
    • Type: discrete
    • Weight
    • Scale: ratio\text{ratio}
    • Type: continuous
    • IQ
    • Scale: interval\text{interval}
    • Type: continuous

Checklists

  • (Checklist items from the slides are omitted here but align with the sections above.)