PSYCHOLOGICAL STATISTICS

Introduction to Psychological Statistics

Overview of Psychological Statistics

  • Psychological statistics involves defining the language of data, distribution, and inference.
  • Key concepts include probability, error, and the notion that misleading data does not indicate mistakes but rather uncertainties inherent in the data.
  • The goal is to calculate probabilities about inferences, which is crucial for statistical analysis.

Key Concepts in Statistics

Population vs. Sample

  • Population: The totality of units or individuals with which we are concerned in research.
  • Parameter: A value that describes a population.
  • Sample: Any portion of the population selected for analysis.
  • Statistic: A value that describes a sample.
  • Note: Psychology consistently relies on samples due to practical constraints.

Sampling

Random Sampling

  • Random Sampling: A method of taking samples in such a way that every unit in the population has an equal chance of being included.
  • This method aims to minimize bias and ensure that the sample accurately represents the population.

Types of Statistics

Descriptive Statistics

  • Definition: Organizes, summarizes, and simplifies data for easier understanding.
  • Uses: Presentation of data and describing data to make predictions.

Inferential Statistics

  • Definition: Generalizes findings from samples to populations, including hypothesis testing and studying relationships among variables.

Levels (or Scales) of Measurement

  • Variables may be measured on one of four scales, which determines the type of statistics and conclusions that can be drawn:
      1. Nominal
      2. Ordinal
      3. Interval
      4. Ratio
  • Importance of understanding scales: They determine the appropriate statistical techniques and inferential methods.

Scales of Measurement for Qualitative Data

Nominal Scale

  • Definition: Consists of non-ordered categorical responses without a specific continuum.
  • Examples: Mood, major, gender.

Ordinal Scale

  • Definition: Comprises ordered categorical responses that exist on a continuum ranging from low to high, but the intervals are not necessarily equal.
  • Examples: Anxiety ratings, rank order descriptions.

Scales of Measurement for Quantitative Data

Interval Scale

  • Definition: Involves numerical responses that are equally spaced, but do not have a true zero point.
  • Examples: Temperature, Likert scale ratings (1-5; 1-7).

Ratio Scale

  • Definition: Like the interval scale but has a true zero point, allowing for meaningful ratio comparisons.
  • Examples: Reaction time, accuracy, height, weight.

Frequency Distributions

  • Definition: Describes the number of subjects falling into particular categories, condensing data into a single representative number.

Cross-Tabulation

  • Definition: A method of categorizing data based on two or more variables.
  • Example table for political sub-groups:
      - Democrats: 24
      - Republicans: 1
      - Total: 25
  • Frequency percentage can be calculated as:
       ext{Frequency ext{
     ext{Percentage}}} = rac{ ext{Sub-group total}}{ ext{Overall total}} imes 100

Data Visualization

Bar Graphs

  • Rule: Appropriate for nominal scales of measurement.
  • Show frequency distributions across different categories (e.g., college major).

Histograms

  • Rule: Used for ratio data (quantitative).
  • Visually represent distribution of numerical data using bars, emphasizing the frequency of different ranges.

Polygons (Line Graphs)

  • Used to illustrate frequency distributions over a range of values.

The Normal Distribution

  • Characteristic: A symmetrical distribution where 68% of data falls within one standard deviation from the mean, and 95% falls within two.
  • Commonly used for variables such as body temperature, IQ scores, and height.
  • Displays different shapes, including normal, J-shaped, rectangular, and bimodal distributions.

Measures of Central Tendency

Definitions

  • Mode: The most frequently occurring value in a data set.
  • Median: The middle value when data is ordered, providing a central point dividing the data set into two equal halves.
  • Mean: The arithmetic average, calculated as:
       ext{Mean} (ar{x}) = rac{ ext{Sum of all observations}}{ ext{Total number of observations}}
Notes on Usage
  • For quantitative data, all three measures are applicable.
  • For qualitative data, the mode is always appropriate; the mean is not valid.

Variability Measures

1. Range

  • Defined as the distance from the lowest value to the highest value.
  • Note: Considers only two data points.

2. Variance ($s^2$)

  • Definition: The average of the squared deviations from the mean.
  • Calculation involves utilizing all data points:
      s^2 = rac{ ext{Σ}(x - ar{x})^2}{N}

3. Standard Deviation (SD)

  • The square root of variance, representing the average distance of data points from the mean.
  • Calculated as:
    s=exts2s = ext{√}s^2

4. Standard Error of the Mean (SEM)

  • Definition: Standard deviation divided by the square root of the sample size:
       ext{SEM} = rac{s}{ ext{√}n}

Sampling Error

  • Definition: Refers to variability among samples that occurs by chance, not reflecting true population characteristics.
  • Key Questions to Consider:
      1. Are the observed differences real or merely due to sampling error?
      2. Are our inferences about the population valid?

Hypothesis Testing

Null Hypothesis (H₀)

  • Definition: Predicts no differences between means; notation: H0:m1=m2H₀: m₁ = m₂.
  • Core principle: Always tested in hypothesis testing scenarios.

Alternative Hypothesis (H₁)

  • Definition: Predicts differences between groups exist; notation: H1:m1<br/>eqm2H₁: m₁ <br /> eq m₂.

Error Types in Hypothesis Testing

Type I Error (Alpha)

  • Definition: Occurs when the null hypothesis is rejected when it is actually true.
  • Significance level commonly set at 0.05.

Type II Error (Beta)

  • Definition: Occurs when the null hypothesis is accepted when it is false, meaning real differences exist but are overlooked.

Power Analysis

  • Definition: The probability of correctly rejecting the null hypothesis when it is false; the ability to detect an effect if one exists.
  • Strategies to increase power:
      1. Increase sample size (n).
      2. Decrease variability among sample measurements.
      3. Use more precise instruments for measurement.

Effect Size

  • Definition: A measure of the magnitude of differences attributed to the treatment, distinguishing practical significance from statistical significance.

Tools for Testing Mean Differences

T-Test

  • Definition: Used when comparing only two groups; types include:
      - Independent T-Test: Between different subjects.
      - Correlated T-Test: Within subjects or matched.
  • Metric used: t-statistic (critical values based on degrees of freedom and alpha).

Analysis of Variance (ANOVA)

  • Definition: Used for comparing more than two groups; types include:
      - Between Subjects: Different participants in each group.
      - Within Subjects: Same participants across conditions (repeated measures).
      - One-way ANOVA: One independent variable.
      - Factorial ANOVA: Multiple independent variables.
  • Metric used: f-statistic.

Meta-Analysis

  • Definition: Involves statistical averaging of results from multiple independent studies evaluating the same phenomenon.