statistics

Research Variables

  • By: Claudine T. Villa

  • Inspired by: Mathspace

Learning Outcomes

  • Differentiate between research variables and measurement.

  • Identify the correct sampling size using Slovin's formula.

  • Determine the appropriate research design and sampling technique to use.

Variables and Measurement

  • Definition of Variables:

    • Factors that can be manipulated and measured.

    • Characteristics or attributes of persons or objects that assume different values across different objects under consideration.

Classification of Variables

Discrete and Continuous Variables
  • Discrete Variable:

    • Countable infinite number of values.

    • Usually measured by counting or enumeration.

    • Example: Students, professors, psychologists, counselors, hospitals.

  • Continuous Variable:

    • Cannot be counted due to their distinct divisions.

    • Abstract variables that assume values corresponding to a line of intervals.

    • Example: Intelligence, beauty, effectiveness, cleanliness, weight, height, temperature.

Qualitative and Quantitative Variables
  • Qualitative Variable:

    • Provides categorical responses.

    • Example: Occupation, gender, civil status, religious affiliation, political parties.

  • Quantitative Variable:

    • Numerical values representing an amount or quantity.

    • Example: Height, salary, number of children, weight, time.

Dependent and Independent Variables
  • Independent Variable:

    • The variable that the researcher controls or manipulates according to the purpose of the investigation.

  • Dependent Variable:

    • Measures based on the effect of the independent variable.

    • Example: To determine the predictive validity of entrance requirements for freshman students, the independent variables include the national achievement test, entrance examination, and school grades, while the dependent variable is the performance in first-year college.

Cause and Effect

  • Matching Exercise:

    • Cause: Decrease in the number of continuous rainy season.

    • Effect:

    • Grocery items you can buy (increased price of goods).

    • Increase in umbrella sales.

Independent Variable

  • Changes to this variable will affect the other variable.

Dependent Variable

  • A variable whose value is affected by another variable.

  • Example Structure:

    Time (in hours) spent studying

    Exam Score


    4

    84


    3

    80


    6

    95


    2

    76

    • Identification:

    • Which variable is dependent?

    • Explanation of dependency.

Think-Pair-Share Activity

  • Discussion prompt: How do independent and dependent variables relate to cause and effect?

    • Use the example below for discussion:

    • Price of goods vs. number of grocery items you can buy.

Variables and Measurement

Classification of Variables

Univariable, Bivariable, and Multivariable Distribution
  • Univariable Distribution:

    • Involves only one variable.

    • Example: Age of Grade 7 pupils, Temperature, Sales.

  • Bivariable Distribution:

    • Data classified based on two variables.

    • Example: Ice cream shop monitoring ice cream sales versus temperature of the day.

    • Data Structure:
      | Temperature | Sales |
      | ------------ | ------ |
      | 14.2 | Php 215|
      | 16.4 | 325 |
      | 11.9 | 185 |
      | …. | …. |

  • Bivariate Data: Numerical data consisting of two variables organized into pairs of values.

    • Examples: Hours studied vs. score on the exam, Favorite ice cream flavor vs. number of students.

Multivariable Distribution
  • Involves three or more variables.

    • Example: Tracking enrollment in college based on program, year level, and gender.

    • Data Structure Example:
      | Grand Total | Program | Year Level | M | F |
      | ----------- | ------- | ----------- | - | - |
      | 1,095 | Psycholo| 1st Year | 115 | 178 |
      | …. | …. | …. | … | … |

Levels of Measurement

Nominal Scale

  • Classification without numerical value.

  • Also called categorical scales or categorical data.

  • Examples: Sex, employment status, marital status.

Ordinal Scale

  • Classifies and ranks subjects based on degree of possession of a characteristic.

  • Example: Classroom performance rankings (5 - outstanding to 1 - poor).

Interval Scale

  • Combines characteristics of nominal and ordinal scales with predetermined equal intervals.

  • Examples: Heights, weights, prices.

  • Note: Lacks a true zero point (e.g., IQ test scores ranging from 0 to 200).

Ratio Scale

  • Represents the highest, most precise level of measurement.

  • Contains a meaningful zero point (where quality being measured does not exist).

  • Examples: Height, weight, time, distance, and speed.

Identifying Qualitative and Quantitative Variables

  • Task: Classify the following:

    1. Type of school - Qualitative

    2. Number of words correctly spelled - Quantitative, Discrete

    3. House ownership - Qualitative, Nominal

    4. Civil status - Qualitative, Nominal

    5. Educational attainment of respondents - Qualitative, Ordinal

    6. Job satisfaction of employees - Qualitative, Ordinal

    7. Favorite color - Qualitative, Nominal

    8. Number of siblings - Quantitative, Discrete

    9. Study habits - Qualitative, Ordinal

    10. Faculty evaluation - Qualitative, Ordinal

Measurement Levels Categorization

  • Examples with Reasons:

    1. Ranking of college team - Ordinal (has order but unequal differences between ranks).

    2. Student number - Nominal (identifier only).

    3. Temperature in Celsius - Interval (equal intervals but no true zero).

    4. House number - Nominal (labels/identifiers).

    5. Brands of soft drinks - Nominal (no order).

    6. Socio-Economic Status - Ordinal (ordered categories).

    7. Number of vehicles registered - Ratio (true zero).

    8. Zip Code number - Nominal (codes used as labels).

    9. Annual income - Ratio (true zero).

    10. Amount of time spent on online games - Ratio (true zero).

Population and Sample

Population

  • Total or entire group of individuals, events, objects, observations, reactions with unique patterns and characteristics from which information is sought. This is referred to as the universe in statistical investigation.

Sample

  • Portion or subset of the population used to gather information. Represents the unique qualities or characteristics of the population.

Essential Steps in Determining Sample Sizes

  1. Determine the population from which the data is needed.

  2. Identify the target group to generalize the study's results.

  3. Determine the kind of sample to be drawn.

  4. Establish desired sample size using Slovin's formula: n=N1+Ne2n = \frac{N}{1 + Ne^2}

    • Where:

      • nn = Sample size

      • NN = Population

      • ee = Estimated margin of error (acceptable error; maximum = 5% or 0.05).

Parameter and Statistics

  • Parameter (μ): Measures of the population or numerical characteristic of the population.

  • Statistics: Numerical value that describes the sample; synonymously used with estimates.

Probability Sampling Method

Definition

  • A sampling process where each unit in the population has a known non-zero probability of being included in the sample.

  • Most unbiased yet difficult to execute.

Types of Probability Sampling
  1. Simple Random Sampling:

    • Each member has an equal chance of being selected.

    • Can be performed via fishbowl technique, lottery, or random number tables.

    • Advantages: Easy to understand and apply.

    • Disadvantages: May be difficult for large populations; best used for geographically close populations.

  2. Stratified Random Sampling:

    • Samples randomly selected from different groups or sections.

    • Population divided into sub-populations (strata) based on factors like age, gender.

    • Each stratum is subjected to simple random sampling.

    • Advantages: More accurate, tailored sampling designs.

    • Disadvantages: Stratum variable values may be hard to access.

  3. Systematic Random Sampling:

    • Every kth name on a list is selected, useful in arrangements like alphabetical listings.

    • K=NnK = \frac{N}{n} where NN is population and nn is sample size.

    • Advantages: Easy sampling process.

    • Disadvantages: Can lead to bias with periodicity in the population.

  4. Cluster Sampling:

    • Identifies naturally occurring group units for sample selection.

    • Clusters should ideally be heterogeneous.

    • Advantages: Efficient and cost-effective.

    • Disadvantages: Can be misleading.

    • When to use: When population can be grouped into clusters.

  5. Multi-Stage Sampling:

    • Used for large geographical areas with respondents spread out.

    • Involves multiple sampling stages.

    • Advantages: Avoids random sampling issues in large populations.

    • Disadvantages: Subjectivity arises during group selections.

Non-Probability Sampling

Definition

  • Sampling methods where selection probabilities aren't specified for individual units in the population. Used when generalization isn't necessary.

Types of Non-Probability Sampling
  1. Purposive Sampling:

    • Selects respondents based on judgment of who can provide the best information.

  2. Convenience Sampling:

    • Selection based on availability of respondents during data collection.

  3. Quota Sampling:

    • Researcher sets a quota and selects participants accordingly.

  4. Snowball Sampling:

    • Utilized when subjects are difficult to identify; recruits through referrals from known participants.

Research Design

Action Research

  • Used for investigating localized problem-solving.

Descriptive Research

  • Aims to understand the characteristics and aspects of a situation.

Explanatory Research

  • Seeks to explain relationships between two or more variables.

Exploratory Research

  • Investigates phenomena not well understood.

Correlational Research

  • Examines the significance of relationships between characteristics or factors.

Evaluation Research

  • Assesses the impacts or outcomes of actions, policies, or programs.

Policy Research

  • Generates information relevant for policy development and assessment of impacts.

Ex-Post Facto or Causal-Comparative Research

  • Observes existing conditions and explores causal factors retrospectively.

Historical Research

  • Addresses problems arising from historical contexts using past data.

Ethnographic Research

  • Seeks holistic descriptions of phenomena through multiple data collection techniques.

Phenomenological Research

  • Begins with shared experiences and investigates effects through respondents' narratives.

Assessment Questions

  1. Temperature reading:

    • B. Interval data

  2. Comparing number of girls to boys:

    • D. Nominal data

  3. Grade ranking of senior class:

    • A. Ordinal data

  4. Eye color of students:

    • D. Nominal data

  5. Grade percentage in Science test:

    • C. Ratio data

  6. Top 50 movies:

    • A. Ordinal data

  7. Jersey numbers of players:

    • D. Nominal data

  8. Most expensive cars:

    • C. Ratio data

  9. Weight of children:

    • C. Ratio data

  10. Waist measurement of contestants:

    • C. Ratio data

Sample Size Estimation Questions:

  1. Estimate the sample size from 5000 students using 5% error.

  2. Sample size needed for 600 target population of Psychometricians with 1% error.

Identifying Sampling Techniques:

  1. Interviewing 20 friends for initial insights - Convenience Sampling.

  2. Selecting 10 specialized neurosurgeons - Purposive Sampling.

  3. Studying a rare disease with referrals - Snowball Sampling.

  4. Surveying 50 out of 500 with random number generation - Simple Random Sampling.

  5. Surveying student satisfaction proportionally across years - Stratified Random Sampling.

  6. Surveying households in randomly selected city blocks - Cluster Sampling.