Round up- lecture notes
Measurement Basics
What is Measurement?
A recording and systematic observation of something in the world as a number for analysis.
Involves finding standardization and understanding the implications of things being measured.
How are Things Measured?
Quizzes
Studies
Quantitative Measures: Numerically observable things, primarily used for data analysis.
Qualitative Measurement: Measured as written observations that are then categorized to achieve goals for what is being measured.
Example: Observing rural communities and then creating characteristic variables from these observations.
The Basic Measurement Model
Represented as:
This model is an operation of real-world measurement.
Errors in Measurement
Sources of Error:
Sample size might be insufficient.
Questions might be misinterpreted by participants.
There is always some level of error present in measurement.
The goal is to make the error as small as possible.
Constructs and Conceptualization
Construct
To "operationalize the construct" means to give it life and explicitly define what your concept means.
Example: What does poverty mean? Is it based on income? Does a certain level of income determine poverty?
Conceptualization
The process of explicitly defining what it is that you seek to measure.
Example: What do gender norms and gender attributes mean?
Operationalizing Conceptualization
Involves identifying the specific indicators that signify or define the gender norms you are interested in.
Challenges to Conceptualization
Determining which indicators will be acceptable within society.
Defining standards for abstract concepts (e.g., defining the standard of intelligence, defining the standard of poverty).
Manifest vs. Latent Constructs
Manifest Constructs: Easily observable and directly measurable.
Examples: Height, weight.
Latent Constructs: Less observable and not directly measurable; inferred from other measures.
Example: Intelligence.
Dimensions
Constructs that have multiple layers or components treated as a system.
Example: Health is not a simple "yes" or "no" state; it requires multiple measures to establish a standard of health and understand someone's health status comprehensively.
Operationalization
What is Operationalization?
Taking the concept of a construct and translating it into a concrete form of action.
Refers to the specific procedure used to measure the construct itself.
Relies on indicators to keep the measurement within measurable bounds.
Protocol
The set of rules for operating the measurement instruments.
Outlines the process and procedure of what needs to be done.
Provides instructions that guide the entire measurement process.
Proxies
Measurements that are taken on behalf of the actual concept when the concept itself is difficult or impossible to measure directly.
Indicators
The attributes that place a data point into a specific category or "bin" of concept measurement as a response.
Every indicator will have associated random errors.
Errors can depend on the instruments used and how the participant responded.
Response Format
Refers to how participants respond to a question or statement.
Includes the range of choices and potential responses offered.
Also encompasses the tools used for participants to respond.
Example: Rating a question on a scale of .
Example: The Big Five Inventory uses specific response formats.
Validity
What is Validity?
How well a measure truly represents the construct it is intended to measure.
Types of Validity
Face Validity: How well the measurement appears, at its face value, to be looking for what we intended, or simply the outcome at face value.
Content-Related Validity: The extent to which the measurement covers the full scope of the findings and their ability to correctly summarize those findings.
Criterion-Related Validity: Uses empirical evidence to demonstrate how well a measure correlates with a criterion.
Concurrent Validity: If measures are strongly correlated between two different measures taken at the same time, then there is strong concurrent validity.
Predictive Validity: Your measurement accurately predicts a future outcome or a pattern of a concept.
Example: If a job satisfaction measure is valid, those who score low on satisfaction are more likely to quit their job in the next months.
Convergent Validity: Relates to other variables that are theoretically expected to be related. We expect two variables to be correlative in nature.
Example: Health and age are generally expected to be correlated.
Discriminant Validity: The opposite of convergent validity. A measure of a construct should not be correlative in nature to another construct from which it is theoretically distinct.
This indicates that something else might be measured inadvertently within the construct if it is correlated to an unrelated construct.
Example: If there is a supposed decrease in health correlating with a specific religion (which is a latent construct), it suggests a potential error or bias being measured rather than a true relationship between health and religion.
Nomological Validity: (Mentioned, but not elaborated on in the transcript).
Measurement Error
What is Measurement Error?
Error inherent in the measurement of a constant.
Types of Measurement Error
Random/Noisy Error:
Error with an average of ; the expected value of the result is .
Measurement that occurs unpredictably and by chance.
Systematic Error:
Consistently over- or underestimating the actual result of the measurement.
Often referred to as bias.
Can arise from the use of the instrument itself.
Can stem from the way a participant answers a question (e.g., normative social intent).
Important Distinction: Bias in research methods and bias in statistics do not mean the same thing.
Example of Sampling Bias: Trying to measure the average height of a population but only picking people who are taller; this means your sampling mean will be biased.
True Value
The standard or benchmark against which your observed measurement will be compared, based on the categorized indicators.
Reliability
What is Reliability?
Refers to the consistency of a measure. More noise leads to less reliability.
Why Reliability Matters
When Calculating Averages:
If the sample size is large enough, random measurement errors will tend to cancel each other out.
However, a high degree of random error will still lower confidence in the average.
Confidence Interval:
An estimated range of variables.
A statement like "" of the time your true value will be within this specific range of the true value" is based on reliability.
Estimating Relationships: Random errors can weaken observed relationships between variables.
Classifying Individuals: Reliable measures are crucial for accurate classification.
Tracking Changes Over Time: Essential for monitoring performance (e.g., police responsiveness, hospital efficiency).
These processes are often supported and affected by external factors like changes in technology.
How to Test Reliability
Test-Retest Reliability: Testing the same individuals more than once to check for consistency and differences in scores.
Interrater Reliability: Two or more interviewers or observers check each other's answers or observations to test for consistent questions and similar answers/interpretations.
Split-Half Reliability: (Mentioned, but not detailed).
Parallel Forms Reliability: Distribute the same or equivalent test to two different samples to measure overlapping agreements and disagreements.
Example: Checking the quality of teachers and the level of interpretation agreement among students on a given task.
Levels of Measurement
Quantitative Variables
Interval
Ratio
Categorical Variables
Nominal
Ordinal
Ethnography Assignment
Purpose: Conduct an ethnography in any public location or setting.
Starting Point: Begin with a specific research question, such as gender, race, or age segregation among bus riders.
Process:
Take detailed notes.
Record observations.
Transcribe recorded observations.
Write up the notes.
Read existing ethnographies for reference to inform the process and provide context.
Data Fundamentals
Primary and Secondary Data
Quantitative Data:
Defined as a numerical form of recorded and coded information.
Quantitative Variables: Inherently numeric dependent variables.
Categorical Variables as Quantitative: Many categorical variables can be converted into quantitative forms for analysis.
Example 1: The number of years of education is numerical, but can categorize for other variables like income and health status.
Example 2: Gender can be numerically coded (e.g., Male , Female ) for use in statistical analyses like regression.
Level of Aggregation
Microdata:
Represents the unit of observation at the most specific or basic level.
Example: The individual persons being measured in a national census.
Aggregate Data:
Involves the collection and generalized averages of categorized microdata.
Example: The average class score of a quiz, which combines and summarizes the microdata (individual student scores).
Multilevel Data:
Refers to the sub-tiers or hierarchical levels within an aggregate set of data.
Example: State-by-state level data collected within a national census, representing an intermediate level between individual microdata and the full national aggregate (population or sample).
Data Types: Time Dimensions
Cross-Sectional Data:
A "snapshot" of public opinion and data captured at a single, specific point in time.
It analyzes different populations at one particular time.
Longitudinal Data:
Involves studying the same subjects or entities over an extended period to analyze trends and relationships.
Example: Studying the academic and personal development of the same student from grade to grade .
Panel Data:
A type of longitudinal study that examines the change in a dependent variable () given multiple independent variables over a specific amount of time.
Can involve measurements taken year over year.
Pooled Cross-Sectional Data:
Combines two or more cross-sectional datasets that may not necessarily be seeking identical objectives.
Example: A year-over-year regression analysis of the income of new lessees in a maximum -year apartment rental against various household data.
Time Series Data:
Consists of data recorded sequentially over time.
Each data point is specifically associated with a particular moment or interval in time.
Surveys
What is a Survey?
A systematic collection of information from individuals or organizations, with the subsequent organization of answers to support a specific research question.
Steps in a Survey
Identify the Population: Clearly define the target group for the survey.
Develop a Questionnaire: Create a set of questions designed to elicit the desired information.
Pretest Questionnaire: Test the questionnaire with a small group similar to the target population to:
Ensure the wording is unambiguous and clear.
Validate the questions, ensuring they measure what they intend to measure.
Recruit and Train Interviewers: If applicable, select and equip interviewers with the necessary skills to avoid biasing interviewees.
Collect Data: Administer the survey to the identified population.
Analyze and Present Findings: Process the collected data and communicate the results.
Modes of Data Collection
Intercept Interview Surveys:
An interview technique known for constant feedback and generally high response rates.
Typically conducted in high-traffic public spaces such as shopping malls.
Household Interview Surveys:
In-person interviews conducted in the respondents' homes.
Common for sensitive topics like income or health.
Telephone Surveys:
Conducted over the phone, often for customer satisfaction or public opinion polling.
Automated Telephone Surveys:
Surveys where questions are asked and responses collected using automated voice systems.
Mail Self-Administered Surveys:
Questionnaires sent via mail for respondents to complete and return on their own.
Group Administered Surveys:
A practical and cost-effective method where surveys are completed by a group of respondents simultaneously, often in a classroom or meeting setting.
Web Surveys:
Surveys administered online, typically via email invitations or website links.
Establishment Surveys:
Surveys focused on collecting data from companies and/or organizations rather than individuals.
Ethics of Survey Research
Informed Consent: Ensuring participants understand the survey's purpose, risks, and benefits before agreeing to participate.
Pushing for High Response Rate: Balancing the need for sufficient data with avoiding undue pressure on potential respondents.
Overburdening Respondents: Designing surveys that are not excessively long or demanding, respecting respondents' time and effort.
Protecting Privacy and Confidentiality: Safeguarding personal information and ensuring responses cannot be linked back to individual participants.
Surveying Minors and Other Vulnerable Populations: Implementing additional safeguards and obtaining appropriate permissions when surveying individuals who may be more susceptible to coercion or harm.
Making Survey Data Available for Public Use: Considering the ethical implications of sharing data, including anonymization and data security.
Statistics
Standard Deviation ( or ):
A measure of how dispersed the data points are in relation to the mean of the dataset.
A low or small standard deviation indicates that data points are tightly clustered around the mean.
A high or large standard deviation signifies that data points are more spread out from the mean.
Confidence Interval (CI):
Shows the range of values within which you expect the true population estimate to fall, at a specified level of confidence (e.g., CI).
Normal Distribution:
A probability distribution where the population mean is located at the highest point of a symmetrical bell curve.
This peak determines the most average point of a data set, with data values tapering off symmetrically on either side.
Multivariate Regression
Definition: Explains a dependent variable (Y) using more than one independent variable.
General Formula: The predicted value of the dependent variable (Y) is represented as:
Where:
is the constant (intercept).
are the coefficients (slopes) for each independent variable.
are the independent variables.
is the error term.
Example Application: Predicting earnings based on education and experience.
Why Regression is Used
Prediction: To predict the dependent variable using a combination of independent variables.
Description of Data: Provides a way to describe relationships within the dataset.
Causal Effect Estimation: Used to estimate whether one variable causally affects another and to determine the magnitude of that effect. For example, using regression on tobacco use to find its effect on health insurance.
Best Fit Linear Predictor: Regression finds the linear relationship that best fits the data points. The tightness of this fit is quantified by .
Relationship and Interpretation: Explores the relationships between independent and dependent variables, and the interpretation of the coefficients is crucial for understanding these relationships.
Interpretation of Coefficients
Constant (a): Represents the predicted value of the dependent variable when all independent variables are equal to .
Coefficient/Slope (): Each slope describes the change in the dependent variable for a one-unit increase in its corresponding independent variable, assuming all other independent variables are held constant.
What Variation Matters?
Collinearity: If independent variables are highly correlated (e.g., increases in education are strongly associated with increases in earnings), it can be difficult to isolate their individual effects.
Adding Explanatory Variables: Including additional independent variables, such as experience, can help to better explain how increases in education lead to increased earnings by accounting for other contributing factors.
R Squared ()
Definition: Represents the proportion of the variation in the dependent variable that is predicted or explained by all independent variables combined.
Interpretation: A higher indicates a better prediction because it suggests a tighter fit to the data and less error in the model.
Effect of Adding Variables: Adding any new independent variable will always increase , even if that variable does not contribute any meaningful predictive power to the model. This is a limitation of .
Adjusted R Squared
Definition: A modified version of that accounts for the number of predictors (independent variables) included in a regression model.
Purpose: It penalizes the inclusion of unnecessary predictors that do not significantly improve the model's predictive power.
Benefit: Provides a more reliable measure of model quality, especially when comparing different models that have varying numbers of predictors.
Multicollinearity
Definition: Occurs when two or more independent variables in a regression model are highly correlated with each other.
Challenge: Makes it difficult to accurately disentangle the individual effects of the independent variables on the dependent variable.
Perfect Multicollinearity: An extreme case where one independent variable is a perfect linear function of two or more other independent variables, making the model impossible to estimate accurately.
Precision
Achieving Higher Precision: Higher precision is associated with a lower standard error of the coefficient estimates.
Factors Influencing Precision:
Large Sample Size: More data points generally lead to more precise estimates.
Tighter Fits (Higher ): Models that explain a larger proportion of the dependent variable's variance usually have more precise estimates.
More Variation in Independent Variables: Greater variability in the independent variables provides more information for the model to work with, leading to better precision.
Less Multicollinearity: Reducing multicollinearity helps in obtaining more precise and reliable estimates of individual variable effects.
Effect of Adding Independent Variables: The effect of an additional independent variable on precision is complex; it might raise or lower precision depending on its contribution to explained variance and its correlation with existing variables.
Dummy Variables
Definition: A variable that takes a value of if a certain condition is true and if it is false.
Example (Female Dummy): Takes a value of if an individual is a woman and if not.
Reference Category: In dummy variable coding, the category assigned a value of for all dummy variables is known as the reference category. All comparisons are made relative to this category.
Categorical Variables: Categorical variables with more than two categories can be converted into a set of dummy variables (typically, if there are categories, dummy variables are created).
Interaction Term
Definition: A term created by multiplying two or more independent variables together. It allows for testing whether the effect of one variable on the dependent variable depends on the level of another variable.
Purpose: Used to test for conditional effects and to model more realistic relationships where the impact of one factor is not constant but varies depending on another factor.
Example (Exam Scores): The effect of sleep time on exam scores might depend on study time.
Causation and Validity of Research Studies
Causation
Definition: Relationship between two events where a change in one event causes a change in another.
Key Concepts:
Reverse Causation: Outcome appears to occur before the exposure.
Spurious Correlation: Two variables appear causally related due to a third variable that is unaccounted for.
Bias: Influenced by confounding variables or common causes, e.g., the correlation between shoe size and reading ability (both increase with age).
Distortion of Causal Relationship: Caused by extraneous variables or measurement biases, such as underreporting socially undesirable behaviors.
Sampling Bias
Representational bias can arise when certain groups are overrepresented.
Reverse Causation and Simultaneity Bias: Unclear if variable X is affecting Y or vice versa.
Intervening Variables: Explain the relationship between X and Y. Examples include communication patterns affecting family dynamics.
Direct and Indirect Effects
Direct Effect: Independent variable influences dependent variable directly.
Indirect Effect: An intervening variable affects the dependent variable.
Evidence for Causation
Cause should predict the effect.
Cause must be related to the effect.
Strength of evidence increases when plausible common causes are controlled for.
Endogeneity
Occurs when the independent variable is not truly independent, leading to bias due to correlation with error terms.
Sources of Endogeneity:
Omitted variable bias.
Measurement error.
Simultaneity.
Internal Validity
A measure of how well evidence supports the claim about cause and effect.
Trustworthiness of study conclusions regarding the relationship between independent and dependent variables.
Threats to Internal Validity:
Ambiguous Temporal Precedence: Unclear temporal ordering of events.
Selection Bias: Non-representative samples.
History Effects: External events impacting study outcomes.
Maturation: Participant changes over time affecting results.
Attrition: Dropout of participants can skew results if remaining participants are systematically different.
Testing and Instrumentation Bias: Changes in testing or measurement tools may alter data.
External Validity
Extent to which study results can be generalized beyond specific conditions.
Replication: Results should hold in different settings over time.
Threats to External Validity:
Interaction of causal relationships with participant characteristics.
Variability in treatment outcomes based on conditions.
Context-Dependent Mediation: Treatment effects or outcomes may change based on the environment. An example is how cultural factors affect program implementation success.
Modality of Causal Inference
Causal inference is about determining whether an observed association between two variables reflects a causal relationship (one variable causing changes in another), not just correlation.
The instructor emphasizes caution: correlation and causation are related but distinct, and they should not be used interchangeably in the same sentence.
The term “causal inference” will be used throughout the course, along with the potential outcomes framework, to analyze when an observed association can be interpreted as a causal effect.
Real-world discussions (policy, social science, economics) often use these terms interchangeably, but rigorous causal inference requires careful consideration of temporal ordering, confounders, and study design.
Simple intuition: Sunniness and acing a math test may be correlated but Sunniness does not cause acing the test; instead, study behavior may be the underlying driver (a confounder). The lecture uses this to motivate the need for a formal framework.
Key Concepts and Definitions
Correlation: Two variables move together (A and B tend to vary together) but there may be no causal link between them.
Causation/Causal relationship: A change in one variable (the cause) leads to a change in another variable (the effect).
Causal inference: A process to determine whether an observed association reflects a causal relationship, often using the potential outcomes framework.
Potential outcomes framework (two potential outcomes per individual): For each unit i, there are two possible outcomes depending on treatment status: if treated and if not treated.
Treatment indicator: where 1 = treated, 0 = untreated (control).
Observed outcome: You observe only one of the two potential outcomes for each individual, depending on their treatment status.
Counterfactual: The outcome that would have occurred under the alternative treatment status (the unobserved potential outcome).
Two key estimands:
Individual Treatment Effect (ITE):
Average Treatment Effect (ATE):
How we observe data under the framework:
If (treated), observed outcome is ; the counterfactual is (what would have happened without treatment).
If (control), observed outcome is ; the counterfactual is (what would have happened with treatment).
Fundamental problem of causal inference: For any individual, we can only observe one of the two potential outcomes, not both.
The Potential Outcomes Framework: Notation and Concepts
For each individual i:
Treatment indicator:
Potential outcomes: (if treated), (if untreated)
Observed outcome:
ITE:
The two potential outcomes per individual lead to the concept of a counterfactual for each unit.
Counterfactuals:
If treated (Di = 1), counterfactual is (what would have happened without treatment).
If untreated (Di = 0), counterfactual is (what would have happened with treatment).
The reason we use the potential outcomes framework is to formalize what we would need to know to claim a causal effect for individuals and for populations.
Why it matters: Understanding ITEs and ATE helps connect theory (what would happen under different treatment states) to empirical estimations from data.
Core Equations and Estimands
Observed data relationship:
Individual treatment effect (ITE):
Average treatment effect (ATE):
If we consider a finite sample of N individuals, the population ATE is estimated as the average difference in potential outcomes across individuals, though only one of the two outcomes is observed per individual.
A simple population-based expression sometimes used in practice (requires randomization or strong unconfoundedness for validity):
In the lecture example:
If some health insurance program affects the number of doctor visits, we would compare average outcomes between those who received insurance (treated) and those who did not (control) to estimate the program’s impact.
Conditions for Identifying Causal Effects (Internal Validity)
Causality requires meeting key conditions to attribute observed differences to the treatment, not to confounding factors:
Temporal precedence (Cause must precede the effect): The cause must occur before the observed effect in time.
Variation in the cause must correlate with variation in the effect: More exposure or stronger exposure to the cause should be associated with larger changes in the outcome.
No alternative explanations (confounding must be ruled out): There should be no third variable that causes both the treatment and the outcome.
Exogenous assignment (randomization) or controlled study design: Assignment to treatment should be independent of potential outcomes (or methods should account for non-random assignment).
The goal of policy evaluation and program evaluation is to use quasi-experimental and experimental designs to satisfy these criteria and attribute observed changes in outcomes to the treatment.
The instructor emphasizes that the three core conditions (temporal order, covariation, and ruling out alternative explanations) are central to establishing causality and the internal validity of a study.
Treatments, Outcomes, and Study Design Examples
Treatments and outcomes:
Example: A social safety net program (treatment) and child education or nutrition outcomes (outcomes).
In notation: treat some individuals with the program (Di = 1) and compare outcomes to those not treated (Di = 0).
Exemplar language from the lecture:
The potential outcomes framework is used to reason about what would have happened under both the treatment and non-treatment scenarios for each individual.
The analysis focuses on the treatment’s effect on outcomes of interest, driven by theory and prior evidence.
Example discussion points:
If some individuals have health insurance (Di = 1) and others do not (Di = 0), we can compare their outcomes (doctor visits) to estimate the program’s impact, acknowledging the possible presence of confounding factors.
Counterfactual thinking is central: for each individual, only one of the two potential outcomes is observed, and the other is counterfactual.
Observed vs Counterfactual Outcomes (Intuition)
For treated individuals (D_i = 1):
Observed outcome:
Counterfactual: (what would have happened if not treated)
For control individuals (D_i = 0):
Observed outcome:
Counterfactual: (what would have happened if treated)
This asymmetry (we cannot observe both outcomes for the same individual) is what makes causal estimation challenging and why assumptions or randomized designs are needed.
Practical Implications and Takeaways
The key takeaway is that causal inference seeks to determine whether a treatment causes a change in an outcome, while carefully considering temporal order, covariation, and exclusions of alternative explanations.
The potential outcomes framework provides a clear language for discussing observed outcomes, unobserved counterfactuals, and the estimands ITE and ATE.
In policy and social science research, quasi-experimental and experimental methods are used to approximate exogenous treatment assignment and to satisfy internal validity conditions.
Real-world examples (e.g., sunny days vs. test performance; health insurance vs. doctor visits) illustrate how correlation can mislead without a causal framework.
Connections to Foundational Principles and Real-World Relevance
Links to basic statistics and econometrics: understanding of causality, counterfactuals, and randomization is foundational for program evaluation and policy analysis.
Relevance to research design: when planning a study, researchers must consider how to ensure temporal order, covariation, and minimal confounding (or use methods to adjust for confounding).
Ethical and practical implications: misattributing causality can lead to ineffective or harmful policy decisions; rigorous design helps ensure that interventions produce the intended effects.
Summary Quick Reference
Causality vs correlation: correlation does not imply causation; causality requires a mechanism and appropriate study design.
Causal inference frameworks rely on potential outcomes: for each unit i, there are two potential outcomes, Y{1i} and Y{0i}.
Observed outcome depends on treatment:
Individual effect:
Average effect:
Counterfactual reasoning underlines why we cannot observe both potential outcomes for the same individual and why design choices matter for causal interpretation.
Three core causal-identification conditions include temporal precedence, covariation, and ruling out alternative explanations (often via exogenous assignment or quasi-experimental designs).
Practical examples illustrate the distinction between correlation and causation and show how the potential outcomes framework guides interpretation and estimation.