1/109
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
4 types of research questions
Factual/Procedural
Describes the facts of the world
Hypothetical
What might be in the future
Normative
What the world should be
Empirical
How the world is, how the world works
What makes a good research question?
Asks WHY something happens (implicit or explicit)
Often starts with a puzzle or intriguing outcome
Focuses on explaining general patterns
Have interesting implications (for policy, understanding history, etc.)
Theory
"An interconnected set of propositions that shows how or why something occurs"
"A reasoned and precise speculation about the answer to a research question, including a statement about why the proposed answer is correct"
Variables
A concept or phenomenon that can have various values. As apposed to a constant that does not vary
Independent Variable
A phenomenon or factor that affects or causes the DV
Dependent Variable
A phenomenon that is affected by other variables. The thing you're trying to explain.
Causal Mechanism
Provides a specific chain of steps, series of links, or other specific accounting of how or why changes in the causal variable (IV) affect the outcome variable (DV)
Scope conditions/Domain
The temporal and spatial domain in which theories are expected to operate- Assumptions may imply a specific spatial or temporal domain.
Temporal- Time period
Spatial- Space the theory operates in.
Assumptions
Things that your theory assumes are true and that are needed to generate your prediction/ for your causal mechanism to operate. Claims or beliefs (often implicit) about how the world operates.
Expectation/Prediction
Hypothesis. A tentative expectation about the relationship between 2 or more phenomena.
Inductive theory building
Bottom up. Get data, look for patterns, find theory
Deductive Theory Building
Bottom up. Make assumption, deduce a prediction, get data to test theory
Hypothesis
A tentative expectation about the relationship between 2 or more phenomena. What you think we will observe in the data, according to the theory.
Deterministic Laws
If X occurs then Y will occur with certainty. A single counterexample can falsify a theory. Extremely rare.
Probabilistic Laws
An increase in X increases the PROBABILITY of Y occurring. On average.
Establishing Causation
The 4 hurdles to establishing causation (know all 4 hurdles)
Is there a credible causal mechanism connecting X to Y
Can you rule out reverse causation
Is there covariation/a correlation between X and Y
Correlation: An association between 2 variables
Have you controlled for all confounding variables
A confounding variable (Z) that correlates with both X and Y and alters relationship. Often causes both.
How to identify a confounder
Control for Z to see if a relationship between X and Y still exists. A causal relationship with both IV and DV
Correlation
An association between 2 variables. In stats- strength and direction of a linear relationship.
Spurious Correlation
The failure to control for confounders. It is a correlation that is not what it appears to be.
Unit of Analysis
The cases or entities being studied, the unit of observation
Conceptualization
The development and clarification of concepts. Produce a theoretical definition of a variable.
Review the literature on the concept
Define/refine the meaning of the concept
Operationalization
Produces an operational definition: A detailed description of the research procedures necessary to assign UoA to variable categories.
Specify empirical indicators
Observable characteristics. "A single, concrete proxy for a concept"
Identify procedures for applying them to measure a concept
How to collect data, how to turn data into measure
Empirical Indicator
Observable characteristics. “a single, concrete proxy for a concept." Example: A survey item or a piece of quantitative info.
Validity
Does the indicator capture the phenomenon we're interested in?
Reliability
Does the indicator produce the same result if different people use it?
Relationship between validity and reliability
Reliability necessary for validity.
Measurement Error
Gap between measure and concept due to…
Poor match between operational definition and concepts
Unclear measurement procedures
Researchers applying the defining inaccurately
convergent validation
Compare your measure against other measures that aim to measure the same thing
Construct Validation
Validation based on an accumulation of research evidence showing that a measure is related to other variables as theoretically expected.
Test-retest
Measure the same thing or person on different days; have the same person take the same questionnaire on different days, etc. Should be a correlation of at least .80 between measurements.
Internal Consistency
If using a composite measure like an index or scale, is there generally agreement among items?
Inter-Coder Reliability
Have multiple people make the same measurement then assess similarity between them. Especially helpful for subjective measurements.
Verbal Self Report
Based on respondents' answers to questions in an interview, survey, etc.
Observations Data Source
Direct. Observe and record a behavior or outcome directly
Archival Records
Existing recorded information.
Nominal
Measurement scale, in which numbers serve as labels only to identify or classify an object.
Ordinal
Groups variables into categories, just like the nominal scale, but also conveys the order of the variables.
Interval
Measured along a numerical scale that has equal distances between adjacent values
Ratio
Extension of interval measurement. Deals with data that have a natural zero point.
Index
A composite score derived from aggregating measures of multiple constructs (called components) using a set of rules and formulas.
Scale
The different ways in which variables are defined and grouped into different categories.
Target Population
The entire group of people or things you want to study. Population to which you would like to generalize your results
Sample
A subset of cases selected from a population
Sampling Frame
Set of all cases from which you will select the sample.
Coverage Error
Mismatch between the sampling frame and the target population
Nonresponse Bias
Nonresponse- i.e. some people not responding- is not a problem if it happens completely randomly. But nonresponse becomes a "bias" if respondents in terms of the issues you are trying to study.
Sample Statistic
Difference between an actual population value and the population value estimated from the sample (value of sample statistic minus actual population value)
Standard Error
A statistical measure of the average sampling error for a particular sample size, which shows how much the statistic should vary from random sample to random sample. The standard error gets smaller the larger your sample is.
Margin of Error
Statistics that tells you how much sampling error to expect, on average, given your sample size. It is calculated using the standard error we just talked about. The MOE is defined so that 95% of the time, the sampling error will be that size or smaller. Gets smaller the larger the random sample.
Confidence Interval and How to Construct/Interpret
Using the MOE, you can easily calculate the 95% confidence interval.
= A range of values defined so that we can have 95% confidence that the true population statistic falls within that range Calculating a 95% Confidence Interval
That sample estimate, +/- the margin or error
Example: If our sample statistic= 50% and our MOE= 4%
Then, our 95% confidence interval is 50% +/- 4%
From 46% to 54%
Often written "(46%, 54%)"
Probability vs. Nonprobability Sampling
Probability- Sampling based on random selection, where each case in the pop has an equal or known chance of being included in the sample
Nonprobability- Methods of case selection other than random sampling
Random Digit Dialing
When the sampling frame is just not possible and another solution is needed. Such as random digit dialing.
Convenience Sampling
Sampling by selecting cases that are conveniently available
Snowball Sampling
Uses chain referral, where each contact is asked to ID additional members of the target population, who then ID others, etc.
Purposive Sampling
Use expert judgment to select cases that reflect “important” attributes of the target population
Theoretical Sampling
Choosing cases to build theory inductively
Cross-Sectional Design
Compare across individuals in one time period
Longitudinal Design
Looking at patterns over time can be helpful because we have variation both across and within units
Proxy
a measure that stands in for something that cannot be measured directly
Coding Data
"Code" data: transform data into numbers
Bivariate Analysis
Simple regression. Includes only the IV and DV
Inspecting Data
Central tendency: mean and median
Dispersion: range, standard deviation
Shape / skew: histogram
Outliers
Unusual or suspicious values that are far removed from the preponderance of observations for a variable. These can… be a clue to errors in the data, be problematic for statistical analysis.
Missing Values
When there is no data for an observation for one or more variables
Listwise Deletion
Dop all observations with any missing data. This can produce problems similar to nonresponse bias, if the cases you drop are systematically different from the cases you keep!
Imputation
Fill in missing values using data from other observations (for example, the mean for all observations)
Four Types of Measurement
Nominal, Ordinal, Interval, Ratio
Nominal
Categories that have no "order" to them
Ordinal
Categories that can be ordered
Interval
Has ranking qualities of ordinal, but also assumes equal distances between each “number” of the measure.
Ratio
Includes features of all the others but also has some arbitrary or real "0"
Regression Analysis
Statistical method for analyzing bivariate (simple regression) and multivariate (multiple regression) relationships among interval or ratio-scale variables. To do regression. Calculate the line that would be the best fit to your data
Regression Coefficients; how to interpret regression coefficients
In a bivariate analysis: a statistic indicating how much the DV increases or decreases for every one-unit change in the IV; the slope of the regression line.
Statistical significance (p-value); conventional levels of statistical significance
The likelihood that the association between variables could have arisen by chance. Report statistical significance using the "p-value"
Convention is that a p-value of p<.05 or smaller is “statistically significant at conventional levels”
So, p<.05. or p<.01. or p<.001
But, p<.10 typically not considered “statistically sig”
Pure reverse causation
When the DV causes the IV (but IV does not cause the DV)
Simultaneity Bias
When the IV causes the DV and the DV causes the IV
Lagging Variables
Use the value of that variable from the previous time period 9t-10 instead of the current time period. Can rule out reverse causation because DV hasn't happened yet.
Omitted Variable Bias
Mistakenly attributing a causal effect to X when it was really due to Z or underestimating (even failing to detect) a causal effect when one really exists!
Confounding Variables
A variable (Z) that is correlated with both the IV (X) and the DV (Y) and somehow alters the relationship between the two
Control Variables
Unaltered variables to compare to
Multiple Regression
Includes the IV, DV, and control variables (Z)
Multivariate Analysis
Estimate a "partial regression coefficient" It is the estimated effect of each IV on the DV when all other IVs are held constant
Endogeneity
When an explanatory variable is correlated with the error term in a statistical analysis. When one or more of your predictor variables causes other predictor variables.
It means that by controlling for Z, we are controlling for a consequence of X, which biases our causal estimate of X
Instrumental Variables
Solution for endogeneity.
A variable that predicts one of the problem variables but not the other
Face-to-Face
Advantages: restate and clarify questions and answers, fast response rate, long interviews/complicated questions
Disadvantages: expensive, hard to keep anonymous, SDB, non-response, F2F respondents likely to be older, more female
Telephone (CATI or Robopolls)
Advantages: cheap, easy to access, less SDB compared to F2F
Disadvantages: hard to ask complex Qs, more SDB than some other modes, high non-response bias
Advantages: cheap, anonymous
Disadvantages: nonresponse bias, only enthusiastic people response, people skip questions, can't clarify
Internet
Advantages: inexpensive, anon, less SDB
Disadvantages: nonresponse
Close ended questions
The researcher provides the answer choices. Scales, multiple choice, ranking, etc.
Open ended questions
The researcher allows the respondent to say or write whatever they want (sometimes with word or time limit)
Advantages and Disadvantages of Close Ended Questions
Advantages: time-efficient, researcher doesn't have to make potentially subjective coding decisions about how to interpret answers
Disadvantages: the answer choices provided affect the answers- people may not feel their options reflect their feelings.
Advantages and Disadvantages of Open Ended Questions
Advantages: nuanced answers
Disadvantages: more expensive (longer for respondents to answer), more difficult to analyze in a valid and reliable way
Trend Studies
Long- draw new sample from the same population over time, and ask the same questions each time
Panel Studies
Long- ask the same people the same questions over time
How to diagnose and address common problems with survey research
Poor question design
Inattention
Problems with drawing the sample
Social desirability bias
Poor Question Design
Leading and double barreled questions
Inattention
Not paying much attention when responding
Problems with drawing the sample
Selection error, sample frame error, non-response
Social desirability bias
Tendency to project self in a socially desirable way
Random Digit Dialing RDD
a type of probability sampling in which phone numbers are randomly generated using a software system and used to create the sample for a research project.