1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
statistics
the science of data; involves collecting, classifying, summarizing, organizing, analyzing and interpreting numerical info
descriptive statistics
utilizes numerical and graphical methods to explore data, summarize info, and present it in a convenient form
inferential statistics
utilizes sample data to make estimates, decisions, predictions, or other generalizations about a larger set of data
experimental (or observational) unit
object upon which we collect data
population
set of units we are interested in studying
variable
property of an individual experimental unit
sample
subset of the units of a population
statistical inference
estimate, prediction, or generalization about a population based on info contained in a sample
parameter
a numerical value that describes a characteristic of an entire population (ex. the true average height of all adult women in Canada)
statistic
a numerical value that describes a characteristic of a sample (a subset of the population) - ex. the average height of a random sample of 1,000 adult women from Canada; this statistic can estimate the population parameter in situations where measuring the parameter is impractical
sampling error and applications
a degree of uncertainty because statistics are only based on samples;
designing studies by choosing the right sample size and ensuring the sample represents the population
interpreting results by avoiding overgeneralization and communicating findings through understanding limitations
measure of reliability
a statement about the degree of uncertainty associated with a statistical inference
data generating process (DGP)
the underlying mechanism or process that produces the data we observe; economists can identify patterns, test theories, and propose policy interventions by studying the DGP
why understanding the DGP matters
we may be interested in describing the outcome of a process (distribution of wages), but understanding the DGP helps economists and policymakers make informed decisions by revealing the relationships between variables - this can lead to more effective policies: if the DGP shows that education significantly incs wages, this could justify prioritizing policies that enhance educational opps
quantitative data
measurements that are recorded on a naturally occurring numerical scale
qualitative data
measurements that cannot be measured on a natural numerical scale - can only be classified into one of a group of categories
obtaining data (ways)
published source, designed experiment, observation study (like a survey)
observational study
a data-collection method where the experimental units sampled are observed in their natural setting. no attempt is made to control the characteristics of the experimental units samples (ex. opinion polls, surveys, etc.)
designed experiment
a data-collection method where the researcher exerts full control over the characteristics of the experimental units sampled. these experiments typically involve a group of experimental units that are assigned the treatment and an untreated (control) group
randomized control trials (RCTs)
A research design in which the investigator assigns participants randomly to two or more groups who then receive systematic treatment (or, in some cases, no treatment), and the outcomes for each group are compared; help remedy the lack of data in poor countries and cheaper to conduct than in rich countries
importance of selection (a sample)
how a sample is selected from a population is of vital importance in statistical inference because the probability of an observed sample will be used to infer the characteristics of the population
representative sample
exhibits characteristics typical of those possessed by the population of interest
simple random sample
a sample selected from the population in such a way that every different sample of size n has an equal chance of selection
random number generator
an algorithm that generates a sequence of numbers that seem to occur in random order
stratified random sampling (purpose)
ensures that specific subgroups (strata) within a population are adequately represented
the process of stratified random sampling
divide the pop into strata based on important characteristics
randomly sample from each stratum. the number of units sampled from each stratum can be proportional to the stratum's size relative to the population or the same across strata
combine the samples from all strata to form the final sample
cluster sampling (purpose)
makes sampling more practical and cost-effective for large, naturally divided populations
cluster sampling (process)
divide the population into naturally occurring groups (clusters) (ex. schools, neighborhoods)
randomly select a sample of clusters
collect data from all units within the selected clusters
ex. studying student performance by randomly selecting a few schools and surveying all students within those schools
systematic sampling
a method where you select every kth item from a list of all items in the population
systematic sampling (process)
list the population
choose a random starting point
select every kth item until the end of the list
ex. select every 10th student from a list of 1,000 students
*problem: if the list has a periodic pattern, this method might introduce bias
randomized response sampling
a technique to obtain truthful responses to sensitive questions by reducing the likelihood of false answers
randomized response sampling (process)
participants answer one of two questions based on a randomizing device (ex. coin flip)
the pollster doesn't know which question was answered, ensuring anonymity
only the overall proportion of "yes"/"no" answers in the sample will be known
ex. participants flip a coin to decide whether to answer a sensitive or a neutral question
selection bias
occurs when a subset of the experimental units in the population is excluded, giving those units no chance of being selected for the sample; the sample may not accurately represent the population, leading to biased results
ex. conducting a phone survey that only includes landlines may exclude a significant portion of the population that only uses mobile phones
nonresponse bias
occurs when the researchers conducting a survey or study are unable to obtain data from all the experimental units selected for the sample; if non-respondents differ significantly from respondents, the results may not reflect the true characteristics of the population
measurement error
refers to inaccuracies in the values of the data recorded, often due to faulty data collection methods or instruments; can lead to incorrect conclusions about the population, reducing the reliability of the results
ex. a survey question worded ambiguously causes respondents to interpret it differently