1/83
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Statistics
is the process of collecting, organizing, summarizing,and analyzing data to draw conclusions or answer questions
Statistical conclusions
come with a measure of confidence (i.e., we will quantify how confident we are in our conclusions, like a margin of error)
Population
the entire group of individuals to be studied (Ex. students at UMaine)
Sample
a subset of the population (Ex. students at Umaine --> 20 students enrolled in STS 132 selected at random)
Individual
a member of the population (i.e. any student at UMaine)
A statistic
is a numerical summary of a sample
A parameter
is a numerical summary of a population
Descriptive statistics
involves organizing and summarizing the data obtained from a sample
Inferential statistics
involves taking a result about a sample, extending it to the whole population, then measuring the reliability of that result
The process of statistics
(1) Identify the research objective, (2) collect the data needed to answer the question, (3) describe the data, (4) perform inference
variables
are the characteristics of individuals in a population (note: these vary within individuals and among individuals)
Qualitative variable (categorical variable)
classify individuals based on some attribute or characteristic
Quantitative variables
provide a numerical measure of individuals (when in doubt these values can be added or subtracted to obtain meaningful results)
Discrete quantitative variable
its value results from counting
Continuous quantitative variable
its value is measured
Data
the set of observed values for a variable
Explanatory variable
In research, we determine how variation in this affects the value of the response variable
Response variable
In research, we determine how variation in the explanatory variable affects this
Observational study
measures the response variable without attempting to influence the value of the explanatory variable
Designed experiment
the researcher assigns individuals to groups, intentionally influences the value of the explanatory variable, and measures the value of the response variable for each group
Confounding
occurs when the effects of two or more explanatory variables are not separated. So any perceived relation between an explanatory variable and the response variable may be due to some other variable that was not accounted for
Lurking variable
An explanatory variable that was not considered in a study, but affects the value of the response variable (we manage these by considering whether the individuals in our study differ in any significant way
Observational studies allow
researchers to claim association but not causation
Benefits of observational studies
lower cost, better access to individuals, maybe more ethical
Confounding variable
an explanatory variable that was considered in a study whose effect cannot be distinguished from a second explanatory variable in the study
Categories of observational studies
cross-sectional, case-control, cohort
Cross-sectional studies
observational studies that collect info about individuals at a specific point in time, or over a very short period of time
case-control studies
These studies are retrospective, meaning they require individuals to look back in time or require the researcher to look at historical records
Cohort studies
Identifies a group of individuals to participate in the study. The group is then observed over a long period of time and characteristics about the individuals are recorded. These are prospective
Random sampling
the process of using chance to select individuals from a population to be included in the sample (chance = randomness)
Randomness
__________ is the key to obtaining a sample that's representative of the population
Convenience
Avoid using ________ to select a sample
Simple random sampling process
Identify the population (with N individuals), select a sample (of n individuals) in such a way that each individual has an equal chance of being selected
Frame
a list of all the individuals within the population
Sample without replacement
an individual who is selected is removed from the population and cannot be chosen again
Sample with replacement
a selected individual is placed back into the population and could be chosen a second time
Stratified sample
Is obtained by separating the population into non-overlapping groups called strata and then obtaining a simple random sample from each stratum (the individuals within each stratum should be homogeneous in some way)
Systematic sample
is obtained by selecting every kth individual from the population. The first individual selected is a random number between 1 and k
Steps in Systematic Sampling
N/n and round down to the nearest interval. This is k. Randomly select a number between 1 and k (p). The sample is p, p+k, p+2k..., p + (n-1)k
Cluster sample
is obtained by selecting all individuals within a randomly selected cluster or group of individuals
Convenience sample
the individuals are easily obtained and not based on randomness (we should be skeptical of these)
Voluntary response samples
Individuals in the sample are self-selected (meaning the individuals themselves decide to participate in the study). This is a type of convenience sample
Bias
A sample has this if it's not representative of the population
Three sources of bias
sampling bias, nonresponse bias, response bias
Sampling bias
occurs when the sampling methods tend to favor one part of the population over another
Undercoverage (source of sampling bias)
occurs when the proportion of one segment of the population is lower in a sample than it is in the population
Nonresponse bias
occurs when sampled individuals who do not participate in a survey have different opinions from those who do (can be improved through the use of callbacks or rewards/incentives)
Response bias
exists when the answers on a survey do not reflect the true feelings of the respondent
Types of response bias
Interviewer error, misrepresented answers, wording of questions, ordering of questions or words, type of question
Interviewer Error
Can be avoided with a trained/skilled interviewer
misrepresented answers
can cause response bias. some questions elicit inaccurate responses (Questions about salaries, crime)
Wording of questions
can cause response bias, but can be avoided by asking questions in a balance form and in a not vague way
Ordering of Questions or words
Can cause response bias, but can be avoided by distributing surveys with questions switched around
Open question
allows the respondent to write their own response (using their own words)
Closed question
requires the respondent to choose from a list of predetermined responses
Weakness of closed questions
Respondents are likely to choose earlier choices rather than later choices. Choices can be shuffled/rotated to mitigate this
Common practice regarding open/closed questions
conduct a small survey with open questions, then use common responses as choices for closed questions
Data entry error
Researcher enters the data wrong and that can lead to results that are not representative o the population
Sampling error
results from using a sample to estimate information about a population. This type of error is unavoidable because a sample gives incomplete information about a population
Nonsampling error
results from undercoverage, nonresponse bias, response bias, or data-entry error. Could be present even if we sampled the whole population.
experiment
is a controlled study conducted to determine the effect of one or more explanatory variables (aka factors) on a response variable
factors
explanatory variables
Treatment
Any combination of the values of the factors
Experimental unit (subject)
is a person, object, or some other well-defined item upon which a treatment is applied
control group
serves as a baseline treatment that can be used to compare to other treatments
Placebo
Inert medication, such as a sugar tablet or saline injection, that looks/tastes/smells like the experimental drug but has no effect otherwise
Blinding
refers to nondisclosure of the treatment an experimental unit is receiving
Single-blind experiment
the experimental unit (or subject) does not know which treatment they're receiving
double-blind experiment
neither the experimental unit nor the researcher in contact with the experimental unit knows which treatment the experimental unit is receiving
Design an experiment
means to describe the overall plan in conducting the experiment
Steps to designing an experiment
(1) Identify the problem to be solved, (2) determine the factors that affect the response variable, (3) determine the number of experimental units, (4) determine the level of each factor, (5) Conduct the experiment with replication, (6) test the claim
Claim
statement of the problem (should be explicit and must identify the response variable and the population to be studied)
Two ways to deal with factors
Control or randomize
set the level of a factor at one value throughout the experiment if...
you are not interested in its effect on the response variable
Set the level of a factor at various levels if...
you are interested in its effect on the response variable
Randomize a factor by...
randomly assigning the experimental units to treatment groups
Replication
when each treatment is applied to more than one experiment unit (should always be done)
inferential statistics
the process that allows us to make a conclusion about a population using results obtained from a sample
Completely randomized design
is an experimental design where each experiment unit is randomly assigned to a treatment
Matched-pairs design
an experimental design in which the experimental units are paired up (the pairs are matched up so that they are somehow related -- same person before and after, twins, same geographical location, husband/wife, etc.)
Blocking
the process of grouping together homogenous experimental units and then randomly assigning the experimental units within each group to a treatment
block
a group of homogenous individuals
Randomized block design
used when the experimental units are divided into homogenous groups called blocks. Within each, the experimental units are randomly assigned to treatments.
Still learning (1)
You've started learning these terms. Keep it up!