Unit 1 Ap statistics vocabulary/concepts

5.0(5)
studied byStudied by 77 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/58

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

59 Terms

1
New cards
cluster sampling
A probability sampling technique in which clusters of participants within the population of interest are selected at random, followed by data collection from all individuals in each cluster
2
New cards
advantages/disadvantages of cluster sampling
cluster sampling is widely used because of its cost- effectiveness and ease of implementation. In many cases, the only representative sampling frame available to researchers is one based on clusters.

disadvantages: a primary disadvantage of cluster sampling is that the clusters often are homogeneous. the more homogeneous the cluster, the less precise the sample estimates. another concern is the appropriateness of the designated cluster factor used to identify the sampling unites within clusters.
3
New cards
stratified sampling
a variation of random sampling; the population is divided into subgroups of characteristics (similar people get grouped together)
4
New cards
Stratified sampling advantages/disadvantages
Advantages: Sample accurately reflects the population structure; Guarantees proportional representation of groups within a population; Disadvantages: Very costly and time-consuming. Also, people must be clearly classified into distinct strata. Selection within each stratum suffers from same disadvantages as simple random sampling
5
New cards
undercoverage bias
occurs when some groups in the population are left out of the process of choosing the sample
6
New cards
non-response bias
Bias introduced into survey results because individuals refuse to participate or are not reachable
7
New cards
What term do you have to put when using calculator to generate random numbers
"regenerate if repeated values"
8
New cards
steps to SRS using random digits table

1. ASSIGN each member of the population from 01 to N.\*

2\. Determine the population size and sample size.

3\. Select a starting point on the random number table. (randint)

4\.) Select the first *n* numbers (however many numbers are in your sample) whose last X digits are between 0 and N. For instance, if N is a 3-digit number, then X would be 3.

5\. Continue this way through the table until you have selected your entire sample, whatever your n is
9
New cards
systematic sampling
select some starting point and then select every kth (e.g. every 3rd) element in the population
10
New cards
advantages/disadvantages of systematic sampling
Advantages: Simple and quick to use, no need to label each member of the sample, suitable for large samples and large populations; Disadvantages: It can introduce bias if the sampling frame is not random (e.g. numbering off people for dodgeball and students stand in a certain position to be on the same team)
11
New cards
Sample survey procedure
figure out → 1.) population we want to describe 2.) what we want to measure 3.) how to sample
12
New cards
Census
the entire population
13
New cards
Population
everybody within the character of interest -\> does not actually mean the entire population unless the character of interest is the entire population of the world
14
New cards
collecting a census advantages/disadvantages
advantages - most accurate representation of the data no need to even sample; disadvantages - near impossible to do when the population gets bigger
15
New cards
What are some bad sampling methods
convenience sample, voluntary sample, etc
16
New cards
what makes a bad sampling method - bad
they overestimate or underestimate the data that are not able to be generalized to the vast population.
17
New cards
random sampling
a method of poll selection that gives each person in a group the same chance of being selected
18
New cards
Simple Random Sample (SRS)
a sample in which each set of n elements in the population has an equal chance of selection
19
New cards
confounding
When two or more variables are associated in such a way that their effects on a response variable cannot be distinguished from each other. (testing for two things at once, like the sitting and closing eyes heart rate test that we did during class)
20
New cards
Control group
Experimental group whose primary purpose is to provide a baseline for comparing the effects of the other treatments. Depending on the purpose of the experiment, a control group may be given a placebo or an active treatment (called a double blind)
21
New cards
Sampling with replacement
any number can be counted multiple times (e.g. lottery, prize selection the same person can get three pieces of candy)
22
New cards
Sample without replacement
a number cannot be repeated (ignore duplicates) - the same item cannot be selected more than once
23
New cards
How do both experimental and observational studies choose their study subjects
Both types of studies use random sampling to select participants
24
New cards
Experimental study versus observational study
experimental studies involve actively assigning participants specific tasks/treatments - manipulating a variable; observational studies observe participants’ behaviors without directing them/observing data that already exists without changing anything
25
New cards
Setting of experimental study versus observational study
Experimental studies are often conducted in structured environments like labs; observational studies are often conducted in natural environments without researcher control
26
New cards
Cost and length difference between experimental study versus observational study
Experimental studies are generally more expensive when there are more controls; observational studies are usually less expensive but can go on for longer
27
New cards
treatment
a specific condition that is applied to the individuals in an experiment → consider the experimental group
28
New cards
experimental units
smallest collection of individuals to which treatments are applied. When the units are people, they are often called “subjects”
29
New cards
response variable
dependant variable → measures an outcome of a study
30
New cards
explanatory variable
independent variable → the treatment/the condition; explains or predicts changes in a response variable
31
New cards
advantages of an experiment over an observational study
causality establishment, control over variables, precision and accuracy, takes less time, replicability, manipulation of variables, experimental conditions(allow researchers to create conditions that might not naturally occur or take too long to occur)
32
New cards
double blind
neither the subjects nor those who interact with them that measure the response variable know which treatment a subject received (real versus placebo)
33
New cards
Inference for sampling
Drawing conclusions that go beyond the data at hand → making inferences/generalizations about the entire populations which are likely based on the info that is put out
34
New cards
Completely randomized design
Design in which the experimental units are assigned to the treatments completely by chance
35
New cards
Block
Group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to the treatments
36
New cards
Statistically significant
Observed effect so large that it would rarely occurred by chance (specified as the P value which is 5 percent)
37
New cards
Placebo
A fake treatment (typically used on the control group to signify the effects of the real treatment)
38
New cards
Prospective study
A prospective study watches for outcomes, such as the development of a disease, during the study period and relates this to other factors such as suspected risk or protection factor(s). The study usually involves taking a cohort of subjects and watching them over a long period.
39
New cards
What are effective means to display categorical data
bar chart, pie chart
40
New cards
What are effective means to display quantitative data
 Histograms, percentage polygon, line graphs, cumulative percentage graphs
41
New cards
response-bias
The response bias refers to our tendency to provide inaccurate, or even false, answers to self-report questions, such as those asked on surveys or in structured interviews; response bias is also introduced based on how the question is worded.
42
New cards
population paramater
a population parameter is a number that describes something about an entire group or population; e.g. the mean, median, standard deviation etc
43
New cards
principles of experimental design
comparison, random assignment, control, replication
44
New cards
replication (in an experiment)
replicating a treatment to numerous different experimental units
45
New cards
common response variable
a type of lurking/confounding variable → neither explanatory nor response, but still affects the relationship between both of these variables
46
New cards
Control (in an experiment)
When conducting an experiment, a control is an element that remains unchanged or unaffected by other variables
47
New cards
what is always the first step of any type of sampling (cluster, stratified etc)
defining the population you would like to take your sample from
48
New cards
what are the three types of experimental designs
completely randomized design, matched pairs design, and repeated measure design
49
New cards
matched pair design
Pairs of people who share similar characteristics are each tested with differing treatments. Then, their results are compared to see if the treatment has an effect on what is being measured.
Pairs of people who share similar characteristics are each tested with differing treatments. Then, their results are compared to see if the treatment has an effect on what is being measured.
50
New cards
repeated measure design
Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods
Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods
51
New cards
what are some examples of nonsampling error
data entry errors, biased survey questions (which might lead to response bias), biased processing/decision making, non-responses, inappropriate analysis conclusions, and false information provided by respondents.
52
New cards
sampling frame
a list of the items or people forming a population from which a sample is taken.
53
New cards
what is the diference between block design and stratified random sampling
block design is a specific method use for designing an experiment while stratified random sampling is a method of gathering a sample
54
New cards
sampling variability
results vary from sample to sample
55
New cards
margin of error
how far off our sample from the truth of the population
56
New cards
factor
another word for explanatory variable
57
New cards
how should you choose which variable to block for
blocks should be formed based on the most important sources of variability.
58
New cards
levels
number of variations within a factor (explanatory variable)
59
New cards
population of generalization
the people that you are pulling your sample from