AP Statistics Chapter 4: Designing Studies

0.0(0)
studied byStudied by 2 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/69

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

70 Terms

1
New cards

Population

In a statistical study, the entire group of individuals we want information about (population of interest; not to be confused with population of inference)

2
New cards

Census

Collects data from every individual in the population

3
New cards

Sample

Subset of individuals in the population from which we actually collect data; collecting data from a representative sample allows us to make an inference about the population

4
New cards

Steps to Choosing a sample

We often draw conclusions about a whole population on the basis of a sample; in choosing a sample from a large, varied population, we must:

Step 1: Define the population we want to describe

Step 2: Say exactly what we want to measure (give exact definitions of variables)

Step 3: Decide how to choose a sample from the population (slips of paper, technology, Table D)

5
New cards

Sample Survey

A study that uses an organized plan to choose a sample that represents some specific population

6
New cards

Convenience Sample

Choosing individuals from the population who are easy to reach; often produce unrepresentative data, almost guaranteed to show bias

7
New cards

Bias

The design of a statistical study shows this factor if it would consistently underestimate or consistently overestimate the value you want to know (over or underrepresentation of a group)

8
New cards

Voluntary response sample

Self-selected sample; consists of people who choose themselves by responding to a general invitation (ex: email); show bias because people with strong opinions or who feel strongly about an issue, often in the same direction, are most likely to respond. People can often also respond more than once.

9
New cards

Random sampling

Involves using a change process to determine which members of a population are included in the sample; a sample chosen by chance rules out both favoritism by the sampler and self-selection by respondants.

10
New cards

Simple Random Sample (SRS)

of size n, is chosen in such a way that every group of n individuals in the population has an equal chance to be selected as the sample. It gives every possible sample of the desired size an equal chance to be chosen.

11
New cards

Table of Random Digits

In practice, people use random numbers generated by a computer or calculator to choose samples; if technology is not available, this resource can be used

12
New cards

Choosing and SRS with technology

Step 1: Label. Give each individual in the population a distinct numerical label from 1-N.

Step 2: Randomize: Use a random number generator to obtain n different integers from 1-N.

13
New cards

Choosing an SRS with Table D

Step 1: Label. Give each member of the population a numerical label with the same number of digits. Use as few digits as possible.

Step 2: Randomize. Read consecutive groups of digits of the appropriate length from left to right across a line in Table D. Ignore any group of digits that wasn't use as a label or that duplicates a label already in the sample. Stop when you have chosen n different labels. Your sample contains the individuals whose labels you find

14
New cards

Stratified random sample

Sometimes there are statistical advantages to using more complex sampling methods. To get a ____, start by classifying the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the sample. Can be hard to use this sampling method when populations are large and spread out over a wide area.

15
New cards

Strata

Classifications of the population into groups of similar individuals. When we choose strata that are similar within but different between, stratified random samples give more precise estimates than simple random samples of the same size.

16
New cards

Cluster sample

A method that selects groups of individuals that are "near;" start by classifying the population into groups of individuals that are near each other. Used for practical reasons of saving money and time (higher efficiency is the greatest benefit to this sampling method).

17
New cards

Clusters

Best chosen when the different within but similar between; when samples are just like the population but on a smaller scale. More varied than stratum.

18
New cards

Inference

The purpose of a sample is to give us information about a larger population, the process of drawing conclusions about a population on the basis of sample data. Larger random samples give better information about the population than random samples.

19
New cards

Samples that do not allow inference

Convenience and voluntary response samples do not allow us to infer about the population because the sample is misleading and contains bias; therefore, it does not fairly represent the population.

20
New cards

Reliance on random sampling

Avoids bias in selecting samples from the list of available individuals; the laws of probability allow trustworthy inference about the population.

21
New cards

Margin of error

Results from random samples come with _____ that sets bounds on the size of the likely error.

22
New cards

4.1 Tips

Describe, fully explain, and justify full steps in processes and answers.

Things to write in a short answer asking for a sample from a table of random digits: population, sample, value you want to measure (with units), how to choose the SRS, "From table D," number of digits, line number, n number of individuals selected, "without placement."

23
New cards

Unbiased samples

Will still produce estimates that differ from the value we want to know simply by chance. However, these estimates will be too small about half the time and too large the other half of the time.

24
New cards

Without replacement

Must explicitly state that the repeated integers should be ignored, or say that they will generate random integers until n different numbers are selected from the given range.

25
New cards

Best variables to chose for stratification

Those that would most accurately predict the response.

26
New cards

Observational study

Observes individuals and measures variables of interest but does not attempt to influence the response; therefore, we cannot determine cause and effect (this includes sample surveys). The purpose is to describe a group or situation, compare groups, and examine relationships between variables

27
New cards

Experiment

deliberately and actively imposes some treatment on individuals to measure their response. When our goal is to understand cause and effect, this is the only source of fully convincing data. Directly answers question. The purpose is to determine if a treatment causes a change in response.

28
New cards

Confounding

Occurs when two variables are associated in such a way that their effects on a response variable cannot be distinguished from each other. Reason why observational studies of the effect of an explanatory variable often fail; occurs between the explanatory variable and one or more other variables; the dynamics of relationship between variables cannot be determined. Well-designed experiments take steps to prevent it.

Steps: identify confounding variable, explain how it is associated with the explanatory variable and the effects it has on the response variable

29
New cards

Response variable

Measures the outcome of a study or experiment

30
New cards

Explanatory variable

Helps explain or predict changes in a response variable

31
New cards

Experiment

A statistical study in which we actually do something (a treatment) to people, animals, or objects (the experimental units) to observe the response

32
New cards

Experimental units

The smallest collection/entity of individuals to which treatment is applied (can be objects, plants, animals, humans)

33
New cards

Subjects

Name for experimental units when they are human beings

34
New cards

Treatment

A specific condition applied to the individuals in an experiment; if an experiment has several explanatory variables, a treatment is a combination of specific values of these variables.

35
New cards

Experiment conditions

Experiments often use a design: Experimental units -> treatment -> measure response. In the lab environment, simple designs often work well. Field experiments and experiments with animals or people deal with more variable conditions. Outside the lab, badly designed experiments often yield worthless results because of confounding

36
New cards

Factors

Explanatory variables; each treatment is formed by combining a specific value (level) of each ___.

37
New cards

Experiment

Allows study of combined effects of several factors and interactions of several factors can produce effects that can not be predicted by looking at each of the factors alone

38
New cards

Comparative experiment

Remedy for confounding, experiment in which some units receive one treatment and similar units receive another. Most well designed experiments compare two or more treatments

39
New cards

Random assignment

If treatments are given to groups that differ greatly, like self-placed groups, bias will result. We use ____ so that units are assigned to treatments using a chance process; ensures that the effects of other variables are spread evenly among the two groups. Must have the same number of individuals in each treatment group

40
New cards

Chance

Assigns individuals to groups but will always cause some difference between the groups

41
New cards

Control

Prevents confounding, reduces variability in the response variable, and provides a baseline for comparison

42
New cards

4 Principles of experimental design

1. Comparison. Use a design that compares two or more treatments.

2. Random assignment. Use chance to assign experimental units to treatments; doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among treatment groups.

3. Control. Keep other variables that might affect the response the same for all groups

4. Replication: Use enough experimental units in each group so that any differences in the effects of the treatments can be distinguished from chance differences between the groups.

43
New cards

Completely Randomized Design

The treatments are assigned to all the experimental units completely by chance. Some experiments may include a control group. Using chance to assign treatments in an experiment does NOT guarantee a completely randomized design.

44
New cards

Control group

A group that receives an inactive treatment or an existing baseline treatment. It is okay if there is no control group when researching the comparison of the effects of several treatments, as opposed to trying to determine if any one treatment works better than another

45
New cards

Good experiments

The logic of a randomized comparative experiment depends on our ability to treat all the subjects the same in every way accept for the actual treatments being compared; require careful attention to details to ensure that all subjects really are treated identically.

46
New cards

Placebo effect

A response to a dummy treatment; very strong, expectations bias results

47
New cards

Double-blind experiment

Neither the subjects nor those who interact with them and measure the response variable know which treatment a subject is administered

48
New cards

Single-blind

When individuals interacting with the subjects know the treatment the subjects are receiving; however, the subjects are still unaware of their treatment and/or the measured response variables

49
New cards

Confounding variables

Confounding occurs when existing differences in the experimental units are not taken into account; different variables might systematically affect the response to treatments

50
New cards

Statistically Significant

In an experiment, researchers usually hope to see a difference in the responses so large that it is unlikely to happen just because of chance variation. We can use the laws of probability, which describe chance behavior, to learn whether the treatment effects are larger than we would expect to see if only chance were operating; an observed effect so large that it would rarely occur by chance; measured by margin of error

51
New cards

Statistically Significant association

In general, association does not imply causation, but association in data from a well-designed experiment (experiment with groups in randomized comparative experiment) does imply causation.

52
New cards

Blocking

When a population consists of groups of individuals that are similar within but different between, a stratified random sample gives a better estimate than a random sample. This same logic applies to experiments.

53
New cards

Block

Group of experimental units that are known before the experiment to be similar in some way that is expected to affect the response to treatments. When formed wisely, it is easier to find convincing evidence that one treatment is more effective than the other; should be based on the most unavoidable source of variation in the experimental units. Each block acts as its own control.

54
New cards

Randomized block design

The random assignment of experimental units to treatments is carried out separately within each block; it averages out the effects of other remaining variables and allows the unbiased comparison of the treatments

55
New cards

Helpful reminder

Control what you can, block what you can't, randomize to create comparable groups

56
New cards

Matched pairs design

Common type of randomized block design for comparing two treatments; idea is to create blocks by matching pairs of similar experimental units. Chance is to determine which unit in each pair gets each treatment. Sometimes a pair in a matched-pairs design consist of a single unit that receives both treatments, and since order of treatments can influence the response, chance is used to determine which treatment is applied for each unit to greatly reduce variability in the response variable. The results are compared within each block and as a whole.

57
New cards

Large amount of variability

We are unable to draw conclusions based off experiments if the response variable shows _____.

58
New cards

Design of study

Determines appropriate method of analysis

59
New cards

Scope of inference

Random selection allows inference about the population; random assignment allows inference about cause and effect relationships

60
New cards

Inference about cause and effect

Well-designed experiments randomly assign individuals to treatment groups. However, most experiments don't select experimental units at random from the larger population. That limits such experiments's inference

61
New cards

Inference about the population

Observational studies don't randomly assign individuals to groups, which rules out inference about cause and effect. Observational studies that use random sampling can make inferences about the population

62
New cards

Challenges of establishing causation

Well-designed experiment tell us that changes in the explanatory variable cause changes in the response variable. Lack of realism can limit our ability to apply the conclusions of an experiment to the settings of greatest interest (lab settings vs. reality with little control)

63
New cards

Causation from Observational Studies

It is sometimes possible to build strong case for causation based on data from observational studies; criteria include: the association is strong, consistent, larger values of the explanatory variable are associated with stronger response, alleged cause precedes the effect in time, alleged cause is plausible.

64
New cards

Data Ethics

Complex issue of data ethics arise when we collect data from people

65
New cards

Basic Data Ethics criteria

Basic standards of data ethics that must be obeyed by all studies that gather data from human subjects, both observational studies and experiments:

1. All planned studies must be reviewed in advance by an institutional review board charged with protecting the safety, rights, and well-being of the subjects. The board decides whether the experiment will produce valuable information and whether it is statistically sound.

2. All individuals who are subjects in a study must give their written informed consent before data are collected. Must be informed on the purpose and nature of the experiment, risks of harm, and time it will take.

3. All individual data must be kept confidential. Only statistical summaries for groups of subjects may be made public (different from anonymity, which is not usually desired).

66
New cards

Sampling frame

List of all individuals from which a ample will be drawn

67
New cards

Undercoverage

Occurs when some members of the population cannot be chosen in a sample

68
New cards

Nonresponse

Occurs when individuals chosen for the sample can't be contacted or refuse to participate; often exceeds 50%

69
New cards

Response bias

A systematic pattern of incorrect responses in a sample survey leads to _____, due to ethnicity, gender, age, race, or behaviors.

70
New cards

Wording

The wording of questions and order of the questions presented to individuals is the most important influence on the answers given to a sample survey. Confusing or leading questions can introduce a strong bias and greatly impacts the survey's outcome.