knowt ap exam guide logo

Designing Studies [The Practice of Statistics- Chapter 4]

Introduction

The goal of many statistical studies is to show that changes in one variable cause changes in another variable.

4.1- Sampling and Surveys

To find the percentage of drivers in the US that text while driving, ideally taking a census and asking everyone would be the best approach. Since that method isn’t practical, samples are more efficient by representing the entire population.

  • The population in a statistical study is the entire group of individuals we want information about

  • A census collects data from every individual in the population

  • A sample is a subset of individuals in the population from which we actually collect data

The Idea of a Sample Survey

We often draw conclusions about a whole population on the basis of a sample. The first step in planning a sample survey is to say exactly what population we want to describe. The second step is to say exactly what we want to measure, that is, to give the exact definitions of our variables. The final step in planning a sample survey is to decide how to choose a sample from the population.

We reserve the term “sample survey” for studies that use an organized plan to choose a sample that represents some specific population. The population in a sample survey can consist of people, animals, or things.

How to Sample Badly

Convenience sampling often produces unrepresentative data.

  • Choosing individuals from the population who are easy to reach results in a convenience sample

Convenience sampling is a form of bias: using a method that favors some outcomes over others.

  • The design of a statistical study shows bias if it would consistently underestimate or consistently overestimate the value you want to know

Bias is not just bad luck in one sample. It’s the result of a bad study design that will consistently miss the truth about the population in the same way. Voluntary response samples are also an example of biased sampling.

  • A voluntary response sample consists of people who choose themselves by responding to a general invitation

People who choose to respond to call-in, text-in, or many internet polls are usually not representative of some larger population of interest. Voluntary response samples attract people who feel strongly about an issue, and who often share the same opinion.

How to Sample Well: Simple Random Sampling

The best way to avoid convenience and voluntary response bias is by letting chance choose the sample. That’s the idea of random sampling.

  • Random sampling involves using a chance process to determine which members of a population are included in the sample

The easiest way to choose a random sample is to have people write their names on a slip of paper, put them all in a hat, and pull out slips until the desired sample size is chosen. The resulting sample is a simple random sample, or SRS.

  • A simple random sample (SRS) of size n is chosen in such a way that every group of n individuals in the population has an equal chance to be selected as the sample.

There are many ways to take an SRS, and using a set of random digits (Table D) is a common way.

How to Choose an SRS Using Table D

  1. Label- Give each member of the population a numerical label with the same number of digits. Use as few digits as possible

  2. Randomize- Read consecutive groups of digits of the appropriate length from left to right across a line in Table D. Ignore any group of digits that wasn’t used as a label or that duplicates a label already in the sample. Stop when you have chosen n different labels

Other Random Sampling Methods

Sometimes it can be difficult to take a sample and collect data from a select group, and it can be easier to use complex sampling methods. One of the most common alternatives to an SRS involves sampling groups (strata) of similar individuals within the population separately. Then these separate “subsamples” are combined to form one stratified random sample.

  • To get a stratified random sample, start by classifying the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the sample

When we can choose strata that are “similar within but different between,” stratified random samples give more precise estimates than simple random samples of the same size.

When populations are large and spread out, using a method that selects clusters of individuals near each other can be easier. That’s the idea of a cluster sample.

  • To get a cluster sample, start by classifying the population into groups of individuals that are located near each other, called clusters. Then choose an SRS of the clusters. All individuals in the chosen clusters are included in the sample

Cluster samples are often used for practical reasons, like saving time and money.

IMPORTANT: Be sure to understand the difference between strata and clusters.

Inference for Sampling

The purpose of a sample is to give us information about a larger population. The process of drawing conclusions about a population on the basis of sample data is called inference because we infer information about the population from what we know about the sample.

Inference from convenience samples or voluntary response samples would be misleading because these methods of choosing a sample are biased and don’t fairly represent the population. The first reason to rely on random sampling is to avoid bias in choosing a sample.

In addition, larger random samples give better information about the population than smaller samples. This helps to reduce the margin of error.

Sample Surveys: What Can Go Wrong?

Sampling is often done using a list of individuals in the population where the lists are often inaccurate or incomplete. This results in undercoverage.

  • Undercoverage occurs when some members of the population cannot be chosen in a sample

Most samples suffer from some degree of undercoverage. Well-designed sample surveys avoid bias in the sampling process. The real problems start after the sample is chosen. One of the biggest sources of bias in sample surveys is nonresponse.

  • nonresponse occurs when an individual chosen for the sample can’t be contacted or refuses to participate.

Some students misuse the term “voluntary response” to explain why certain individuals don’t respond in a sample survey. The idea is that participation in the survey is optional, so anyone can refuse to take part.

Another form of bias occurs when people give incorrect answers to survey questions. A systematic pattern of inaccurate answers in a survey leads to response bias. The wording of questions is also an important influence on the answers given.

4.2- Experiments

A sample survey aims to gather information about a population without disturbing the population in the process. Sample surveys are a type of observational study. There’s also statistical designs for experiments, a different way to produce data.

Observational Study vs Experiment

The main difference between an observational study and an experiment is that an experiment imposes a treatment to measure the response.

  • An observational study observes individuals and measures variables of interest but doesn’t attempt to influence the responses

  • An experiment deliberately imposes some treatment on individuals to measure their responses

Experiments are the only source of fully convincing data when trying to understand cause and effect. In an experiment, the explanatory variable is the treatment(s), and the response variable is how the subject responds to the treatment(s).

When the effects of two variables on a response variable can’t be separated from each other, it’s called confounding. If there’s no difference between the groups with respect to the other variable, there can be no confounding.

  • Confounding occurs when two variables are associated in such a way that their effects on a response variable can’t be distinguished from each other

Observational studies of the effect of an explanatory variable on a response variable often fail because of confounding between the explanatory variable and one or more variable. Well-designed experiments take steps to prevent confounding.

The Language of Experiments

An experiment is a statistical study in which treatments are applied to people, animals, or objects (the experimental units) to observe the response.

  • A specific condition applied to the individuals in an experiment is called a treatment. If an experiment has several explanatory variables, a treatment is a combination of these variables

  • The experimental units are the smallest collection of individuals to which treatments are applied. When the units are human beings, they are often called subjects

The big advantage of experiments over observational studies is that experiments can give good evidence for causation. Sometimes, the explanatory variables in an experiment are called factors. In experiments where the joint effect of several factors is studied, each treatment is formed by combining a specific value (often called a level) of each of the factors.

How to Experiment Well

The remedy for some confounding variables is to do a comparative experiment. Most well-designed experiments compare two or more treatments.

Comparison alone isn’t enough to produce results we can trust. If the treatments are given to groups that’re very different when the experiment begins, bias will result. The solution to the problem of bias in experiments is random assignment.

  • In an experiment, random assignment means that experimental units are assigned to treatments using a chance process

Although random assignment should create two groups of students that are roughly equivalent to begin with, we still have to ensure that the only consistent difference between the groups during the experiment is the experimental variable. We can control the effects of some variables by keeping them the same for both groups.

The idea of replication is to use enough experimental units to distinguish a difference in the effects of the treatment from chance variations due to random assignment.

Principles of Experimental Design

  1. Comparison- Use a design that compares two or more treatments

  2. Random assignment- Use chance to assign experimental units to treatments. Doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among the treatment groups

  3. Control- Keep other variables that might affect the response the same for all groups

  4. Replication- Use enough experimental units in each group so that any differences in the effects of the treatment can be distinguished from chance between groups

Completely Randomized Designs

There are statistical reasons for using treatment groups that are about equal in size. One type of design to accomplish this is a completely randomized design.

  • In a completely randomized design, the experimental units are assigned to the treatments completely by chance

The main purpose of a control group is to provide a baseline for comparing the effects of the other treatments.

Experiments: What Can Go Wrong?

The logic of a randomized comparative experiment depends on our ability to treat all the subjects the same in every way except for the actual treatments being compared. Good experiments, therefore, require careful attention to details to ensure that all subjects really are treated identically. Many patients respond favorably to any treatment, even a placebo, perhaps because they trust the doctor. The response to a dummy treatment is called the placebo effect.

Because the placebo effect is so strong, it would be foolish to tell subjects in a medical experiment whether they are receiving a new drug or a placebo. Knowing that they are getting “just a placebo” might weaken the placebo effect and bias the experiment in favor of the other treatments. Doctors can also change their expectations and therefore potentially the outcome of the experiment with the knowledge of placebos. Due to this, experiments with human subjects should be double-blind whenever possible.

  • In a double-blind experiment, neither the subject nor those who interact with them and measure the response variable know which treatment a subject received

Experiments can still be single-blind if the individuals who are interacting with the subjects and measuring the response variable don’t know who receives which treatment.

Inference for Experiments

In an experiment, researchers hope to see a difference in the responses so large that it is unlikely to happen just because of chance variation. We can use the laws of probability, which describe chance behavior, to decide whether the treatment effects are large enough that we would expect to see by chance. If they are, we call them statistically significant.

  • An observed effect so large that it would rarely occur by chance is called statistically significant

A statistically significant association in data from a well-designed experiment does imply causation.

Blocking

When a population consists of groups of individuals that are “similar within but different between,” a stratified random sample gives a better estimate than a simple random sample. This same logic applies in experiments with randomized block designs.

  • A block is a group of experimental units that are known before the experiment to be similar in some way that’s expected to affect the response to the treatments

  • In a randomized block design, the random assignment of experimental units to treatments is carried out separately within each block

Using a randomized block design allows us to account for the variation in the response that is due to the blocking variable. This makes it easier to determine if one treatment is actually more effective than the other.

A common type of randomized block design for comparing two treatments is a matched pairs design. The idea is to create blocks by matching pairs of similar experimental units. Then we can use chance to decide which member of a pair gets the first treatment. The other subject in that pair receives the other treatment. That is, the random assignment of subjects to treatments is done within each matched pair. Just as with other forms of blocking, matching helps account for the variation among the experimental units.

4.3- Using Studies Wisely

Scope of Interference

Random sampling avoids bias and produces trustworthy estimates of the truth about the population, so inferences about the population can be made. When experiments use volunteer subjects, it limits the scientists’ ability to generalize their findings to a larger population of individuals.

The Challenges of Establishing Causation

A well-designed experiment tells us that changes in the explanatory variable cause changes in the response variable. Lack of realism can limit the ability to apply the conclusions of an experiment to a larger population. For some questions (such as, does smoking cause lung cancer) people can’t be randomly assigned, so observational studies have to occur.

What are the criteria for establishing causation when we can’t do an experiment?

  • The association is strong

  • The association is consistent

  • Larger values of the explanatory variable are associated with stronger responses

  • The alleged cause precedes the effect in time

  • The alleged cause is plausible

Data Ethics

Basic Data Ethics

  • All planned studies must be reviewed in advance by an institutional review board charged with protecting the safety and well-being of the subjects

  • All individuals who are subjects in a study must give their informed consent before data are collected

  • All individual data must be kept confidential. Only statistical summaries for groups of subjects may be made public

Institutional review boards

The purpose of an institutional review board, in the words of one university’s board, “to protect the rights and welfare of human subjects (including patients) recruited to participate in research activities.” The board continues to monitor the progress of the study after it begins.

Informed consent

Subjects must be informed in advance about the nature of a study and any risk of harm it may bring, then must consent in writing.

Confidentiality

Ethical problems don’t disappear once a study has been cleared by the review board, has obtained consent from its participants, and has actually collected data about them. It is important to protect individuals’ privacy by keeping all data about them confidential. Confidentiality is not the same as anonymity. Anonymity means that individuals are anonymous—their names are not known even to the director of the study. Anonymity is rare in statistical studies. Any breach of confidentiality is a serious violation of data ethics.

SA

Designing Studies [The Practice of Statistics- Chapter 4]

Introduction

The goal of many statistical studies is to show that changes in one variable cause changes in another variable.

4.1- Sampling and Surveys

To find the percentage of drivers in the US that text while driving, ideally taking a census and asking everyone would be the best approach. Since that method isn’t practical, samples are more efficient by representing the entire population.

  • The population in a statistical study is the entire group of individuals we want information about

  • A census collects data from every individual in the population

  • A sample is a subset of individuals in the population from which we actually collect data

The Idea of a Sample Survey

We often draw conclusions about a whole population on the basis of a sample. The first step in planning a sample survey is to say exactly what population we want to describe. The second step is to say exactly what we want to measure, that is, to give the exact definitions of our variables. The final step in planning a sample survey is to decide how to choose a sample from the population.

We reserve the term “sample survey” for studies that use an organized plan to choose a sample that represents some specific population. The population in a sample survey can consist of people, animals, or things.

How to Sample Badly

Convenience sampling often produces unrepresentative data.

  • Choosing individuals from the population who are easy to reach results in a convenience sample

Convenience sampling is a form of bias: using a method that favors some outcomes over others.

  • The design of a statistical study shows bias if it would consistently underestimate or consistently overestimate the value you want to know

Bias is not just bad luck in one sample. It’s the result of a bad study design that will consistently miss the truth about the population in the same way. Voluntary response samples are also an example of biased sampling.

  • A voluntary response sample consists of people who choose themselves by responding to a general invitation

People who choose to respond to call-in, text-in, or many internet polls are usually not representative of some larger population of interest. Voluntary response samples attract people who feel strongly about an issue, and who often share the same opinion.

How to Sample Well: Simple Random Sampling

The best way to avoid convenience and voluntary response bias is by letting chance choose the sample. That’s the idea of random sampling.

  • Random sampling involves using a chance process to determine which members of a population are included in the sample

The easiest way to choose a random sample is to have people write their names on a slip of paper, put them all in a hat, and pull out slips until the desired sample size is chosen. The resulting sample is a simple random sample, or SRS.

  • A simple random sample (SRS) of size n is chosen in such a way that every group of n individuals in the population has an equal chance to be selected as the sample.

There are many ways to take an SRS, and using a set of random digits (Table D) is a common way.

How to Choose an SRS Using Table D

  1. Label- Give each member of the population a numerical label with the same number of digits. Use as few digits as possible

  2. Randomize- Read consecutive groups of digits of the appropriate length from left to right across a line in Table D. Ignore any group of digits that wasn’t used as a label or that duplicates a label already in the sample. Stop when you have chosen n different labels

Other Random Sampling Methods

Sometimes it can be difficult to take a sample and collect data from a select group, and it can be easier to use complex sampling methods. One of the most common alternatives to an SRS involves sampling groups (strata) of similar individuals within the population separately. Then these separate “subsamples” are combined to form one stratified random sample.

  • To get a stratified random sample, start by classifying the population into groups of similar individuals, called strata. Then choose a separate SRS in each stratum and combine these SRSs to form the sample

When we can choose strata that are “similar within but different between,” stratified random samples give more precise estimates than simple random samples of the same size.

When populations are large and spread out, using a method that selects clusters of individuals near each other can be easier. That’s the idea of a cluster sample.

  • To get a cluster sample, start by classifying the population into groups of individuals that are located near each other, called clusters. Then choose an SRS of the clusters. All individuals in the chosen clusters are included in the sample

Cluster samples are often used for practical reasons, like saving time and money.

IMPORTANT: Be sure to understand the difference between strata and clusters.

Inference for Sampling

The purpose of a sample is to give us information about a larger population. The process of drawing conclusions about a population on the basis of sample data is called inference because we infer information about the population from what we know about the sample.

Inference from convenience samples or voluntary response samples would be misleading because these methods of choosing a sample are biased and don’t fairly represent the population. The first reason to rely on random sampling is to avoid bias in choosing a sample.

In addition, larger random samples give better information about the population than smaller samples. This helps to reduce the margin of error.

Sample Surveys: What Can Go Wrong?

Sampling is often done using a list of individuals in the population where the lists are often inaccurate or incomplete. This results in undercoverage.

  • Undercoverage occurs when some members of the population cannot be chosen in a sample

Most samples suffer from some degree of undercoverage. Well-designed sample surveys avoid bias in the sampling process. The real problems start after the sample is chosen. One of the biggest sources of bias in sample surveys is nonresponse.

  • nonresponse occurs when an individual chosen for the sample can’t be contacted or refuses to participate.

Some students misuse the term “voluntary response” to explain why certain individuals don’t respond in a sample survey. The idea is that participation in the survey is optional, so anyone can refuse to take part.

Another form of bias occurs when people give incorrect answers to survey questions. A systematic pattern of inaccurate answers in a survey leads to response bias. The wording of questions is also an important influence on the answers given.

4.2- Experiments

A sample survey aims to gather information about a population without disturbing the population in the process. Sample surveys are a type of observational study. There’s also statistical designs for experiments, a different way to produce data.

Observational Study vs Experiment

The main difference between an observational study and an experiment is that an experiment imposes a treatment to measure the response.

  • An observational study observes individuals and measures variables of interest but doesn’t attempt to influence the responses

  • An experiment deliberately imposes some treatment on individuals to measure their responses

Experiments are the only source of fully convincing data when trying to understand cause and effect. In an experiment, the explanatory variable is the treatment(s), and the response variable is how the subject responds to the treatment(s).

When the effects of two variables on a response variable can’t be separated from each other, it’s called confounding. If there’s no difference between the groups with respect to the other variable, there can be no confounding.

  • Confounding occurs when two variables are associated in such a way that their effects on a response variable can’t be distinguished from each other

Observational studies of the effect of an explanatory variable on a response variable often fail because of confounding between the explanatory variable and one or more variable. Well-designed experiments take steps to prevent confounding.

The Language of Experiments

An experiment is a statistical study in which treatments are applied to people, animals, or objects (the experimental units) to observe the response.

  • A specific condition applied to the individuals in an experiment is called a treatment. If an experiment has several explanatory variables, a treatment is a combination of these variables

  • The experimental units are the smallest collection of individuals to which treatments are applied. When the units are human beings, they are often called subjects

The big advantage of experiments over observational studies is that experiments can give good evidence for causation. Sometimes, the explanatory variables in an experiment are called factors. In experiments where the joint effect of several factors is studied, each treatment is formed by combining a specific value (often called a level) of each of the factors.

How to Experiment Well

The remedy for some confounding variables is to do a comparative experiment. Most well-designed experiments compare two or more treatments.

Comparison alone isn’t enough to produce results we can trust. If the treatments are given to groups that’re very different when the experiment begins, bias will result. The solution to the problem of bias in experiments is random assignment.

  • In an experiment, random assignment means that experimental units are assigned to treatments using a chance process

Although random assignment should create two groups of students that are roughly equivalent to begin with, we still have to ensure that the only consistent difference between the groups during the experiment is the experimental variable. We can control the effects of some variables by keeping them the same for both groups.

The idea of replication is to use enough experimental units to distinguish a difference in the effects of the treatment from chance variations due to random assignment.

Principles of Experimental Design

  1. Comparison- Use a design that compares two or more treatments

  2. Random assignment- Use chance to assign experimental units to treatments. Doing so helps create roughly equivalent groups of experimental units by balancing the effects of other variables among the treatment groups

  3. Control- Keep other variables that might affect the response the same for all groups

  4. Replication- Use enough experimental units in each group so that any differences in the effects of the treatment can be distinguished from chance between groups

Completely Randomized Designs

There are statistical reasons for using treatment groups that are about equal in size. One type of design to accomplish this is a completely randomized design.

  • In a completely randomized design, the experimental units are assigned to the treatments completely by chance

The main purpose of a control group is to provide a baseline for comparing the effects of the other treatments.

Experiments: What Can Go Wrong?

The logic of a randomized comparative experiment depends on our ability to treat all the subjects the same in every way except for the actual treatments being compared. Good experiments, therefore, require careful attention to details to ensure that all subjects really are treated identically. Many patients respond favorably to any treatment, even a placebo, perhaps because they trust the doctor. The response to a dummy treatment is called the placebo effect.

Because the placebo effect is so strong, it would be foolish to tell subjects in a medical experiment whether they are receiving a new drug or a placebo. Knowing that they are getting “just a placebo” might weaken the placebo effect and bias the experiment in favor of the other treatments. Doctors can also change their expectations and therefore potentially the outcome of the experiment with the knowledge of placebos. Due to this, experiments with human subjects should be double-blind whenever possible.

  • In a double-blind experiment, neither the subject nor those who interact with them and measure the response variable know which treatment a subject received

Experiments can still be single-blind if the individuals who are interacting with the subjects and measuring the response variable don’t know who receives which treatment.

Inference for Experiments

In an experiment, researchers hope to see a difference in the responses so large that it is unlikely to happen just because of chance variation. We can use the laws of probability, which describe chance behavior, to decide whether the treatment effects are large enough that we would expect to see by chance. If they are, we call them statistically significant.

  • An observed effect so large that it would rarely occur by chance is called statistically significant

A statistically significant association in data from a well-designed experiment does imply causation.

Blocking

When a population consists of groups of individuals that are “similar within but different between,” a stratified random sample gives a better estimate than a simple random sample. This same logic applies in experiments with randomized block designs.

  • A block is a group of experimental units that are known before the experiment to be similar in some way that’s expected to affect the response to the treatments

  • In a randomized block design, the random assignment of experimental units to treatments is carried out separately within each block

Using a randomized block design allows us to account for the variation in the response that is due to the blocking variable. This makes it easier to determine if one treatment is actually more effective than the other.

A common type of randomized block design for comparing two treatments is a matched pairs design. The idea is to create blocks by matching pairs of similar experimental units. Then we can use chance to decide which member of a pair gets the first treatment. The other subject in that pair receives the other treatment. That is, the random assignment of subjects to treatments is done within each matched pair. Just as with other forms of blocking, matching helps account for the variation among the experimental units.

4.3- Using Studies Wisely

Scope of Interference

Random sampling avoids bias and produces trustworthy estimates of the truth about the population, so inferences about the population can be made. When experiments use volunteer subjects, it limits the scientists’ ability to generalize their findings to a larger population of individuals.

The Challenges of Establishing Causation

A well-designed experiment tells us that changes in the explanatory variable cause changes in the response variable. Lack of realism can limit the ability to apply the conclusions of an experiment to a larger population. For some questions (such as, does smoking cause lung cancer) people can’t be randomly assigned, so observational studies have to occur.

What are the criteria for establishing causation when we can’t do an experiment?

  • The association is strong

  • The association is consistent

  • Larger values of the explanatory variable are associated with stronger responses

  • The alleged cause precedes the effect in time

  • The alleged cause is plausible

Data Ethics

Basic Data Ethics

  • All planned studies must be reviewed in advance by an institutional review board charged with protecting the safety and well-being of the subjects

  • All individuals who are subjects in a study must give their informed consent before data are collected

  • All individual data must be kept confidential. Only statistical summaries for groups of subjects may be made public

Institutional review boards

The purpose of an institutional review board, in the words of one university’s board, “to protect the rights and welfare of human subjects (including patients) recruited to participate in research activities.” The board continues to monitor the progress of the study after it begins.

Informed consent

Subjects must be informed in advance about the nature of a study and any risk of harm it may bring, then must consent in writing.

Confidentiality

Ethical problems don’t disappear once a study has been cleared by the review board, has obtained consent from its participants, and has actually collected data about them. It is important to protect individuals’ privacy by keeping all data about them confidential. Confidentiality is not the same as anonymity. Anonymity means that individuals are anonymous—their names are not known even to the director of the study. Anonymity is rare in statistical studies. Any breach of confidentiality is a serious violation of data ethics.