Statistics and Global Issues Notes
Introduction
- Decision-making in the real world is intensely data-driven.
- Decisions are based on diverse and ubiquitous data.
- Decision-makers use statistics to:
- Collect data
- Analyze data
- Make interpretations
- Build models
- Describe data using graphs and summary measurements
- Develop methods for designing experiments and gathering data that are cost-effective and diminish bias.
Statistics and Global Issues
- Example: Pharmaceutical companies developing a vaccine during a global pandemic.
- This process uses a controlled experiment.
- A typical setup involves:
- Creating two groups sampled from the population.
- Treatment group (or experimental group): Receives the vaccine.
- Control group: Does not receive the vaccine.
- Random assignment of participants to either group.
- Any difference in the effect of the virus on the two groups can be attributed to the treatment (vaccine).
Definitions
- Controlled Experiment: An experiment conducted under controlled conditions where one or two factors are changed at a time to determine if a relationship exists between variables.
- Treatment Group: The group that receives the treatment in the experiment.
- Control Group: The group that does not receive the treatment; it provides a baseline to determine if the treatment has an effect.
- Treatment: Something applied or administered to one or more groups in a controlled experiment.
- Population: The set of all objects or elements about which we are interested in making inferences.
- Frame: A list containing all members of the population.
- Population Parameters: Facts about the population.
- A population can have many parameters.
- Example: Researchers studying registered voters in a state:
- Population: Registered voters in the state.
- Frame: A list of all registered voters in that state.
- Population Parameters of Interest:
- Percentage of voters that will vote on election day.
- Percentage of voters that favor a particular candidate.
- Average income of voters that favor a particular candidate.
- For a specific population at a specific point in time, population parameters do not change; they are fixed numbers.
- The value of a population parameter is seldom known because it involves all population measurements, which are usually too expensive or time-consuming to collect.
- It is the statistician's job to discover these values.
Samples and Statistics
- Sample: A subset of the population used to gain insight about the population.
- Samples are used to represent a larger group, the population.
- Statistic: A fact or characteristic about a sample.
Visualizations
- Population (large blue oval):
- Described by parameters.
- Values usually cannot be obtained because collecting data from everyone is too expensive/time-consuming.
- Sample (smaller green oval):
- A smaller group taken from the population.
- Its characteristics are the statistics.
- The goal is to use statistics from the sample to describe the characteristics (parameters) of the population.
Cyclical Nature of Statistics
- Population → (Sample) - From the population, look at a small group called the sample.
- Sample → (Statistics) - From the sample, calculate a statistic.
- Statistics → (Parameter) - The values from the statistic give an idea of the values for the parameter.
- Parameter → (Population) - The parameter is a characteristic of the population.
- The cycle:
- Start with a population.
- Look at a smaller group (sample).
- Calculate the statistic from the sample.
- Use the statistic to estimate the parameter.
- Use the parameter to describe the population.