Statistics Vocab
Bias: A question that is flawed in a way that leads to inaccurate results.
Biased question: A question that is flawed in a way that leads to inaccurate results, such as "Do you agree that we should take a field trip to a science museum this year?" because it encourages a particular response.
Biased sample: An error that results in a misrepresentation of a population, such as when an environmental magazine sends out a survey on recycling to its readers who likely have a strong opinion about recycling, thus overrepresenting that viewpoint.
Cluster sample: A sample in which a population is divided into groups (clusters), and all members in one or more clusters are randomly selected. For example, surveying every booth holder in a specific section of a convention center.
Confidence interval: An interval that has c% probability of containing the actual value of a population parameter, typically represented as (x - E, x + E), where E is the margin of error.
Control group: The group in an experiment that is subjected to no treatment under ordinary conditions to serve as a baseline for comparison.
Controlled experiment: An experiment in which two groups are studied under identical conditions except for one variable, typically with a treatment group and a control group.
Convenience sample: A sample in which only members of a population that are easy to reach are selected, such as surveying only students in your homeroom to determine their opinions.
Descriptive statistics: The branch of statistics that involves organizing, summarizing, and displaying data to describe its main features.
Experiment: A method that imposes a treatment on individuals to collect data on their response to the treatment in order to test hypotheses.
Hypothesis: A claim or statement about a characteristic of a population that is subject to investigation and testing.
Inferential statistics: The branch of statistics that involves using sample data to make inferences or predictions about a population.
Information design: The process of designing data and information so that it can be easily understood and used effectively.
Margin of error: The maximum expected difference between a sample result and the population parameter it estimates, often represented as a percentage.
Normal curve: A type of probability distribution that is symmetric and bell-shaped, representing the distribution of many naturally occurring phenomena.
Normal distribution: A continuous probability distribution characterized by a symmetric, bell-shaped curve.
Observational study: A study in which individuals are observed and variables are measured without intervention or manipulation by the researcher.
Parameter: A numerical summary of a population characteristic.
Placebo: A harmless substance or treatment with no therapeutic effect, often used in controlled experiments to test the efficacy of another treatment.
Probability distribution: A function that assigns probabilities to the outcomes of a random experiment.
Population: The entire group of individuals or instances about which we want information.
Random sample: A sample in which each member of the population has an equal chance of being selected.
Random variable: A variable whose value is subject to variations due to chance.
Randomization: The process of randomly assigning subjects to different treatment groups or experimental conditions.
Randomized comparative experiment: An experiment in which subjects are randomly assigned to either a control group or a treatment group to compare the effects of different treatments.
Replication: The repetition of an experiment or study under similar conditions to validate or refute the results.
Sample: A subset of individuals or instances from a larger population.
Sampling distribution: The probability distribution of a sample statistic based on all possible samples of a given size from a population.
Self-selected sample: A sample in which individuals voluntarily choose to participate, potentially introducing bias.
Simulation: The use of a model or computer program to mimic a real-world process or situation.
Standard error of the mean: The standard deviation of the sampling distribution of the sample mean.
Standard normal distribution: A normal distribution with a mean of 0 and a standard deviation of 1.
Statistic: A numerical summary of a sample characteristic.
Stratified sample: A sample obtained by dividing the population into homogeneous subgroups and then randomly selecting samples from each subgroup.
Survey: A research method that collects data from a sample of individuals or instances to generalize findings to a larger population.
Systematic sample: A sample obtained by selecting every nth individual from the population after an initial random start.
Treatment group: The group in an experiment that receives the treatment or intervention being studied.
Unbiased sample: A sample that is representative of the population and is free from systematic error or bias.
Z-score: A standardized score that indicates how many standard deviations a data point is from the mean in a normal distribution.
Standard deviation: A measure of the amount of variation or dispersion of a set of values.