Statistics Vocab

  1. Bias: A question that is flawed in a way that leads to inaccurate results.

  2. Biased question: A question that is flawed in a way that leads to inaccurate results, such as "Do you agree that we should take a field trip to a science museum this year?" because it encourages a particular response.

  3. Biased sample: An error that results in a misrepresentation of a population, such as when an environmental magazine sends out a survey on recycling to its readers who likely have a strong opinion about recycling, thus overrepresenting that viewpoint.

  4. Cluster sample: A sample in which a population is divided into groups (clusters), and all members in one or more clusters are randomly selected. For example, surveying every booth holder in a specific section of a convention center.

  5. Confidence interval: An interval that has c% probability of containing the actual value of a population parameter, typically represented as (x - E, x + E), where E is the margin of error.

  6. Control group: The group in an experiment that is subjected to no treatment under ordinary conditions to serve as a baseline for comparison.

  7. Controlled experiment: An experiment in which two groups are studied under identical conditions except for one variable, typically with a treatment group and a control group.

  8. Convenience sample: A sample in which only members of a population that are easy to reach are selected, such as surveying only students in your homeroom to determine their opinions.

  9. Descriptive statistics: The branch of statistics that involves organizing, summarizing, and displaying data to describe its main features.

  10. Experiment: A method that imposes a treatment on individuals to collect data on their response to the treatment in order to test hypotheses.

  11. Hypothesis: A claim or statement about a characteristic of a population that is subject to investigation and testing.

  12. Inferential statistics: The branch of statistics that involves using sample data to make inferences or predictions about a population.

  13. Information design: The process of designing data and information so that it can be easily understood and used effectively.

  14. Margin of error: The maximum expected difference between a sample result and the population parameter it estimates, often represented as a percentage.

  15. Normal curve: A type of probability distribution that is symmetric and bell-shaped, representing the distribution of many naturally occurring phenomena.

  16. Normal distribution: A continuous probability distribution characterized by a symmetric, bell-shaped curve.

  17. Observational study: A study in which individuals are observed and variables are measured without intervention or manipulation by the researcher.

  18. Parameter: A numerical summary of a population characteristic.

  19. Placebo: A harmless substance or treatment with no therapeutic effect, often used in controlled experiments to test the efficacy of another treatment.

  20. Probability distribution: A function that assigns probabilities to the outcomes of a random experiment.

  21. Population: The entire group of individuals or instances about which we want information.

  22. Random sample: A sample in which each member of the population has an equal chance of being selected.

  23. Random variable: A variable whose value is subject to variations due to chance.

  24. Randomization: The process of randomly assigning subjects to different treatment groups or experimental conditions.

  25. Randomized comparative experiment: An experiment in which subjects are randomly assigned to either a control group or a treatment group to compare the effects of different treatments.

  26. Replication: The repetition of an experiment or study under similar conditions to validate or refute the results.

  27. Sample: A subset of individuals or instances from a larger population.

  28. Sampling distribution: The probability distribution of a sample statistic based on all possible samples of a given size from a population.

  29. Self-selected sample: A sample in which individuals voluntarily choose to participate, potentially introducing bias.

  30. Simulation: The use of a model or computer program to mimic a real-world process or situation.

  31. Standard error of the mean: The standard deviation of the sampling distribution of the sample mean.

  32. Standard normal distribution: A normal distribution with a mean of 0 and a standard deviation of 1.

  33. Statistic: A numerical summary of a sample characteristic.

  34. Stratified sample: A sample obtained by dividing the population into homogeneous subgroups and then randomly selecting samples from each subgroup.

  35. Survey: A research method that collects data from a sample of individuals or instances to generalize findings to a larger population.

  36. Systematic sample: A sample obtained by selecting every nth individual from the population after an initial random start.

  37. Treatment group: The group in an experiment that receives the treatment or intervention being studied.

  38. Unbiased sample: A sample that is representative of the population and is free from systematic error or bias.

  39. Z-score: A standardized score that indicates how many standard deviations a data point is from the mean in a normal distribution.

  40. Standard deviation: A measure of the amount of variation or dispersion of a set of values.