Chapter 8: Sample Surveys & Experiments, Chapter 2: Displaying Categorical Data, Chapter 3: Displaying Quantitative Data & Describing Distributions Numerically, Chapter 4: Regression Scatterplots

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/69

Earn XP

Description and Tags

Flashcards covering key vocabulary from Chapter 8 (Sample Surveys & Experiments), Chapter 2 (Displaying Categorical Data), Chapter 3 (Displaying Quantitative Data & Describing Distributions Numerically), and Chapter 4 (Regression Scatterplots).

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

70 Terms

New cards

Population

The entire group of individuals that we want information about.

New cards

Sample

A subset of the population that we actually examine to gather information about the population.

New cards

Voluntary Response Sampling

A sampling method where people choose themselves to be included (e.g., webpolls, call-in polls).

New cards

Convenience Sampling

A sampling method where individuals are chosen because they are the easiest to reach.

New cards

Simple Random Sampling (SRS)

A sampling method where each member of the population has an equal chance of being included.

New cards

Systematic Sampling

A sampling method where every nth item from the population is chosen.

New cards

Cluster Sampling

A sampling method where the population is divided into groups, then random clusters are selected, and all individuals within those selected clusters are measured.

New cards

Stratified Sampling

A sampling method where the population is first divided into groups, and then a Simple Random Sample (SRS) is taken from each group.

New cards

Multistage Random Sampling

A sampling method that combines a variety of other sampling methods.

New cards

Biased Sample

A sample where each member of the population does not have an equal chance of being selected.

New cards

Undercoverage

A problem in sampling where the entire targeted population is not included in the design of the sample.

New cards

Non-response

A problem in sampling where an individual selected cannot be contacted or refuses to cooperate.

New cards

Response Bias

A problem in sampling where responses are influenced by the interviewer.

New cards

Retrospective Study (Observational)

An observational study that looks backward in time.

New cards

Prospective Study (Observational)

An observational study that looks forward in time.

New cards

Control (Experimental Design)

A principle of experimental design involving managing experimental conditions for all treatment groups to prevent lurking variables from biasing results.

New cards

Random Assignment (Experimental Design)

A principle of experimental design stating that experimental units must be randomly assigned to treatments.

New cards

Replication (Experimental Design)

A principle of experimental design involving repeating a study to reduce chance variation in results.

New cards

Placebo or Control (Experimental Design)

A principle of experimental design requiring the use of a dummy treatment or a standard comparison group as one of the treatments.

New cards

Double-blind (Medical Experiments)

A principle of experimental design for medical experiments where neither the participant nor the researcher taking measurements knows who received which treatment.

New cards

Experimental Units/Subjects

The individuals being studied in an experiment.

New cards

Treatment (Experiment)

A specific condition applied to the subjects in an experiment.

New cards

Factors (Experiment)

The explanatory (independent) variables that are thought to influence the response (outcome/dependent) variable studied, often combined at specific values (levels) to form a treatment.

New cards

Lurking Variable

A variable not among the explanatory or response variables, but which influences the interpretation of their relationship.

New cards

Confounding Variable

Additional explanatory variables that affect the response but are not considered when exploring the explanatory/response relationship.

New cards

Placebo

A dummy treatment used in experiments.

New cards

Double-blind Experiment

An experiment where neither the participant nor the researcher taking measurements knows who had which treatment.

New cards

Single-blind Experiment

An experiment where the participants do not know which treatment they have been assigned.

New cards

Statistically Significant

An observed effect so large that it would rarely occur by chance.

New cards

Completely Randomized Design

An experimental design where subjects are randomly assigned to different treatment groups.

New cards

Matched Pairs Design

An experimental design where subjects are paired according to variables that affect the response and then randomly assigned to treatments within pairs.

New cards

Block Design

An experimental design where blocks of similar subjects are formed and then randomly assigned to treatment groups within each block.

New cards

Bar Graphs

Graphs used to display one categorical variable.

New cards

Pie Charts

Graphs used to display one categorical variable, showing proportions of a whole.

New cards

Contingency Table

A table used to display the relationship between two categorical variables.

New cards

Joint Proportions

Values found by dividing each cell frequency by the overall total in a contingency table.

New cards

Conditional Proportion

Values found by first conditioning upon a category (which becomes the denominator) and then dividing the cell frequency by this denominator.

New cards

Histograms

Graphs used to display the distribution of quantitative variables.

New cards

Stem-plots (and split-stem plots)

Graphs used to display quantitative variables, showing the shape and individual data points.

New cards

Time Plots

Graphs used to display quantitative variables over time, showing trends.

New cards

Interpreting Quantitative Graphs

Analyzing graphs by evaluating their Shape, Center, Spread, and Outliers.

New cards

Graph Shapes (Quantitative)

Descriptions of the distribution of data, including Symmetry, Skewness (Left or Right), Bimodal, Unimodal, or Bell-Shaped.

New cards

Mean

The average value of a dataset.

New cards

Median

The middle value of a dataset when observations are ordered from smallest to largest.

New cards

Variance

A measure of the spread or variability of the data, the average of the squared differences from the mean.

New cards

Standard Deviation

A measure of the spread or variability of the data, calculated as the square root of the variance.

New cards

First Quartile (Q1)

The middle value of the smallest half of the data.

New cards

Third Quartile (Q3)

The middle value of the largest half of the data.

New cards

Five Number Summary

A set of five values that describe the distribution of data: Minimum, Q1, Median, Q3, and Maximum.

New cards

Boxplot

A graphical display created using the five number summary to show the distribution and potential outliers of quantitative data.

New cards

Modified Boxplot

A boxplot that specifically indicates outliers, often identified using rules like 1.5IQR or 3IQR.

New cards

Resistant Measures

Statistical measures (like median and quartiles) that are not significantly affected by outliers or skewness in the data.

New cards

Non-resistant Measures

Statistical measures (like mean and standard deviation) that are significantly affected by outliers or skewness in the data.

New cards

Z-score

A standardized score (Z = (X - µ) / σ) used to compare values from two different normal distributions, indicating how many standard deviations a value is from the mean.

New cards

Explanatory (x) Variable

The independent variable in a scatterplot, thought to influence the response variable.

New cards

Response (y) Variable

The dependent variable in a scatterplot, thought to be influenced by the explanatory variable.

New cards

Scatterplot

A graph that displays the relationship between two quantitative variables.

New cards

Form (of Scatterplot)

Describes the overall pattern of the relationship in a scatterplot, such as Linear, Curved, or Clusters.

New cards

Direction (of Scatterplot)

Describes whether the relationship between variables is a positive association (both increase) or a negative association (one increases as the other decreases).

New cards

Strength (of Scatterplot)

Describes how closely the points in a scatterplot lie to a simple form, such as a line.

New cards

Outliers (in Scatterplot)

Extreme observations in a scatterplot that deviate from the overall pattern.

New cards

Correlation Coefficient (r)

A numerical measure (+1 to -1) that quantifies the strength and direction of a linear relationship between two quantitative variables.

New cards

Regression Line

A line that describes how a response variable y changes as an explanatory variable x changes, used for interpretation of slope, predictions, and residual calculations.

New cards

R² (Coefficient of Determination)

The square of the correlation coefficient, which measures the predictive power of the regression equation. It represents 'The percentage of variability in Y that is explained by the regression line'.

New cards

Residual

The error in prediction, calculated as the observed y-value minus the predicted y-value (observed y – predicted y).

New cards

Negative Residual

Indicates that the prediction made by the regression line was too high compared to the observed value.

New cards

Positive Residual

Indicates that the prediction made by the regression line was too low compared to the observed value.

New cards

Residual Plot

A graph plotting residuals against the explanatory variable (x). A pattern in this plot (e.g., fanning, curvature) suggests that the linear regression line is not a good fit.

New cards

Extrapolation

Making predictions outside of the range for which there is available data, which can be unreliable.

New cards

Correlation does not imply Causation

A caution in regression analysis, warning that simply because two variables are correlated, it does not mean that one causes the other, as lurking variables may be involved.