1/37
Vocabulary flashcards covering the fundamentals of statistical inference, data types, measures of center and variability, sampling methods, and experimental design rules.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai | Chat |
|---|
No analytics yet
Send a link to your students to track their progress
Statistical Inference
Drawing reasonable conclusions about a population based on information from a good sample.
Sample
A subset of a population that is considered good if it is random and representative.
Data
A set of observations denoted as x={x1,x2,…}.
Discrete
A type of data with a finite or countable number of possible values.
Continuous
A type of data with an infinite or not-countable number of possible values.
Statistics
The science of designing experiments, collecting data, analyzing that data, and drawing conclusions.
Population (of Interest)
All individuals about whom we want to obtain information.
Parameter
A value that describes the population, such as the average length of ALL boxes.
Statistic
A value that describes the sample, such as the average length of 100 boxes.
Quantitative/Numerical Data
Data consisting of numbers.
Categorical/Qualitative Data
Data consisting of characters or labels.
Histogram
A frequency distribution that bins quantitative data.
Dotplot
An alternative to a histogram used for displaying quantitative data.
Pie/Bar/Stacked Bar Chart
Types of charts used for visualizing categorical data.
Skew
A characteristic of a distribution where the side of the tail determines if it is right skewed or left skewed.
Mean (average)
A measure of center best suited for symmetric unimodal (normal/uniform) distributions.
Median (middle)
A measure of center best suited for skewed data; it is considered robust and is also known as Q2.
Mode (most frequent)
A measure of center best suited for categorical data.
Variance or Standard Deviation
Measures of variability best suited for symmetrical unimodal distributions.
Interquartile range (IQR)
A measure of variability best for skewed data, calculated as Q3−Q1.
Range
A measure of variability that is not used in this specific class.
Outliers
Dots located outside a boxplot that can become the minimum or maximum values.
5 Number Summary
A condensed summary of data consisting of the Min, Q1, Q2, Q3, and Max.
Bias
A situation where the subject or researcher favors a specific outcome.
Convenience sampling
A bad sampling method where the sample is obtained simply because it is convenient.
Volunteer Sample
A bad sampling method involving individuals who volunteer to participate, such as a call-in response.
Simple Random Sample (SRS)
A good sampling method where every sample has an equal probability of being selected, often using a random number generator.
Stratified Random Sample
A sampling method that involves dividing the population into groups (strata) by a common trait followed by a proportional SRS from each strata.
Cluster Sample
An SRS taken over naturally occurring clusters in a population that are mutually homogenous but internally heterogenous.
Systematic Random Sample
A sampling method where every Kth subject is selected from the population in a specific order.
Bad Sampling Frame
An issue where some subjects are missing from the population during sampling.
Undercoverage
A situation where the distribution for a group is smaller than its actual representation in the population.
Nonresponse Bias
A sampling issue occurring when selected subjects do not respond.
Response Bias
A situation where subjects give false responses, which can be caused by leading questions.
Experiments
The process of assigning treatment(s) and control to units/subjects from a sample.
Response
The variable that researchers are interested in measuring in an experiment.
Explanatory
The variable that researchers change to try and affect the response.
Observational Study
A study where units/subjects are observed without any control; it can only conclude association, not cause and effect.