statistics exam definitions

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/125

There's no tags or description

Looks like no tags are added yet.

Last updated 6:06 PM on 6/23/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

126 Terms

New cards

What is statistics?

Science of collecting, organizing, presenting and interpreting data.

New cards

Main steps of statistics

Explore → Summarize → Model → Estimate → Test.

New cards

Descriptive statistics

Describes the given data using tables, charts and summary statistics.

New cards

Inductive / inferential statistics

Uses sample data to draw conclusions about an unknown population.

New cards

Population

The whole group we are interested in.

New cards

Sample

A subset of the population.

New cards

Observational unit

One object/case/subject whose characteristics are measured.

New cards

Attribute

Characteristic measured on observational units, e.g. age, grade, color.

New cards

Attribute value

Concrete value of an attribute, e.g. age = 20, color = red.

New cards

Parameter

Information about the population, e.g. true mean μ\muμ.

New cards

Statistic

Information calculated from the sample, e.g. xˉ\bar{x}xˉ.

New cards

Raw data list

Data in uncompressed form, value by value

New cards

Representative sample

Sample that reflects the population well.

New cards

Simple random sample

Every object has an equal chance of being selected.

New cards

Random error

Random difference between sample and population.

New cards

Systematic bias

Non-random sampling error; hard to fix statistically.

New cards

Nominal scale

Categories without natural order. Example: blood type, cuisine, ticket type.

New cards

Ordinal scale

Categories with natural order, but distances are not objectively interpretable. Example: pain rating, satisfaction 1–5.

New cards

Metric discrete

Countable numerical values. Example: number of children, website visits.

New cards

Metric continuous

Measurable values on a continuum. Example: time, income, volume, length.

New cards

Quantitative data

Numerical data where distances are meaningful.

New cards

Categorical data

Category labels, e.g. color, gender, blood type.

New cards

Metric scale includes ordinal property?

Yes, metric values can be ordered.

New cards

Nominal scale includes ordinal property?

No, nominal categories have no natural order.

New cards

Frequency distribution

Shows how often each value/class occurs.

New cards

Frequency table

Tabular summary of counts and percentages.

New cards

Cross tabulation

Table describing relationship between two categorical variables.

New cards

Class limits

Boundaries of intervals for grouped data.

New cards

Why use classes?

Continuous data often has too many different values, so grouping helps.

New cards

Measures of location

Mean, median, mode, quantiles.

New cards

Measures of location

Range, IQR, variance, standard deviation, CV.

New cards

Measures of variability

Range, IQR, variance, standard deviation, CV.

New cards

Mean is sensitive to outliers?

Yes. Extreme values can pull the mean strongly.

New cards

Median is robust?

Yes. Median is less affected by outliers.

New cards

Range weakness

Uses only min and max, very sensitive to outliers.

New cards

IQR advantage

More robust because it focuses on middle 50%.

New cards

Variance meaning

Average squared distance from the mean.

New cards

Standard deviation meaning

Typical distance of observations from the mean.

New cards

Population variance vs sample variance

Population: divide by NNN. Estimator from sample: divide by n−1n-1n−1.

New cards

Boxplot shows

Minimum, Q1Q_1Q1, median, Q3Q_3Q3, maximum, and sometimes outliers.

New cards

Bivariate data

Data with two paired variables (xi,yi)(x_i,y_i)(xi,yi).

New cards

Scatterplot

Visualizes relationship between two variables.

New cards

Covariance sign

Positive = variables move together; negative = one increases while other decreases.

New cards

Covariance weakness

Not standardized, depends on units.

New cards

Pearson correlation

Standardized measure of linear relationship.

New cards

r=1

Perfect positive linear relationship.

New cards

r=−1

Perfect negative linear relationship.

New cards

r=0

No linear relationship, but nonlinear relationship may still exist.

New cards

Pearson affected by outliers?

Yes. Use carefully if scatterplot has outliers.

New cards

Spearman correlation

Correlation based on ranks; useful for ordinal data or outliers.

New cards

Regression goal

Predict Y from X using a line.

New cards

Slope interpretation

Expected change in Y if X increases by 1.

New cards

Intercept interpretation

Predicted Y when X=0

New cards

Extrapolation danger

Prediction far outside observed XXX-range can be unreliable.

New cards

R^2 meaning

Share of variation in Y explained by the regression model.

New cards

Random experiment

Experiment with uncertain outcome.

New cards

Sample space Ω\OmegaΩ

Set of all possible outcomes.

New cards

Event

Subset of the sample space.

New cards

Atomic event

Event with one outcome only.

New cards

Impossible event

Event that cannot happen, probability 0.

New cards

Certain event

Event that always happens, probability 1.

New cards

Disjoint events

Events that cannot happen together.

New cards

Independent events

Occurrence of one event does not change probability of the other.

New cards

Disjoint vs independent

Disjoint events are usually dependent, because if one happens, the other cannot.

New cards

Theoretical probability

Based on model/formula.

New cards

Empirical probability

Based on observed data or simulations.

New cards

Law of large numbers

With many repetitions, empirical average/probability approaches theoretical value.

New cards

Conditional probability

Probability of A, given that B happened.

New cards

Bayes idea

Updates probability after receiving new evidence.

New cards

Base-rate problem

Even accurate tests can have low posterior probability if the event is very rare.

New cards

Random variable

Numerical outcome of a random experiment.

New cards

Discrete random variable

Countable possible values.

New cards

Continuous random variable

Uncountable possible values.

New cards

PMF/PDF for discrete variables

Gives probabilities P(X=x)P(X=x)P(X=x).

New cards

Density for continuous variables

Area under curve gives probability.

New cards

Why P(X=c)=0 for continuous X?

A single point has zero area.

New cards

CDF

F(x)=P(X≤x).

New cards

Expected value

Long-run average/theoretical mean.

New cards

Variance

Theoretical spread around expected value.

New cards

Quantile

Value below which a certain probability lies.

New cards

Bernoulli distribution

One trial, two outcomes: success/failure.

New cards

Binomial distribution

Number of successes in nnn independent Bernoulli trials.

New cards

Hypergeometric distribution

Sampling without replacement.

New cards

Binomial vs hypergeometric

Binomial = with replacement/independent; hypergeometric = without replacement/dependent.

New cards

Poisson distribution

Counts rare independent events in fixed time/area.

New cards

Normal distribution

Symmetric distribution; used for measurement errors and many natural processes.

New cards

Standard normal

Normal distribution with mean 0 and variance 1.

New cards

z-transformation purpose

Converts any normal variable to standard normal.

New cards

Central limit theorem

Sums/means of many independent variables tend to normal distribution.

New cards

Chi-square distribution

Sum of squared standard normal variables.

New cards

Point estimator

One numerical estimate of an unknown population parameter.

New cards

Weakness of point estimator

Very precise, but low reliability for continuous parameters.

New cards

Confidence interval

Interval of plausible values for unknown parameter.

New cards

Confidence level 1−α

Long-run probability that the method captures the true parameter.

New cards

Precision of CI

Shorter interval = higher precision.

New cards

Confidence vs precision

Higher confidence usually means wider interval.

New cards

Larger sample size effect

More precision and/or more confidence.

New cards

Unbiased estimator

Hits the true parameter on average.

New cards

Consistent estimator

Gets more precise as sample size increases.

100

New cards

Efficient estimator

Among unbiased estimators, has smallest variance.