Introduction to Data and Sampling Flashcards

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/38

flashcard set

Earn XP

Description and Tags

Vocabulary terms and definitions from introductory statistics covering data types, study designs, and sampling methods.

Last updated 10:54 PM on 6/24/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

39 Terms

1
New cards

Summary statistic

A single number that condenses a lot of data, such as a proportion.

2
New cards

Treatment group

The group in a study that receives the intervention being tested.

3
New cards

Control group

The comparison group that does not receive the intervention; it serves as the reference point.

4
New cards

Data matrix

Data organized as rows (cases)×columns (variables)\text{rows (cases)} \times \text{columns (variables)}.

5
New cards

Case / observational unit

One row in a data matrix representing one entity measured, such as a patient or a county.

6
New cards

Variable

One column in a data matrix representing a characteristic measured on every case.

7
New cards

Numerical variable

Values where arithmetic is meaningful.

8
New cards

Continuous numerical variable

Numerical values over a range, such as height or rate.

9
New cards

Discrete numerical variable

Numerical values that are counts or jumps, such as population or \text{# siblings}.

10
New cards

Categorical variable

Values that represent categories.

11
New cards

Level

A possible value or category of a categorical variable.

12
New cards

Nominal categorical variable

Categorical values with no natural order, such as state names.

13
New cards

Ordinal categorical variable

Categorical values with a natural order, such as "below hs" reaching to "bachelors".

14
New cards

Scatterplot

A graph of two numerical variables, with one dot per case.

15
New cards

Associated (dependent) variables

Variables that show a discernible pattern together.

16
New cards

Positive association

A relationship where both variables move in the same direction.

17
New cards

Negative association

A relationship where one variable rises as the other falls.

18
New cards

Independent variables

Variables with no evident relationship; variables are either associated or independent, never both.

19
New cards

Explanatory variable

The variable suspected of affecting the other.

20
New cards

Response variable

The variable suspected of being affected by the explanatory variable.

21
New cards

Observational study

Data collected without interfering; it shows association only.

22
New cards

Experiment

A study where treatments are actively assigned to subjects.

23
New cards

Randomized experiment

An experiment where subjects are assigned to groups at random, which licenses causal claims.

24
New cards

Population (target population)

The full set of cases the question is about.

25
New cards

Sample

A subset of the population that is actually measured.

26
New cards

Anecdotal evidence

Haphazard data from a few striking cases that is usually unrepresentative.

27
New cards

Bias

Systematic skew that makes a sample unrepresentative.

28
New cards

Simple random sample

A sample where every case has an equal chance of being selected and selections are unconnected, like a raffle.

29
New cards

Non-response bias

Skew that occurs when sampled people do not respond.

30
New cards

Convenience sample

A potentially unrepresentative sample where only easily-reached cases are included.

31
New cards

Observational data

Data with no treatment applied or withheld.

32
New cards

Confounding variable (lurking variable)

A variable correlated with both explanatory and response variables; the reason observational data cannot prove causation.

33
New cards

Prospective study

A study that follows cases forward as events unfold.

34
New cards

Retrospective study

A study that looks backward through records after events have occurred.

35
New cards

Stratified sampling

Splitting the population into similar groups (strata), then random-sampling within each stratum.

36
New cards

Cluster sampling

Splitting the population into clusters, picking a few whole clusters, and taking all cases within them.

37
New cards

Multistage sampling

A process similar to cluster sampling, but random-sampling is performed within each chosen cluster.

38
New cards

Sample statistic

A number computed from the sample used as an estimate.

39
New cards

Population parameter

The true value for the whole population, which is usually unknown.