HL AI - Analyzing Data (stats)

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/31

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

32 Terms

1
New cards

Population

an entire group of individuals that information is being collected about

2
New cards

Census

collects data from every individual in the population

3
New cards

Sample

collects data from a subset of individuals in the population

4
New cards

Bias

when a study consistently under- or over-estimates data due to how it was collected

5
New cards

Convenience Sampling

choosing individuals who are easiest to reach (choose conveniently, no strata)

6
New cards

Quota Sampling

a sampling technique where a population is divided into groups and then participants are chosen conveniently (choose conveniently, with strata)

7
New cards

Simple Random Sampling

every member of the population has an equal probability of being selected for the sample (choose randomly, no strata)

8
New cards

Stratified Sampling

a population is divided into a strata then chosen randomly (choose randomly, with a strata)

9
New cards

Systematic Sampling

created by selecting members of the population at regular intervals, eg. every tenth person on a lit of people in UWC grade 11 (choose randomly, no strata)

10
New cards

Bivariate Data

data showing the relationship between two variables

11
New cards

Continuous Data

data that can take any value within a range, with no discrete groups (eg. height, arm span, etc)

12
New cards

Discrete Data

when only a certain number of values are possible and there are no intermediates between groups (eg. eye color, number of siblings, etc)

13
New cards

Cumulative Frequency

the total frequency up to a certain data value

14
New cards

Correlation

the degree of association between two variables (can be positive or negative)

15
New cards

Outlier

a value that is far outside the general range of data (Q1 - 1.5 x the IQR) or (Q3 + 1.5 x the IQR)

16
New cards

Standard Deviation

the measure of how much a set of values differs from its mean (a measure of spread)

17
New cards

When you add to each data point


  • average (mean, median, mode) increases by the number added

  • spread (IQR, range, standard deviation, variance) stays the same

18
New cards

When you multiply each data point


  • average (mean, median, mode) is multiplied by the same number

  • spread (IQR, range, standard deviation, variance) is also multiplied by the same number

19
New cards

Variance

the square of a data set’s standard deviation (so outliers are weighed more heavily than data that is closer to the mean)

20
New cards

Pearson’s Correlation Coefficient

a measure of the linear relationship between two variables, and it is used on the raw data (affected more by outliers)

21
New cards

Spearman’s Correlation Coefficient

a measure of the monotonic relationship between the rank of two variables, and it is used on ranked data (affected less by outliers)

22
New cards

Reliability of data can be compromised by


  • missing data

  • small sample size

  • errors in handling

  • outliers

23
New cards

Causation Disclaimer

must remember that correlation between variables does not necessarily mean one causes the other

24
New cards

Extrapolation

estimating or predicting values beyond the known data points (could lead to incorrect predictions)

25
New cards

s vs σ

  • s represents the spread in the entire population if you are looking at a sample

  • σ represents the spread in the entire population only if you are looking at the entire population

26
New cards

Reliability

when a test produces similar results each time it is carried out

27
New cards

Validity

when a test measures what it claims to measure

28
New cards

Test-Retest

a technique to increase reliability by repeating the same test with the same people at different times

29
New cards

Parallel Forms

a technique to increase reliability by doing different tests with the same people at the same time

30
New cards

Content Validity

refers to whether a test is actually answering the question it is trying to test for

31
New cards

Criterion Validity

the extent to which one test can predict the outcome of another test (eg. mock exams and ib exams)

32
New cards