Biology HL statistics

0.0(0)
studied byStudied by 9 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/56

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

57 Terms

1
New cards

Mean

sum of the data points divided by the number of data points

2
New cards

Normal distribution

Bell curve.

Occurs when you have a large sample size only (as their sample means are more likely to be closer to the population mean which will cause less variation)

has a spike in the middle with the most values, and fewer on either side

Tall and narrow when values are closer together

flatter and wider if data is more spread out

mean value at the peak of the curve

3
New cards

x

represents a single value

4
New cards

n

represents the total number of values in a set

5
New cards

x bar. x

mean of a set of values

6
New cards

Σ

the sum of values

7
New cards

s

standard deviation of a sample

8
New cards

±

plus or minus

9
New cards

to do a t test or analyse a spread using standard deviation

requires a spread of data close to normal distribution, this is why its best to get as

10
New cards

Standard deviation

shows the spread of all the values around the mean. shows variability of the data set. 68% of the data lies around the 1 standard deviation from the mean on the horizontal axis

higher standard deviation, more spread around the mean

lower standard deviation, less spread around the mean, more clumped together

<p>shows the spread of all the values around the mean. shows variability of the data set. 68% of the data lies around the 1 standard deviation from the mean on the horizontal axis</p><p>higher standard deviation, more spread around the mean</p><p>lower standard deviation, less spread around the mean, more clumped together</p>
11
New cards

What percent of the values in a sample fall within +-1 standard deviation from the mean

68%

12
New cards

What percent of the values in a sample fall within +-2 standard deviation from the mean

95%

13
New cards

variability

Measure of how spread out the data is from the centre of the data, which can be the mean

14
New cards

how to calculate two standard deviations from one

multiply by two

15
New cards

how to calculate standard deviation

on excel type in STDEV(highlight all boxes)

16
New cards

higher/lower standard deviation

higher- more lower-less variation

17
New cards

Standard deviation can give additional information on

whether the differences between two samples are likely to be significant. The mean can be the same, but the spread around the mean can be different.

18
New cards

error bars

a way of showing either range or standard deviation of data, show variability

19
New cards

Range

spread of data from the lowest to highest value in the distribution

20
New cards

How do error bars work

The mean is plotted either on a bar graph or a scattered plot graph, and the error bar is plotted around the mean

to show the highest and lowest values in the set (this shows the range) (used on smaller sets).

to show standard deviation (used in bigger sets as then you have enough values for normal distribution

21
New cards

To find the range of values in standard deviation

add and subtract the standard deviation from the mean

22
New cards

t -test

Used to find if the difference between two sets of data is significant

23
New cards

the t test compares

mean and standard deviation of two sets of samples to see if they are the same or different (leaves on tree in front and leaves on tree in back of school)

24
New cards

how to calculate p value

a value for t value is calculated using a formula

find degrees of freedom (calculated from the sum of the sample sizes of the two groups of data minus two.

degrees of freedom = (n1 + n2) − 1. number of values in sample is n

find the t value on that degrees of freedom, and then find the p value, can be a range between two percentages

You can also look at critical value for 5% and then look if your t value is greater or less than that to estimate if it would be statistically significant or not

<p>a value for t value is calculated using a formula</p><p>find degrees of freedom (calculated from the sum of the sample sizes of the two groups of data minus two.</p><p><span style="font-family: BemboStd">degrees of freedom = (<em>n</em>1 + <em>n</em>2) − 1. number of values in sample is n</span></p><p><span style="font-family: BemboStd">find the t value on that degrees of freedom, and then find the p value, can be a range between two percentages</span></p><p><span style="font-family: BemboStd">You can also look at critical value for 5% and then look if your t value is greater or less than that to estimate if it would be statistically significant or not</span></p>
25
New cards

p value

the probability that the difference between the two data sets were caused by chance

26
New cards

under 5% is

statistically significant as it shows that 95% or more of the time, the differences between the two data sets were not caused by chance, therefore you must deny the null hypothesis

27
New cards

5 % is a

critical value, scientists take this into account because living things have natural variation that can cause differences in the data sets, but at some line the differences were no longer due to this chance

28
New cards

Null hypothesis

there is no significant difference between the two data sets (above 5%)

29
New cards

Alternative hypothesis

there is a significant difference between the two data sets (below 5%)

30
New cards

What if you get a value like 6-15%

conclusion is less certain. if you suspect null hypothesis still, that there is no significant difference, make a bigger sample size

31
New cards

population vs sample set

All the students in the class vs the top ten students

32
New cards

Correlation

describes the degree of a relationship between two variables

33
New cards

what can correlation do

de establish a casual relationship between 2 variables

34
New cards

Positive correlation

as x value increases, so does y value

<p>as x value increases, so does y value </p>
35
New cards

Negative correlation

as x value increases, y value decreases, inversely proportional relationship

<p>as x value increases, y value decreases, inversely proportional relationship </p>
36
New cards

Casual relationship

If the occurrence of one variable causes the other (more vaccines, less deaths, negative correlation)

37
New cards

Does a trend always mean a Casal relationship

No, the two variables can be totally unrelated and still show a trend. Experiments must be used to provide evidence showing the cause of the correlation

38
New cards

Median

Middle value of a range of results. Useful if you have outliers

39
New cards

Mode

value that appears the greatest number of times

40
New cards

Continuous variation

Quantitative

Different characteristics within a population

Range

Height, body mass, intelligence

41
New cards

Discontinuous

Either have it or you dont

distinct features

qualititative

Tongue roll, ear lobe, blood group

42
New cards

Positively skewed

tail on the positive end

43
New cards

negatively skewed

tail on the negative end

44
New cards

Error analysis

evaluating the uncertainty associated with a data measurement

45
New cards

Double blinded

Doctor and patients do not know who has placebo or real thing

46
New cards

Interquartile range

calculate main median, and then split up the two groups and calculate their medians, then subtract the 25% median from 75% median to find IQR. shows the range of the middle 50% of your sample. useful with outliers

47
New cards

standard error

Calculates how representative your sample is of your population, how accurate a random sample’s mean would be in comparison with the population’s mean

Calculated using SD and sample size

You can decrease this by having a larger sample size

48
New cards

Standard error vs. standard deviation

Standard deviation is variability within a sample standard error is variability across samples

49
New cards

High Standard error vs low standard error

High- data is widely spread around population

Low- Data is closely distributed around population

50
New cards

Higher mean on graph

Bell curve shifts to the right

51
New cards

Higher frequency of the mean on graph

Bell curve becomes taller and thinner

52
New cards

Pearson correlation coefficient r and R²

Used when data is continuous and normally distributed

Measures the strength and direction of the two variables

(+1- perfect positive correlation, 0- no correlation, -1 perfect negative correlation), the strength of a linear relationship between two variables

r² shows variability in the data, measure of how well the data fits the linear model, how far the points are from the line

0% represents a model that does not explain any of the variance in the data

100% represents a model that explains all of the variance in the data

aim for 80% or higher

53
New cards

Spearman test

Can be used with discontinuous data as well as continuous

same as the “r s” explained above, explains the strength and direction of the relationship of the two variables

54
New cards

error bars overlap

share the same values

55
New cards

Variation

A general description of the difference between any two measurements

56
New cards

how to know if the difference in the two sets of data in a t test are caused by the independent variable

Because they will be higher than the t value for the critical p value 5%, showing that they are lower than 5% and the differences are most likely not caused to chance

57
New cards

as p value decreases

t value increases

Explore top flashcards