MCAT Physics and Math - Data-Based and Statistical Reasoning

0.0(0)
studied byStudied by 8 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/56

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

57 Terms

1
New cards

Measures of central tendency

describe the middle(s) of a sample

2
New cards

mean / (arithmetic) average

calculated by adding up all of the individual values within the data set and dividing the result by the number of values

where xi to xn are the values of all of the data points in the set and n is the number of data points in the set

good indicator of central tendency when all of the values tend to be fairly close to one another

<p>calculated by adding up all of the individual values within the data set and dividing the result by the number of values</p><p>where x<sub>i</sub> to x<sub>n</sub> are the values of all of the data points in the set and n is the number of data points in the set</p><p>good indicator of central tendency when all of the values tend to be fairly close to one another</p>
3
New cards

outlier

extremely large or extremely small value compared to the other data values

4
New cards

median

midpoint, where half of data points are greater than the value and half are smaller; data set must first be listed in increasing fashion

where n is the number of data values

least susceptible to outliers, but may not be useful for data sets with very large ranges or multiple modes

<p>midpoint, where half of data points are greater than the value and half are smaller; data set must first be listed in increasing fashion</p><p>where n is the number of data values</p><p>least susceptible to outliers, but may not be useful for data sets with very large ranges or multiple modes</p>
5
New cards

mode

the number that appears the most often in a set of data; may be multiple, one or none; represented graphically as peaks; not directly used as measure of central tendency but relationships can be enlightening

6
New cards

normal distribution / bell curve

mean = median = mode

68% of the distribution is within one standard deviation of the mean, 95% within two, and 99% within three

<p>mean = median = mode</p><p>68% of the distribution is within one standard deviation of the mean, 95% within two, and 99% within three</p>
7
New cards

standard distribution

mean of zero and a standard deviation of one, can be extrapolated from any normal distribution

8
New cards

skewed distribution

one that contains a tail on one side or the other of the data set; visual shift in the data appear opposite the direction of the skew

<p>one that contains a tail on one side or the other of the data set; visual shift in the data appear opposite the direction of the skew</p>
9
New cards

negative skew

tail to left

mean < median < mode

<p>tail to left</p><p>mean &lt; median &lt; mode</p>
10
New cards

positive skew

tail to right

mode < median < mean

<p>tail to right</p><p>mode &lt; median &lt; mean</p>
11
New cards

bimodal

distribution containing two peaks with a valley in between; might have only one mode if one peak is slightly higher than the other; can often be analyzed as two separate distribution

<p>distribution containing two peaks with a valley in between; might have only one mode if one peak is slightly higher than the other; can often be analyzed as two separate distribution</p>
12
New cards

range

difference between its largest and smallest values; does not consider the number of items of the data set, nor the placement of any measures of central tendency; possible to approximate the standard deviation as one-fourth

range = xmax − xmin

13
New cards

Quartiles

divide data (when placed in ascending order) into groups that comprise one-fourth of the entire set

14
New cards

Calculate quartiles

To calculate the position of the first quartile (Q1) in a set of data sorted in ascending order, multiply n by ¼.

median (Q2)

To calculate the position of the third quartile (Q3), multiply the value of n by ¾.

  • If this is a whole number, the quartile is the mean of the value at this position and the next highest position.

  • If this is a decimal, round up to the next whole number, and take that as the quartile position.

15
New cards

Interquartile range

calculated by subtracting the value of the first quartile from the value of the third quartile

IQR = Q3 – Q1

16
New cards

outlier

Any value that falls more than 1.5 interquartile ranges below the first quartile or above the third quartile OR that lies more than three standard deviations from the mean

may be: true statistical anomaly, measurement error, distribution that is not approximated by the normal distribution

17
New cards

Standard deviation

calculated by taking the difference between each data point and the mean, squaring this value, dividing the sum of all of these squared values by the number of points in the data set minus one, and then taking the square root of the result

where σ is the standard deviation, xi to xn are the values of all of the data points in the set, is the mean, and n is the number of data points in the set.

<p>calculated by taking the difference between each data point and the mean, squaring this value, dividing the sum of all of these squared values by the number of points in the data set minus one, and then taking the square root of the result</p><p>where σ is the standard deviation, xi to xn are the values of all of the data points in the set, is the mean, and n is the number of data points in the set.</p>
18
New cards

independent events

have no effect on one another

19
New cards

Dependent events

have an impact on one another, such that the order changes the probability

20
New cards

Mutually exclusive outcomes

cannot occur at the same time

21
New cards

exhaustive

there are no other possible outcomes

22
New cards

probability of two or more independent events

product of their probabilities alone

P(A ∩ B) = P(A and B) = P(A) × P(B)

23
New cards

probability of at least one of two independent events

equal to the sum of their initial probabilities, minus the probability that they will both occur.

P(A ∪ B) = P(A or B) = P(A) + P(B) − P(A and B)

24
New cards

Hypothesis testing

begins with an idea about what may be different between two populations

25
New cards

null hypothesis

always a hypothesis of equivalence; says that two populations are equal, or that a single population can be described by a parameter equal to a given value

26
New cards

alternative hypothesis

a hypothesis contrary to the null hypothesis

27
New cards

nondirectional

alternative hypothesis that the populations are not equal

28
New cards

directional

alternative hypothesis that the mean of population A is greater than the mean of population B

29
New cards

z- or t-tests

most common hypothesis tests which rely on the standard distribution or the closely related t-distribution

30
New cards

test statistic

calculated and compared to a table to determine the likelihood that that statistic was obtained by random chance

31
New cards

p-value

the likelihood that that statistic was obtained by random chance under the assumption that our null hypothesis is true

32
New cards

significance level (α)

comparison of p-value; 0.05 is commonly used

p-value > α, fail to reject the null hypothesis - not a statistically significant difference between the two populations

p-value < α, reject the null hypothesis - there is a statistically significant difference between the two groups

33
New cards

type I error

the likelihood that we report a difference between two populations when one does not actually exist; probability is α; false positive

34
New cards

type II error

incorrectly fail to reject the null hypothesis; probability is β; false negative

35
New cards

power

The probability of correctly rejecting a false null hypothesis; equal to 1 − β

36
New cards

confidence

probability of correctly failing to reject a true null hypothesis

37
New cards

Results of Hypothesis Testing

knowt flashcard image
38
New cards

Confidence intervals

essentially the reverse of hypothesis testing; determine a range of values from the sample mean and standard deviation

39
New cards

Charts

present information in a visual format and are frequently used for categorical data

40
New cards

Pie / circle charts

used to represent relative amounts of entities and are especially popular in demographics; may be labeled with raw numerical values or with percent values

as the number of represented categories increases, the visual representation loses impact and becomes confusing

<p>used to represent relative amounts of entities and are especially popular in demographics; may be labeled with raw numerical values or with percent values</p><p>as the number of represented categories increases, the visual representation loses impact and becomes confusing</p>
41
New cards

Bar charts

used for categorical data, which sort data points based on predetermined categories; may then be sorted by increasing or decreasing bar length; length of a bar is generally proportional to the value it represents

breaks should be avoided in the chart because of the potential to distort scale

<p>used for categorical data, which sort data points based on predetermined categories; may then be sorted by increasing or decreasing bar length; length of a bar is generally proportional to the value it represents</p><p>breaks should be avoided in the chart because of the potential to distort scale</p>
42
New cards

Histograms

present numerical data rather than discrete categories; particularly useful for determining the mode of a data set because they are used to display the distribution of a data set

43
New cards

Box plots

used to show the range, median, quartiles and outliers for a set of data

44
New cards

box-and-whisker

a labeled box plot; box is bounded by Q1 and Q3; Q2 is the line in the middle of the box; ends of the whiskers correspond to maximum and minimum values of the data set

<p>a labeled box plot; box is bounded by Q1 and Q3; Q2 is the line in the middle of the box; ends of the whiskers correspond to maximum and minimum values of the data set</p>
45
New cards

Maps

data can be illustrated geographically; relatively easy to comprehend and may show geographic clustering for some data

<p>data can be illustrated geographically; relatively easy to comprehend and may show geographic clustering for some data</p>
46
New cards

Linear graphs

show the relationships between two variables; curve may be linear, parabolic, exponential, or logarithmic; axes of a linear graph will be consistent in the sense that each unit will occupy the same amount of space

47
New cards

linear shape graph

knowt flashcard image
48
New cards

Parabolic graph

knowt flashcard image
49
New cards

Exponential graph

knowt flashcard image
50
New cards

Logarithmic graph

knowt flashcard image
51
New cards

Slope (m)

change in the y-direction divided by the change in the x-direction for any two points

<p>change in the y-direction divided by the change in the x-direction for any two points</p>
52
New cards

Semilog graphs

specialized representation of a logarithmic data set; curved nature of the logarithmic data is made linear by a change in the axis ratio

<p>specialized representation of a logarithmic data set; curved nature of the logarithmic data is made linear by a change in the axis ratio</p>
53
New cards

axis ratio

spacing based on a ratio, usually 10, 100, 1000, and so on

54
New cards

Tables

more likely to contain disjointed information than either charts or graphs because they often contain categorical data or experimental results; significant organization is likely to be relevant; should be able to convert it to a rough graph or to a linear equation

55
New cards

Correlation

a connection—direct relationship, inverse relationship, or otherwise—between data

56
New cards

correlation coefficient

number between –1 and +1 that represents the strength and direction of the relationship

+1 = strong positive relationship

–1 = strong negative relationship

0 = no apparent relationship

57
New cards

causation

manipulation of one variable is the reason for an effect in another