Stats 301 Exam 1 Review

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
GameKnowt Play
New
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/88

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

89 Terms

1
New cards

Defining Statistics

Set of tools & techniques used to describe, organize, and interpret data

2
New cards

Goals of science & how stats helps achieve those

Sats helps describe, predict, and explain data

3
New cards

Descriptive Stats

Organize and describe data

4
New cards

Inferential Stats

Infer (guess) something about a larger group (population) from smaller groups (sample)

5
New cards

What is a sample?

A portion or subset OF the population

6
New cards

What is a population?

The overarching group you are studying (large)

7
New cards

What is a variable in stats?

Something that can change (vary) or have different values for different individuals

EX: Age, Major, etc

8
New cards

What is data in stats?

Information collected from the sample on the variables we are interested in (actual numbers & measurements & characteristics)

EX: Engineering, psych, business OR 18,19,20,21, etc

9
New cards

What is continuous data?

variables that can assume any value along some underlying continuum.

EX: height, weight, time

10
New cards

What is categorical data?

a variable that can take on one of a limited, usually fixed, number of possible values.

EX: political affiliation, marital status, and education level

11
New cards
12
New cards

What is central tendancy?

a statistical measure that identifies an (average value) in a data distribution

EX: mean, median, and mode

13
New cards

What is the mean and how do you calculate it?

The AVERAGE of the data

  • most sensitive to outliers

  • best used when there are NO extreme values in the data set

How to calculate:

x bar = sum of x over n

14
New cards

What is the median and how do you calculate it?

The MIDDLE number in a data set

  • NOT sensitive to extreme values

  • Use when extreme values ARE present

How to calculate:

  1. Put data in numerical order

  2. If an odd number of values, find the value in the center

OR

  • If even number of values, find the two values in the center, add them, and divide by 2.

15
New cards

What is the mode, and how do you calculate it?

The MOST FREQUENT occurring value in the data set

  • typically used in CATEGORICAL data

  • you CAN have multiple in the data set (bi-multi)

  • LEAST precise and LEAST affected by extreme values

How to calculate:

  1. put values in numerical order

  2. identify the MOST occurred value

  3. if 2 values appear, they are BOTH modes of the data set

16
New cards

When to use which measure of central tendency?

3 Rules

  1. Use mode when data is CATEGORICAL

  2. Use mean when the data is CONTINUOUS and NO outliers

  3. Use median when the data is CONTINUOUS and you think to mean is misleading because of extreme scores

When in doubt, report BOTH!

17
New cards

What are the extreme values for mean, median, and mode

  • Mean = DON’T use for extreme values

  • Median = can use for extreme values

  • Mode = can use for extreme values

18
New cards

What is the measure of Variability?

Tells us how DIFFERENT the scores are from each other.

  • represent the spread or dispersion in the dataset

19
New cards

Why is variability important?

helps us understand the nature of our SAMPLE and the nature of our VARIABLES

20
New cards

What are the 3 measures of variability?

  • Range

  • Standard Deviation

  • Variance

21
New cards

What is range and how do we calculate it?

The DIFFERENCE between the highest and lowest score of a data set

  • only considers MOST EXTREME values

  • not very accurate

How to calculate:

Range = h - l

22
New cards

What is standard deviation and how do we calculate it?

The AVERAGE distance scores are from the MEAN

  • The most commonly used measure of variability

  • SMALLER stand dev. means scores are closer to the mean

  • LARGE stand dev. means scores are further away from the mean

How to calculate:

Sigma (x-xbar) = single deviation

Sigma (x-xbar) squared = sum of ALL squared deviations

<p>The AVERAGE <u>distance</u> scores are from the MEAN</p><ul><li><p><u>The most commonly used</u> measure of variability</p></li><li><p><strong>SMALLER</strong> stand dev. means scores are <strong><u>closer</u></strong><u> to the </u><strong><u>mean</u></strong></p></li><li><p><strong>LARGE</strong> stand dev. means scores are <strong>f<u>urther</u></strong><u> away from the </u><strong><u>mean</u></strong></p></li></ul><p>How to calculate: </p><img src="https://knowt-user-attachments.s3.amazonaws.com/b600bd2d-71db-41da-b8ae-b6273b021045.gif" data-width="100%" data-align="center" alt=""><p>Sigma (x-xbar) = single deviation</p><p>Sigma (x-xbar) squared = sum of ALL squared deviations</p><p></p>
23
New cards

What is variance and how do we calculate it?

The standard deviation SQUARED

  • rarely used to report descriptive stats

  • more used as a concept

How to calculate:

Variance = SD ²

24
New cards

What are the important Standard Deviation concepts?

  • By def. the average of the deviations is ZERO (assuming normal distribution)

  • ^ we must square the deviations

  • Values are squared so that they do NOT cancel each other out

  • SD is sensitive to extreme values

  • We use the sq root to REVERT back to original units

25
New cards

What is an outlier or extreme value?

A data point that appears to deviate markedly from other data points in the sample

26
New cards

What is the rule of thumb for outliers and extreme values?

  • Anything more than two standard deviations away from the mean is a potential outlier.

  • Anything more than three standard deviations away from the mean is likely an outlier.

27
New cards

Formula to calculate outliers

x bar +- ( c cut off value x s standard deviation)

28
New cards

How do you use standard deviation to understand an individual data point?

  • determine how far the point deviates from the mean (avg) of the dataset comparing it to the overall data spread

  • calculate the mean and standard deviation

  • find the “z” score and use the outlier identification formula

29
New cards

What is a “Z” score aka standard score?

The raw scores that have been adjusted for the mean and standard deviation of the distribution from which the raw scores came.

30
New cards

What are histograms and how do you identify them?

  • They show distributions of continuous variables

  • The height of the bar is the number of times that value occurs

  • The bars touch on the graph

<ul><li><p>They <strong>show distributions of continuous variables </strong></p></li><li><p>The <u>height of the bar</u> is the <strong>number</strong> of <strong>times </strong>that<strong> value occurs</strong></p></li><li><p>The<strong> bars touch</strong> on the<strong> graph</strong></p><p></p></li></ul><img src="https://knowt-user-attachments.s3.amazonaws.com/66bc5b64-cf9f-4ad4-9207-ea54c71aa231.png" data-width="100%" data-align="center" alt=""><p></p>
31
New cards

What are bar graphs and how do you identify them?

  • They show the frequency of categorical responses

  • The bars have spaces in between them on graph

<ul><li><p>They<strong> show the frequency </strong>of <strong>categorical </strong>responses</p></li><li><p>The <strong><u>bars have spaces</u></strong> in between them on graph</p><p></p></li></ul><img src="https://knowt-user-attachments.s3.amazonaws.com/ff081d08-b105-47e6-8fb5-7bb2c63887bc.png" data-width="100%" data-align="center" alt=""><p></p>
32
New cards

How is central tendency described as a distribution?

  • Mean, median, and mode differ in central tendency but do not differ otherwise

  • all 3 m’s would be the same in each of the symmetrical distributions

  • aka the same variability, different average

33
New cards

How is variability described as a distribution?

  • Can have the same central tendency - but different amounts of variability

  • Some can have the same range but different standard deviations

34
New cards

What is skewness and how is it described in a distribution?

The lack of symmetry in a graph

35
New cards

What is a positive skew and which way does the tail face

When the curve's tail is on the right side of the graph.

  • Mode is the highest on left side

  • The median is typically in the middle

  • Mean is the lowest on right side

<p>When the curve's tail is on the <strong>right </strong>side of the graph.</p><ul><li><p><strong>Mode</strong> is the <strong>highest</strong> on <strong>left side</strong></p></li><li><p><strong>The median</strong> is typically in the <strong>middle </strong></p></li><li><p> <strong>Mean</strong> is the <strong>lowest on right side</strong></p><p></p></li></ul><img src="https://knowt-user-attachments.s3.amazonaws.com/711e6ad6-65b0-4fe9-8760-acfc87b7b963.png" data-width="100%" data-align="center" alt=""><p></p>
36
New cards

What is a negative skew and which way does the tail face?

When the tail curve is typically on the left side of the graph

  • Mode is the highest on the right side

  • The median is in the middle

  • Mean is on the left side

<p>When the tail curve is typically on the<strong> left </strong>side of the graph</p><ul><li><p><strong>Mode</strong> is the <strong>highest</strong> on the <strong>right side</strong></p></li><li><p><strong>The median</strong> is in the <strong>middle</strong></p></li><li><p><strong>Mean</strong> is on the <strong>left </strong>side</p><img src="https://knowt-user-attachments.s3.amazonaws.com/7706824a-5bff-4690-a82e-59252af7175e.png" data-width="100%" data-align="center" alt=""><p></p></li></ul><p></p>
37
New cards

What does skewness reflect about the mean, median, and mode?

Reflects the relation between one another

38
New cards

What is the floor effect?

When there is a bottom bound for the values of a data set. MUCH of the data falls around the BOTTOM bound.

  • creates a positive skew!

  • majority values fall on the LOW end of the distribution

39
New cards

What is the ceiling effect?

When there is an upper bound for the values of the data set

  • Creates a negative skew

  • majority of the values fall at the HIGH end of the distribution

40
New cards

What is kurtosis?

How peaked vs flat the distribution is

41
New cards

What is platykurtic?

LOW kurtosis

  • relatively FLAT

  • HIGH variability

42
New cards

What is leptokurtic?

HIGH kurtosis

  • relatively PEAKED

  • LOW variability

43
New cards

What can make graphs misleading?

This can occur when visual reprensations are off and distortions are created with manipulation of axes, scales, and more

44
New cards

What are correlations?

How changes in one variable relate to changes in another variable

  • THE RELATIONSHIP BETWEEN TWO VARIABLES

45
New cards

When do we use correlations?

They are used when you want to quantify the strength and direction of a liner relationship between two continuous variables

46
New cards

What is a correlation coefficient?

a single number that describes the relationship between two variables

47
New cards

How is correlation coefficient abbreviated, and what does it range from?

  • Abv. as “r

  • Ranges from -1 to 1

48
New cards

What is direction in correlation coefficient?

The sign of the coefficient tells us in which direction one variable is to the other

49
New cards

What is the relationship of a positive coefficient?

DIRECT relationship

  • as x increases, y increases

50
New cards

What is the relationship of a negative coefficient?

INVERSE relationship

  • as x increases, y decreases

51
New cards

What is strength of a correlation coefficient?

The closer the coefficient is to -1 or 1, the stronger the relationship is

52
New cards

What are scatterplots in relation to correlations?

A chart or graph that uses dots to represent values for two different numeric values

53
New cards

What is an important idea to remember about correlation coefficient?

Correlation does NOT equal causation. Just because two variables are closely related, does not mean that one causes the other.

54
New cards

Understand the chart of correlation relationships

55
New cards

Understand scatter plots and correlation examples

56
New cards

What are the limitations of correlation coefficients?

  • Can only be used to identify LINEAR relationships

  • NO curvilinear relationships

  • Restriction of range

57
New cards

What is the restriction of range?

When there are too many scores that have similar values for a variable, the coefficient cannot capture the true relationship.

58
New cards

Do outliers have a significant effect on correlation coefficents?

YES! They have a huge impact on correlation co.

59
New cards

What is the coefficient of determination? And how do we calculate it?

The representation of how much variance two variables share

  • how much x can be accounted for y (vise versa)

    How to calculate it?

  • simply square the coefficient!

60
New cards

How do we calculate/compute the correlation coefficient?

The formula used:

  • rxy = the correlation between x and y

  • n is the sample size

  • X is each individual's score on the X variable

  • Y is each individual’s score on the Y variable

  • XY is the product of each X score times its corresponding Y score

  • X2 is each individual's X score squared

  • Y2 is each individual’s Y score squared

61
New cards

What are the numerator and denominator relationships when computing a correlation coefficient?

numerator = how much do x and y go together

denominator = how much do x and y vary on their own

62
New cards

What is an example on how to report a correlation coefficient?

We found a strong or weak negative/positive correlation between ——- and ——- (r=). Suggesting that…..

63
New cards

What is coefficient of determination?

The more two variables have in common, the more variance they share

64
New cards

What is coefficient of determiination?

The variance that is left over after calculation

65
New cards

What is a correlation matrix?

A simple way to report a bunch of correlations at one time

66
New cards

What is r² and how do you calculate it?

This is known as the coefficient of determination and is calculated by squaring the value of r.

67
New cards

What is important to remember about correlation vs causation?

Correlation does NOT equal causation

  • we can NEVER definitively assume causation from a correlational relationship

68
New cards

What is reverse causation?

The causal direction may be opposite from what has been hypothesized

69
New cards

What is reciprocal causation?

When two variables cause each other

  • spiral effect

70
New cards

What are measures in reliability and validity?

the act or process of assigning numbers to phenomena according to a rule.

71
New cards

What are the 4 measurement scales from least to most precise?

  • Nominal Scale: measure split into categories. A person cannot be in more than one category. Data is presented as counts or percentages.

    Ex: hair color, political affiliation

  • Ordinal Scale: categories are ranked in a hierarchy.

    Ex: class ranking

  • Interval Scale: ranked continuous variables, with equal spacing (intervals) between values

    Ex: 1-5 strongly agree to strongly disagree

  • Ratio Scale: similar to interval, but has a true zero value.

    0= complete absence of the attribute

72
New cards

What is an independent variable?

Something that can be manipulated or changed in an experiment.

Ex: the amount of water used

73
New cards

What is a dependent variable?

What you measure/observe as a result of change

Ex: how much the plants had grown

74
New cards

What is reliability?

a measure that is consistent in the values it outputs

75
New cards

What is validity?

the measure is actually measuring what you intended to measure

76
New cards

What is a key note to remember about reliability and validity.

A measure can be reliable and NOT be valid.

But a measure cannot be valid and NOT be reliable.

77
New cards

What is the idea of garbage in, garbage out?

if the data you collected is based on invalid or unreliable measure, your results will be useless.

78
New cards

What is the goal for reliability and validity/ overall stats and testing?

MINIMIZE the error!

79
New cards

What is an observed score?

the ACTUAL score a person receives

80
New cards

What is a true score?

the theoretical score representing a persons actual ability or trait without measurement errors. (aka the perfect score)

81
New cards

What is an error score?

AKA measurement error, the discrepancy between observed and true score.

82
New cards

What are the types of reliability?

  • Test-retest: does a person receive the SAME score when they complete the measure at two different points in time?

  • Parallel test forms: are different versions of the same measurements equivalent?

  • Internal consistency: do all items in a measure assess the same concept you are trying to measure? Is there a strong correlation between individual items and total scores?

    Chronbachs Alpha ^:

  • Inter-rater: does the measure produce the same results regardless of who is grading the scale? Can be evaluated by looking at the correlation between raters.

83
New cards

What is important to remember about test-retest and parallel forms?

  • both can be measured using correlation

  • the CLOSER the coefficient is to 1, the more reliable the measure is.

84
New cards

What is Cronbachs Alpha in relation to internal consistency?

a stat that reflects the degree of internal consistency of items. Should always be from ZERO to ONE. The closer to 1, the better.

85
New cards

How to improve cronbachs alpha?

  1. Increase # of items in the survey

  2. properly format instructions

  3. make sure the admin of the measure is standardized

  4. remove unclear or confusing items

86
New cards

Can validity be assessed with stats?

NO.

requires theory, critical thinking, and lots of data

87
New cards

What are the 3 types of validity?

  1. Content: does the measure cover ALL of what we are trying to measure?

  2. Criterion: does the measure predict other indicators of the same construct?

  3. Construct: is the measure related to things it shouldn’t be and is it not related to things it should? Does it measure the underlying concept you set out to measure? Requires psychological theory

88
New cards

What are concurrent and predictive validity within criterion validity?

Concurrent validity: do the measures taken correlate with pre-existing measures that have already been validated?

Predictive validity: the ability of the measure to predict outcomes in the future.

89
New cards

What are convergent and discriminant validity within construct validity?

Convergent validity: does the measure relate to things that it should?

Construct validity: does the measure NOT relate to things that it should?

Explore top flashcards