Stats 301 Exam 1 Review

studied byStudied by 0 people
0.0(0)
learn
LearnA personalized and smart learning plan
exam
Practice TestTake a test on your terms and definitions
spaced repetition
Spaced RepetitionScientifically backed study method
heart puzzle
Matching GameHow quick can you match all your cards?
flashcards
FlashcardsStudy terms and definitions

1 / 88

encourage image

There's no tags or description

Looks like no one added any tags here yet for you.

89 Terms

1

Defining Statistics

Set of tools & techniques used to describe, organize, and interpret data

New cards
2

Goals of science & how stats helps achieve those

Sats helps describe, predict, and explain data

New cards
3

Descriptive Stats

Organize and describe data

New cards
4

Inferential Stats

Infer (guess) something about a larger group (population) from smaller groups (sample)

New cards
5

What is a sample?

A portion or subset OF the population

New cards
6

What is a population?

The overarching group you are studying (large)

New cards
7

What is a variable in stats?

Something that can change (vary) or have different values for different individuals

EX: Age, Major, etc

New cards
8

What is data in stats?

Information collected from the sample on the variables we are interested in (actual numbers & measurements & characteristics)

EX: Engineering, psych, business OR 18,19,20,21, etc

New cards
9

What is continuous data?

variables that can assume any value along some underlying continuum.

EX: height, weight, time

New cards
10

What is categorical data?

a variable that can take on one of a limited, usually fixed, number of possible values.

EX: political affiliation, marital status, and education level

New cards
11
New cards
12

What is central tendancy?

a statistical measure that identifies an (average value) in a data distribution

EX: mean, median, and mode

New cards
13

What is the mean and how do you calculate it?

The AVERAGE of the data

  • most sensitive to outliers

  • best used when there are NO extreme values in the data set

How to calculate:

x bar = sum of x over n

New cards
14

What is the median and how do you calculate it?

The MIDDLE number in a data set

  • NOT sensitive to extreme values

  • Use when extreme values ARE present

How to calculate:

  1. Put data in numerical order

  2. If an odd number of values, find the value in the center

OR

  • If even number of values, find the two values in the center, add them, and divide by 2.

New cards
15

What is the mode, and how do you calculate it?

The MOST FREQUENT occurring value in the data set

  • typically used in CATEGORICAL data

  • you CAN have multiple in the data set (bi-multi)

  • LEAST precise and LEAST affected by extreme values

How to calculate:

  1. put values in numerical order

  2. identify the MOST occurred value

  3. if 2 values appear, they are BOTH modes of the data set

New cards
16

When to use which measure of central tendency?

3 Rules

  1. Use mode when data is CATEGORICAL

  2. Use mean when the data is CONTINUOUS and NO outliers

  3. Use median when the data is CONTINUOUS and you think to mean is misleading because of extreme scores

When in doubt, report BOTH!

New cards
17

What are the extreme values for mean, median, and mode

  • Mean = DON’T use for extreme values

  • Median = can use for extreme values

  • Mode = can use for extreme values

New cards
18

What is the measure of Variability?

Tells us how DIFFERENT the scores are from each other.

  • represent the spread or dispersion in the dataset

New cards
19

Why is variability important?

helps us understand the nature of our SAMPLE and the nature of our VARIABLES

New cards
20

What are the 3 measures of variability?

  • Range

  • Standard Deviation

  • Variance

New cards
21

What is range and how do we calculate it?

The DIFFERENCE between the highest and lowest score of a data set

  • only considers MOST EXTREME values

  • not very accurate

How to calculate:

Range = h - l

New cards
22

What is standard deviation and how do we calculate it?

The AVERAGE distance scores are from the MEAN

  • The most commonly used measure of variability

  • SMALLER stand dev. means scores are closer to the mean

  • LARGE stand dev. means scores are further away from the mean

How to calculate:

Sigma (x-xbar) = single deviation

Sigma (x-xbar) squared = sum of ALL squared deviations

<p>The AVERAGE <u>distance</u> scores are from the MEAN</p><ul><li><p><u>The most commonly used</u> measure of variability</p></li><li><p><strong>SMALLER</strong> stand dev. means scores are <strong><u>closer</u></strong><u> to the </u><strong><u>mean</u></strong></p></li><li><p><strong>LARGE</strong> stand dev. means scores are <strong>f<u>urther</u></strong><u> away from the </u><strong><u>mean</u></strong></p></li></ul><p>How to calculate: </p><img src="https://knowt-user-attachments.s3.amazonaws.com/b600bd2d-71db-41da-b8ae-b6273b021045.gif" data-width="100%" data-align="center" alt=""><p>Sigma (x-xbar) = single deviation</p><p>Sigma (x-xbar) squared = sum of ALL squared deviations</p><p></p>
New cards
23

What is variance and how do we calculate it?

The standard deviation SQUARED

  • rarely used to report descriptive stats

  • more used as a concept

How to calculate:

Variance = SD ²

New cards
24

What are the important Standard Deviation concepts?

  • By def. the average of the deviations is ZERO (assuming normal distribution)

  • ^ we must square the deviations

  • Values are squared so that they do NOT cancel each other out

  • SD is sensitive to extreme values

  • We use the sq root to REVERT back to original units

New cards
25

What is an outlier or extreme value?

A data point that appears to deviate markedly from other data points in the sample

New cards
26

What is the rule of thumb for outliers and extreme values?

  • Anything more than two standard deviations away from the mean is a potential outlier.

  • Anything more than three standard deviations away from the mean is likely an outlier.

New cards
27

Formula to calculate outliers

x bar +- ( c cut off value x s standard deviation)

New cards
28

How do you use standard deviation to understand an individual data point?

  • determine how far the point deviates from the mean (avg) of the dataset comparing it to the overall data spread

  • calculate the mean and standard deviation

  • find the “z” score and use the outlier identification formula

New cards
29

What is a “Z” score aka standard score?

The raw scores that have been adjusted for the mean and standard deviation of the distribution from which the raw scores came.

New cards
30

What are histograms and how do you identify them?

  • They show distributions of continuous variables

  • The height of the bar is the number of times that value occurs

  • The bars touch on the graph

<ul><li><p>They <strong>show distributions of continuous variables </strong></p></li><li><p>The <u>height of the bar</u> is the <strong>number</strong> of <strong>times </strong>that<strong> value occurs</strong></p></li><li><p>The<strong> bars touch</strong> on the<strong> graph</strong></p><p></p></li></ul><img src="https://knowt-user-attachments.s3.amazonaws.com/66bc5b64-cf9f-4ad4-9207-ea54c71aa231.png" data-width="100%" data-align="center" alt=""><p></p>
New cards
31

What are bar graphs and how do you identify them?

  • They show the frequency of categorical responses

  • The bars have spaces in between them on graph

<ul><li><p>They<strong> show the frequency </strong>of <strong>categorical </strong>responses</p></li><li><p>The <strong><u>bars have spaces</u></strong> in between them on graph</p><p></p></li></ul><img src="https://knowt-user-attachments.s3.amazonaws.com/ff081d08-b105-47e6-8fb5-7bb2c63887bc.png" data-width="100%" data-align="center" alt=""><p></p>
New cards
32

How is central tendency described as a distribution?

  • Mean, median, and mode differ in central tendency but do not differ otherwise

  • all 3 m’s would be the same in each of the symmetrical distributions

  • aka the same variability, different average

New cards
33

How is variability described as a distribution?

  • Can have the same central tendency - but different amounts of variability

  • Some can have the same range but different standard deviations

New cards
34

What is skewness and how is it described in a distribution?

The lack of symmetry in a graph

New cards
35

What is a positive skew and which way does the tail face

When the curve's tail is on the right side of the graph.

  • Mode is the highest on left side

  • The median is typically in the middle

  • Mean is the lowest on right side

<p>When the curve's tail is on the <strong>right </strong>side of the graph.</p><ul><li><p><strong>Mode</strong> is the <strong>highest</strong> on <strong>left side</strong></p></li><li><p><strong>The median</strong> is typically in the <strong>middle </strong></p></li><li><p> <strong>Mean</strong> is the <strong>lowest on right side</strong></p><p></p></li></ul><img src="https://knowt-user-attachments.s3.amazonaws.com/711e6ad6-65b0-4fe9-8760-acfc87b7b963.png" data-width="100%" data-align="center" alt=""><p></p>
New cards
36

What is a negative skew and which way does the tail face?

When the tail curve is typically on the left side of the graph

  • Mode is the highest on the right side

  • The median is in the middle

  • Mean is on the left side

<p>When the tail curve is typically on the<strong> left </strong>side of the graph</p><ul><li><p><strong>Mode</strong> is the <strong>highest</strong> on the <strong>right side</strong></p></li><li><p><strong>The median</strong> is in the <strong>middle</strong></p></li><li><p><strong>Mean</strong> is on the <strong>left </strong>side</p><img src="https://knowt-user-attachments.s3.amazonaws.com/7706824a-5bff-4690-a82e-59252af7175e.png" data-width="100%" data-align="center" alt=""><p></p></li></ul><p></p>
New cards
37

What does skewness reflect about the mean, median, and mode?

Reflects the relation between one another

New cards
38

What is the floor effect?

When there is a bottom bound for the values of a data set. MUCH of the data falls around the BOTTOM bound.

  • creates a positive skew!

  • majority values fall on the LOW end of the distribution

New cards
39

What is the ceiling effect?

When there is an upper bound for the values of the data set

  • Creates a negative skew

  • majority of the values fall at the HIGH end of the distribution

New cards
40

What is kurtosis?

How peaked vs flat the distribution is

New cards
41

What is platykurtic?

LOW kurtosis

  • relatively FLAT

  • HIGH variability

New cards
42

What is leptokurtic?

HIGH kurtosis

  • relatively PEAKED

  • LOW variability

New cards
43

What can make graphs misleading?

This can occur when visual reprensations are off and distortions are created with manipulation of axes, scales, and more

New cards
44

What are correlations?

How changes in one variable relate to changes in another variable

  • THE RELATIONSHIP BETWEEN TWO VARIABLES

New cards
45

When do we use correlations?

They are used when you want to quantify the strength and direction of a liner relationship between two continuous variables

New cards
46

What is a correlation coefficient?

a single number that describes the relationship between two variables

New cards
47

How is correlation coefficient abbreviated, and what does it range from?

  • Abv. as “r

  • Ranges from -1 to 1

New cards
48

What is direction in correlation coefficient?

The sign of the coefficient tells us in which direction one variable is to the other

New cards
49

What is the relationship of a positive coefficient?

DIRECT relationship

  • as x increases, y increases

New cards
50

What is the relationship of a negative coefficient?

INVERSE relationship

  • as x increases, y decreases

New cards
51

What is strength of a correlation coefficient?

The closer the coefficient is to -1 or 1, the stronger the relationship is

New cards
52

What are scatterplots in relation to correlations?

A chart or graph that uses dots to represent values for two different numeric values

New cards
53

What is an important idea to remember about correlation coefficient?

Correlation does NOT equal causation. Just because two variables are closely related, does not mean that one causes the other.

New cards
54

Understand the chart of correlation relationships

New cards
55

Understand scatter plots and correlation examples

New cards
56

What are the limitations of correlation coefficients?

  • Can only be used to identify LINEAR relationships

  • NO curvilinear relationships

  • Restriction of range

New cards
57

What is the restriction of range?

When there are too many scores that have similar values for a variable, the coefficient cannot capture the true relationship.

New cards
58

Do outliers have a significant effect on correlation coefficents?

YES! They have a huge impact on correlation co.

New cards
59

What is the coefficient of determination? And how do we calculate it?

The representation of how much variance two variables share

  • how much x can be accounted for y (vise versa)

    How to calculate it?

  • simply square the coefficient!

New cards
60

How do we calculate/compute the correlation coefficient?

The formula used:

  • rxy = the correlation between x and y

  • n is the sample size

  • X is each individual's score on the X variable

  • Y is each individual’s score on the Y variable

  • XY is the product of each X score times its corresponding Y score

  • X2 is each individual's X score squared

  • Y2 is each individual’s Y score squared

New cards
61

What are the numerator and denominator relationships when computing a correlation coefficient?

numerator = how much do x and y go together

denominator = how much do x and y vary on their own

New cards
62

What is an example on how to report a correlation coefficient?

We found a strong or weak negative/positive correlation between ——- and ——- (r=). Suggesting that…..

New cards
63

What is coefficient of determination?

The more two variables have in common, the more variance they share

New cards
64

What is coefficient of determiination?

The variance that is left over after calculation

New cards
65

What is a correlation matrix?

A simple way to report a bunch of correlations at one time

New cards
66

What is r² and how do you calculate it?

This is known as the coefficient of determination and is calculated by squaring the value of r.

New cards
67

What is important to remember about correlation vs causation?

Correlation does NOT equal causation

  • we can NEVER definitively assume causation from a correlational relationship

New cards
68

What is reverse causation?

The causal direction may be opposite from what has been hypothesized

New cards
69

What is reciprocal causation?

When two variables cause each other

  • spiral effect

New cards
70

What are measures in reliability and validity?

the act or process of assigning numbers to phenomena according to a rule.

New cards
71

What are the 4 measurement scales from least to most precise?

  • Nominal Scale: measure split into categories. A person cannot be in more than one category. Data is presented as counts or percentages.

    Ex: hair color, political affiliation

  • Ordinal Scale: categories are ranked in a hierarchy.

    Ex: class ranking

  • Interval Scale: ranked continuous variables, with equal spacing (intervals) between values

    Ex: 1-5 strongly agree to strongly disagree

  • Ratio Scale: similar to interval, but has a true zero value.

    0= complete absence of the attribute

New cards
72

What is an independent variable?

Something that can be manipulated or changed in an experiment.

Ex: the amount of water used

New cards
73

What is a dependent variable?

What you measure/observe as a result of change

Ex: how much the plants had grown

New cards
74

What is reliability?

a measure that is consistent in the values it outputs

New cards
75

What is validity?

the measure is actually measuring what you intended to measure

New cards
76

What is a key note to remember about reliability and validity.

A measure can be reliable and NOT be valid.

But a measure cannot be valid and NOT be reliable.

New cards
77

What is the idea of garbage in, garbage out?

if the data you collected is based on invalid or unreliable measure, your results will be useless.

New cards
78

What is the goal for reliability and validity/ overall stats and testing?

MINIMIZE the error!

New cards
79

What is an observed score?

the ACTUAL score a person receives

New cards
80

What is a true score?

the theoretical score representing a persons actual ability or trait without measurement errors. (aka the perfect score)

New cards
81

What is an error score?

AKA measurement error, the discrepancy between observed and true score.

New cards
82

What are the types of reliability?

  • Test-retest: does a person receive the SAME score when they complete the measure at two different points in time?

  • Parallel test forms: are different versions of the same measurements equivalent?

  • Internal consistency: do all items in a measure assess the same concept you are trying to measure? Is there a strong correlation between individual items and total scores?

    Chronbachs Alpha ^:

  • Inter-rater: does the measure produce the same results regardless of who is grading the scale? Can be evaluated by looking at the correlation between raters.

New cards
83

What is important to remember about test-retest and parallel forms?

  • both can be measured using correlation

  • the CLOSER the coefficient is to 1, the more reliable the measure is.

New cards
84

What is Cronbachs Alpha in relation to internal consistency?

a stat that reflects the degree of internal consistency of items. Should always be from ZERO to ONE. The closer to 1, the better.

New cards
85

How to improve cronbachs alpha?

  1. Increase # of items in the survey

  2. properly format instructions

  3. make sure the admin of the measure is standardized

  4. remove unclear or confusing items

New cards
86

Can validity be assessed with stats?

NO.

requires theory, critical thinking, and lots of data

New cards
87

What are the 3 types of validity?

  1. Content: does the measure cover ALL of what we are trying to measure?

  2. Criterion: does the measure predict other indicators of the same construct?

  3. Construct: is the measure related to things it shouldn’t be and is it not related to things it should? Does it measure the underlying concept you set out to measure? Requires psychological theory

New cards
88

What are concurrent and predictive validity within criterion validity?

Concurrent validity: do the measures taken correlate with pre-existing measures that have already been validated?

Predictive validity: the ability of the measure to predict outcomes in the future.

New cards
89

What are convergent and discriminant validity within construct validity?

Convergent validity: does the measure relate to things that it should?

Construct validity: does the measure NOT relate to things that it should?

New cards

Explore top notes

note Note
studied byStudied by 21 people
991 days ago
5.0(1)
note Note
studied byStudied by 8 people
771 days ago
5.0(1)
note Note
studied byStudied by 19 people
896 days ago
5.0(2)
note Note
studied byStudied by 71 people
308 days ago
5.0(1)
note Note
studied byStudied by 82 people
902 days ago
5.0(1)
note Note
studied byStudied by 22 people
844 days ago
5.0(2)
note Note
studied byStudied by 3 people
24 days ago
5.0(1)
note Note
studied byStudied by 6307 people
705 days ago
4.9(48)

Explore top flashcards

flashcards Flashcard (21)
studied byStudied by 63 people
30 days ago
5.0(2)
flashcards Flashcard (31)
studied byStudied by 2 people
548 days ago
5.0(1)
flashcards Flashcard (147)
studied byStudied by 2 people
17 days ago
5.0(1)
flashcards Flashcard (33)
studied byStudied by 51 people
63 days ago
5.0(1)
flashcards Flashcard (37)
studied byStudied by 27 people
700 days ago
4.0(1)
flashcards Flashcard (41)
studied byStudied by 3 people
190 days ago
5.0(1)
flashcards Flashcard (37)
studied byStudied by 1 person
126 days ago
5.0(1)
flashcards Flashcard (129)
studied byStudied by 3 people
105 days ago
5.0(1)
robot