Univariate Statistics Quiz 1

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/73

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 12:30 AM on 5/1/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

74 Terms

1
New cards

Variable

able to vary, anything that can take on different values for different observations

2
New cards

Constant

something that does not vary, static across observations, no change

3
New cards

Levels

the values that a variable can take on (ex. different ages being the level of the variable of age)

4
New cards

Nominal

(name) a variables level are different only in name (ex. eye colors, birth states)

5
New cards

Ordinal

(inherent order) levels also have a natural order, no sense of how much difference or scale there is between them (ex. military rank, soft drink size)

6
New cards

Categorical values

nominal and ordinal

7
New cards

Interval

(consistent interval) unit interval always means the same thing, a unit has a meaning, there’s a scale, unlike ratio (ex. temperature, calendar years, time of day)

8
New cards

Ratio

the value zero means “none”, allows calculations of ratios that make sense (ex. age, amount of money in pocket)

9
New cards

Continuous values

interval and ratio

10
New cards

Populations

often too big to measure, instead select a subset of the population to pull out and measure, the set of all possible scores

11
New cards

Parameter

summary value that describes a population

12
New cards

Sample

the subset of the population that is actually measured, goal of the sample is to understand the population not just the sample, a summary value described using statistics

13
New cards

Statistics

summary values that describe samples, used to estimate parametersIn

14
New cards

Inferences

drawing conclusions about a population based on what is observed in a sample, importance of the idea of random selection to work with a representative sample of the population

15
New cards

Descriptive statistics

anything that meats the definition of a statistic, wanting to describe the thing that we observed, values meant to describe the data measured, have a UNIT used to describe them (ex. the mean of a sample (range of scores), average)

16
New cards

Descriptive parameters

can describe the population based on descriptive statistics

17
New cards

Test statistics

doing a test to compare statistics, gives a value to find a difference, used to make decisions, don’t help to describe what was in the sample, NO UNIT, only exist in samples not in parameters (ex. z, t, f)

18
New cards

Independent variable

causes

19
New cards

Dependent variable

effect, dependent on the IV

20
New cards

Selecting an analysis

knowt flashcard image
21
New cards

Mean

can be sensitive to extreme scores

<p>can be sensitive to extreme scores</p>
22
New cards

Median

The middle score in a set of ordered scores (the score with the same number of values above and below)

23
New cards

Mode

  • "where are scores most likely to be"

  • Most frequently occurring score

  • Most frequently occurring interval (histogram)

    • Histogram - break up number line into intervals (used in lab 1)

  • Score with highest probability density (top of the curve)

24
New cards

Range

  • max(y) - min(y)

  • Difference between the highest and lowest value

    • Limited while only describing two points in the data set

25
New cards

Variance

Average of the squared differences between score and population mean

<p><span><span>Average of the squared differences between score and population mean</span></span></p>
26
New cards

Standard deviation

Square-root of variance (back to the variable's original metric)

<p>Square-root of variance (back to the variable's original metric)</p>
27
New cards

Frequency distribution

  • Heights of bars represent counts for the number of scores within each interval

<ul><li><p><span><span>Heights of bars represent counts for the number of scores within each interval</span></span></p></li></ul><p></p>
28
New cards

Probability distribution

  • Showing the scores in each interval as a proportion of the larger total number of scores

<ul><li><p><span><span>Showing the scores in each interval as a proportion of the larger total number of scores</span></span></p></li></ul><p></p>
29
New cards

Probability density

used to describe the independent variable, integral to find area under curve to find probability

whole curve is the data, integral to find the probability of pulling a score from that certain range

<p>used to describe the independent variable, integral to find area under curve to find probability</p><p>whole curve is the data, integral to find the probability of pulling a score from that certain range</p>
30
New cards

Skew

asymmetry (the normal distribution is symmetrical)

<p>asymmetry (the normal distribution is symmetrical)</p>
31
New cards

Leptokurtosis

more peaked than a normal curve

<p>more peaked than a normal curve</p>
32
New cards

Platykurtosis

 flatter than a normal curve

<p><span><span>&nbsp;</span></span>flatter than a normal curve</p>
33
New cards

Definition of rare

alpha = .05

34
New cards

Distribution based probabilities

  • find probability within a specific range using an integral

<ul><li><p>find probability within a specific range using an integral</p></li></ul><p></p>
35
New cards

Critical values

value(s) beyond which is a proportion of a population equal to alpha

“what value of y is going to give you a specific probability”

<p>value(s) beyond which is a proportion of a population equal to alpha</p><p>“what value of y is going to give you a specific probability”</p>
36
New cards

Standardizing equation

Z, simplified metric easier to understand

<p>Z, simplified metric easier to understand</p>
37
New cards

Standard normal distribution

knowt flashcard image
38
New cards

Point estimates

specific point on the number line, precise, almost always wrong (at least a little)

no confidence that it’s the right answer

(ex. 5.7)

39
New cards

Confidence interval

less precise, much more likely correct

(ex. [5.4,6.0]

40
New cards

“Confidence”

stated as a percentage

90% confidence means that if a confidence interval is calculated for many, many samples, 90% of the samples’ confidence intervals will include the parameter being estimated

relationship between percent and confidence (C) and alpha

41
New cards

Confidence interval equation with sigma SD known and unknown

knowt flashcard image
42
New cards

Central limit theorem

describes the sampling distribution of the mean, distribution of sample means

<p>describes the sampling distribution of the mean, distribution of sample means</p>
43
New cards

variance sum theorem

for any two unrelated (independent) variables a and b, the following relationship holds:

the variance of the sum or difference is equal to the variance of the sum

<p>for any two unrelated (independent) variables a and b, the following relationship holds:</p><p>the variance of the sum or difference is equal to the variance of the sum</p>
44
New cards

Significance testing

 decision making tool to help figure out when we can make an inference

45
New cards

Goal of significance testing

  • Decision-making

  • Inference

  • i.e. deciding whether the observed results are strong enough to support a conclusion that the population from which the sample was drawn is different from the null population

  • If results support such inference, they are said to be "significant

46
New cards

Statistical significance vs. practical significance

  • Practical significance: a result is big, strong, or important

    • "the temperature dropped significantly over the weekend"

  • Statistical significance: a result supports the inference that an effect or characteristic exists in a population

  • Effects that are of great practical significance can be statistically non-significant

  • Effects that are of very little practical significance can be statistically significant

47
New cards

Null hypothesis

the absence of what we are looking for

<p>the absence of what we are looking for</p>
48
New cards

Data

something we have seen/measures

49
New cards

Target

an effect or characteristic we are looking for

50
New cards

Probability

describes sample relative to null hypothesis

“low” p means” the observed data would be “rare” if the null hypothesis was true

<p>describes sample relative to null hypothesis</p><p>“low” p means” the observed data would be “rare” if the null hypothesis was true</p>
51
New cards

Plum tree example:

  • We are tying to find a plum tree (target). Observe a tree with purple fruit

  • Null hypothesis (H): this type of tree is not a plum tree with purple fruit

  • Purple fruit would be rare among non-plum trees (if H were true, the observed date would be rare)

  • Conclusion: we reject the not-a-plum-tree idea (reject H) and conclude that we have found a plum tree (assume that the null hypothesis is right until proven wrong)

    • Focus on knowing what we know that the thing you're looking for does not exist or isn't true

52
New cards

Significance testing procedure

  1. Define "rare" as an arbitrarily low probability (usually alpha = .05)

  2. Specify a target - what are we looking for?

  3. State the null hypothesis

  4. collect data

  5. Find the value of a test statistic (z, t, etc.)

  6. Use the test statistic value to find a p-value or critical value

  7. Decide: if p<a (or if, e.g. z(observed) > z(critical)), reject H

  8. If H is rejected, describe result as "significant" and use descriptive statistics to describe the (estimated) population. E.g., "Symptom Severity is significantly lower among those who received the drug (ybar = 19) than among those who received the placebo (ybar = 73)

    1. If rejecting then null hypothesis, that means that you found what you were looking for

53
New cards

Why do we reject the null hypothesis when probability < alpha?

knowt flashcard image
54
New cards

Outcomes: type I error, type II error, power, “ok”

Type I error:

  • reject null hypothesis, reality that null hypothesis is true

  • “false positive”

Type II error:

  • don’t discover anything, something to discover

  • “false negative”, a miss

power:

  • reject null hypothesis, reality that null hypothesis is false

  • “hit”

“ok”: fail to reject null hypothesis, nothing to discover

<p>Type I error: </p><ul><li><p>reject null hypothesis, reality that null hypothesis is true</p></li><li><p>“false positive”</p></li></ul><p>Type II error:</p><ul><li><p>don’t discover anything, something to discover</p></li><li><p>“false negative”, a miss</p></li></ul><p>power:</p><ul><li><p>reject null hypothesis, reality that null hypothesis is false</p></li><li><p>“hit”</p></li></ul><p>“ok”: fail to reject null hypothesis, nothing to discover</p>
55
New cards

Selecting an analysis

knowt flashcard image
56
New cards

Population variance equation

knowt flashcard image
57
New cards

Population covariance equation

knowt flashcard image
58
New cards

Sample covariance equation

knowt flashcard image
59
New cards

No Relationship scatterplot

knowt flashcard image
60
New cards

Positive covariance scatterplot

direct relationship

<p>direct relationship</p>
61
New cards

Negative covariance scatterplot

inverse relationship

<p>inverse relationship</p>
62
New cards

Correlation coefficient

form of standardizing the covariance, previously determined by the scale of the variables

<p>form of standardizing the covariance, previously determined by the scale of the variables</p>
63
New cards

"Predicting" scores" mean vs. regression line

  • "I now want to be able to guess how much someone Is going to spend

  • Mean is always the best guess in the long run in general

64
New cards

Regression line

procedure of giving us a way to come up with a best guess, the predicted value

<p>procedure of giving us a way to come up with a best guess, the predicted value</p>
65
New cards

Sum of squares: total

  • “how far off is the mean when you use it as a best guess”

  • “amount of inaccuracy if you used the mean as a best guess”

<ul><li><p>“how far off is the mean when you use it as a best guess”</p></li><li><p>“amount of inaccuracy if you used the mean as a best guess”</p></li></ul><p></p>
66
New cards

Sum of squares: error

  • variability that cannot be explained by the regression model

  • “how far are the individual values from the regression line”

  • “is it doing better then the mean at a best guess”

<ul><li><p>variability that cannot be explained by the regression model</p></li><li><p>“how far are the individual values from the regression line”</p></li><li><p>“is it doing better then the mean at a best guess”</p></li></ul><p></p>
67
New cards

Sum of squares: explained

  • difference between the regression line and the mean

  • “how much is changing from mean as best guess to regression line as best guess”

  • each subject: the predicted value minus the mean

<ul><li><p>difference between the regression line and the mean</p></li><li><p>“how much is changing from mean as best guess to regression line as best guess”</p></li><li><p>each subject: the predicted value minus the mean</p></li></ul><p></p>
68
New cards

Source table

  • MS = mean square (average of the square deviations), another way of saying variance

<ul><li><p>MS = mean square (average of the square deviations), another way of saying variance</p></li></ul><p></p>
69
New cards

How to use F

F is used to test the null hypothesis, “the IV cannot predict the DV”

<p>F is used to test the null hypothesis, “the IV cannot predict the DV”</p>
70
New cards

Regression assumptions

  • Linearity

    • for every line of the IV, all population DV means fall on the same line

  • Normality of errors

    • assume that their will be a normal variance of errors

  • homogeneity of error variance

    • population variance in the DV is the same for all levels of the IV

  • independence of errors

    • no two or more errors are similar to one another because they come from a common source

71
New cards

Selecting analysis: IV categorical, DV categorical

chi-square, test of independence

72
New cards

Selecting analysis: IV continuous, DV categorical

logistic regression and discriminant function analysis

73
New cards

Selecting analysis: IV categorical, DV continuous

t-test and/or analysis of variance

74
New cards

Selecting analysis: IV continuous, DV continuous

correlation or simple regression