Univariate Statistics Quiz 1

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/73

There's no tags or description

Looks like no tags are added yet.

Last updated 12:30 AM on 5/1/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

74 Terms

New cards

Variable

able to vary, anything that can take on different values for different observations

New cards

Constant

something that does not vary, static across observations, no change

New cards

Levels

the values that a variable can take on (ex. different ages being the level of the variable of age)

New cards

Nominal

(name) a variables level are different only in name (ex. eye colors, birth states)

New cards

Ordinal

(inherent order) levels also have a natural order, no sense of how much difference or scale there is between them (ex. military rank, soft drink size)

New cards

Categorical values

nominal and ordinal

New cards

Interval

(consistent interval) unit interval always means the same thing, a unit has a meaning, there’s a scale, unlike ratio (ex. temperature, calendar years, time of day)

New cards

Ratio

the value zero means “none”, allows calculations of ratios that make sense (ex. age, amount of money in pocket)

New cards

Continuous values

interval and ratio

New cards

Populations

often too big to measure, instead select a subset of the population to pull out and measure, the set of all possible scores

New cards

Parameter

summary value that describes a population

New cards

Sample

the subset of the population that is actually measured, goal of the sample is to understand the population not just the sample, a summary value described using statistics

New cards

Statistics

summary values that describe samples, used to estimate parametersIn

New cards

Inferences

drawing conclusions about a population based on what is observed in a sample, importance of the idea of random selection to work with a representative sample of the population

New cards

Descriptive statistics

anything that meats the definition of a statistic, wanting to describe the thing that we observed, values meant to describe the data measured, have a UNIT used to describe them (ex. the mean of a sample (range of scores), average)

New cards

Descriptive parameters

can describe the population based on descriptive statistics

New cards

Test statistics

doing a test to compare statistics, gives a value to find a difference, used to make decisions, don’t help to describe what was in the sample, NO UNIT, only exist in samples not in parameters (ex. z, t, f)

New cards

Independent variable

causes

New cards

Dependent variable

effect, dependent on the IV

New cards

Selecting an analysis

New cards

Mean

can be sensitive to extreme scores

New cards

Median

The middle score in a set of ordered scores (the score with the same number of values above and below)

New cards

Mode

"where are scores most likely to be"
Most frequently occurring score
Most frequently occurring interval (histogram)
- Histogram - break up number line into intervals (used in lab 1)
Score with highest probability density (top of the curve)

New cards

Range

max(y) - min(y)
Difference between the highest and lowest value
- Limited while only describing two points in the data set

New cards

Variance

Average of the squared differences between score and population mean

New cards

Standard deviation

Square-root of variance (back to the variable's original metric)

New cards

Frequency distribution

Heights of bars represent counts for the number of scores within each interval

<ul><li><p><span><span>Heights of bars represent counts for the number of scores within each interval</span></span></p></li></ul><p></p>

New cards

Probability distribution

Showing the scores in each interval as a proportion of the larger total number of scores

<ul><li><p><span><span>Showing the scores in each interval as a proportion of the larger total number of scores</span></span></p></li></ul><p></p>

New cards

Probability density

used to describe the independent variable, integral to find area under curve to find probability

whole curve is the data, integral to find the probability of pulling a score from that certain range

New cards

Skew

asymmetry (the normal distribution is symmetrical)

New cards

Leptokurtosis

more peaked than a normal curve

New cards

Platykurtosis

flatter than a normal curve

New cards

Definition of rare

alpha = .05

New cards

Distribution based probabilities

find probability within a specific range using an integral

<ul><li><p>find probability within a specific range using an integral</p></li></ul><p></p>

New cards

Critical values

value(s) beyond which is a proportion of a population equal to alpha

“what value of y is going to give you a specific probability”

New cards

Standardizing equation

Z, simplified metric easier to understand

New cards

Standard normal distribution

New cards

Point estimates

specific point on the number line, precise, almost always wrong (at least a little)

no confidence that it’s the right answer

(ex. 5.7)

New cards

Confidence interval

less precise, much more likely correct

(ex. [5.4,6.0]

New cards

“Confidence”

stated as a percentage

90% confidence means that if a confidence interval is calculated for many, many samples, 90% of the samples’ confidence intervals will include the parameter being estimated

relationship between percent and confidence (C) and alpha

New cards

Confidence interval equation with sigma SD known and unknown

New cards

Central limit theorem

describes the sampling distribution of the mean, distribution of sample means

New cards

variance sum theorem

for any two unrelated (independent) variables a and b, the following relationship holds:

the variance of the sum or difference is equal to the variance of the sum

New cards

Significance testing

decision making tool to help figure out when we can make an inference

New cards

Goal of significance testing

Decision-making
Inference
i.e. deciding whether the observed results are strong enough to support a conclusion that the population from which the sample was drawn is different from the null population
If results support such inference, they are said to be "significant

New cards

Statistical significance vs. practical significance

Practical significance: a result is big, strong, or important
- "the temperature dropped significantly over the weekend"
Statistical significance: a result supports the inference that an effect or characteristic exists in a population

Effects that are of great practical significance can be statistically non-significant
Effects that are of very little practical significance can be statistically significant

New cards

Null hypothesis

the absence of what we are looking for

New cards

Data

something we have seen/measures

New cards

Target

an effect or characteristic we are looking for

New cards

Probability

describes sample relative to null hypothesis

“low” p means” the observed data would be “rare” if the null hypothesis was true

New cards

Plum tree example:

We are tying to find a plum tree (target). Observe a tree with purple fruit
Null hypothesis (H): this type of tree is not a plum tree with purple fruit
Purple fruit would be rare among non-plum trees (if H were true, the observed date would be rare)
Conclusion: we reject the not-a-plum-tree idea (reject H) and conclude that we have found a plum tree (assume that the null hypothesis is right until proven wrong)
- Focus on knowing what we know that the thing you're looking for does not exist or isn't true

New cards

Significance testing procedure

Define "rare" as an arbitrarily low probability (usually alpha = .05)
Specify a target - what are we looking for?
State the null hypothesis
collect data
Find the value of a test statistic (z, t, etc.)
Use the test statistic value to find a p-value or critical value
Decide: if p<a (or if, e.g. z(observed) > z(critical)), reject H
If H is rejected, describe result as "significant" and use descriptive statistics to describe the (estimated) population. E.g., "Symptom Severity is significantly lower among those who received the drug (ybar = 19) than among those who received the placebo (ybar = 73)
1. If rejecting then null hypothesis, that means that you found what you were looking for

New cards

Why do we reject the null hypothesis when probability < alpha?

New cards

Outcomes: type I error, type II error, power, “ok”

Type I error:

reject null hypothesis, reality that null hypothesis is true
“false positive”

Type II error:

don’t discover anything, something to discover
“false negative”, a miss

power:

reject null hypothesis, reality that null hypothesis is false
“hit”

“ok”: fail to reject null hypothesis, nothing to discover

<p>Type I error: </p><ul><li><p>reject null hypothesis, reality that null hypothesis is true</p></li><li><p>“false positive”</p></li></ul><p>Type II error:</p><ul><li><p>don’t discover anything, something to discover</p></li><li><p>“false negative”, a miss</p></li></ul><p>power:</p><ul><li><p>reject null hypothesis, reality that null hypothesis is false</p></li><li><p>“hit”</p></li></ul><p>“ok”: fail to reject null hypothesis, nothing to discover</p>

New cards

Selecting an analysis

New cards

Population variance equation

New cards

Population covariance equation

New cards

Sample covariance equation

New cards

No Relationship scatterplot

New cards

Positive covariance scatterplot

direct relationship

New cards

Negative covariance scatterplot

inverse relationship

New cards

Correlation coefficient

form of standardizing the covariance, previously determined by the scale of the variables

New cards

"Predicting" scores" mean vs. regression line

"I now want to be able to guess how much someone Is going to spend
Mean is always the best guess in the long run in general

New cards

Regression line

procedure of giving us a way to come up with a best guess, the predicted value

New cards

Sum of squares: total

“how far off is the mean when you use it as a best guess”
“amount of inaccuracy if you used the mean as a best guess”

<ul><li><p>“how far off is the mean when you use it as a best guess”</p></li><li><p>“amount of inaccuracy if you used the mean as a best guess”</p></li></ul><p></p>

New cards

Sum of squares: error

variability that cannot be explained by the regression model
“how far are the individual values from the regression line”
“is it doing better then the mean at a best guess”

<ul><li><p>variability that cannot be explained by the regression model</p></li><li><p>“how far are the individual values from the regression line”</p></li><li><p>“is it doing better then the mean at a best guess”</p></li></ul><p></p>

New cards

Sum of squares: explained

difference between the regression line and the mean
“how much is changing from mean as best guess to regression line as best guess”
each subject: the predicted value minus the mean

<ul><li><p>difference between the regression line and the mean</p></li><li><p>“how much is changing from mean as best guess to regression line as best guess”</p></li><li><p>each subject: the predicted value minus the mean</p></li></ul><p></p>

New cards

Source table

MS = mean square (average of the square deviations), another way of saying variance

<ul><li><p>MS = mean square (average of the square deviations), another way of saying variance</p></li></ul><p></p>

New cards

How to use F

F is used to test the null hypothesis, “the IV cannot predict the DV”

New cards

Regression assumptions

Linearity
- for every line of the IV, all population DV means fall on the same line
Normality of errors
- assume that their will be a normal variance of errors
homogeneity of error variance
- population variance in the DV is the same for all levels of the IV
independence of errors
- no two or more errors are similar to one another because they come from a common source

New cards

Selecting analysis: IV categorical, DV categorical

chi-square, test of independence

New cards

Selecting analysis: IV continuous, DV categorical

logistic regression and discriminant function analysis

New cards

Selecting analysis: IV categorical, DV continuous

t-test and/or analysis of variance

New cards

Selecting analysis: IV continuous, DV continuous

correlation or simple regression