POS Final

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/86

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 8:45 PM on 11/11/25
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

87 Terms

1
New cards

Univariate Distribution

Describing and analyzing one variable.

2
New cards

Statistic

An estimate of a parameter based on a sample.

3
New cards

Descriptive Statistics

Measurements used to summarize and organize the observed values of one variable.

4
New cards

Inferential statistics

Measurements used to make decisions about variable(s) by interpreting one variable.

5
New cards

Frequency Distributions

Number of cases for each category of a variable. To construct one, list the categories of variables and then list the number of observations is each.

6
New cards

Proportion

Number of observations in each category divided by the total number of observations

7
New cards

Percentage

Proportion multiplied by 100.

8
New cards

Central Tendency

The value around which most of the data are clustered.

9
New cards

Mode

The value that appears most frequently

10
New cards

Median

The midpoint in an ordered series of data. (N+1)/2. For an even #, is the average of the two middle values.

11
New cards

Bimodal

when two categories occur just as frequently

12
New cards

Mean

The sum of the observed values divided by the number of cases

13
New cards

Dispersion

The extent to which the data are spread out from their central tendency

14
New cards

Range

Indicates the difference between the lowest and highest values of the distribution

15
New cards

Quantiles

points taken at regular intervals of an ordered data set that divide the set into equal groups from lowest to highest.

16
New cards

Quartiles

4 equal parts

17
New cards

Quintiles

5 equal parts

18
New cards

Deciles

Ten equal parts

19
New cards

Percentiles

100 equal parts

20
New cards

Variance

A measure of dispersion for interval or ratio level data based on finding the variation around the mean value of the distribution.

Is the average of the total square of the deviation from the mean of the data.

How far the data is spread from the mean.

21
New cards

Standard Deviation

The square root of the variance - since the variance has transformed the data into squared units and we want to report them in original units.

22
New cards

Standard Deviation Advantages (over other measures of dispersion)

it is more stable from sample to sample since it is based on all observations

23
New cards

Standard Deviation is most useful when…

It is most useful when they are interpreted by comparing the dispersion of several clusters of data.

24
New cards

Six examples of frequency curve

  1. Bell shaped

  2. U-shaped

  3. Positively Skewed j-curve

  4. Negatively skewed j-curve

  5. Bimodal

  6. Rectangular

25
New cards

Bell Shaped

A symmetrical distribution (mean and median are identical and frequencies going toward the right and left tails are identical) - where most of the data (the mode) is centered near the median and mean.

26
New cards

U-shaped

A symmetrical distribution where most of the data is spread evenly away from the mean and median (Very few cases are average and most are at either end of the extreme).

27
New cards

Positively skewed j-curve

A non-symmetrical distribution with a large number of low scores and a few extremely high scores.

28
New cards

Negatively skewed j-curve

A non-symmetrical distribution with a large number of high scores and a few extremely low scores

29
New cards

Bimodal

A distribution where the data is clustered at two different points away from the mean. Takes an “m” shape.

30
New cards

Rectangular

A symmetrical distribution where the data is spread equally in every category.

31
New cards

Normal Curve

A special type of bell curve where the distribution of values nears a direct and known relationship to the size of the standard deviation. A given percentage of cases are within 1, 2, or 3 standard deviations from the mean.

32
New cards

Four properties of Normal Curves

  1. Symmetrical and bell-shaped

  2. Mode, median, and mean coincide at the center of the distribution

  3. Curve is based on an infinite number of observations

  4. A fixed proportion of observations lies between the mean and fixed units of standard deviations

33
New cards

Normal Distribution

When data is distributed normally the mean divides the data in half. The following holds true:

  1. 68.26% of the data lies within ±1 sd from the mean

  2. 95.46% of the observations fall within ±2 sd from the mean

  3. 99.73% of the observations fall within ±3 sd from the mean

34
New cards

Outlier

An observation in a normally distributed data set that lies beyond ±3 standard deviations from the mean (only .27% of observations fall in this category). 

Are sometimes dropped from the data when it is analyzed since they do not represent the average cases and tend to skew the results.

35
New cards

Z-scores

Also known as standardized score; is the number of standard deviations an observation is above or below the mean. A positive score is above the mean, and a negative is below the mean.

Are used if the data is normally distributed.

36
New cards

To compute z-scores

Subtract the mean of the data from the score of a specific observation. Then divide the results by the standard deviation of the data.

37
New cards

Bivariate Relationships

Relationships between two variables.

38
New cards

Contingency Tables

aka Crosstabulation. Compares or cross-tabulates two nominal and/or ordinal variables in a table to see if the values of one are contingent on the other.

This helps determine if there is a relationship between the variables.

39
New cards

Cross Tabulations

a statistical method used to analyze the relationship between two or more categorical variables by displaying the frequency of their combinations in a table

40
New cards

“Percentage” the Table

Percent the Table and Subtract Across to Compute the Percentage Difference

41
New cards

Difference of Means

Comparing the appropriate central tendency in two groups to look for patterns

42
New cards

Tests of Statistical Significance

Determines whether a relationship between variables in a probability sample can be generalized to the population from which the sample was selected.

43
New cards

Significance Level

states the probability that a relationship in a probability sample occurred by chance and doesn’t really exist in the population. Frequently symbolized with the letter “p” for probability - “p-value”

44
New cards

Probability or “p” value

the significance level. The probability that a relationship in a probability sample occurred by chance.

45
New cards

.05 Cutoff Level

95% confidence. The most commonly used level to use as a cutoff point. If reported significance level is greater than .05, the relationship is assumed not significant. if it is less than .05, the relationship is assumed to be significant.

46
New cards

Chi - squared statistic for statistical significance

Used for contingency tables of nominal or ordinal data.

Computed by figuring out the difference between the observed and expected values of the variables in contingency table.

If it is significant at .05 or less then you can state that the sample relationship in the contingency table is significant. ..01 or .001 is Yes. .20 or .30 No.

47
New cards

Measures of Association 

Allow us to summarize the strength of a relationship in more accurate ways than relying on percentage differences.

48
New cards

Strength of Relationships

Extent to which changes in one variable are accompanied by changes in another variable.

49
New cards

Yule’s Q

A summary measure for use in a bivariate table with nominal (or ordinal) data that indicates the strength of the relationship.

Is based on the number of cases that show a positive relationship minus the number of cases that show a negative relationship.

Q= ad-bc / ad+bc

50
New cards

Proportional Reduction of Error (PRE)

The amount by which errors in predicting the dependent variable can be reduced by knowing the relationship between the DV and the IV. 

The extent to which we can reduce possible errors in estimating the value of a case if we know its value on a second variable.

errors w/o knowledge of iv - errors w/ knowledge of iv [divided by] errors w/out knowledge of iv

51
New cards

Lambda

Used with two nominal variables or with one nominal and one ordinal variable. Can be used for tables that are larger than two by two. It compares the modal value for each value of the IV.

52
New cards

Gamma

an ordinal measure of relationships. Measures the number of similarly ordered pairs as a proportion of all relevant pairs. Does not include tied pairs.

(identical to Yule’s Q for a 2X2 table). For Ordinal level data.  Is frequently higher than the other measures and can over estimate relationships when there are many tied pairs since it does not include them.

53
New cards

Kendall’s Tau

an ordinal measure of relationships. is based on pairs of cases.

For Ordinal level data; can be used for nominal data when lambda is inappropriate (1 DV category very high).  Tau b for square tables (2x2 or 3x3 for instance) and tau c for rectangular ones (2x3 or 2x4 or 3x4 for instance).  Includes tied pairs in denominator in calculations (those not on diagonal) and thus gives more accurate (lowest, most conservative) measure. (tau c harder to interpret and can only say which of two tables of similar proportions is stronger.

54
New cards

Somer’s D

an ordinal measure of relationship.

Generally for ordinal level data. Only counts pairs tied on the DV in the denominator.  This has the effect of focusing on pairs of cases where the IV actually changes.  Thus it is better for causal analysis. Usually gives a moderate measure between those arrived at by Gamma and Tau.

55
New cards

List the Variables:

Dependent (Y), Independent (X), Intervening (Z)`

56
New cards

Dependent Variable

(Y). The variable we wish to explain or predict; its value is influenced by the Independent Variable

57
New cards

Independent Variable

(X) The variable we think causes a change in (has an effect on) the Dependent variable

58
New cards

Intervening Variable

(Z) A third variable that can affect the relationship between the independent variable and dependent variable. It is also another independent variable.

59
New cards

Multivariate Analysis

Examining a relationship between more than two variables; usually looks at the effect of several Independent variables on one Dependent variable.

60
New cards

Control

Examining a relationship between an Independent Variable and a Dependent variable by holding another Independent variable constant.

This attempts to rule out the third variable as an alternative explanation. It may or may not turn out to be an intervening variable depending on the results.

61
New cards

Elaboration

Breaking down the original contingency table into two or more tables based on the values of the control variable

62
New cards

Contingency Table

Independent Variable across the top; Dependent Variable down the side

63
New cards

Gross Effect

% difference between IV and DV

64
New cards

Net Effect

% difference of the IV on the DV controlling for the new variable

65
New cards

Replication

If the relationships are basically the same in the different tables, then the control variable is not really an intervening variable.

It does not have an effect on the relationship between the DV and IV.

The original contingency table repeats itself after controlling

66
New cards

Spuriousness

There is an association between the variables in the original table which disappears after elaboration.

67
New cards

Conditioning

The relationship in the original table is modified so that a different association appears in one or both (or all) sub-groups of the elaborated table

68
New cards

Additive Effect

If the modification is about the same for both or all subgroups then the relationship is considered this.

They both are a little or a lot higher or both a little or a lot lower.

69
New cards

Interaction

If the modification is different for one or more of the subgroups then the relationship is considered this.

One is higher and one is lower; or one is the same and one is higher; or one is the same and one is lower; or one is a little higher and one is a lot higher; or one is a little lower and one is a lot lower

70
New cards

Linear Regression

The relationship between two variables so that the prediction line “fits” as closely as possible to the actual data.

Is often used when both the DV and IV are measured at the interval/ratio level but can also be used without much modification when variables are at the ordinal level.

71
New cards

Scatterplot

A graph constructed by plotting the values from 2 interval variables along an X and Y axis.

Allows us to visualize the relationship between the IV and DV. If the dots fall about randomly then there is not much of a relationship.

72
New cards

Criterion of Least Squares

The object of regression is to draw a line that fits the data “best” and then rely on the formula for that line to predict and explain other values. 

We draw the line that minimizes the sum of the squares of the deviations of the observed values of y from those values of y predicted by the regression line.

73
New cards

Regression Line

When a researcher collects interval/ratio data, this can be computed to see if the IV has an impact on the DV

74
New cards

Regression line equation

Y = a + bX

Y = values of DV

X = values of IV

75
New cards

Regression constant “a” - y intercept

“a” is the constant or y-intercept (when x=0 then y=a)- that is, the value of the DV when the independent variable is 0

76
New cards

Regression coefficient “b” slope

a 1 unit rise inthe IV results in a “b” unit rise in the DV. The slope can be positive or negative indicating the direction of the relationship.

Slope also tells how much of a change in one variable is associated with the other.

77
New cards

Prediction Line

once a regression equation is computed, it can be used to predict values of the DV by inserting different values for the IV.

78
New cards

Pearson’s R Correlation Coefficient

A measure of closeness of fit that indicates the strength of a relationship between interval/ratio level data.

A summary of measure that tells how likely it is that an estimate of a value of the DV will be more accurate given information about the DV

Runs from -1 to +1. The closer the cases fit the regression line, the higher r will be. 

79
New cards

R-squared

A PRE measure or goodness of fit that tells the extent to which variation in the Y (the DV) is explained by variation in the X (the IV).

Tells us to what extent knowing X enables us to predict Y.

Runs from 0 to 1. The higher the number the more variance explained and the better the fit.

Is computed by multiplying Pearson’s r by itself (r x r)

80
New cards

T-statistic

Sometimes researchers compute this to determine statistical significance for interval/ratio data.

Indicates the probability that Pearson’s correlation coefficient is not zero in the population.

Commonly used in regression to say whether or not the relationship under study occurred by chance.

81
New cards

Multiple Regression

Using regression to explain the variation in a dependent variable with two or more independent variables. 

Allows the researcher to isolate the effect of each IV included in the model while controlling for the effects of the other IVs

Y = a + b1X1 + b2X2 = b3X3 + bnXn

82
New cards

Partial Regression Coefficient

each “b” in Multiple Regression equation. There are several of them in one equation and each accounts for only part of the overall change in the DV.

83
New cards

F-statistic

Statistical significance of the overall regression model.

Will usually be reported with asterisks indicating the level of statistical significance.

No asterisk at all would mean that the model overall is not statistically significant

84
New cards

Beta Coefficient

Regression coefficients that have been standardized.

Is interpreted as how many standard deviations of change are caused in the DV

85
New cards

Adjusted R-squared

Smaller than regular R-squared.

Gives total amount of variance explained in DV by all the IV, but it takes into account the number of IV in the model and the sample size.

Is considered more reliable and conservative measure of total amount of variance explained.

86
New cards

Regression with Dummy Variables

Regression analysis can still be used.

87
New cards

Logistic Regression

Regression analysis when the DV is measured at the nominal level and has only two categories.