Descriptive Stats & Statistical Inference

0.0(0)
studied byStudied by 4 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/102

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

103 Terms

1
New cards

What are categorical/qualitative variables?

Any assigned numerals the represent categories (ranked or unranked)

2
New cards

What kind of variables are limited statistically?

Categorical/qualitative variables

3
New cards

What are the different types of categorical/qualitative variables?

  • Nominal

  • Ordinal

4
New cards

What are nominal variables?

  • A type of categorical/qualitative variable

  • Entities divided into 2+ distincted UNRANKED categories

5
New cards

The following is an example of what kind of variable?

Type of hip replacement surgery (0 = anterior, 1 = posterior)

Nominal

6
New cards

What is an ordinal variable?

  • A type of categorical/qualitative variable

  • 2+ ORDERED/RANKED categories

7
New cards

The following is an example of what kind of variable?

Self rated health (1 = poor, 2 = fair, 3 = good, 4 = very good)

-Ordinal

8
New cards

What is an important thing to do when working with categorical/qualitative variables?

Distinguish actual variable from levels of the variable

9
New cards

What are numeric quantitative variables?

Numbers that represent an amount/quantity of an entity and not a category or ranking

10
New cards

What are the different types of numerica/quantitative variables?

  • Discrete

  • Continuous

11
New cards

What is a discrete variable?

  • A type of numerica/quantitative variable

  • Variables that can only take on specific values within a give range (whole interger units)

12
New cards

The following is an example of what variable?

Number of tibial fractures

Discrete variable

13
New cards

What is a continuous variable?

  • A type of numeric quantitative variable

  • Variable where scores occur along a continuum in theory but is constrained by precision of measuring intstrument

14
New cards

The following is an example of what variable?

Distance walked (in meters) during a 6MWT

Continuous variable

15
New cards

Define parametric testing

Testing that assumes data is normally distributed in a bell shape pattern

16
New cards

Which type of testing is more powerful, parametric or nonparametric testing?

Parametric testing

17
New cards

Parametric testing is used with what kind of data with what kind of distribution?

Quantitative data w/ a normal distribution

18
New cards

What kind of testing can you assume normality?

Parametric testing

19
New cards

What kind of testing can you not assume normality?

Nonparametric testing

20
New cards

What kind of data is used in nonpararmetric testing?

  • Categorical data

  • Quantitative data that don’t go a normal distribution

21
New cards

What kind of graphs could you use when you have continuous/quantitative data?

  • Boxplot

  • Error bar chart

  • Histogram

  • Q-Q/P-P plot

  • Stem-and-leaf plot

22
New cards

What kind of stat procedure can you use for either continuous/quantitative or categorical variables?

Frequency table

23
New cards

What kind of stat procedures can you use for continuous/quantitative variables?

  • Descriptive stats

    • N

    • Mean

    • SD

    • 5 numbers ummary

      • Min

      • 1st quartile

      • Median

      • Max

  • 1 Sample t-test

  • Freq table

24
New cards

What kind of graphs can you use when you have categorical variables?

  • Bar chart

  • Pie chart

25
New cards

What kind of stat procedures can you use if you have categorical variables?

  • Freq table

  • Binomial test

  • Chi-sq goodness of fit

26
New cards

What do descriptive stats characterize?

  • Shape/distribution within a set of data

  • Central tendency within a set of data

  • Dispersion/variability within a set of data

27
New cards

What is the first step in the data analysis process?

Checking shape/distribution of data

28
New cards

Why do you need to do data anlysis process?

  • Gives us a sense of the data

  • Informs decisions about normality and parametric v non parametric testing

29
New cards

What are some visual checks you can do to examin distribution and evaluate normality?

  • Histograms

  • Stem-and-leaf plots

  • P-P plots

  • Q-Q plots

30
New cards

What needs to be done prior to any further statistical analysis with quantitative variables?

Checking for normality

31
New cards

What are some numeric checks for examine distribution and evaluating normality?

  • Frequency tables

  • Values for skweness and kurtosis

  • Statistical tests for normality

    • K-S test

    • S-W test

32
New cards

What is a histogram?

A graph plotting values of observations on the horizontal axis with a bar showing how many times each value occured in the data set

33
New cards

What is on the x axis of a histogram?

Values of observations

34
New cards

What is on the y-axis of a histogram?

Frequency

35
New cards

What does a scew mean?

A distripution is asymmetrical and normality assumption cannot be met

36
New cards

What is a positive skew?

When scores are bunched at low values w/ the tail pointing to high values

37
New cards

What is a negative skew?

When scores are bunched at high values with the tail pointing to low values

38
New cards

What is kurtosis?

The “peakedness” and degree to which scores cluster at tails

39
New cards

What is positive kurtosis/leptokurtic?

When a histogram is too peaked and has real long tails

40
New cards

What is negative kurtosis/platykurtic?

When a histogram is too flat and has short tails

41
New cards

What is a Q-Q plot?

A quantile-quantile plot. Plot of quantiles of variable v quantiles of a theoretical distribution

42
New cards

What is on the y-axis of a Q-Q plot?

The expected data in SPSS

43
New cards

What is on the x-axis on a Q-Q plot?

The observed data in SPSS

44
New cards

What would indicate data normality on a Q-Q plot?

The data falls along a straight line on a plot

45
New cards

What does kurtosis look like on a Q-Q plot?

The data is consistently sagging above or below the line

46
New cards

What does skewness look like on a Q-Q plot?

Data is s-shaped around the line

47
New cards

Positive values on a Q-Q plot indicate…

Positive skew/kurtosis

48
New cards

Negative values on a Q-Q plot indicate…

Negative skew/kurtosis

49
New cards

The further values are from 0 on a Q-Q plot, the more…

Skew/kurtosis there is

50
New cards

How do you convert a skewness/kurtosis value to a z-score?

Divide the skewness/kurtosis value by its standard error

51
New cards

If the absolute value of a z-score is greater than _____, then it is sigificantly different from 0 and is NOT dormally distributed

1.96

52
New cards

A z-score has to be under what number to be considered normally distributed?

Under 1.96

53
New cards

What is the most common way to check for normality?

  • Kolomogrov-Smirnov (K-S) test

and

  • Shapiro-Wilk (S-W) test

54
New cards

When running a K-S or S-W test, what does p need to be less than to be considered significantly different from a normal distribution?

Less than .05

55
New cards

When running a K-S or S-W test, what does p need to be greater than to indicate that the data is normally distributed?

Greater than .05

56
New cards

The S-W test is better for studies with how many samples?

Less than 50

57
New cards

What are 3 common measures to calculate where the center of a distribution lie?

  • Mode

  • Median

  • Mean

58
New cards

What is the mode?

The most frequent score

59
New cards

What is the mode useful for?

Summarizing categorical data

60
New cards

What is a downside of the mode?

It can take on several values

61
New cards

What is the median?

THe middle score when you order the data

62
New cards

What is the median useful for?

  • Ordinal data

  • Quantitative data

63
New cards

The median is preferred over what other method of central tendency when working with skewed distributions or data with outliers?

Mean

64
New cards

What is the mean?

The average score

65
New cards

What is a downside of using the mean?

Not useful for categorical data or for quantitative data that is skewed or has outliers

66
New cards

List the measures of dispersion/variability in a distribution

  • Range

  • Quantiles

  • Variance

  • Standard deviation

  • Coefficient of variation

67
New cards

What is the range?

The largest score minus the smallest score

68
New cards

What are quantiles

Equal data splits

69
New cards

What is the interquartile range?

Upper (3rd) quartile minus lower (1st) quartile

70
New cards

What are all the parts of a 5-point/number summary?

  • Minimum

  • 25th percentile (lower quartile)

  • 50th percentile (median)

  • 75th percentile (upper quartile)

  • Max

71
New cards

Define deviance

How different a score is from the center of a distribution

72
New cards

What does the sum of squares (SS) or mean squared error represent?

The total dispersion or total deviance of scores from the mean

73
New cards

What is the sum of squares (SS) or mean squared error dependent on?

The number of scores in the data

74
New cards

What is the term for the average dispersion?

Variance

75
New cards

Variance gives us a value in units ____

Squared

76
New cards

If you have variance, how can you get to SD?

By square rooting the variance

77
New cards

If you have SD, how can you get to variance?

Squaring SD

78
New cards

What is the coefficient of variation (CV)?

The ratio of standard deviation to mean expressed as a percentage

79
New cards

The coefficient of variation (CV) has no units. True or false?

True

80
New cards

Why is it helpful that the coefficient of variation (CV) does not have any units?

It allows us to compare distributions from different saples that may have different means or units

81
New cards

In a boxplot, any points outside of ____ are considered outliers

Whiskers

82
New cards

Statistical inference is based on what assumption?

That sample data represents population characteristics

83
New cards

The assumption for statistical inference is based on what concepts?

  • Probability (the likelihood that an event will occur given all the possible outcomes)

  • Sampling error (extent to which a statistic varies in samples taken from the same population

84
New cards

What percentage of the population is within 1 SD of the mean?

68.26%

85
New cards

What percentage of the population is within 2 SD?

  • 95.44%

    • 95% within 1.96 SD

86
New cards

What percentage of a population is within 3 SD?

99.74%

87
New cards

What does a z-score represent?

How many SDs a score is away from the mean

88
New cards

What is a sampling error?

The difference/deviance between each sample mean and ppulation mean

89
New cards

Do you want a big sampling or small sampling error?

Small

90
New cards

What is a standard error of mean (SEM)?

The SD of a distribution of sampling distribution

91
New cards

What does a standard error of mean (SEM) indicate?

How close a sample mean is to the true population mean

92
New cards

What is a confidence interval?

The range within we believe the true population paramter lies

93
New cards

What is the z-score for a 95% CI?

1.96

94
New cards

Statistical tests are based on _____ hypothesis

Null

95
New cards

If p is less than or equal to 0.05, do you reject the null and accept the alternative or accept the null and reject the alternative?

  • Reject null

  • Accept alternative

96
New cards

If p is greater than 0.05, do we fail to reject the null or alternative?

Fail to reject the null

97
New cards

What is a type 1 error?

When you reject the null but it’s actually true

98
New cards

Is alpha or beta the probability of making a type 1 error?

Alpha

99
New cards

What is a type II error?

Wen you fail to reject the null when the null is actually false

100
New cards

Is alpha or beta the probability of making a type II error?

Beta