Stats Final Definitions/Concepts

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/185

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 5:07 AM on 4/10/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

186 Terms

1
New cards

Statistics

Science dealing with the collection, analysis, interpretation, and presentation of numerical data

2
New cards

Descriptive Stats

Using data gathered on a group to reach conclusions about the same group

3
New cards

Inferential statistics

Using data gathered on a group to reach conclusions about the population

4
New cards

Population

Collection of persons, objects, or items of interest

5
New cards

Census

Gathering data from the entire population

6
New cards

Sample

A portion of the population that represents the entire population

7
New cards

Parameter

A descriptive measure of the population

8
New cards

Statistic

Descriptive measure of the sample

9
New cards

Variable

Characteristic of any entity being studied that is capable of taking on different values

10
New cards

Measurement

Occurs when a standard process is used to assign numbers to particular attributes or characteristics of a variable

11
New cards

Data

Measurement that is recoreded and stored

12
New cards

Nominal

Data used only to classify or categorize

-no value statement

-no order

13
New cards

Ordinal

Data that is used to order/rank items

-no value statement

14
New cards

Interval

Data that has ranking and between each ranking has meaning

15
New cards

Ratio

Data that has ranking and between each ranking has meaning, additionally zero means the absence

16
New cards

Big data

Large amount of either organized or unorganized data from different sources that is difficult to process

17
New cards

Variety

Different forms of data

18
New cards

Velocity

Speed at which data is available and can be processed

19
New cards

Veracity

Quality and accuracy of the data

20
New cards

Volume

Size of the data

21
New cards

Data Mining

Process of collecting, exploring, and analyzing large volumes of data in an effort to uncover hidden patterns/relationships

22
New cards

Data visualization

Study of the visual representation of data, employed to convey data or information by imparting it as visual objects displayed in graphs

23
New cards

Ungrouped data

Raw data or data that has not been summarized

24
New cards

Grouped data

Data that has been organized into a frequency distribution

25
New cards

Frequency distribution

Summary of data presented in the form of class intervals and frequencies

26
New cards

Range

The difference between the largest and smallest value in a set of numbers

-generally between 5 and 15 classes

27
New cards

Class midpoint

Value halfway across a class interval

28
New cards

Relative Frequency

Proportion of the total frequency that is in any given class interval in a frequency distribution

= Individual class frequency/Total frequency (proportion of the total that the individual class makes up)

29
New cards

Cumulative frequency

Running total of frequencies through the class of a frequency distribution

30
New cards

Histogram

Vertical bar chart constructed by graphing segments for frequencies

-frequency on Y axis

-classes on X axis

31
New cards

Frequency Polygon

Graphical display of class frequencies

-line graph that connects class midpoints

32
New cards

Ogive

Line graph connecting the cumulative frequency of class endpoints

33
New cards

Stem and Leaf Plot

Consists of Stems (left digit) in the first column and Leaf (right digit) coming out of the stems
-Stems ordered lowest value at the top

-Leafs ordered lowest value at the left

34
New cards

Pie chart

Circular depiction of data where area of the whole pie represents 100%

35
New cards

Bar chart

Chart containing two or more categories along one axis and bars along the other

36
New cards

Pareto chart

A vertical bar chart, categories being graphed descending order (highest value on the left)

-often includes a cumulative frequency line

-80/20 rule

37
New cards

Cross Tabulation

Process for producing a two dimensional table, displaying frequency counts for two variables

38
New cards

Scatter plot

Two dimensional plot of pairs of points from two variables

39
New cards

Time series

Data gathered on a given characteristic over a period of time at regular intervals

40
New cards

Measures of central tendency

One type of measure that is used to yield information about the center of a group of numbers

-Mean, Median, Mode

41
New cards

Mean

Average of a group of numbers

42
New cards

Median

middle value in an ordered array of numbers

-the (N+1)/2 term

43
New cards

Mode

The most frequently occuring value in a set of data

44
New cards

Bimodal

Data set that has two modes

45
New cards

Multimodal

Data set that has more than two modes

46
New cards

Percentiles

Measures of central tendency that divide a group into 100 parts

nth percentile means at least n% of the data is below that value

-always rounds down

47
New cards

Average Ith and (I+1)th number

When calculating percentile, if I is a whole number, what do you do to find location of the percentile?

48
New cards

Whole number part of (I+1)th number

When calculating percentile, if I is not a whole number, what do you do to find location of the percentile?

49
New cards

Quartiles

Measures of central tendency that divide a group of data into four parts

Q1 = 25th percentile

Q2 = Median

Q3 = 75th percentile

50
New cards

Measures of variability

Statistics that describe the spread or dispersion of a set of data

51
New cards

Interquartile range

Q3 - Q1

52
New cards

68%, 95%, 99.7%

The empirical rule states that if data is normally distributed, (blank)% of data is within 1 standard deviation, (blank)% of data is within 2 standard deviations, and (blank)% of data is within 3 standard deviations

53
New cards

1 - 1/K²

Chebyshev’s Theorem states that at least (blank) values will fall within K standard deviations

-works regardless of shape of distribution

54
New cards

Z score

The number of standard deviations by which a value is above or below the mean of a set of numbers, when the data is normally distributed

55
New cards

Skewness

The degree of symmetry around the sample mean

-left skewed means the long tail is on the left (right means long tail on the right)

-Left: Mean, Median, Mode

-Right: Mode, Median, Mean

-Symmetrical: All in the middle

56
New cards

Box and Whisker plot

Diagram that with the interquartile range as the box

1.5*IQR as the inner fence

3*IQR as outer fence

-values in the inner fence are mild outliers

-values in the outer fence are extreme outliers

-if the median in the box is to the right, skewed left

57
New cards

Classical method (probability)

Assigning probability based on laws or rules (number of times event occurs/total number of outcomes)

58
New cards

Relative frequency of occurence method (probability)

Probability based on historical (number of times event occured/number of times it could have occured)

59
New cards

Subjective method (probability)

Probability based on feelings or insight

60
New cards

Experiment

Process that produces outcomes

61
New cards

Event

Outcome of an experiment

-Broken down furthest into elementary events

62
New cards

Sample space

Complete roster or listing of all elementary events of an experiment

-can be deonted using set notation

63
New cards

Union

Combination of all the numbers between two sets (X and Y)

-numbers don’t get repeated when listing them

64
New cards

Intersection

Numbers that are common to both sets

65
New cards

Mutually exclusive events

Events such that the occurence of one means the other cannot occur
Ex. Making a shot vs missing a shot

66
New cards

Independent events

Events such that the occurence of one has no effect on the occurence of the other

67
New cards

Collectively exhaustive events

Contains all possible elementary events

-The entire sample space

68
New cards

Complement

An event that comprises all the elementary events not in one event

-Denoted P(A’)

= 1 - P(A)

69
New cards

M*n counting rule

When there are multiple combinations, what rule should you apply to figure out the total number of possible combinations

Ex. When there is a cake with 5 flavours and 5 sizes how many possible combinations?

70
New cards

N^n

When sampling with replacement, how many different possibles can occur?

-where N is population size and n is sample size

71
New cards

N!/n!(N-n)!

When sampling without replacement, how many possibles can occur?

-where N is population size and n is sample size

72
New cards

n!/(n-r)!

When sampling where order matters, how many possible permutations are there?
-where n is the population and r is the sample size

73
New cards

Random variable

A variable that contains the outcome of a chance experiment

74
New cards

Discrete variable

A random variable that is finite or countably infinite

75
New cards

Continous

A random variable that has values at every point over a given interval

76
New cards

Binominal Distribution

Discrete distribution with only 2 possible outcomes in a given trial (ex. success, failure)

-Assumption: Replacement/independence

77
New cards

n < 5% N

You can use the binominal distribution without assuming independence/replacement if:
What rule regarding n and N?

78
New cards

Number of trials, Number of successes desired, Probability of success, Probability of failure (n, x, p, q)

What information do you need to do to solve a binomial problem using the binominal formula?

79
New cards

Normal distribution (Z)

-Continous distribution

-Symmetrical about the mean

-Asymptotic (doesn’t touch horizontal axis)

-Unimodal

-Family of curves

-Area of the curve = 1

80
New cards

NP > 5 and NQ > 5

If (this condition) is met, we can use the Z distribution to solve binominal problems, after applying a correction factor

81
New cards

+0.50, -0.50, -0.50, +0.50

When using Z distribution to solve binomial problems, what is the correction factor for solving for:
X >

X >=

X <

X <=

82
New cards

Frame

A list, map, directory, or any source that can be used to represent a population

-can be overregistered or underegistered

83
New cards

Random sampling

Sampling in which every unit of the population has the same probability of being selected

84
New cards

Simple random sampling

The most elementary of the random sampling techniques, using a random number generator to pick items

85
New cards

Statified random sampling

Random sampling in which the population is divided into various strata (ex. age), then items are picked from each strata

-can be proportionate (pick so sample reflects the proportions of each strata in the population) or disproportionate

86
New cards

Systematic sampling

A random sampling technique in which every kth item or person in a randomized list is selected

-where k = N/n

87
New cards

Cluster Sampling

A random sampling technique in which the population is divided into clusters and elements are randomly sampled from clusters

-Homogeneity between clusters, hetero within clusters

88
New cards

Non random sampling

Sampling in which not every unit of the population has the same probability of being selected for the sample

-not scientific

89
New cards

Convenience sampling

Selecting a sample at researcher’s convenience

90
New cards

Judgement sampling

Selectinga sample at researcher’s judgement

91
New cards

Quota sampling

Sample is selected non randomly to fit a desired quota

92
New cards

Snowball sampling

Survey subjects are selected based on referral from others

93
New cards

Sampling error

The error that results if the sample is not representative of the population

94
New cards

Central limit theorem

Regardless of the shape of a population, the distributions of sample means and sample proportions are normally distributed as long as n is large (n>30 or np>5 nq >5)

-Thus we can use Z to solve sample problems

95
New cards

Sqrt(N-n/N-1)

When working with a finite population (and n is more than 5% of the population), what correction factor do we apply?

96
New cards

T distribution

What distribution should you use when the population standard deviation is unknown but the sample standard deviation is known?

-Also assuming population is normally distributed

97
New cards

Robust

A term used to describe statistical techniques that are relatively insensitive to minor violations in its assumptions

98
New cards

Area between mean and the Z

What area does the Z value give?

99
New cards

Area between T and the upper/lower tail

What area does the T value give?

100
New cards

n - 1

What is degrees of freedom for T?