Stats Final Definitions/Concepts

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/185

There's no tags or description

Looks like no tags are added yet.

Last updated 5:07 AM on 4/10/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

186 Terms

New cards

Statistics

Science dealing with the collection, analysis, interpretation, and presentation of numerical data

New cards

Descriptive Stats

Using data gathered on a group to reach conclusions about the same group

New cards

Inferential statistics

Using data gathered on a group to reach conclusions about the population

New cards

Population

Collection of persons, objects, or items of interest

New cards

Census

Gathering data from the entire population

New cards

Sample

A portion of the population that represents the entire population

New cards

Parameter

A descriptive measure of the population

New cards

Statistic

Descriptive measure of the sample

New cards

Variable

Characteristic of any entity being studied that is capable of taking on different values

New cards

Measurement

Occurs when a standard process is used to assign numbers to particular attributes or characteristics of a variable

New cards

Data

Measurement that is recoreded and stored

New cards

Nominal

Data used only to classify or categorize

-no value statement

-no order

New cards

Ordinal

Data that is used to order/rank items

-no value statement

New cards

Interval

Data that has ranking and between each ranking has meaning

New cards

Ratio

Data that has ranking and between each ranking has meaning, additionally zero means the absence

New cards

Big data

Large amount of either organized or unorganized data from different sources that is difficult to process

New cards

Variety

Different forms of data

New cards

Velocity

Speed at which data is available and can be processed

New cards

Veracity

Quality and accuracy of the data

New cards

Volume

Size of the data

New cards

Data Mining

Process of collecting, exploring, and analyzing large volumes of data in an effort to uncover hidden patterns/relationships

New cards

Data visualization

Study of the visual representation of data, employed to convey data or information by imparting it as visual objects displayed in graphs

New cards

Ungrouped data

Raw data or data that has not been summarized

New cards

Grouped data

Data that has been organized into a frequency distribution

New cards

Frequency distribution

Summary of data presented in the form of class intervals and frequencies

New cards

Range

The difference between the largest and smallest value in a set of numbers

-generally between 5 and 15 classes

New cards

Class midpoint

Value halfway across a class interval

New cards

Relative Frequency

Proportion of the total frequency that is in any given class interval in a frequency distribution

= Individual class frequency/Total frequency (proportion of the total that the individual class makes up)

New cards

Cumulative frequency

Running total of frequencies through the class of a frequency distribution

New cards

Histogram

Vertical bar chart constructed by graphing segments for frequencies

-frequency on Y axis

-classes on X axis

New cards

Frequency Polygon

Graphical display of class frequencies

-line graph that connects class midpoints

New cards

Ogive

Line graph connecting the cumulative frequency of class endpoints

New cards

Stem and Leaf Plot

Consists of Stems (left digit) in the first column and Leaf (right digit) coming out of the stems
-Stems ordered lowest value at the top

-Leafs ordered lowest value at the left

New cards

Pie chart

Circular depiction of data where area of the whole pie represents 100%

New cards

Bar chart

Chart containing two or more categories along one axis and bars along the other

New cards

Pareto chart

A vertical bar chart, categories being graphed descending order (highest value on the left)

-often includes a cumulative frequency line

-80/20 rule

New cards

Cross Tabulation

Process for producing a two dimensional table, displaying frequency counts for two variables

New cards

Scatter plot

Two dimensional plot of pairs of points from two variables

New cards

Time series

Data gathered on a given characteristic over a period of time at regular intervals

New cards

Measures of central tendency

One type of measure that is used to yield information about the center of a group of numbers

-Mean, Median, Mode

New cards

Mean

Average of a group of numbers

New cards

Median

middle value in an ordered array of numbers

-the (N+1)/2 term

New cards

Mode

The most frequently occuring value in a set of data

New cards

Bimodal

Data set that has two modes

New cards

Multimodal

Data set that has more than two modes

New cards

Percentiles

Measures of central tendency that divide a group into 100 parts

nth percentile means at least n% of the data is below that value

-always rounds down

New cards

Average Ith and (I+1)th number

When calculating percentile, if I is a whole number, what do you do to find location of the percentile?

New cards

Whole number part of (I+1)th number

When calculating percentile, if I is not a whole number, what do you do to find location of the percentile?

New cards

Quartiles

Measures of central tendency that divide a group of data into four parts

Q1 = 25th percentile

Q2 = Median

Q3 = 75th percentile

New cards

Measures of variability

Statistics that describe the spread or dispersion of a set of data

New cards

Interquartile range

Q3 - Q1

New cards

68%, 95%, 99.7%

The empirical rule states that if data is normally distributed, (blank)% of data is within 1 standard deviation, (blank)% of data is within 2 standard deviations, and (blank)% of data is within 3 standard deviations

New cards

1 - 1/K²

Chebyshev’s Theorem states that at least (blank) values will fall within K standard deviations

-works regardless of shape of distribution

New cards

Z score

The number of standard deviations by which a value is above or below the mean of a set of numbers, when the data is normally distributed

New cards

Skewness

The degree of symmetry around the sample mean

-left skewed means the long tail is on the left (right means long tail on the right)

-Left: Mean, Median, Mode

-Right: Mode, Median, Mean

-Symmetrical: All in the middle

New cards

Box and Whisker plot

Diagram that with the interquartile range as the box

1.5*IQR as the inner fence

3*IQR as outer fence

-values in the inner fence are mild outliers

-values in the outer fence are extreme outliers

-if the median in the box is to the right, skewed left

New cards

Classical method (probability)

Assigning probability based on laws or rules (number of times event occurs/total number of outcomes)

New cards

Relative frequency of occurence method (probability)

Probability based on historical (number of times event occured/number of times it could have occured)

New cards

Subjective method (probability)

Probability based on feelings or insight

New cards

Experiment

Process that produces outcomes

New cards

Event

Outcome of an experiment

-Broken down furthest into elementary events

New cards

Sample space

Complete roster or listing of all elementary events of an experiment

-can be deonted using set notation

New cards

Union

Combination of all the numbers between two sets (X and Y)

-numbers don’t get repeated when listing them

New cards

Intersection

Numbers that are common to both sets

New cards

Mutually exclusive events

Events such that the occurence of one means the other cannot occur
Ex. Making a shot vs missing a shot

New cards

Independent events

Events such that the occurence of one has no effect on the occurence of the other

New cards

Collectively exhaustive events

Contains all possible elementary events

-The entire sample space

New cards

Complement

An event that comprises all the elementary events not in one event

-Denoted P(A’)

= 1 - P(A)

New cards

M*n counting rule

When there are multiple combinations, what rule should you apply to figure out the total number of possible combinations

Ex. When there is a cake with 5 flavours and 5 sizes how many possible combinations?

New cards

N^n

When sampling with replacement, how many different possibles can occur?

-where N is population size and n is sample size

New cards

N!/n!(N-n)!

When sampling without replacement, how many possibles can occur?

-where N is population size and n is sample size

New cards

n!/(n-r)!

When sampling where order matters, how many possible permutations are there?
-where n is the population and r is the sample size

New cards

Random variable

A variable that contains the outcome of a chance experiment

New cards

Discrete variable

A random variable that is finite or countably infinite

New cards

Continous

A random variable that has values at every point over a given interval

New cards

Binominal Distribution

Discrete distribution with only 2 possible outcomes in a given trial (ex. success, failure)

-Assumption: Replacement/independence

New cards

n < 5% N

You can use the binominal distribution without assuming independence/replacement if:
What rule regarding n and N?

New cards

Number of trials, Number of successes desired, Probability of success, Probability of failure (n, x, p, q)

What information do you need to do to solve a binomial problem using the binominal formula?

New cards

Normal distribution (Z)

-Continous distribution

-Symmetrical about the mean

-Asymptotic (doesn’t touch horizontal axis)

-Unimodal

-Family of curves

-Area of the curve = 1

New cards

NP > 5 and NQ > 5

If (this condition) is met, we can use the Z distribution to solve binominal problems, after applying a correction factor

New cards

+0.50, -0.50, -0.50, +0.50

When using Z distribution to solve binomial problems, what is the correction factor for solving for:
X >

X >=

X <

X <=

New cards

Frame

A list, map, directory, or any source that can be used to represent a population

-can be overregistered or underegistered

New cards

Random sampling

Sampling in which every unit of the population has the same probability of being selected

New cards

Simple random sampling

The most elementary of the random sampling techniques, using a random number generator to pick items

New cards

Statified random sampling

Random sampling in which the population is divided into various strata (ex. age), then items are picked from each strata

-can be proportionate (pick so sample reflects the proportions of each strata in the population) or disproportionate

New cards

Systematic sampling

A random sampling technique in which every kth item or person in a randomized list is selected

-where k = N/n

New cards

Cluster Sampling

A random sampling technique in which the population is divided into clusters and elements are randomly sampled from clusters

-Homogeneity between clusters, hetero within clusters

New cards

Non random sampling

Sampling in which not every unit of the population has the same probability of being selected for the sample

-not scientific

New cards

Convenience sampling

Selecting a sample at researcher’s convenience

New cards

Judgement sampling

Selectinga sample at researcher’s judgement

New cards

Quota sampling

Sample is selected non randomly to fit a desired quota

New cards

Snowball sampling

Survey subjects are selected based on referral from others

New cards

Sampling error

The error that results if the sample is not representative of the population

New cards

Central limit theorem

Regardless of the shape of a population, the distributions of sample means and sample proportions are normally distributed as long as n is large (n>30 or np>5 nq >5)

-Thus we can use Z to solve sample problems

New cards

Sqrt(N-n/N-1)

When working with a finite population (and n is more than 5% of the population), what correction factor do we apply?

New cards

T distribution

What distribution should you use when the population standard deviation is unknown but the sample standard deviation is known?

-Also assuming population is normally distributed

New cards

Robust

A term used to describe statistical techniques that are relatively insensitive to minor violations in its assumptions

New cards

Area between mean and the Z

What area does the Z value give?

New cards

Area between T and the upper/lower tail

What area does the T value give?

100

New cards

n - 1

What is degrees of freedom for T?