ST 311

0.0(0)

Studied by 0 people

Call with Kai

Knowt Play

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Card Sorting

1/125

Earn XP

Description and Tags

ncsu

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

126 Terms

New cards

Data

Collections of observations (measurements, counts, and survey responses)

New cards

Population

Complete collection of all measurements or data that is being considered. Aka population of interest

New cards

Sample

A subset of members selected from a population

New cards

How to select a sample

Should be random and representative of the population

New cards

Parameter

Numerical measurement describing some characteristic of a population

New cards

Statistic

Numerical measurement describing some characteristic of a sample

New cards

Quantitative Data

aka numerical data; consists of numbers representing counts or measurements

New cards

Examples of quantitative data

age of an athlete, weight of a letter

New cards

Categorical Data

aka qualitative data; consists of names or labels

New cards

Example of categorical data

college major, hometown

New cards

Discrete Data

result when the data variables are quantitative and the numbers are countable/finate

New cards

Example of discrete data

the number of tosses of a coin before getting tails

New cards

Continuous/numerical Data

result from infinitely possible values, uncountable

New cards

Example of continuous/numerical data

the arm span of high school seniors

New cards

Bias

those samples that are more likely to produce some outcomes than others (resulting statistics might be too high or too low)

New cards

Convenience

those samples that are easy to collect (often have some bias or don’t represent the population in general)

New cards

Volunteers

a self-selected sample of people who respond to a general appeal

New cards

Simple random sample

a sample of x subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen

New cards

Stratified sample

subdivide the population into at least two different groups, so that the subjects within the same subgroup share the same characteristics. then draw a sample from each subgroup

New cards

Cluster sample

divide the population area into naturally occurring sections then randomly select some of those clusters and choose all members from the selected cluster

New cards

Systematic sample

select some starting point and then select every n^th element in the population. works well when units are in some order (ex. house on the block)

New cards

Multistage sample

collect data by using some combination of the basic sampling methods

New cards

Bad sampling frame

when attempting to list all members of a population, some subjects are missing. can be difficult to obtain a complete list

New cards

Undercoverage

the sampling frame is missing groups from the population

New cards

Non-response bias

some parts of the population chose not to respond

New cards

Response bias

responses given are not truthful

New cards

Wording/order

wording of questions is leading to elicit a particular response

New cards

Experiment

the process of applying some treatment and then observing its effects. almost always compares two (or more) groups (treatment vs control)

New cards

Observational study

the process of observing and measuring specific characteristics without attempting to modify the individuals being studied. tells what’s happening and cannot describe cause-effect relationships

New cards

Response variable

measures an outcome of a study

New cards

Explanatory variable

explains or influences changes in the response variable

New cards

Treatment effects

different treatment = different outcome (what we want)

New cards

Experimental error

variability among observed values of the response variable for experimental units that receive the same treatment

New cards

Lurking variables

a variable that is not among the explanatory variables in a study and yet may influence the interpretation of the relationship among response and explanatory variables

New cards

Confounding variables

two variables are confounded when the effects on the response variable cannot be distinguished from each other

New cards

Control

control the effects of lurking/confounding variables by careful planning (control group receives no treatment)

New cards

Randomization

randomly assign experimental units to treatments to reduce or eliminate bias

New cards

Replication

measure the effect of each treatment on many units to reduce chance variation in results

New cards

Completely randomized design

participants are randomly assigned to treatments (including control group). Assumes that on average lurking variables will affect each treatment group equally

New cards

Randomized block design

divides participants into subgroups called blocks. Variability within blocks is less than variability between blocks. Participants from each block are then randomly grouped.

New cards

Matched pairs designed

used when an experiment has only 2 treatment groups; participants can be grouped into pairs and within pairs are randomly assigned to different treatments

New cards

The placebo effect

the tendency to react to a drug or treatment regardless of its actual physical function.

New cards

Hawthorne effect

behavior is different because the subject knows they are being watched

New cards

Blinding

When individuals associated with an experiment are not aware of how subjects have been assigned

New cards

Single blind study

those who could influence the results are blinded

New cards

Double blind study

those who evaluate the results are blinded as well

New cards

Measure of center

a value at or near the middle of a data set (mean, median, mode)

New cards

∑

denotes a sum, “sigma”

New cards

denotes an individual data value

New cards

denotes the number of values in a sample, “sample size”

New cards

denotes the number of values in a population

New cards

x̅

denotes the sample mean, “x bar”

New cards

denotes the population mean, “mew”

New cards

Mean

found by adding all values and dividing by the number of values in a data set (uses every data value so not good for skewed data)

New cards

Median

the value in the middle when listed in ascending order (not affected by outliers, can be used with any data set)

New cards

Mode

the value that occurs with the greatest frequency (only useful for multimodal or qualitative data)

New cards

Unimodal

dataset with one mode

New cards

Bimodal

dataset with two modes

New cards

Multimodal

dataset with more than two modes

New cards

Which measure of center do you choose?

Quantitative = mean or median

Categorical = mode

New cards

Horizontal histogram

represents quantitative data

New cards

Vertical histogram

represents frequency

New cards

Right skewed histogram

highest amount to the left

New cards

Left skewed histogram

highest amount to the right

New cards

Symmetrical

mean = median = mode

New cards

Right skewed (pos)

mode < median < mean

New cards

Left skewed (neg)

mean < median < mode

New cards

Range

the difference between the maximum and minimum

R = max value - min value (highly affected by outliers)

New cards

Interquartile range

provides a range of values that are not as affected by potential outliers

IQR = Q₃ - Q₁

New cards

Varience

V = (standard deviation)²

New cards

Standard deviation

SD = √V

New cards

Standard deviation

a measure of how much data values deviate from the mean. Increases with 1 or more outliers (never negative)

New cards

σ²

population variance

New cards

standard deviation

New cards

s²

sample variance

New cards

standard deviation

New cards

z-Scores

when you want to compare two numbers from different groups relative to their own groups

New cards

Positive z-score

data value is above average

New cards

Negative z-score

data value is below average

New cards

z-score equation

Z=\frac{x-\mu}{\sigma} (value - mean / standard deviation)

New cards

-1 \sigma to +1 \sigma

68% of the data lie between these

New cards

-2 \sigma to +2 \sigma

95% of the data lie between these

New cards

-3 \sigma to +3 \sigma

99.7% of the data lie between these

New cards

The emperical rule

for a normal distribution, approximately 68% of data falls within 1 standard deviation of the mean.

New cards

Significantly low

values are considered significantly or unusual if they are -2 \sigma or lower

New cards

Significantly high

values are considered significantly or unusual if they are +2 \sigma or higher

New cards

Probability

represented by the area under the density curve

New cards

Normal distribution (total area under the curve is equal to 1)

a continuous probability distribution for a random variable. Mean, mode, and median are equal. Bell-shaped and is symmetric about the mean.

New cards

Parameters

The mean is located in the center and the standard deviation defines the shape

New cards

Normal distribution

X~N( \mu , \sigma )

New cards

The standard normal distribution

the distribution of z-scores, has a mean of zero, and a standard deviation of one.

Z~N(0,1)

New cards

Probability distribution

describes how likely the values of the variable are to occur

New cards

Binomial distribution

a binomial random variable counts the number of successes that must be true

New cards

Qualities to make a distribution binomial

Fixed number of trials/observations labelled as “n”
Independent trials (outcome of one doesn’t affect the probability in the others)
Either a success (S) or failure (F)

New cards

Success in binomial distributions

when the outcome that a random variable is counted, probability of success is constant for each trial.

New cards

Success equation

P(S) = p

New cards

Binomial equation

X~B_in(n, p)

n = # of trials & p = probability of success

New cards

Mean of binomial distribution

\mu=n\cdot p (the mean of a random variable, aka E(x), the expected value)

New cards

E(x)

the expected value, a weighted mean of the outcomes (likely outcomes get more “weight” than unlikely)

100

New cards

Expected value vs mean of random variable

expected value of a discrete random variable is equal to the mean of the random variable