Statistics Final Flashcards (ST 311 NCSU)

studied byStudied by 0 people
0.0(0)
Get a hint
Hint

Statistics

1 / 196

197 Terms

1

Statistics

the science of planning studies and experiments, obtaining data, and organizing, summarizing, analyzing, and interpreting those data and then drawing conclusions based on them

New cards
2

Conducting a statistical study includes 3 phases:

  1. Prepare: consider the population, data types, and sampling method

  2. Analyze: describe the data you collected and use appropriate statistical methods to help with drawing conclusions

  3. Conclude: using statistical inference, make reasonable judgements and answer broad questions

New cards
3

Data

collections of observations, such as measurements, counts, descriptions, or survey responses

New cards
4

Population

the complete collection of all measurements or data that are being considered. Typically, a population is the complete collection of all data we would like to better understand or describe. We also call it the population of interest

New cards
5

Sample

a subset of members selected from a population (random)

New cards
6

Parameter

a numerical measurement describing some characteristic of a population

New cards
7

Statistic

a numerical measurement describing some characteristic of a sample

New cards
8

Quantitative (numerical) data

consists of numbers representing counts or measurements (2 types: discrete or continuous)

New cards
9

Categorical data (qualitative)

consists of names or labels (NOT numbers)

New cards
10

Discrete data (quantitative)

result when the data values are quantitative and the number of values is finite or countable (ex: # of tosses of a coin before getting tails)

New cards
11

Continuous data (numerical)

result from infinitely many possible quantitative values where the collection of values is not countable (ex: the arm spans in inches of high school seniors)

New cards
12

Our goal is to answer a question about a ___

population

New cards
13

We want our sample to be random and ___ of the population

representative

New cards
14

Simple Random Sample (SRS)

A sample of n subjects is selected in such a way that every possible sample of the size size n has the same chance (probability) of being chosen

New cards
15

Stratified Sample

Subdivide the population into 2+ subgroups (or strata) so that the subjects in the same subgroups share the same characteristics. Then draw a sample from each subgroup. The number sampled from each stratum may be done proportionally with respect to population size.

New cards
16

Cluster Sample

Divide the population area into naturally occurring sections (or clusters), then randomly select some of these clusters and choose all the members for those selected clusters)

New cards
17

Systematic Sample

select some starting point and then select every kth element in the population. Works well when units are in the same order like an assembly line

New cards
18

Multistage sample

Collect data by using some combination of the basic sampling methods

New cards
19

Convenience Sampling

Select the first k # of subjects that you come across

New cards
20

Bad Sampling Frame

When attempting to list all members of a population, some subjects are missing. It can be difficult to make a complete list

New cards
21

Non-response bias

Some part of the population chooses not to respond, or subjects were selected but are not able to be contacted

New cards
22

Response bias

Responses to questions are not truthful. This may occur when people are unwilling to reveal personal matters, admit to illegal activity, or tailor their responses to “please” the investigator

New cards
23

Wording and Order Bias

The way questions are worded may be leading/inflammatory to elicit a response. Or the order of questions influences answers.

New cards
24

Measure of center

a value at or near the center or middle of a data set, “typical” values for a group

EX: mean, median, mode

New cards
25

Σ

denotes a sum, “sigma”

New cards
26

x

denotes individual data value

New cards
27

n

denotes # of values in a sample, “sample size”

New cards
28

N

denotes number of values in a population

New cards
29

denotes the same mean, “x bar”

New cards
30

μ

denotes the population mean, “mew”

New cards
31

Mean

found by adding all values and dividing by the number of values in the set. A sample mean is the mean of a sample. A population mean is the mean of an entire population.

New cards
32

Median

the value that is in the middle when listed in ascending order. Shows what # separates the bottom 50% of the data from the top 50%. Roughly half of all values are below, and half are above it.

New cards
33

Mode

the value that occurs with the greatest frequency. Could be no mode. One mode: unimodal, two modes: bimodal, 2+ modes: multimodal

New cards
34

Histogram

the graph of a frequency distribution, a graph of bars of equal width drawn adjacent to each other, a horizontal scale representing classes of quantitative data values, a vertical scale (height) represents frequency

<p>the graph of a frequency distribution, a graph of bars of equal width drawn adjacent to each other, a horizontal scale representing classes of quantitative data values, a vertical scale (height) represents frequency</p>
New cards
35

Dotplot

shows each value in a dataset as a dot above a number line

<p>shows each value in a dataset as a dot above a number line</p>
New cards
36

Measures of variation (or spread)

Range, IQR, variance, standard deviation

New cards
37

Range

max data value - min data value (highly affected by outliers)

New cards
38

Interquartile Range (IQR)

uses quartiles to provide a range of values that are not as affected by potential outliers as the range

(Q1, Q2, Q3)…1/4 of the data lies between 2 consecutive quartiles

IQR= Q3-Q1

New cards
39

3 IQR quartiles together with the min and max values constitutes the 5-number summary:

  1. minimum

  2. Q1 (median of the first half of the dataset)

  3. Median

  4. Q3 (median of the second half of the dataset)

  5. Maximum

<ol><li><p>minimum</p></li><li><p>Q1 (median of the first half of the dataset)</p></li><li><p>Median</p></li><li><p>Q3 (median of the second half of the dataset)</p></li><li><p>Maximum</p></li></ol><p></p>
New cards
40

Variance

(Standard deviation)²

New cards
41

Standard deviation

sqrt(variance)

Defined as a measure of how much data values deviate from the mean, the value of it is never negative, zero ONLY when data is all the same, larger values indicate greater amounts of variation, SD can increase a lot with one or more outliers, units of SD are the same as the units of the OG data values

New cards
42

Population variance

New cards
43

σ or s

standard deviation

New cards
44

sample variance

New cards
45

Experiment

the process of applying some treatment and then observing the effect

  • almost always compares 2+ groups: treatment and control group

  • the individuals in an experiment are called units

New cards
46

Control group

no treatment

New cards
47

Units

the individuals in an experiment

New cards
48

Observational study

the process of observing and measuring specific characteristics without attempting to modify the individuals studied

  • tell “what’s happening” and can’t describe cause-effect relationships

  • accessing reliable records counts as observational

New cards
49

Response variable

measures outcome of a study

New cards
50

explanatory variable

explains/influences changes in the response variable

New cards
51

Design of experiment

plan for collecting the sample

New cards
52

Treatment

a specific experimental condition applied to the units/subjects

New cards
53

Variability in Experiments

There will be variability from treatment effects, experimental error, lurking variables, and confounding variables

New cards
54

Treatment effects

different treatments cause different outcomes

New cards
55

Experimental error

variability among observed values of the response variable for units receiving some treatment, small as possible

New cards
56

Lurking variables

a variable not among the explanatory variables in a study but has impact

New cards
57

Confounding variables

2 variables confounded when the effects on the response variable can’t be distinguished

New cards
58

Principles of Experiment Design

Control, randomization, and replication

New cards
59

Control

Control the effects of lurking/confounding variables by carefully planning

New cards
60

Randomization

randomly assign experimental units to treatments to decrease bias

New cards
61

Replication

measure the effect of each treatment on many units to increase chance variation

New cards
62

Completely Randomized Design

participants randomly assigned to treatments, so lurking variables affect each group equally

New cards
63

Randomized Block Design

the experimenter divides participants into subgroups called blocks, so variability in blocks is less than between blocks. Then, part of each block are randomly assigned to treatment groups.

New cards
64

Matched Pairs Design

a special case of randomized block design; used when only 2 treatment groups are present. Participants grouped in pairs on one or more blocking variables. Then, in each pair, participants randomly assigned to different treatments

New cards
65

Placebo

false drug that subjects believe is real

New cards
66

Placebo effect

tendency to react to a drug/treatment regardless of function

New cards
67

Bias of Subjects

subjects may want to please researcher/hope for specific outcome (Hawthorne Effect, when people behave differently b/c they know they are being watched)

New cards
68

Bias of Researchers

people behave in ways that favor what they believe; researchers may assign subjects to groups/report results in a bias way

New cards
69

Blinding

when individuals in experiments are not aware of how subjects are assigned, so they are less likely to respond with bias

New cards
70

Single-blind study

those who could influence the results are blinded

New cards
71

Double-blind study

those who evaluate the results are blinded too

New cards
72

z-score

the number of standard deviations away from the mean a certain data value is

New cards
73

positive z-score

data value is above average

New cards
74

negative z-score

data value is below average

New cards
75

Standardizing

the process of converting a data value (often labeled x) to a z-score

New cards
76

𝑧 = (𝑥−𝜇) / 𝜎

converting x-value to z-score

New cards
77

Empirical Rule

When a distribution is bell-shaped/normal, the mean and standard deviation have the following relationship:

99.9% of the data is within 3 standard deviations of the mean, 95% of the data is within 2 SD’s, and 68% of the data is within 1 SD of the mean (34% is within -1 SD, 34% is within +1SD).

The 34, 14, 2.5 rule

New cards
78

Significantly low value

values are generally considered significant or unusual if they are (u-2a) or lower

New cards
79

Significantly high value

values are generally considered significant or unusual if they are |u + 2a | or higher

New cards
80

Values not significant

between (u-2a) and (u + 2a)

New cards
81

We will use a significance % of ___ as a general guide for significant values

5%

New cards
82

Density curve

If we scale the bell curve model so the area under the curve = 1

New cards
83

Probability, in a contin. prob. distri., is consequently the ____ the density curve.

area under

New cards
84

Probability Statement

P (small # </= x </= bigger #)

New cards
85

The graph of a normal distri. is called the

normal curve

New cards
86

In a normal curve…

The mean, median, and mode are EQUAL

The normal curve is bell-shaped and is symmetric on the mean..

The total area under the normal curve is EQUAL TO 1.

The normal curve approaches, but never touches, the x-axis as it extends further away from the mean.

New cards
87

Distribution of z-scores

Standard normal distribution

New cards
88

Notation

X ~ N(u, σ) where the ~ symbol reads “is distributed

New cards
89

The random variable X is distri. normally with mean u and SD σ and

Z ~ N(0,1)

New cards
90

Distribution

describes the possible values of a variable, how often they occur, and what pattern they create

New cards
91

Probability description

does the same thing as other distributions but describes how likely (instead of how often) the values of the variable are to occur)

New cards
92

Continuous Random Variable

has an uncountable number of possible outcomes, represented by an interval on the number line

New cards
93

Discrete Random Variable

has a finite or countable number of possible outcomes that can be listed. Countable refers to the fact that they might be infinitely many values, but they can be associated with a counting process.

New cards
94

Criteria for Binomial Distribution

  1. There are a fixed number of trials/observation. Labled n.

  2. The trials are independent (the outcome of any individual trial doesn’t affect the probabilities in the other trials)

  3. Each outcome can be classified as a success or failure. The outcome that a random variable is counting is labeled the success.

  4. The probability of a success is constant for each trial. The probability of success is denoted by P(S) = p.

New cards
95

Binomial Notation

X ~ Bin (n,p)

New cards
96

parameters of the distribution

number of trials (n), probability of success (p)

New cards
97

Expected Value

E(x), mean of a random variable

New cards
98

The expected value of a random variable is a ___

weighted mean of the outcomes

New cards
99

The expected value of a discrete random variable is equal to the ____ of the random variable

mean

New cards
100

Binomial Variance

σ² = n x p x q where q = 1-p

New cards

Explore top notes

note Note
studied byStudied by 4 people
... ago
5.0(1)
note Note
studied byStudied by 7 people
... ago
5.0(1)
note Note
studied byStudied by 1 person
... ago
4.0(1)
note Note
studied byStudied by 6 people
... ago
5.0(1)
note Note
studied byStudied by 13 people
... ago
5.0(1)
note Note
studied byStudied by 13 people
... ago
4.0(1)
note Note
studied byStudied by 20 people
... ago
5.0(1)
note Note
studied byStudied by 46 people
... ago
5.0(1)

Explore top flashcards

flashcards Flashcard (46)
studied byStudied by 186 people
... ago
5.0(2)
flashcards Flashcard (43)
studied byStudied by 91 people
... ago
5.0(3)
flashcards Flashcard (97)
studied byStudied by 10 people
... ago
5.0(1)
flashcards Flashcard (40)
studied byStudied by 17 people
... ago
5.0(1)
flashcards Flashcard (30)
studied byStudied by 31 people
... ago
5.0(1)
flashcards Flashcard (76)
studied byStudied by 9 people
... ago
5.0(1)
flashcards Flashcard (21)
studied byStudied by 19 people
... ago
5.0(1)
flashcards Flashcard (56)
studied byStudied by 28 people
... ago
5.0(2)
robot