ST 311

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/125

flashcard set

Earn XP

Description and Tags

ncsu

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

126 Terms

1
New cards

Data

Collections of observations (measurements, counts, and survey responses)

2
New cards

Population

Complete collection of all measurements or data that is being considered. Aka population of interest

3
New cards

Sample

A subset of members selected from a population

4
New cards

How to select a sample

Should be random and representative of the population

5
New cards

Parameter

Numerical measurement describing some characteristic of a population

6
New cards

Statistic

Numerical measurement describing some characteristic of a sample

7
New cards

Quantitative Data

aka numerical data; consists of numbers representing counts or measurements

8
New cards

Examples of quantitative data

age of an athlete, weight of a letter

9
New cards

Categorical Data

aka qualitative data; consists of names or labels

10
New cards

Example of categorical data

college major, hometown

11
New cards

Discrete Data

result when the data variables are quantitative and the numbers are countable/finate

12
New cards

Example of discrete data

the number of tosses of a coin before getting tails

13
New cards

Continuous/numerical Data

result from infinitely possible values, uncountable

14
New cards

Example of continuous/numerical data

the arm span of high school seniors

15
New cards

Bias

those samples that are more likely to produce some outcomes than others (resulting statistics might be too high or too low)

16
New cards

Convenience

those samples that are easy to collect (often have some bias or don’t represent the population in general)

17
New cards

Volunteers

a self-selected sample of people who respond to a general appeal

18
New cards

Simple random sample

a sample of x subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen

19
New cards

Stratified sample

subdivide the population into at least two different groups, so that the subjects within the same subgroup share the same characteristics. then draw a sample from each subgroup

20
New cards

Cluster sample

divide the population area into naturally occurring sections then randomly select some of those clusters and choose all members from the selected cluster

21
New cards

Systematic sample

select some starting point and then select every nth element in the population. works well when units are in some order (ex. house on the block)

22
New cards

Multistage sample

collect data by using some combination of the basic sampling methods

23
New cards

Bad sampling frame

when attempting to list all members of a population, some subjects are missing. can be difficult to obtain a complete list

24
New cards

Undercoverage

the sampling frame is missing groups from the population

25
New cards

Non-response bias

some parts of the population chose not to respond

26
New cards

Response bias

responses given are not truthful

27
New cards

Wording/order

wording of questions is leading to elicit a particular response

28
New cards

Experiment

the process of applying some treatment and then observing its effects. almost always compares two (or more) groups (treatment vs control)

29
New cards

Observational study

the process of observing and measuring specific characteristics without attempting to modify the individuals being studied. tells what’s happening and cannot describe cause-effect relationships

30
New cards

Response variable

measures an outcome of a study

31
New cards

Explanatory variable

explains or influences changes in the response variable

32
New cards

Treatment effects

different treatment = different outcome (what we want)

33
New cards

Experimental error

variability among observed values of the response variable for experimental units that receive the same treatment

34
New cards

Lurking variables

a variable that is not among the explanatory variables in a study and yet may influence the interpretation of the relationship among response and explanatory variables

35
New cards

Confounding variables

two variables are confounded when the effects on the response variable cannot be distinguished from each other

36
New cards

Control

control the effects of lurking/confounding variables by careful planning (control group receives no treatment)

37
New cards

Randomization

randomly assign experimental units to treatments to reduce or eliminate bias

38
New cards

Replication

measure the effect of each treatment on many units to reduce chance variation in results

39
New cards

Completely randomized design

participants are randomly assigned to treatments (including control group). Assumes that on average lurking variables will affect each treatment group equally

40
New cards

Randomized block design

divides participants into subgroups called blocks. Variability within blocks is less than variability between blocks. Participants from each block are then randomly grouped.

41
New cards

Matched pairs designed

used when an experiment has only 2 treatment groups; participants can be grouped into pairs and within pairs are randomly assigned to different treatments

42
New cards

The placebo effect

the tendency to react to a drug or treatment regardless of its actual physical function.

43
New cards

Hawthorne effect

behavior is different because the subject knows they are being watched

44
New cards

Blinding

When individuals associated with an experiment are not aware of how subjects have been assigned

45
New cards

Single blind study

those who could influence the results are blinded

46
New cards

Double blind study

those who evaluate the results are blinded as well

47
New cards

Measure of center

a value at or near the middle of a data set (mean, median, mode)

48
New cards

denotes a sum, “sigma”

49
New cards

x

denotes an individual data value

50
New cards

n

denotes the number of values in a sample, “sample size”

51
New cards

N

denotes the number of values in a population

52
New cards

denotes the sample mean, “x bar”

53
New cards

μ

denotes the population mean, “mew”

54
New cards

Mean

found by adding all values and dividing by the number of values in a data set (uses every data value so not good for skewed data)

55
New cards

Median

the value in the middle when listed in ascending order (not affected by outliers, can be used with any data set)

56
New cards

Mode

the value that occurs with the greatest frequency (only useful for multimodal or qualitative data)

57
New cards

Unimodal

dataset with one mode

58
New cards

Bimodal

dataset with two modes

59
New cards

Multimodal

dataset with more than two modes

60
New cards

Which measure of center do you choose?

Quantitative = mean or median

Categorical = mode

61
New cards

Horizontal histogram

represents quantitative data

62
New cards

Vertical histogram

represents frequency

63
New cards

Right skewed histogram

highest amount to the left

<p>highest amount to the left</p>
64
New cards

Left skewed histogram

highest amount to the right

<p>highest amount to the right</p>
65
New cards

Symmetrical

mean = median = mode

66
New cards

Right skewed (pos)

mode < median < mean

67
New cards

Left skewed (neg)

mean < median < mode

68
New cards

Range

the difference between the maximum and minimum

R = max value - min value (highly affected by outliers)

69
New cards

Interquartile range

provides a range of values that are not as affected by potential outliers

IQR = Q3 - Q1

70
New cards

Varience

V = (standard deviation)2

71
New cards

Standard deviation

SD = √V

72
New cards

Standard deviation

a measure of how much data values deviate from the mean. Increases with 1 or more outliers (never negative)

73
New cards

σ²

population variance

74
New cards

σ

standard deviation

75
New cards

s2

sample variance

76
New cards

s

standard deviation

77
New cards

z-Scores

when you want to compare two numbers from different groups relative to their own groups

78
New cards

Positive z-score

data value is above average

79
New cards

Negative z-score

data value is below average

80
New cards

z-score equation

Z=\frac{x-\mu}{\sigma} (value - mean / standard deviation)

81
New cards

-1 \sigma  to +1 \sigma  

68% of the data lie between these

82
New cards

-2 \sigma to +2 \sigma

95% of the data lie between these

83
New cards

-3 \sigma to +3 \sigma

99.7% of the data lie between these

84
New cards

The emperical rule

for a normal distribution, approximately 68% of data falls within 1 standard deviation of the mean.

85
New cards

Significantly low

values are considered significantly or unusual if they are -2 \sigma  or lower

86
New cards

Significantly high

values are considered significantly or unusual if they are +2 \sigma or higher

87
New cards

Probability

represented by the area under the density curve

88
New cards

Normal distribution (total area under the curve is equal to 1)

a continuous probability distribution for a random variable. Mean, mode, and median are equal. Bell-shaped and is symmetric about the mean.

89
New cards

Parameters

The mean is located in the center and the standard deviation defines the shape

90
New cards

Normal distribution

X~N( \mu  , \sigma )

91
New cards

The standard normal distribution

the distribution of z-scores, has a mean of zero, and a standard deviation of one.

Z~N(0,1)

92
New cards

Probability distribution

describes how likely the values of the variable are to occur

93
New cards

Binomial distribution

a binomial random variable counts the number of successes that must be true

94
New cards

Qualities to make a distribution binomial

  1. Fixed number of trials/observations labelled as “n”

  2. Independent trials (outcome of one doesn’t affect the probability in the others)

  3. Either a success (S) or failure (F)

95
New cards

Success in binomial distributions

when the outcome that a random variable is counted, probability of success is constant for each trial.

96
New cards

Success equation

P(S) = p

97
New cards

Binomial equation

X~Bin(n, p)

n = # of trials & p = probability of success

98
New cards

Mean of binomial distribution

\mu=n\cdot p  (the mean of a random variable, aka E(x), the expected value)

99
New cards

E(x)

the expected value, a weighted mean of the outcomes (likely outcomes get more “weight” than unlikely)

100
New cards

Expected value vs mean of random variable

expected value of a discrete random variable is equal to the mean of the random variable