STA 215 Test 1

0.0(0)
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/99

flashcard set

Earn XP

Description and Tags

Statistics

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

100 Terms

1
New cards
Population
The entire set of people or objects that you want to \n draw conclusions about
2
New cards
Statistical Inference
Using the results of a sample to draw conclusions about a population, along with a measure of how good those conclusions are
3
New cards
Simple Random Sample
sample taken in such a way that every possible group of n individuals has the same chance of being the sample
4
New cards
Describing shape of a graph
Symmetry, modes, outliers
5
New cards
Variable
The characteristic being observed
6
New cards
Variables are denoted by
Capital letters
7
New cards
Value
The category or number of the variable for that individual
8
New cards
Values are denoted by
Lowercase letters
9
New cards
Two types of variables are:
Categorical and numerical
10
New cards
Ordinal variables
Can be categorical or numerical
11
New cards
Distribution of a variable tells you
Possible values of the variable and how often they occur
12
New cards
Area Principle
The areas for values in the picture must be proportional to the percent of observations which take on that value in the population or sample
13
New cards
Two pictures drawn to represent categorical variables
Pie charts and bar graphs
14
New cards
Does a bar graph have a shape? (Y/N)
N
15
New cards
Difference between bar charts using frequencies/relative

frequencies/percents
Labelling of y-axis. Relative areas will remain the same
16
New cards
To find the angle at the tip of a wedge in a pie chart
Multiply relative frequency by 360 degrees
17
New cards
Area principle holds over what kind of variables
Categorical and numerical
18
New cards
Symmetric
Left and right sides are mirror images of each other
19
New cards
Skewed
If a long tail is present, graph is skewed in the direction of the tail
20
New cards
Mode
Peak of a distribution
21
New cards
When there is more than one mode in the distribution, it usually means
There is more than one subgroup in the population
22
New cards
Outlier
A data point that falls away from the others, outside of the overall pattern
23
New cards
First thing to do when you find an outlier
Determine if the outlier is actually a correct data value
24
New cards
Two most common pictures for numerical data
Stemplots and histograms
25
New cards
If needed, stems in stemplot can be split into
2 or 5
26
New cards
What is the difference between the shape of a histogram and the shape of a stemplot?
None, they have the same shape
27
New cards
For small data sets, you should use a (stemplot/histogram)
Stemplot
28
New cards
For large data sets, you should use a (stemplot/histogram)
Histogram
29
New cards
Advantages of stemplot over histogram
Part of the original data can be recovered and it is fast to rank data from smallest to largest
30
New cards
Advantages of histogram over stemplot
Can make interval widths anything you want
31
New cards
Timeplot
Shows how a variable evolves over time
32
New cards
Does a timeplot have shape? (Y/N)
N
33
New cards
When analyzing a timeplot, you must analyze
Trends, cycles, and departures
34
New cards
Trends
A general up or down movement
35
New cards
Cycles
A recurring up and down movement
36
New cards
Departures
Outliers from the regular pattern
37
New cards
When you find a departure:
Look for a reason for it
38
New cards
It is possible to change the way that any picture looks by changing:
The scale
39
New cards
Why do categorical graphs not have a shape but numerical graphs do?
There isn’t an order to the variables in categorical data, but the variables are ordered in numerical data
40
New cards
Average
Measure of the center of the data set
41
New cards
Median
Middle number in the data set when numbers are ranked from smallest to largest
42
New cards
First step in finding the median
Rank the numbers from smallest to largest
43
New cards
Sample median is denoted by
m
44
New cards
Sample mean is denoted by
x bar
45
New cards
Mean can be found by
Adding up numbers and dividing by n
46
New cards
p% trimmed mean can be found by
Trimming the smallest and largest p% of data from the data set
47
New cards
What is the difference between a parameter and a statistic?
A parameter is calculated from the population; a statistic is calculated from a sample
48
New cards
What does it mean for a statistic to be resistant to the effect of outliers?
Large changes in a few of the data points do not change the value of the statistic by very much
49
New cards
If there are outliers in the data set, the (mean/median) is better for calculating the data set
Median
50
New cards
If there are not outliers in the data set, the (mean/median) is better for calculating the data set
Mean
51
New cards
Why is the mean better for calculating the average of a data set with no outliers?
It uses all the data, not just the middle values
52
New cards
A parameter is (constant/variable)
Constant
53
New cards
A statistic is (constant/variable)
Variable
54
New cards
In a picture of a distribution, the mean is
The balance point
55
New cards
In a picture of the distribution, the median is
Where the area is split in half
56
New cards
If the distribution is roughly symmetric, the mean and median will be
Close to the same number
57
New cards
If the mean and the median are the same number, the distribution is
Cannot tell
58
New cards
How do you know if data is skewed or has outliers?
Draw a picture
59
New cards
The pth percentile of a data set
a number such that at least p% of the numbers in the data set are at or below it and at least (100 – p)% of the numbers in the data set are at or above it
60
New cards
The 50th percentile is the
Median
61
New cards
5 numbers in the five number summary
Min, Q1, median, Q3, max
62
New cards
The 5 number summary splits the data into parts that contain x amount of the data
One fourth
63
New cards
Box plot
A picture of the five number summary
64
New cards
Lower fence
Q1 - (1.5)(IQR)
65
New cards
Upper fence
Q3 + 1.5(IQR)
66
New cards
Can you tell the shape of a distribution from a boxplot? (Y/N)
N
67
New cards
Biggest use of boxplots
Side by side comparison of data sets
68
New cards
If the data set has outliers or is skewed, the (IQR/standard deviation) is used
IQR
69
New cards
Spread
The average distance of all points from the middle
70
New cards
Mean deviation is always
Zero
71
New cards
Why is sample standard deviation divided by n - 1?
Samples tend to be less spread out than the populations that they are drawn from
72
New cards
Sampling distribution
Distribution of a variable
73
New cards
Experiment
Observation of a random phenomenon to see what happens
74
New cards
Trial
Repetitions of an experiment
75
New cards
Sample space
Set of all possible outcomes for an experiment
76
New cards
Event
A set of outcomes in the sample space S that possesses some characteristic
77
New cards
Disjoint/mutually exclusive
Two events with no outcomes in common and therefore cannot happen at the same time
78
New cards
Complement of A
All elements in S that are not in A
79
New cards
Probability of an event
Its long-term relative frequency
80
New cards
Independence
Two events are independent if knowing that one event happened does not affect the probability of another event
81
New cards
Law of Large Numbers
The long-run relative frequency of repeated independent events gets closer to the true relative frequency as the number of trials increases
82
New cards
The probability of P(A) of any event A is between
0 and 1
83
New cards
If S is the sample space, P(S) =
1
84
New cards
P(A U B) =
P(A) + P(B) - P(AB)
85
New cards
P(AB) =
P(A)P(B|A)
86
New cards
P(A|B) =
P(AB) / P(B)
87
New cards
Discrete random variable
Takes on a finite or countably infinite number of values
88
New cards
Continuous random variable
Can take on any value in an interval
89
New cards
On a continuous curve, probability that P = something is
Zero
90
New cards
z-scores always have mean
Zero
91
New cards
z-scores always have standard deviation
One
92
New cards
What % of observations are within one standard deviation of the normal curve?
68
93
New cards
What % of observations are within two standard deviations of the normal curve?
95
94
New cards
What % of observations are within three standard deviations of the normal curve?
99\.7
95
New cards
How can you tell if a data set has a normal distribution?
Draw a picture
96
New cards
Sampling distribution of a statistic
Distribution of the numbers you get if you take every possible sample of size n from the population and evaluate the statistic for each one
97
New cards
What does it mean for a statistic to be unbiased?
The mean of its sampling distribution equals the parameter it’s estimating
98
New cards
Why is the sample range an underestimate?
A sample cannot be bigger than the range, but can be smaller. On average, it will be too small
99
New cards
Spread of sampling distribution tells you:
Margin of error
100
New cards
Decrease margin of error by:
Increasing sample size