ST 311 Midterm 1 NCSU

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/57

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

58 Terms

1
New cards

Parameter

A parameter is a numerical summary of a variable for the entire population (typically unknown)

2
New cards

Variable

A variable is the characteristic of the units that we want to learn about

3
New cards

Statistic

A statistic is the numerical summary of a variable for a sample (used to estimate the parameter)

4
New cards

Sampling Frame

A sampling frame is the list of units from which a sample is selected

5
New cards

Census

A census is a special case when every unit in the population is measured or surveyed. It is often difficult or impossible to conduct a census.

6
New cards

Sample

A sample is the smaller group. A sample is part of the population we actually examine in order to gather information. In equations, sample size is represented as n.

7
New cards

Population

A population is the entire group of items or individuals (units) that we want information about.

8
New cards

Randomization

Randomization is necessary to ensure meaningful inference.

9
New cards

The Sample Statistic is used to estimate...

the sample statistic is used to estimate the population parameter.

10
New cards

Random Sampling

Random Sampling uses chance mechanism which avoids bias. By Random Sampling we are more likely to have a sample statistic that better reflects the population parameter. Appropriate inference is only assured when a random sample is selected.

11
New cards

A biased sample may produce sample statistics that are...

A biased sample may produce sample statistics that are consistently higher or lower than the population parameter.

12
New cards

Voluntary Response Sample

This is a bad sampling method. In a Voluntary Response Sample only those who volunteer to participate are included in the sample. Voluntary Response Sample people tend to have stronger opinions than the general population. (online polls).

13
New cards

Convenience Sample

This is a bad sampling method. In Convenience Samples the most convenient or readily available group is considered as the sample. (People walking by the brickyard).

14
New cards

Simple Random Sample (SRS)

This is a good sampling method. A Simple Random Sample is every different possible sample of the desired size has the same chance of being selected.

15
New cards

Stratified Random Sample

This is a good sampling method. A Stratified Random Sample is when the population is first divided into non-overlapping groups called strata. Then, a random sample is selected from each group. Within a stratum, every person has the same chance of being selected.

16
New cards

Cluster Sample

This is a good sampling method. In Cluster Samples, the population is first divided into overlapping groups called clusters. Then a random sample of clusters is selected and all the individuals in the selected cluster are included in the sample. Every cluster has the same chance of being selected; however sometimes not all groups are represented in the sample.

17
New cards

Systematic Sample

This is a good sampling method. A Systematic Sample is when the population is a list divided into consecutive segments. One individual is randomly selected from the first segment and the same position is selected from each of the remaining groups. (select every Kth unit from the random starting point)

18
New cards

Selection Bias

Selection Bias is when sample participants tend to systematically differ from the population of interest.

19
New cards

Undercoverage

Type of survey bias. Undercoverage is the tendency for a sample to differ from the corresponding population because the sampling frame excludes some parts of the population. (minority)

20
New cards

Nonresponse Bias

Type of survey bias. Nonresponse Bias is the tendency for a sample to differ from the corresponding populatio because a subset of the sample cannot be contacted or does not respond.

21
New cards

Response Bias

Type of survey bias. Response Bias is the tendency for a sample to differ from the corresponding population because participants respond differently from how they truly feel.

22
New cards

Categorical Variable

A Categorical Variable places a unit into one of several groups or categories such as major, car type, hair color, or letter grades (A, A+, B, B-). Categorical Data is displayed using pie charts or bar graphs.

23
New cards

Quantitative Variable

A Quantitative Variable takes numeric values for which arithmetic operations such as adding and averaging make sense such as height, age, exam score, and points. Quantitative data is displayed using histograms, dot plots, or box plots.

24
New cards

Left Skewed

in left skewed data, the median is bigger than the mean

25
New cards

Right Skewed

in right skewed data, the mean is larger than the median

26
New cards

Symmetric

in symmetric data, the mean and median are equal.

27
New cards

Mean

Mean is the average; to get the mean you add all the observed values and divide by the number of observations. The Mean is sensitive to outliers. The Mean moves in direction of skeweness.

28
New cards

Median

Median is the middle value of ordered data. To find the Median, order all the values from smallest to largest and find the middle number.
The Median is not sensitive to outliers. The Median is not effected by skewness. The median implies there is 50% of data above it and 50% below it.

29
New cards

Center

the Center is the middle of the data, or where the distribution would balance. Measures of center include mean (average) and median (the middle value). Measures of center are affected by adding, subtracting, multiplying, and dividing the original data by the same value.

30
New cards

Spread

the spread is the measure of variability. measures of spread include range, interquartile range (IQR) and Standard Deviation. Measures of Spread are only affected by multiplying and dividing.

31
New cards

Deviations

Deviations from overall pattern. Look for possible outliers, or unusual points that are not consistent with the rest of the data.

32
New cards

Interquartile Range (IQR)

The interquartile range is a single number equal to the third quartile minus the first quartile.

33
New cards

Standard Deviation

The Standard Deviation is a measure of how far, on average, the data values are from the mean. The Standard Deviation cannot be negative, and is rarely zero (which means there is no variation because they are all the same number). Standard Deviation is a measure of Spread and is affected by multiplying or dividing all the values by the same number. Standard Deviation takes into account all of the data.

34
New cards

Normal Distribution

Normal Distributions have a bell shape. Normal Distribution is symmetric and bell shaped.
The Normal Distribution is characterized by its mean, which is at the center of the distribution, and its standard deviation.

35
New cards

The Empirical Rule

In the Empirical Rule for any bell shaped curve 68% of the data will fall within 1 standard deviation of the mean. 95% of the data will fall within 2 standard deviations of the mean in either direction. 99.7% of the observations will fall within 3 standard deviations of the mean in either direction.

36
New cards

Standardize / Z Score

The Z Score is the distance between an observation and the mean, measured in terms of number of standard deviations. We use the Z Score to find percentages or probabilities for a normal distribution with any mean and any standard deviation.
Z -Score values that are above (greater than) the mean will have positive z-scores. Z- Score values that are below (less than) the mean will have negative z-scores. Most z-scores will be between -3 and +3. The Z-score follows a standard normal distribution with a mean of zero and a standard deviation of 1.

37
New cards

Finding Probabilities from Table Z

P( Z < number) = use number in table. P(Z>number) = use 1 - number in table.

38
New cards

The Standard Normal Distribution

The Standard Normal Distribution has a mean of ZERO and a standard deviation of ONE.

39
New cards

Sample Proportion

Sample Proportion is the number of items that fall into a given category divided by the total number of observations in your sample. Categorical Data. It is shown as P Hat. Answers a yes or no question.

40
New cards

Sampling Variability

Sampling Variability is the variation in sample statistics that results from selecting different random samples.

41
New cards

p

population proportion. There can be only one value of p.

42
New cards

p hat

sample proportions. There can be several values of p hat.

43
New cards

Sampling Distributions are..

Sampling Distributions are predictable.

44
New cards

Distributions need to have...

Distributions need to have shape, center, and spread.

45
New cards

(Z*) Z Multiplier

the Z multiplier tells us how many standard deviations away we believe our estimate is from the true parameter. Use Table Z or Table t.

46
New cards

Confidence Level

Confidence Level is if we take many samples from the same population, the proportion of samples that will produce a confidence interval that contains the true population
parameter.

47
New cards

If an outlier is present in a data set it...

If an outlier is present in a data set it can make the mean and median very different from each other

48
New cards

If N is less than 30 and the population is skewed then the sampling distribution will be..

If N is less than 30 and the population is skewed then the sampling distribution will be skewed but not as much as the population.

49
New cards

If N (sample) is more than 30 and the population is skewed then the sampling distribution will be...

If N is more than 30 and the population is skewed then the sampling distribution will be approximately normal

50
New cards

If N (sample) is less than 30 and the population is normal then the sampling distribution will be...

If N is less than 30 and the population is normal then the sampling distribution will be approximately normal

51
New cards

If N (sample) is large and the population is normal then the sampling distribution will be...

If N is large and the population is normal then the sampling distribution will be approximately normal.

52
New cards

Does the sampling distribution always have more or less variability than the population?

The sampling distribution always has less variability than the population.

53
New cards

As the sample size INCREASES the variability in the sample mean...

As the sample size INCREASES the variability in the sample mean DECREASES.

54
New cards

Central Limit Theorem (CLT)

The Central Limit Theorem states if the variable Y follows ANY distribution with mean and standard deviation and the sample size is large, or more than 30, then Y BAR from a simple random sample follows a NORMAL DISTRIBUTION.

55
New cards

If the parent population is normal then the sampling distribution will be...

If the parent population is normal then the sampling distribution will be normal no matter the sample size.

56
New cards

Inference

Inference is the process of using sample information to make conclusions about the population of interest.

57
New cards

Standard Error

Standard Error is an estimate of the standard deviation of the sampling distribution. The Standard Error only depends on sample quantities. Standard Error is key in calculating confident intervals

58
New cards

Margin of Error

The Margin of Error is the distance from the population parameter that will include most of the possible values of a sample statistic