ST 311 Midterm 1 NCSU

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/60

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

61 Terms

1
New cards

Parameter

a numerical summary of a variable for the entire population (typically unknown)

2
New cards

Variable

the characteristic of the units that we want to learn about

3
New cards

Statistic

the numerical summary of a variable for a sample (used to estimate the parameter)

4
New cards

Sampling Frame

the list of units from which a sample is selected

5
New cards

Census

  • a special case when every unit in the population is measured or surveyed

  • often difficult or impossible to conduct

6
New cards

Sample

  • the smaller group

  • part of the population we actually examine in order to gather information

  • represented as n in equations

7
New cards

Population

the entire group of items or individuals (units) that we want information about.

8
New cards

Randomization

necessary to ensure a meaningful inference

9
New cards

The Sample Statistic is used to estimate...

the population parameter

10
New cards

Random Sampling

  • uses chance mechanism which avoids bias

  • we are more likely to have a sample statistic that better reflects the population parameter

  • Appropriate inference is only assured when a random sample is selected.

11
New cards

A biased sample may produce sample statistics that are...

consistently higher or lower than the population parameter.

12
New cards

Voluntary Response Sample

  • This is a bad sampling method.

  • only those who volunteer to participate are included in the sample

  • people tend to have stronger opinions than the general population

  • ex. online polls

13
New cards

Convenience Sample

  • This is a bad sampling method

  • the most convenient or readily available group is considered as the sample

  • ex. people walking by the brickyard

14
New cards

Simple Random Sample (SRS)

  • This is a good sampling method

  • every different possible sample of the desired size has the same chance of being selected.

15
New cards

Stratified Random Sample

  • This is a good sampling method

  • when the population is first divided into non-overlapping groups called strata

  • Then, a random sample is selected from each group.

  • Within a stratum, every person has the same chance of being selected.

16
New cards

Cluster Sample

  • This is a good sampling method

  • the population is first divided into overlapping groups called clusters

  • Then a random sample of clusters is selected and all the individuals in the selected cluster are included in the sample

  • Every cluster has the same chance of being selected; however sometimes not all groups are represented in the sample.

17
New cards

Systematic Sample

  • This is a good sampling method

  • when the population is a list divided into consecutive segments

  • One individual is randomly selected from the first segment and the same position is selected from each of the remaining groups

    • select every Kth unit from the random starting point

18
New cards

Selection Bias

when sample participants tend to systematically differ from the population of interest.

19
New cards

Undercoverage

  • Type of survey bias.

  • the tendency for a sample to differ from the corresponding population because the sampling frame excludes some parts of the population

    • minority

20
New cards

Nonresponse Bias

  • Type of survey bias

  • the tendency for a sample to differ from the corresponding population because a subset of the sample cannot be contacted or does not respond.

21
New cards

Response Bias

  • Type of survey bias

  • the tendency for a sample to differ from the corresponding population because participants respond differently from how they truly feel.

22
New cards

Categorical Variable

  • places a unit into one of several groups or categories such as major, car type, hair color, or letter grades (A, A+, B, B-).

  • displayed using pie charts or bar graphs.

23
New cards

Quantitative Variable

  • takes numeric values for which arithmetic operations such as adding and averaging make sense such as height, age, exam score, and points

  • displayed using histograms, dot plots, or box plots.

24
New cards

Left Skewed

  • the median is bigger than the mean

  • all data towards the right of graph

25
New cards

Right Skewed

  • the mean is bigger than the median

  • all data towards the left of graph

26
New cards

Symmetric

  • the mean and median are equal

  • bell curve shape

27
New cards

Mean

  • the average

  • add all the observed values and divide by the number of observations

  • sensitive to outliers

  • moves in direction of skeweness.

28
New cards

Median

  • the middle value of ordered data

  • order all the values from smallest to largest and find the middle number.

  • not sensitive to outliers

  • not effected by skewness

  • implies there is 50% of data above it and 50% below it.

29
New cards

Center

  • the middle of the data, or where the distribution would balance

  • Measures include mean (average) and median (the middle value)

  • Measures are affected by adding, subtracting, multiplying, and dividing the original data by the same value.

30
New cards

Spread

  • the measure of variability

  • measures include range, interquartile range (IQR) and Standard Deviation

  • Measures are only affected by multiplying and dividing.

31
New cards

Deviations

  • Deviations from overall pattern

  • Look for possible outliers, or unusual points that are not consistent with the rest of the data.

32
New cards

Interquartile Range (IQR)

a single number equal to the third quartile minus the first quartile.

33
New cards

Standard Deviation

  • a measure of how far, on average, the data values are from the mean

  • cannot be negative, and is rarely zero (which means there is no variation because they are all the same number)

  • a measure of Spread and is affected by multiplying or dividing all the values by the same number

  • takes into account all of the data.

34
New cards

Normal Distribution

  • bell shape

  • symmetric

  • characterized by its mean, which is at the center of the distribution, and its standard deviation.

35
New cards

The Empirical Rule

  • for any bell shaped curve 68% of the data will fall within 1 standard deviation of the mean

  • 95% of the data will fall within 2 standard deviations of the mean in either direction

  • 99.7% of the observations will fall within 3 standard deviations of the mean in either direction.

36
New cards

Standardize / Z Score

  • the distance between an observation and the mean, measured in terms of number of standard deviations

  • used find percentages or probabilities for a normal distribution with any mean and any standard deviation.

  • values that are above (greater than) the mean will have positive z-scores

  • values that are below (less than) the mean will have negative z-scores

  • Most z-scores will be between -3 and +3

  • follows a standard normal distribution with a mean of zero and a standard deviation of 1.

37
New cards

Finding Probabilities from Table Z

  • P( Z < number) = use number in table

  • P(Z>number) = use 1 - number in table.

38
New cards

The Standard Normal Distribution

has a mean of ZERO and a standard deviation of ONE.

39
New cards

Sample Proportion

  • the number of items that fall into a given category divided by the total number of observations in your sample

  • Categorical Data

  • It is shown as P Hat

  • Answers a yes or no question.

40
New cards

Sampling Variability

the variation in sample statistics that results from selecting different random samples.

41
New cards

p

  • population proportion

  • There can be only one value of p.

42
New cards

p hat

  • sample proportions

  • There can be several values of p hat.

43
New cards

Sampling Distributions are..

predictable

44
New cards

Distributions need to have...

shape, center, and spread

45
New cards

(Z*) Z Multiplier

  • tells us how many standard deviations away we believe our estimate is from the true parameter

  • Use Table Z or Table t.

46
New cards

Confidence Level

if we take many samples from the same population, the proportion of samples that will produce a confidence interval that contains the true population parameter.

47
New cards

If an outlier is present in a data set it...

can make the mean and median very different from each other

48
New cards

If N is less than 30 and the population is skewed then the sampling distribution will be..

skewed but not as much as the population.

49
New cards

If N (sample) is more than 30 and the population is skewed then the sampling distribution will be...

approximately normal (m)

50
New cards

If N (sample) is less than 30 and the population is normal then the sampling distribution will be...

approximately normal (less)

51
New cards

If N (sample) is large and the population is normal then the sampling distribution will be...

approximately normal (la)

52
New cards

Does the sampling distribution always have more or less variability than the population?

less variability

53
New cards

As the sample size INCREASES the variability in the sample mean...

decreases

54
New cards

Central Limit Theorem (CLT)

  • states if the variable Y follows ANY distribution with mean and standard deviation and the sample size is large, or more than 30, then Y BAR from a simple random sample follows a NORMAL DISTRIBUTION.

55
New cards

If the parent population is normal then the sampling distribution will be...

normal no matter the sample size.

56
New cards

Inference

the process of using sample information to make conclusions about the population of interest.

57
New cards

Standard Error

  • an estimate of the standard deviation of the sampling distribution

  • only depends on sample quantities

  • key in calculating confident intervals

58
New cards

Margin of Error

the distance from the population parameter that will include most of the possible values of a sample statistic

59
New cards

The “box” in a boxplot indicates…

the start and end of the middle 50% of the data

60
New cards

The “whiskers” in a boxplot indicate…

  • the range of the data, ending at the minimum and maximum values that are not considered outliers

    • or at 1.5 times the Interquartile Range (IQR)

61
New cards

A boxplot is constructed of…

  • the minimum value

  • the first quartile

  • the median

  • the third quartile

  • the maximum value