ST 311 Final Exam

0.0(0)
studied byStudied by 0 people
full-widthCall with Kai
GameKnowt Play
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/193

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

194 Terms

1
New cards

Statistics

When referring to the field, this is the science of planning studies and experiments, organizing and data and organizing, summarizing, analyzing and interpreting those data.

2
New cards

Prepare

First phase of constructing a statistical study. Consider the population, data types, and sampling method.

3
New cards

Analyze

Second phase of constructing a statistical study. Describe the data you collected and use appropriate statistical methods to help with drawing conclusions.

4
New cards

Conclude

Third phase of constructing a statistical study. Using statistical inference, make reasonable judgments and answer broad questions.

5
New cards

Data

collections of observations, such as measurements, counts, descriptions, or survey responses. Helps us to understand our world. Varies and are imperfect.

6
New cards

Population

The complete collection of all measurements or data that are being considered. Typically is the complete collection of all data that we would like to better understand or describe. Also called the population of interest.

7
New cards

Sample

A subset of members selected from a population. For good results, this should be random and representative of the population.

8
New cards

Parameter

A numerical measurement describing some characteristic of a population. Note: both this and population begin with p!

9
New cards

Statistic

A numerical measurement describing some characteristic of a sample. Different from the field.

10
New cards

Quantitative data

Also known as numerical data. Consists of numbers representing counts or measurements. Examples: Age of a professional athlete, weight of a letter.

11
New cards

Categorical data

Also known as qualitative or attribute data. Consists of names or labels. When numbers are used as labels, it still pertains to this. Examples: college major, hometown.

12
New cards

Discrete Data

result when the data values are quantitative and the number of values is finite or “countable.” Example: the number of tosses of a coin before getting tails.

13
New cards

Continuous data

Results from infinitely many possible quantitative values, where the collection of values is not countable.

14
New cards

Important information

What you want to know and who you want to know it about.

15
New cards

Biased samples

Samples that are more likely to produce some outcomes than others. The resulting statistic may be too high or too low.

16
New cards

Convenience samples

Samples that are easy to collect. Often have some bias or do not represent the population in general.

17
New cards

Volunteer response sample

A self-selected sample of people who respond to a general appeal.

18
New cards

Random samples

Lead to results that follow a predictable pattern.

19
New cards

Simple random sample

A sample of n subjects is selected in such a way that every possible sample of the same size n has the same chance of being chosen. Compile a numbered list of the units in the population. Then use a computer, calculator, or table to generate random numbers. Those whose numbers are generated are selected to be in the sample.

20
New cards

Stratified sample

Subdivide the population into at least two different subgroups so that the subjects within the same subgroup share the same characteristics. Then draw a sample from each subgroup. The number sampled from each stratum may be done proportionally with respect to the size of the population.

21
New cards

Cluster sample

Divide the population area into naturally occurring sections then randomly select some of those clusters and choose all the members from those selected clusters.

22
New cards

Systematic sample

Select some starting point and then select every kth element in a population. Works well when units are in some order.

23
New cards

Multi stage sample

Collect data by using some combination of the basic sampling methods.

24
New cards

Bad sampling frame

when attempting to list all members of a population, some subjects are missing. It can be difficult to obtain a full, complete list.

25
New cards

Undercoverage

The sampling frame is missing groups from the population or the groups have smaller representation in the sample than in the population.

26
New cards

Non-response bias

Some part of the population chooses not to respond, or subjects were selected but are not able to be contacted.

27
New cards

Response bias

Responses given to questions or surveys are not truthful. This may occur when people are unwilling to reveal personal matters, admit to illegal activity, or otherwise tailor their responses to “please” the investigator.

28
New cards

Wording and order

The way questions are worded may be leading or inflammatory to elicit a particular response. The order in which questions are asked may influence the answers.

29
New cards

Experiment

The process of applying some treatment and then observing its effects. Almost always compares two (or more) groups. Typically this involves a treatment group and a control group. Type of study better at establishing causation.

30
New cards

Experimental units

The individuals in experiments. Called subjects when referring to people. Object upon which the response is measured or individuals on which the treatment is done.

31
New cards

Observational study

the process of observing and measuring specific characteristics without attempting to modify the individuals being studied. Tell what’s happening and cannot describe cause effect relationships. Good for establishing whether two variables are related, or to learn characteristics of a population.

32
New cards

Response variable

measures the outcome of a study

33
New cards

Explanatory variable

Explains or influences changes in the response variable.

34
New cards

Design of experiment

Plan for collecting the sample.

35
New cards

Treatment

A specific experimental condition applied to the units or subjects.

36
New cards

Treatment effects

What we’re looking for in an experiment. Different treatments causing different outcomes.

37
New cards

Experimental error

variability among observed values of the response variable for experimental units that receive the same treatment. We want this to be as small as possible.

38
New cards

Lurking variables

A variable that is not among the explanatory variables in a study and yet may influence the interpretation of the relationship among response and explanatory variables.

39
New cards

Confounding variables

Two variables are confounded when the effects on the response variable cannot be distinguished from each other.

40
New cards

Control group

Is a group that recieves no treatment and is used as a baseline or comparison for the treatment group.

41
New cards

Randomization

Randomly assign experimental units already in a sample to a treatment groupto reduce or eliminate bias.

42
New cards

Replication

Measure the effect of each treatment on many units to reduce chance variation in the result.

43
New cards

Completely randomized design

Participants are randomly assigned to treatments (including control groups). By randomly assigning subjects to treatments, the experimenter assumes that, on average, lurking variables will affect each treatment group equally. Any significant differences between groups can be fairly attributed to the explanatory variable.

44
New cards

Randomized block design

The experimenter divides participants into subgroups called blocks, such that the variability within blocks is less than the variability between blocks. Then, participants within each block are randomly assigned to treatment blocks.

45
New cards

Matched pairs designs

A special case of the randomized block design. It is used when the experiment has only two treatment groups; and participants can be grouped into pairs based on one or more blocking variables. Then, within each pair, participants are randomly assigned to different treatments. This can also be done as a before and after experiment where the same subject is recorder and after the treatment.

46
New cards

Placebo

A false drug or treatment that the subjects believe is real. Examples include sugar pills, saline solutions, fake treatments, etc.

47
New cards

The placebo effect

The tendency to react to a drug or treatment regardless of its actual physical function. People believe that a drug will make them better, so they get better whether the drug is real or not.

48
New cards

Bias of the subjects

similar to response bias in sampling, subjects may want to please the researcher or hope for a specific outcome.

49
New cards

Hawthorne effect

when people behave differently because they know they are being watched.

50
New cards

Bias of the researcher

People subconsciously behave in ways that favor what they believe. Researchers, even when following a protocol, are no different. They may assign subjects to groups or report results in a biased way. They may treat people or animals differently when holding certain expectations of their research.

51
New cards

Blinding

When individuals associated with an experiment (as a subject or experimenter) are not aware of how subjects are assigned (treatment or control, treatment or placebo). Without this knowledge, the subjects are less likely to respond with bias and the researchers are less likely to allow thier biases to influence the study.

52
New cards

Single-blind study

Those who could influence the results (subjects, administrators, technicians, etc.)

53
New cards

Double-blind study

Those who evaluated the results (judges, physicians, analysts, etc.) are blinded as well.

54
New cards

Frequency distribution

When working with large data sets, this is helpful in organizing and summarizing data. Ex: In a sample, how many people say each flavor of ice cream is their favorite?

55
New cards

Measure of center

A value at or near the center or middle of a data set. Often interpreted as “typical” values of a group. The most common ones are mean, median, and mode.

56
New cards

Σ

Denotes a sum, sigma

57
New cards

x

denotes an individual data value

58
New cards

n

denotes the number of values in a sample

59
New cards

N

denotes the number of values in a population

60
New cards

Denotes the sample mean. Pronounced “x-bar”

61
New cards

μ

Denotes the population mean. Pronounced “mew”

62
New cards

Mean

Found by adding all values and divided by the number of values in the set. Is highly affected by outliers and is not good for skewed data sets.

63
New cards

Median

The value that is in the middle when listed in ascending order. Shows what separates the bottom 50% from the top 50%. Not affected by outliers and can use with any data set.

64
New cards

Mode

The value that occurs with the greatest frequency in a data set. Not necessarily in the center, not affected by outliers. Only useful for qualitative data.

65
New cards

Unimodal

A data set with one mode

66
New cards

Bimodal

A data set with two modes.

67
New cards

multimodal

A data set with more than two modes.

68
New cards

Histogram

The graph of a frequency distribution. Consists of bars of equal width drawn adjacent to each other, a horizontal scale representing classes of quantitative data values, and a vertical scale representing frequency.

69
New cards

Normal distribution

Unimodal and symmetric, the bell curve. A continuous probability distribution for a random variable, x. Historically, it is a very important distribution in statistics.

70
New cards

Right-skewed distribution

positively-skewed. mode<median<mean. Outliers appear on the right side.

71
New cards

Left-skewed distribution

Negatively skewed. mean<median<mode. Outliers appear on the left side.

72
New cards

Uniform distribution

equal spread, no peaks.

73
New cards

Symmetric distribution

mean=median=mode

74
New cards

Variability

The extent to which data points in a statistical distribution or data set diverge from the average value, as well as the extent to which these data points differ from each other.

75
New cards

Range

The difference between the maximum and minimum. Since this is calculated using only the two most extreme data values, it is highly affected by outliers.

76
New cards

Interquartile range

Uses what are called quartiles to provide a range of values that are not as affected by potential outliers as the range. The difference between the third and first quartiles.

77
New cards

Quartiles

values that separate a data set into fourths.

78
New cards

Q1

The first quartile

79
New cards

Q2

The second quartile, a.k.a. the median

80
New cards

Q3

The third quartile

81
New cards

Five number summary

1: Minimum, 2: Q1, 3:Median, 4: Q3, 5: Maximum,

82
New cards

Boxplot

A visual representation of the 5 number summary and also helps to identify outliers. Can be displayed vertically or horizontally.

83
New cards

variance

(standard deviation)²

84
New cards

Standard deviation

How much data values deviate from the mean. Never negative. Zero only when all the data values are exactly the same. Can increase dramatically with one or more outliers. Units are the same as the original data value.

85
New cards

Population variance

σ² (sigma squared)

86
New cards

Standard deviation

σ (sigma)

87
New cards

Sample variance

s² (s-squared)

88
New cards

Sample standard deviation

s. Value used to estimate the standard deviation for mean confidence intervals at it is often not known.

89
New cards

Z-score

the number of standard deviations away from the mean a certain data value is.

90
New cards

Positive z-score

Data value is above average

91
New cards

Negative z-score

Data value is below average.

92
New cards

Standardizing

The process of converting a data value (which is often labelled x) to a z-score. The formula used to do this is

z= (x - μ)/σ

z=value of interest-mean/standard deviation

93
New cards

Significantly low

When values are (μ-2σ) or lower (beyond z=-2)

94
New cards

Significantly high

When values are (μ+2σ) or higher (beyond z=+2)

95
New cards

Density curve

A curve with a total area under the curve equal to one.

96
New cards

Area under the density curve

represents probability in a continuous probability distribution.

97
New cards

Probability statement

P(A<x<B). Saying, the probability that we observe a random value between A and B is some number.

98
New cards

Normal curve

the graph of a normal distribution.

Properties:

  1. The mean, median, and mode are equal

  2. Bell shaped and symmetric about the mean

  3. Total area under the curve is equal to one

  4. Approaches, but never touches the x-axis as it extends farther and farther away from the mean.

99
New cards

X ~ N(mean, standard deviation)

The random variable x is distributed normally with mean μ and standard deviation σ.

100
New cards

Standard Normal Distribution

The distribution of z-scores. Has a mean, μ, of 0 and a standard deviation, σ, of 1.