Introduction to Statistical Inference and Sampling Techniques

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/99

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

100 Terms

1
New cards

Statistics

the science of collecting, presenting, analyzing, and interpreting data

2
New cards

Data

facts and information of interest

3
New cards

Analytics

the process of using data to make decisions; descriptive, predictive, and prescriptive

4
New cards

Statistical Inference

uses sample data and probability to draw general conclusions about a population

5
New cards

Data set

a collection of information that can be grouped by variables

6
New cards

Elements

the units on which data is collected

7
New cards

Variables

general characteristics of interest

8
New cards

Numerical (or Quantitative) variables

variables that are measurable and can be expressed as numbers

9
New cards

Categorical (or Qualitative) variables

variables that represent categories or groups

10
New cards

Nominal

label or name categorical variable - Favourite ice cream

11
New cards

Ordinal

Label or name AND can be ordered categorical variable - Preference for mint ice cream: dislike, somewhat like, love

12
New cards

Interval

has a fixed unit of measure; can find the difference between (or sum of) two data values; zero can be a meaningful measure numerical variable - the temperature of ice cream from different freezers

13
New cards

Ratio

characteristics of interval data AND also the product or quotient of two data values make sense; zero means non-existent numerical variable - the number of ounces of ice cream in sized containers

14
New cards

Available Data

Data that were produced in the past may be a cost-effective way to help answer a present question

15
New cards

Internal data

Personnel files, cash-flow reports, and inventory records are sources of available data

16
New cards

External data

sources may be available for free (library, internet, researchers) or for a fee (Bloomberg, Neilson Co.)

17
New cards

Newly Produced Data

Some questions may require data to be produced. Newly produced data can be obtained through the design of observational or experimental studies.

18
New cards

Observational Study

Elements are observed and variables of interest are measured. An observation is the set of measurements for a single element

19
New cards

Experimental Study

A treatment is deliberately imposed on the elements and their responses are measured

20
New cards

Cross-Sectional

Observational Studies and Experiments - if the data is collected at a single point in time

21
New cards

Time Series

Observational Studies and Experiments - if the data is collected over several time periods

22
New cards

Population

the collection of ALL elements of interest; the group in which we want to draw a conclusion about

23
New cards

Sample

a subset/part of the population used to gather information; gives insight about the population

24
New cards

Census

an attempt to gather information from every element in the population; usually impossible or too expensive - US Gov

25
New cards

Parameter

a number that describes a population; a fixed amount that is usually an unknown (unless you have a census); - The mean height of Purdue students

26
New cards

Statistic

Is a number that describes a sample; an amount that can vary depending on which sample is collected; estimates a parameter; - The mean height of a sample of 350 Purdue students

27
New cards

Conclusion

We can use the sample proportion to estimate the population proportion

28
New cards

Ordered List

Shows the least & greatest data value

29
New cards

Frequency Table

Shows the total counts for particular intervals

30
New cards

Relative Frequency Table

Shows the total count in a particular interval relative to the entire set of data; usually represented as a percent

31
New cards

Histogram

A display of a frequency or relative frequency table

32
New cards

Mean

the sum of all data values divided by the number of data values

33
New cards

Median

the midpoint of an ordered data set; ½ of the observations are below it and ½ are above it

34
New cards

Mode

the data value (or values) that occur the most

35
New cards

Range

Max-min: where max is maximum (largest) data value and min is the minimum (smallest) data value; range is a single number

36
New cards

Sample Variance

a measure of how different, on average, data values are from the mean

37
New cards

Sample Standard Deviation

Square root of sample variance

38
New cards

Interquartile Range

Q3-Q1 (Q = quartile)

39
New cards

Categorical Variable

Qualitative data represented by categories.

40
New cards

Numerical Variable

Quantitative data represented by numbers.

41
New cards

Bar Graph

Visual representation of categorical data.

42
New cards

Pie Chart

Circular chart showing proportions of categories.

43
New cards

Stemplot

Graph displaying quantitative data in stems and leaves.

44
New cards

Shape of Distribution

Describes the form of data distribution (bell/skewed).

45
New cards

Unusual Data

Identifies gaps or extreme values in data.

46
New cards

Mean (x̄)

Average value of a data set.

47
New cards

Median (M)

Middle value when data is ordered.

48
New cards

Spread

Variability or consistency of data values.

49
New cards

Standard Deviation (s)

Measures data variability around the mean.

50
New cards

Population Variance

Variance calculated for an entire population.

51
New cards

Interquartile Range (IQR)

Difference between Q3 and Q1.

52
New cards

Quartiles

Values dividing data into four equal parts.

53
New cards

Percentile

Value below which a percentage of data falls.

54
New cards

5-Number Summary

Minimum, Q1, Median, Q3, Maximum values.

55
New cards

Boxplot

Graphical summary of the 5-number summary.

56
New cards

Modified Boxplot

Boxplot that highlights outliers.

57
New cards

Outlier

Data point significantly different from others.

58
New cards

Z-score

Standardized score indicating distance from mean.

59
New cards

Empirical Rule

Describes data distribution in bell-shaped curves.

60
New cards

Skewness

Measure of asymmetry in data distribution.

61
New cards

Kurtosis

Measure of data distribution's peakedness.

62
New cards

Descriptive Statistics

Summarizes and describes characteristics of data.

63
New cards

z-scores

A measure of the number of standard deviations a data value is from the mean.

64
New cards

Random Experiment

A random process that generates well-defined outcomes.

65
New cards

Sample space, S

The set of all possible outcomes (or sample points).

66
New cards

Event

A collection of one or more outcomes (or sample points).

67
New cards

Probability

A measure of the likelihood that an event will occur.

68
New cards

Classical (Theoretical) Method

Assumes all outcomes are equally likely.

69
New cards

Relative Frequency Method

Conduct many trials of a random experiment to estimate probabilities.

70
New cards

Counting Rule

If a random experiment consists of a sequence of k steps with n1 outcomes for the 1st step, n2 outcomes for the 2nd step, ..., nk outcomes for the kth step, then there are (n1)(n2)...(nk) total outcomes for the experiment.

71
New cards

Factorial

A notation for showing a special product, n! = n × (n - 1) × (n - 2) × ... × 1.

72
New cards

Permutations

The arrangements of n things taken r at a time, where order matters.

73
New cards

Combinations

The selection of n things taken r at a time, where order does not matter.

74
New cards

Complement of An Event

The set of all outcomes in S that are not outcomes of event A.

75
New cards

Union of A and B

The set of all outcomes that belong to A or B or both, denoted A ∪ B.

76
New cards

Intersection of A and B

The set of all outcomes that belong to both A and B, denoted A ∩ B.

77
New cards

Venn Diagrams

Visual displays of the relationship of the outcomes of combined events and the sample space.

78
New cards

Addition Law for 2 events

P(A ∪ B) = P(A) + P(B) - P(A ∩ B).

79
New cards

Mutually Exclusive Events

Two events that do not share any outcomes and cannot occur at the same time.

80
New cards

Complementary events

Events that cannot both occur at the same time and share no outcomes.

81
New cards

Probability Statement

P(A) = the number of favorable outcomes / TOTAL number of outcomes.

82
New cards

P(A)

The probability of event A occurring.

83
New cards

P(A and B)

The probability that both events A and B occur.

84
New cards

P(A or B)

The probability that either event A or event B occurs.

85
New cards

P(not A)

The probability that event A does not occur.

86
New cards

P(G)

Probability that a home has a garage.

87
New cards

P(P)

Probability that a home has a swimming pool.

88
New cards

P(G and P)

Probability that a home has both a garage and a swimming pool.

89
New cards

P(G or P)

Probability that a home has either a garage or a swimming pool.

90
New cards

N

The total number of employees or items in a sample.

91
New cards

P(L)

Probability that an employee completed their work late.

92
New cards

P(D)

Probability that an employee's work was defective.

93
New cards

P(L and D)

Probability that an employee's work was both late and defective.

94
New cards

P(L or D)

Probability that an employee's work was either late or defective.

95
New cards

Joint Distribution

Probability distribution of two events occurring together.

96
New cards

Marginal Distribution

Distribution of a single event from joint distribution.

97
New cards

Joint Probability

Probability of two events happening simultaneously.

98
New cards

Marginal Probability

Probability of a single event occurring.

99
New cards

Conditional Probability

Probability of an event given another event has occurred.

100
New cards

Multiplication Law

P(A and B) = P(A) * P(B|A).