STA013 Fall 24 Final Exam

0.0(0)
studied byStudied by 0 people
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/150

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

151 Terms

1
New cards

A _______ is a characteristic that changes or varies over time and/or for different individuals or objects under consideration

variable

2
New cards

An ___________________ is the individual or object on which a variable is measured

experimental unit

3
New cards

A single ______ or data value results when a variable is actually measured on an experimental unit

measurement

4
New cards

A _______________ is the set of all measurements of interest to the investigator

population

5
New cards

A ______ is a subset of measurements selected from the population of interest

sample

6
New cards

_______________ data results when a single variable is measured on a single experimental unit

univariate

7
New cards

____________ results when two variables are measured on a single experimental unit

bivariate

8
New cards

________ results when more than two variables are measured

multivariate

9
New cards

___________ variables measure a numerical quantity or amount on each experimental unit

quantitative

10
New cards

A _____________ variable can assume infinitely many values corresponding to the points on a line interval. There are no gaps

continuous

11
New cards

When constructing a graph, we need to first construct a _________________ and then use it to create a graph called a ______________

statistical table, data distribution

12
New cards

The sum of the relative frequencies is ______

1

13
New cards

A ____________ is the familiar circular graph that shows how the measurements are distributed among the categories

pie chart

14
New cards

A _______________ shows the same distribution of measurements among the categories, with the height of the bar measuring how often a particular category was observed

bar chart

15
New cards

For a pie chart, the angle of the sector for a category = ____________ * 360

relative frequency

16
New cards

Pie charts and bar charts are _____________ to qualitative data

not exclusive

17
New cards

A variable can take on as many values as the numbers in an interval is called _____________ variable

continuous

18
New cards

Time series data are most effectively presented on a ___________ with time as the horizontal axis. The idea is to try to find a pattern or __________ that will likely continue into the future

line chart, trend

19
New cards

For a histogram, a ______ is a subinterval created when you divide up the interval from the smallest to the largest measurement

class

20
New cards

The ________ is the difference between the upper and lower class boundaries

width

21
New cards

The class _____________ is the number of measurements falling into that particular class

frequencies

22
New cards

Histogram Steps
1. Choose the number of classes, usually between 5 and ______. The more data you have, the more ______ you should use

12, classes

23
New cards

2.Find the approximate class _______ by dividing the difference between the largest and smallest values by the number of class

width

24
New cards

3. Round the approximate class width up to a convenient number
4. If the data is discrete, you might assign one class for each integer value. For a large number of integer values, you may need to group them into classes

5.List the class boundaries. The _________ class must include the smallest measurement. Then add the remaining classes, including the left boundary point but not the right.

lowest

25
New cards

6.Build a statistical table containing the classes, their ___, and their relative frequencies.
7. Draw the histogram like a bar graph, with the class intervals on the horizontal axis and relative frequencies as the bar height

frequency

26
New cards

A distribution is ___________ if the left and right sides of the distribution, when divided at the middle value, form mirror images

symmetrical

27
New cards

A distribution is _____________________ if a greater proportion of the measurements lie to the right of the peak value

skewed to the right

28
New cards

A distribution is ___________ if it has one peak

unimodal

29
New cards

The are three types of measures of variability: ____________, ___________, and ___________

range, variance, standard deviation

30
New cards

The _____________ of a set of n measurements is defined as the difference between the largest and smallest measurements

range

31
New cards

The variance of a population of N measurements is the average of the squares of the ____________ of the measurements about their mean μ

deviations

32
New cards

The variance of a sample of n measurements is the sum of the ______ of the measurements about their mean _______ divided by _________

squares, |x (x bar), n-1

33
New cards

The measures of variability can be negative. This statement is ________

false

34
New cards

If the measure of variability is equal to zero, all the data should have ____________

the same value

35
New cards

The range and standard deviation have the same _________ as the original data

unit

36
New cards

By Tchebysheff's Theorem, given a number k greater than or equal to 1 and a set of n measurements, at least ________ of the measurements will lie within k _________ of their mean

1-(1/k)^2, standard deviation

37
New cards

Suppose μ is the population mean and σ is the standard deviation. Answer the following questions using Tchebysheff's Theorem:
a. At least none of the measurements lie in the interval μ__σ to μ__σ
b. At least 3/4 of the measurements lie in the interval μ__σ to μ__σ
c. At least 8/9 of the measurements lie in the interval μ__σ to μ__σ

a. -1, +1
b. -2 , +2
c. -3 ,+3

38
New cards

If the data is ____________, we have
The Empirical rules
a. The interval ( μ+/-σ ) contains approximately ______% of the measurements
b. The interval (μ+/-2σ) contains approximately _____% of the measurements
c. The interval (μ+/-3σ) contains approximately _________ of the measurements

mound shaped
a. 68
b. 95
c. 99.7

39
New cards

The empirical rule requires the distribution to be ____________. Tchebysheff's theorem does not require anything

Unimodal

40
New cards

Measure of center is a measure along the ____________ that locates the _____ of the distribution

horizontal axis, center

41
New cards

There are three different measures: __________, __________, ____________

mean, median, mode

42
New cards

Arithmetic mean is the sum of data points of interest divided by ___________. For population, we use notation ________. For sample, we use notation _________

total number of data points, mew (μ), |x (x bar)

43
New cards

The ___________ m of a set of n measurements is the value of x that falls in the middle position when the measurements are ordered from _______ to __________

median, largest, smallest

44
New cards

Mean and median ____________ coincide with each other. We can use them to infer the shape of the distribution

do not always

45
New cards

When the distribution is ________, mean and median are the same

symmetric

46
New cards

When the distribution is skewed to the right, mean is ___________ than the median

larger

47
New cards

when the distribution is skewed to the left, mean is __________ than the median

smaller

48
New cards

The _______________ is the category that occurs most frequently, or the most frequently occurring value of x

mode

49
New cards

Mode is generally used to describe a ________ dataset

large

50
New cards

Mean and median can be used for both ________ and _______ datasets

large, small

51
New cards

It is _____________ to have more than one mode in the dataset

possible

52
New cards

Do we want more or less variability in the data in the following examples?
a. the lifetime of machines produced by a company
b. The SAT score

a. less
b. more

53
New cards

Measures of __________ can help you create a mental picture of the spread of the data

variability

54
New cards

The lower quartile (first quartile) Q1, is the value of x that is greater than ______ of the measurements and is less than the remaining _________

25%, 75%

55
New cards

The second quartile is the ________

median

56
New cards

The upper quartile (third quartile) Q3, is the value of x that is greater than ______ of the measurements and is less than the remaining _________

75%, 25%

57
New cards

The interquartile range for a set of measurements is the difference between the ___________ and ______________

third quartile, first quartile

58
New cards

We can use five numbers to summarize the data: _____________, _______, _________, ________, and __________

minimum, Q1, median, Q3, maximum

59
New cards

Box-plot can be used to detect ________

outliers

60
New cards

An ___________ is the process by which an observation (or measurement) is obtained

experiment

61
New cards

A ___________ is the outcome observed on a single repetition of an experiment

simple event

62
New cards

Experiment: Toss a die and observe the number on the upper face. List the simple events in the experiment:

1, 2, 3, 4, 5, 6

63
New cards

An _________ is a collection of simple events.

Event

64
New cards

Two events are _________________ if, when one event occurs, the other cannot, and vice versa

mutually exclusive

65
New cards

Simple events are all mutually exclusive (true/false)

true

66
New cards

The set of all simple events is called the __________

sample space

67
New cards

Some experiments can be generated in stages, and the sample space can be displayed in a ______________

tree diagram

68
New cards

If you repeat the experiment more and more times, n becomes larger and larger, eventually, you generate the entire population. In this population, the _________________ of the event A is defined as the probability of event A

relative frequency

69
New cards

Each probability must lie between ____ and ____

0, 1

70
New cards

The sum of the probabilities for all _____________ in S, the sample space equals 1

simple events

71
New cards

The probability of an event A is equal to the sum of the probabilities of the _______________ contained in A

simple events

72
New cards

How to calculate the probability of an event
1. List all the ___________ in the sample space

simple events

73
New cards

How to calculate the probability of an event
2. Assign an appropriate ________ to each simple event

probability

74
New cards

How to calculate the probability of an event
3. Determine which simple events result in the __________ of interest

event

75
New cards

How to calculate the probability of an event
4. ____________ the probabilities of the simple events that result in the event of interest

sum

76
New cards

What are the three rules for counting the number of simple events?
1. The ________ rule

mn

77
New cards

What are the three rules for counting the number of simple events?

2. A counting rule for ____

permutations

78
New cards

What are the three rules for counting the number of simple events?
3. A counting rule for ______________

combinations

79
New cards

Z-score is a measurement of _______________

relative standing

80
New cards

Z-score measures the distance between a particular observation x and the ________, measured in units of ____________. Its formula is z=measurement-mean/standard deviation

mean, standard deviation

81
New cards

A percentile is another measure of relative standing, most often used for __________ data sets

large

82
New cards

The p-th percentile is the value of x that is greater than __________% of the measurements and is less than the remaining ________%

p, 100-p

83
New cards

When the ordering or arrangement of the objects is important, you can use a counting rule for ________

permutations

84
New cards

Sometimes the ordering or arrangement of the objects is not important, but only the objects that are chosen. In this case, you can use a counting rule for ____________

combinations

85
New cards

The ________ of events A and B, denoted by A ∪ B, is the event that either A or B both occur

union

86
New cards

The ___________ of events A and B, denoted by A ∩ B, is the event that both A and B occur

intersection

87
New cards

The _________ of an event A, denoted by A^c, is the event that A does not occur

complement

88
New cards

Simple events are mutually exclusive (true/false)

true

89
New cards

Event A and its complement are mutually exclusive no matter what A is (true/false)

true

90
New cards

Are mutually exclusive events independent (yes/no)

no

91
New cards

Are two independent events mutually exclusive (yes/no)

no

92
New cards

A _____________ (type ___ error) is the even t that the test is positive for a given condition, given that the person does not have the condition

false positive, I

93
New cards

A ______________ (type _____ error) is the event that the test is negative for a given condition, given that the person has the condition

false negative, II

94
New cards

A variable X is a __________________ if the value that it assumes, corresponding to the outcome of an experiment, is a chance or random event

random variable

95
New cards

Quantitative variables are classified as either ___________ or ______________, according to the values that X can assume

discrete, continuous

96
New cards

We defined probability as the limiting value of the _______________________ as the experiment is repeated over and over again

relative frequency

97
New cards

Now we define the probability distribution for a random variable X as the ____________________ distribution constructed for the entire population of measurements

relative frequency

98
New cards

The ________________ for a discrete random variable is a formula, table, or graph that gives all the possible values of X, and the probability p(x)=P(X=x) associated with each value x

probability distribution

99
New cards

Requirements for a Discrete Probability Distribution
A. __________ </= p(x) </= _________
B. Sum of x p(x) = _______

A. 0, 1
B. 1

100
New cards

Comparative relative frequency distribution and probability distribution: the difference is that the relative frequency distribution describes a ________ of n measurements, while the probability distribution is constructed as a model for the entire __________ of measurements

sample, population