STA013 Fall 24 Final Exam

0.0(0)

Studied by 0 people

0.0(0)

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/150

There's no tags or description

Looks like no tags are added yet.

Study Analytics

Name	Mastery	Learn	Test	Matching	Spaced

No study sessions yet.

151 Terms

New cards

A _______ is a characteristic that changes or varies over time and/or for different individuals or objects under consideration

variable

New cards

An ___________________ is the individual or object on which a variable is measured

experimental unit

New cards

A single ______ or data value results when a variable is actually measured on an experimental unit

measurement

New cards

A _______________ is the set of all measurements of interest to the investigator

population

New cards

A ______ is a subset of measurements selected from the population of interest

sample

New cards

_______________ data results when a single variable is measured on a single experimental unit

univariate

New cards

____________ results when two variables are measured on a single experimental unit

bivariate

New cards

________ results when more than two variables are measured

multivariate

New cards

___________ variables measure a numerical quantity or amount on each experimental unit

quantitative

New cards

A _____________ variable can assume infinitely many values corresponding to the points on a line interval. There are no gaps

continuous

New cards

When constructing a graph, we need to first construct a _________________ and then use it to create a graph called a ______________

statistical table, data distribution

New cards

The sum of the relative frequencies is ______

New cards

A ____________ is the familiar circular graph that shows how the measurements are distributed among the categories

pie chart

New cards

A _______________ shows the same distribution of measurements among the categories, with the height of the bar measuring how often a particular category was observed

bar chart

New cards

For a pie chart, the angle of the sector for a category = ____________ * 360

relative frequency

New cards

Pie charts and bar charts are _____________ to qualitative data

not exclusive

New cards

A variable can take on as many values as the numbers in an interval is called _____________ variable

continuous

New cards

Time series data are most effectively presented on a ___________ with time as the horizontal axis. The idea is to try to find a pattern or __________ that will likely continue into the future

line chart, trend

New cards

For a histogram, a ______ is a subinterval created when you divide up the interval from the smallest to the largest measurement

class

New cards

The ________ is the difference between the upper and lower class boundaries

width

New cards

The class _____________ is the number of measurements falling into that particular class

frequencies

New cards

Histogram Steps
1. Choose the number of classes, usually between 5 and ______. The more data you have, the more ______ you should use

12, classes

New cards

2.Find the approximate class _______ by dividing the difference between the largest and smallest values by the number of class

width

New cards

3. Round the approximate class width up to a convenient number
4. If the data is discrete, you might assign one class for each integer value. For a large number of integer values, you may need to group them into classes

5.List the class boundaries. The _________ class must include the smallest measurement. Then add the remaining classes, including the left boundary point but not the right.

lowest

New cards

6.Build a statistical table containing the classes, their ___, and their relative frequencies.
7. Draw the histogram like a bar graph, with the class intervals on the horizontal axis and relative frequencies as the bar height

frequency

New cards

A distribution is ___________ if the left and right sides of the distribution, when divided at the middle value, form mirror images

symmetrical

New cards

A distribution is _____________________ if a greater proportion of the measurements lie to the right of the peak value

skewed to the right

New cards

A distribution is ___________ if it has one peak

unimodal

New cards

The are three types of measures of variability: ____________, ___________, and ___________

range, variance, standard deviation

New cards

The _____________ of a set of n measurements is defined as the difference between the largest and smallest measurements

range

New cards

The variance of a population of N measurements is the average of the squares of the ____________ of the measurements about their mean μ

deviations

New cards

The variance of a sample of n measurements is the sum of the ______ of the measurements about their mean _______ divided by _________

squares, |x (x bar), n-1

New cards

The measures of variability can be negative. This statement is ________

false

New cards

If the measure of variability is equal to zero, all the data should have ____________

the same value

New cards

The range and standard deviation have the same _________ as the original data

unit

New cards

By Tchebysheff's Theorem, given a number k greater than or equal to 1 and a set of n measurements, at least ________ of the measurements will lie within k _________ of their mean

1-(1/k)^2, standard deviation

New cards

Suppose μ is the population mean and σ is the standard deviation. Answer the following questions using Tchebysheff's Theorem:
a. At least none of the measurements lie in the interval μ__σ to μ__σ
b. At least 3/4 of the measurements lie in the interval μ__σ to μ__σ
c. At least 8/9 of the measurements lie in the interval μ__σ to μ__σ

a. -1, +1
b. -2 , +2
c. -3 ,+3

New cards

If the data is ____________, we have
The Empirical rules
a. The interval ( μ+/-σ ) contains approximately ______% of the measurements
b. The interval (μ+/-2σ) contains approximately _____% of the measurements
c. The interval (μ+/-3σ) contains approximately _________ of the measurements

mound shaped
a. 68
b. 95
c. 99.7

New cards

The empirical rule requires the distribution to be ____________. Tchebysheff's theorem does not require anything

Unimodal

New cards

Measure of center is a measure along the ____________ that locates the _____ of the distribution

horizontal axis, center

New cards

There are three different measures: __________, __________, ____________

mean, median, mode

New cards

Arithmetic mean is the sum of data points of interest divided by ___________. For population, we use notation ________. For sample, we use notation _________

total number of data points, mew (μ), |x (x bar)

New cards

The ___________ m of a set of n measurements is the value of x that falls in the middle position when the measurements are ordered from _______ to __________

median, largest, smallest

New cards

Mean and median ____________ coincide with each other. We can use them to infer the shape of the distribution

do not always

New cards

When the distribution is ________, mean and median are the same

symmetric

New cards

When the distribution is skewed to the right, mean is ___________ than the median

larger

New cards

when the distribution is skewed to the left, mean is __________ than the median

smaller

New cards

The _______________ is the category that occurs most frequently, or the most frequently occurring value of x

mode

New cards

Mode is generally used to describe a ________ dataset

large

New cards

Mean and median can be used for both ________ and _______ datasets

large, small

New cards

It is _____________ to have more than one mode in the dataset

possible

New cards

Do we want more or less variability in the data in the following examples?
a. the lifetime of machines produced by a company
b. The SAT score

a. less
b. more

New cards

Measures of __________ can help you create a mental picture of the spread of the data

variability

New cards

The lower quartile (first quartile) Q1, is the value of x that is greater than ______ of the measurements and is less than the remaining _________

25%, 75%

New cards

The second quartile is the ________

median

New cards

The upper quartile (third quartile) Q3, is the value of x that is greater than ______ of the measurements and is less than the remaining _________

75%, 25%

New cards

The interquartile range for a set of measurements is the difference between the ___________ and ______________

third quartile, first quartile

New cards

We can use five numbers to summarize the data: _____________, _______, _________, ________, and __________

minimum, Q1, median, Q3, maximum

New cards

Box-plot can be used to detect ________

outliers

New cards

An ___________ is the process by which an observation (or measurement) is obtained

experiment

New cards

A ___________ is the outcome observed on a single repetition of an experiment

simple event

New cards

Experiment: Toss a die and observe the number on the upper face. List the simple events in the experiment:

1, 2, 3, 4, 5, 6

New cards

An _________ is a collection of simple events.

Event

New cards

Two events are _________________ if, when one event occurs, the other cannot, and vice versa

mutually exclusive

New cards

Simple events are all mutually exclusive (true/false)

true

New cards

The set of all simple events is called the __________

sample space

New cards

Some experiments can be generated in stages, and the sample space can be displayed in a ______________

tree diagram

New cards

If you repeat the experiment more and more times, n becomes larger and larger, eventually, you generate the entire population. In this population, the _________________ of the event A is defined as the probability of event A

relative frequency

New cards

Each probability must lie between ____ and ____

0, 1

New cards

The sum of the probabilities for all _____________ in S, the sample space equals 1

simple events

New cards

The probability of an event A is equal to the sum of the probabilities of the _______________ contained in A

simple events

New cards

How to calculate the probability of an event
1. List all the ___________ in the sample space

simple events

New cards

How to calculate the probability of an event
2. Assign an appropriate ________ to each simple event

probability

New cards

How to calculate the probability of an event
3. Determine which simple events result in the __________ of interest

event

New cards

How to calculate the probability of an event
4. ____________ the probabilities of the simple events that result in the event of interest

sum

New cards

What are the three rules for counting the number of simple events?
1. The ________ rule

New cards

What are the three rules for counting the number of simple events?

2. A counting rule for ____

permutations

New cards

What are the three rules for counting the number of simple events?
3. A counting rule for ______________

combinations

New cards

Z-score is a measurement of _______________

relative standing

New cards

Z-score measures the distance between a particular observation x and the ________, measured in units of ____________. Its formula is z=measurement-mean/standard deviation

mean, standard deviation

New cards

A percentile is another measure of relative standing, most often used for __________ data sets

large

New cards

The p-th percentile is the value of x that is greater than __________% of the measurements and is less than the remaining ________%

p, 100-p

New cards

When the ordering or arrangement of the objects is important, you can use a counting rule for ________

permutations

New cards

Sometimes the ordering or arrangement of the objects is not important, but only the objects that are chosen. In this case, you can use a counting rule for ____________

combinations

New cards

The ________ of events A and B, denoted by A ∪ B, is the event that either A or B both occur

union

New cards

The ___________ of events A and B, denoted by A ∩ B, is the event that both A and B occur

intersection

New cards

The _________ of an event A, denoted by A^c, is the event that A does not occur

complement

New cards

Simple events are mutually exclusive (true/false)

true

New cards

Event A and its complement are mutually exclusive no matter what A is (true/false)

true

New cards

Are mutually exclusive events independent (yes/no)

New cards

Are two independent events mutually exclusive (yes/no)

New cards

A _____________ (type ___ error) is the even t that the test is positive for a given condition, given that the person does not have the condition

false positive, I

New cards

A ______________ (type _____ error) is the event that the test is negative for a given condition, given that the person has the condition

false negative, II

New cards

A variable X is a __________________ if the value that it assumes, corresponding to the outcome of an experiment, is a chance or random event

random variable

New cards

Quantitative variables are classified as either ___________ or ______________, according to the values that X can assume

discrete, continuous

New cards

We defined probability as the limiting value of the _______________________ as the experiment is repeated over and over again

relative frequency

New cards

Now we define the probability distribution for a random variable X as the ____________________ distribution constructed for the entire population of measurements

relative frequency

New cards

The ________________ for a discrete random variable is a formula, table, or graph that gives all the possible values of X, and the probability p(x)=P(X=x) associated with each value x

probability distribution

New cards

Requirements for a Discrete Probability Distribution
A. __________ </= p(x) </= _________
B. Sum of x p(x) = _______

A. 0, 1
B. 1

100

New cards

Comparative relative frequency distribution and probability distribution: the difference is that the relative frequency distribution describes a ________ of n measurements, while the probability distribution is constructed as a model for the entire __________ of measurements

sample, population