Experimental Unit
An object (person, thing, event, transaction, etc.) upon which we collect data
Population
A set of units (usually people, objects, transactions, or events) that we are interested in studying
Sample
A subset of the units of a population
Parameter
A summary measure (i.e., an average, a proportion, etc.) that reflects the entire population
Statistic
Summary measure is determined using the data collected from the sample
Examples of Populations
All people in the U.S. (each person is an experimental unit)
All U.S. males age 18-35
All sales made by the ND Bookstore last week
Variable
A characteristic or property of an individual unit. Often notated with capital letters such as X or Y.
Examples of Variables
What is your eye color?
How much cash do you have on you?
Do you suffer from diabetes
Qualitative/Categorical
Data collected on non-numerical variables. Can be expressed in words
Quantitative
Data collected on numerical variables, Can be mathematically summarized
Nominal
No “order” exists for the categories
Ordinal
The values can be ordered (arranged from lowest to highest)
Discrete
Can list the possible values and the pattern of values (i.e., integers)
Continuous
The possible values can be any value in a range, with any number of decimal places
Published sources
Includes government agencies such as the U.S. Census and Bureau of Labor Statistics, newspapers, academic journals, universities, and websites
Designed Experiment
The researcher actively imposes a treatment on the sample units.
Observational Studies
The researcher does not actively impose treatment on the experimental units, but records the values of the variable of interest.
Sample Surveys
A researcher can also conduct surveys via mail, telephone, e-mail, or in-person interviews
Simple Random Sample
A sample selected from the population in such a way that every sample of size n has an equal chance of selection
Selection Bias
This results when some subset of the experimental units in the population is excluded so that these units have no chance of being selected in the sample
Nonresponse Bias
This occurs when the researcher conducting the survey or study are unable to obtain data on all experimental units selected for the sample.
Measurement Error
This refers to inaccuracies in the values of the data recorded. Errors may be due to ambiguous or leading questions.
Experiment
A planned operation carried out under controlled conditions
Outcomes
A single result of an experiment
Sample Space
The set of all possible outcomes
Event
One or more of the outcomes
Probability
The predicted chance (or long-term relative frequency) of an event.
Probabilities must be…
values between 0 and 1, inclusive: 0<or equal P(A) < or equal 1
Probabilities of all outcomes must add to
1
Classical
All outcomes are assumed equally likely
Empirical
Probabilities are based on results of a random sample
Subjective
An expert provides the probabilities
Intersection of two events (A and B):
the group of outcomes in both A and B
Union of two events (A or B):
the group of outcomes in A or B (or both)
Complement of an Event, A: A' =
the group of outcomes NOT in A
Conditional Event, A/B:
A occurs given that B i s known to occur
Mutually Exclusive Events:
A and B have no overlap
Independent Events:
Knowing that A has occurred does not affect the probability ofB (and vice-versa).
Mutually Exclusive
Events that cannot occur at the same time.
Contingency Table
Provides a way of portraying data that can facilitate calculating probabilities
Tree Diagram
A special type of graph is used to determine the outcomes of an experiment
Venn Diagram
A picture that represents the outcome of an experiment. Generally consists of a box that represents the sample space S together with circles or ovals
Random Variable
A numerical description of the outcome of an experiment.
Discrete Random Variable
The possible values are distinct, predictable, and can be listed. The number of values can be either finite or infinite. Probabilities can be assigned to each value, but they might not be equal.
Continuous Random Variable
The possible values cannot be listed individually since the variables can take on any value in a range. It is not possible to list every distinct value. The number of values is always infinite and the probability of any specific value is zero.
Probability Distribution
A table, graph, or formula that specifies the probability associated with each possible value that the random variable can assume
Requirements for the Probability Distribution of a Discrete Random Variable X
0< or equal p(x) < or equal 1 , for all x
All x p(x) =1 , where the summation of P(x) is over all possible values of x
Characteristics of a Binomial Random Variable
The experment consists of n identical trials
There are only 2 possible outcomes on each trial. We denote one outcome by S (or 1) for success, and F (or 0) for failure.
The probability of success is constant from trial to trial. This probability is denoted by p, and the probability of failure is
q= 1-p
The trials are independent → The binomial random variable x represents number of successes in n trials
Unimodal
only has one peak in the distribution
Bimodal
has two peaks in the distribution
Multi-modal
a multimodal distribution is a probability distribution with more than one mode
frequency
the number of observations in a data set falling in a particular category
relative frequency
frequency divided by the total number of observations in the data set.
n
the number of data observations
percentage frequency
class relative frequency multiplied by 100