1/142
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
census
a sample survey that attempts to include the entire population in the sample
experiment
a study that deliberately imposes some treatment on individuals in order to observe their responses. The purpose is to study whether the treatment causes a change in the response
individuals
the objects described by a set of data. They may be people, but they may also be animals or things
observational study
a study that observes individuals and measures variables of interest, yet does not involve any intervention that will influence the responses. The purpose of such a study is to describe some group or situation
population
in a statistical study, this is the entire group of individuals about which we want information
response variable
a variable that measures an outcome or result of a study
sample
the part of the population from which we collect information and is used to draw conclusions about the whole
sample survey
a type of observational study in which only a few members of a particular group are studied. These group members are selected not because they are of special interest, but because they represent the larger group
variable
any characteristic of an individual. It can take different values for different individuals
bias
(1) when the design of a statistical study systematically favors certain outcomes (2) Consistent, repeated deviation of the sample statistic from the population parameter in the same direction when we take many samples.
convenience sampling
selection of participants for a statistical study based on how easy they are to reach. Because this method of sampling may not take an entire population into account, it is often biased
simple random sample
denoted SRS, consists of n individuals from the population chosen in such a way that every set of n individuals has an equal chance to be the sample actually selected
voluntary response sample
a sample that chooses itself by responding to a general appeal, and is therefore often biased. Examples include write-in or call-in opinion polls
table of random digits
a table comprised of a long string of the digits 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 with the following two properties: 1. Each entry in the table is equally likely to be any of the 10 digits 0 through 9. 2. The entries are independent of each other. That is, knowledge of one part of the table gives no information about any other part
parameter
a number that describes a population. It is a fixed number, but in practice we don't know the actual value of this number
statistic
A number that describes a sample. This is a known value when we have taken a sample, but it can change from sample to sample. It is often used to estimate an unknown parameter.
margin of error
How close the sample statistic lies to the population parameter.
level of confidence
Says what percentage of all possible samples satisfy the margin of error.
confidence statement
A statement that says how accurate our conclusions about the population are. The statement itself is comprised of the margin of error and the level of confidence.
variability
Describes how spread out the values of the sample statistic are when we take many samples.
exit poll
A poll in which voters are interviewed as they are leaving the voting place.
pre-election poll
A sample survey that asks people how they will vote in the future. Because people often change their minds before the election, these polls are not very reliable.
response error
A type of nonsampling error that occurs when a subject gives an incorrect response. In this case, the subject's response may be a lie, the subject may remember incorrectly, or the subject may guess at an answer without fully understanding the question.
nonsampling errors
Errors that are not related to the act of selecting a sample from the population. They can be present even in a census.
processing error
A type of nonsampling error that involves mistakes in mechanical tasks such as doing arithmetic or entering responses into a computer.
random sampling error
The deviation between the sample statistic and the population parameter caused by chance in selecting a random sample. The margin of error in a confidence statement includes only this type of error.
nonresponse
The failure to obtain data from an individual selected for a sample. This could be because the subject does not cooperate, or because the subject could not be contacted.
probability sample
A sample chosen by chance.
cluster
A collection of individuals that are grouped together based on location.
sampling errors
Errors that are caused by the act of taking a sample. These errors may cause sample results to be different from the results of a census.
sampling frame
A list of individuals from which one draws a sample.
undercoverage
When some groups in a population are left out of the process of choosing a sample.
strata
A collection of individuals that are grouped together based on a determined similarity.
stratified random sample
A sample in which the sampling frame is first divided into various strata. A simple random sample is then taken in each of these strata, with those selected combined to form the complete sample.
clinical trial
A medical experiment involving human subjects.
treatment
Any specific experimental condition that is applied to the subjects. If an experiment has several explanatory variables, this is a combination of specific values of these variables.
placebo effect
When an individual responds to a dummy treatment, perhaps due to the individual's expectation that the treatment will produce an influential outcome.
confounded
Two variables are said to be this when their effects on a response variable cannot be distinguished from each other.
lurking variable
A variable that has an important effect on the relationship among the variables in a study but is not one of the explanatory variables studied.
randomized comparative experiment
An experiment in which two or more treatments are compared and chance is used to decide which subjects get each treatment. In this type of experiment, enough subjects are used so that the effects of chance are small.
placebo
A dummy treatment with no active ingredients.
double-blind experiment
An experiment in which neither the subjects nor those performing the experiment know which treatment the subjects received.
explanatory variable
A variable that we think explains or causes changes in the response variable.
control group
A group that does not receive the treatment. This group may or may not receive a placebo.
statistically significant
When differences among the effects of the treatments are so large that they would rarely happen just by chance.
subjects
The individuals studied in an experiment.
dropout
A subject who begins an experiment but does not complete it.
completely randomized design
Experimental design where all the experimental subjects are allocated at random among all the treatments.
nonadherer
A subject who participates in an experiment but does not follow the experimental treatment.
block
A group of experimental subjects that are known before the experiment to be similar in some way that is expected to affect the response to the treatments.
matched pairs
A type of experiment that compares just two treatments and combines matching with randomization. Each subject receives both treatments in a random order, or the subjects are matched in pairs as closely as possible, and one subject in each pair receives each treatment.
block design
Where the random assignment of subjects to treatments is carried out separately within each block.
informed consent
Type of consent a subject must give, usually in writing, before participating in a study. The subject must be told in advance about the nature of the study and any risk of harm it may bring.
anonymity
When subjects are anonymous and their names are not known even to the director of the study
confidential
When an individual's responses in a study are not released to the public. Only statistical summaries of groups of subjects may be made public.
institutional review board
A board that reviews all of an organization's planned studies in advance in order to protect the subjects from possible harm. The goal is to ensure that such studies are ethical.
instrument
A tool used to make a measurement.
measurement
The assignment of a number to some property of a person or thing.
rate
Expressed as a fraction, a percentage, or a proportion, this is often a more valid measure than a simple count of occurrences.
predictive validity
When the measurement of a property can be used to predict success on tasks that are related to the property measured.
random error
This occurs when repeated measurements on the same individual give different results.
valid measurement
When a variable is relevant or appropriate as a representation of a property.
reliable measurement
When the random error in a measurement process is small.
average
In terms of measurement, the mean of several repeated measurements of the same individual. This is considered less variable, and thus more reliable, than a single measurement.
variance
(1) A quantity that is used to determine if the random error associated with a measurement is small. (2) The average squared distance of the observations in a distribution from their mean. Also referred to as the square of the standard deviation.
implausible
Numbers in data that are suprisingly large or small. This can be a sign that the data is being incorrectly used in order to support a particular argument.
inconsistencies
When numbers in statistical data do not agree as they should. This can be a sign that the data is being incorrectly used in order to support a particular argument.
bar graph
Used to display the distribution of a categorical variable, a graph that allows one to compare any set of numbers measured in the same units.
pie chart
Used to display the distribution of a categorical variable, a graph that displays the way in which a whole is divided into parts.
distribution
Describes what values a variable takes and how often the variable takes these values.
roundoff error
When there is a discrepancy between a calculated figure and the actual figure due to rounding.
categorical variable
A variable that places an individual into one of several groups or categories.
line graph
Used to show how a quantitative variable changes over time, a graph that plots the values of the variable (vertical scale) against time (horizontal scale). Data points are connected by lines.
quantitative variable
A variable that takes numerical values for which arithmetic operations such as adding and averaging make sense.
deviation
Data that is inconsistent with the overall pattern.
pictogram
A bar graph in which pictures replace the bars
trend
A long-term upward or downward movement over time.
seasonal variation
A pattern that repeats itself at known regular intervals of time.
seasonally adjusted
When seasonal variation is removed before the data are published.
histogram
The most common graph of the distribution of a quantitative variable. This type of graph is usually favored for larger data sets.
center
The midpoint of a distribution. This is often used as a means of helping to describe the overall pattern of a histogram or stemplot.
outlier
Any individual observation that falls outside the overall pattern of the other observations.
shape
The visual pattern of the distribution, such as symmetric or skewed. This is often used as a means of helping to describe the overall pattern of a histogram or stemplot.
spread
The range of a distribution, from its lowest value to its highest value. This is often used as a means of helping to describe the overall pattern of a histogram or stemplot.
right-skewed
Description given to a distribution when the right side of the histogram (the side containing the half of the observations with larger values) extends out much farther than the left side.
left-skewed
Description given to a distribution when the left side of the histogram (the side containing the half of the observations with smaller values) extends out much farther than the right side.
symmetric
When both the right and left sides of a graph of a distribution are mirror images of each other.
stemplot
A graphical display of a quantitative variable, usually favored for smaller data sets, in which each observation is separated into a stem consisting of all but the final (rightmost) digit and a leaf (the final digit). The stems are placed in a vertical column with the smallest at the top, a vertical line is drawn to the right of this column, and each leaf is placed to the right of its corresponding stem, in increasing order out from the stem.
median
The midpoint of a distribution; the number such that half the observations are smaller and the other half are larger. It is often denoted as M and can be found in a set of data, when all observations are arranged in order from smallest to largest, by counting (n + 1)/2 observations up from the bottom of the list.
minimum
The smallest observation in a data set.
maximum
The largest observation in a data set.
five-number summary
A description of a distribution that consists of the smallest observation (minimum), the first quartile, the median, the third quartile, and the largest observation (maximum). These numbers are written from smallest to largest. In symbols, it is: Minimum Q1 M Q3 Maximum
boxplot
A graph of the five-number summary. A central box spans the quartiles. A line in the box marks the median. Lines extend from the box out to the smallest and largest observations.
mean
The average of the observations in a data set.
quartile
With the median, the value that divides the observations in a distribution into quarters. One-quarter of the observations fall below the first quartile, which is denoted as Q1. Three-quarters of the observations fall below the third quartile, which is denoted as Q3.
standard deviation
Denoted s and used to measure spread, this measures the average distance of the observations in a distribution from their mean. This can be calculated by finding an average of the squared distances from the mean and then taking the square root.
density curve
A curve with area exactly 1 underneath it whose shape describes the overall pattern of a distribution.
Normal curve
A symmetric, bell-shaped density curve that has the following properties: 1.It is completely described by giving its mean and its standard deviation. 2.The mean determines the center of the distribution. It is located at the center of symmetry of the curve. 3.The standard deviation determines the shape of the curve. It is the distance from the mean to the change-of-curvature points on either side.
68-95-99.7 rule
Rule that states that in any Normal distribution, approximately 68% of the observations fall within one standard deviation of the mean, 95% of the observations fall within two standard deviations of the mean, and 99.7% of the observations fall within three standard deviations of the mean.
positive association
Two variables are said to have this when above-average values of one variable tend to accompany above-average values of the other variable. Below-average values of the two variables also tend to occur together. As a result, the scatterplot slopes upward as we move from left to right.