Statistics
the science of planning studies and experiments, obtaining data, and then organizing, summarizing, presenting, analyzing, interpreting, and drawing conclusions based on the data.
Data
collections of observations.
Descriptive Statistics
organizing and summarizing data; by graphing and by numerical values (such as an average).
Inferential Statistics
uses methods that take a result from a sample, extend it to the population, and measure the reliability of the result.
Probability
the chance of an event occurring.
Population
the complete collection of all individuals to be studied.
Sample
a subcollection of members selected from a population.
Sampling
selecting a portion (or subset) of the larger population and studying that portion (the sample) to gain information about the population. Data are the result of sampling from a population.
Parameter
a numerical measurement describing some characteristic of a population.
Statistic
a numerical measurement describing some characteristic of a sample.
Representative Sample
the idea that the sample must contain the characteristics of the population. One of the main concerns in the field of statistics is how accurately a statistic estimates a parameter.
Variable
a characteristic or measurement that can be determined for each member of a population.
Mean
or “average.”
Proportion
part out of the whole/total.
Quantitative (or numerical) data
data that consists of numbers representing counts or measurements.
Qualitative (or Categorical) data
data that consists of names or labels that are not numbers representing counts or measurements.
Discrete data
quantitative data which results when the number of possible values is either a finite number or a countable number.
Continuous data
quantitative data which results when there are infinitely many possible values corresponding to some continuous scale that covers a range of values without gaps, interruptions, or jumps.
Pie Chart
categories of data are represented by wedges in a circle and are proportional in size to the percentage of individuals in each category.
Bar Graph
the length of the bar for each category is proportional to the number or percent of individuals in each category.
Pareto chart
consists of bars that are sorted into order by category size (largest to smallest).
Simple random sample
A sample of n subjects selected in such a way that every possible sample of the same size n has the same chance of being chosen.
Systematic sample
A sample in which the researcher selects some starting point and then selects every kth element in the population.
Stratified sample
A sample in which the researcher subdivides the population into at least two different subgroups (or strata), and then draws a sample from each subgroup.
Cluster sample
A sample in which the researcher first divides the population into sections (or clusters), and then randomly selects all members from some of those clusters.
Convenience sample
A sample in which the researcher simply uses results that are very easy to get. This is not a valid sampling method and will likely result in biased data.
Bias
if the results of the sample are not representative of the population.
Sampling bias
the technique used to obtain the individuals to be in the sample tends to favor one part of the population over another
Nonresponse bias
when individuals selected to be in the sample who do not respond to a survey have different opinions from those who do.
Response bias
when answers on a survey do not reflect the true feelings of the respondent.
Interview error
a trained interviewer is essential to obtain accurate information. They will have the skill necessary to elicit responses and make the interviewee feel comfortable.
Misrepresented Answers
some survey questions result in responses that misrepresent facts or are flat-out lies.
Loaded Questions
The wording and presentation of questions play a large role in the type of response given to the question. The way a question is worded can lead to response bias, so they must always be asked in a balanced form
Ordering of Questions/Words
Questions can be unintentionally loaded by the order of items being considered. Many surveys rearrange the order of the questions within a questionnaire so that responses are not affected by prior questions.
Data-entry error
not technically a result of response bias, data-entry errors will lead to results not representative of the population
Problems with samples
A sample must be representative of the population. A sample that is not representative of the population is biased.
Self-selected samples
Responses only by people who choose to respond, such as call-in surveys, are often unreliable.
Sample size issues
Samples that are too small may be unreliable. Larger samples are better, if possible. In some situations, having small samples is unavoidable and can still be used to draw conclusions.
Undue influence
collecting data or asking questions in a way that influences the response.
Non-response or refusal of participation
The collected responses may no longer be representative of the population. Often, people with strong positive or negative opinions may answer surveys, which can affect the results.
Causality
A relationship between two variables does not mean that one causes the other to occur. They may be related (correlated) because of their relationship through a different variable.
Misleading use of data
improperly displayed graphs, incomplete data, or lack of context.
Confounding
When the effects of multiple factors on a response cannot be separated.
Frequency
The number of times a value of the data occurs
Relative frequency
The ratio (fraction or proportion) of the number of times a value of the data occurs in the set of all outcomes to the total number of outcomes.
Cumulative relative frequency
The accumulation of the previous relative frequencies
Nominal scale level
data that cannot be ordered nor can it be used in calculations
Ordinal scale level
data that can be ordered; the differences cannot be measured
Interval scale level
data with a definite ordering but no starting point; the differences can be measured, but there is no such thing as a ratio.
Ratio scale level
data with a starting point that can be ordered; the differences have meaning and ratios can be calculated
Explanatory variable
The variable whose effect you want to study; the independent variable.
Response variable
the variable that you suspect is affected by the other variable; the dependent variable.
Experimental Unit
a single object or individual to be measured
Placebo
A treatment that cannot influence the response variable
Double-blinded experiment
one in which both the subjects and the researchers involved with the subjects are blinded.
Nonsampling Error
an issue that affects the reliability of sampling data other than natural variation
Institutional Review Board
a committee tasked with oversight of research programs that involve human subjects