1/73
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
statistics
the science of collecting, organizing, analyzing, interpreting, and presenting data
variable
a characteristic or attribute that can assume different values
data
- values (measurements or observations that the variables can assume
- consists of information coming from observations, counts, measurements, or responses
descriptive and inferential
two types of statistics
descriptive statistics
describes features in a data set; does not allow us to make any conclusions beyond the given data
inferential statistics
making inferences or conclusions about data; determining relationship among variables and makes predictions
population (N)
entire collection of objects/outcomes which data are collected.
physical population
well defined population, often finite and available at the time the sample is collected
conceptual population
population consisting all the values that might possibly have been observed from a population
sample (n)
a subset of the population containing the observed objects or outcomes of resulting data
summary measure
single numeric feature which describes a particular feature of the data
statistic (from sample data)
parameter (from population data)
types of summary measure
qualitative variables
variables that can be placed into distinct categories, according to some characteristic or attribute
quantitative variables
variables that can be counted or measured
dependent variables
The outcome factor; the variable that may change in response to manipulations of the independent variable.
independent variable
The experimental factor that is manipulated; the variable whose effect is being studied.
nominal scale
a scale in which objects or individuals are assigned to categories that have no numerical properties
e.g. gender, nationality, civil status, student number
ordinal scale
a scale of measurement in which the measurement categories form a rank order along a continuum
e.g. scale ratings, education level, economic status
interval scale
A quantitative measurement scale that has no "true zero," and in which the numerals represent equal intervals (distances) between levels
e.g. temperature, scores, hours, IQ level, years
ratio scale
a quantitative scale of measurement in which the numerals have equal intervals and the value of zero truly means "nothing"; starts from absolute or true zero point
e.g. height, weight, area
primary
types of data according to nature of collection
survey study
systematic method of gathering information from the characteristics of a population
direct / interview method
researcher has direct contact with interviewee; asks questions to obtain needed information
indirect / questionnaire method
researcher distributes questionnaire to respondents and expect them to answer the questions
registration method
method of collecting data gathered by laws (e.g. birth and death rates, number of registered voters)
retrospective study
studies primary data collected by another source
observational study
observes individuals and measures variables of interest but does not attempt to influence the responses
experimental design
used to find out cause and effect relationships; often used by scientists
simulation study
process of collecting data from a simulation, which is a computer model of a system that mimics its behavior
observation unit
basic unit of observation; object which a measurement is taken
target population
complete collection of observations we want to study
sampled population
the collection of all possible observation units that might have been chosen in a sample; the population from which the sample was taken
sampling unit
unit that can be selected for a sample
sampling frame
a list of sampling units in the population from which a sample may be selected
selection bias
occurs when some part of the target population is not in the sampled population; when some population units are sampled at a different rate than intended by the investigator
measurement error
when a response in the survey differs from the true value; happens when people lie / forget / give different answers / impress the interviewer / misinterpret the question
sampling error
error that results from taking one sample instead of examining the whole population
non-sample error
errors that cannot be attributed to the sample-to-sample variability; examples include selection bias and measurement error
simple random sample
systematic random sample
stratified random sample
cluster sample
types of simple probability samples
simple random sampling
every member/unit of the population has an equal probability of being selected for the sample
simple random sampling with replacement
one unit is randomly selected from the population to be the first sampled unit, with probability 1/N
simple random sampling without replacement
every possible subset of n distinct units in the population has the same probability of being selected as the sample, with probability n/N
Yamane or Slovin's formula
use this formula in simple random sampling technique if the population is known and finite
n = N/1+Ne^2
n - sample size; N - population size, e - margin of error (percentage)
Yamane or Slovin's formula
Cochran's Sample Size Formula
use this formula in simple random sampling technique if the population size is unknown but infinite
n = p(1-p)z^2 / e^2
n - sample size, p - population proportion, e - acceptable margin of error, z - z-score at significance level
Cochran's Sample Size Formula
90%: z = 1.645
95%: z = 1.96
99%: z = 2.576
z-score at confidence level
systematic sampling
sampling technique: select some starting point and then select every kth element in the population
stratified random sampling
a random sampling technique in which the researcher identifies particular demographic categories of interest (called "strata") and then randomly selects individuals within each category.
n_sample = n * (stratum / total)
n - computed sample size
(always round up the strata size)
stratified random sampling formula
cluster sampling
clusters of participants within the population of interest are selected at random, followed by data collection from all individuals in each cluster.
non-probability sampling
method of selecting sampling units from a target population using a subjective or non-random method
convenience sampling
choosing individuals who are easiest to reach (e.g. interviewing people passing by)
purposive sampling
researchers deliberately choose qualified participants to take the study
quota sampling
based on certain quotas or predetermined criteria; forces the inclusion of members of different subpopulations
snowball sampling
selection of participants through referrals from earlier participants; used if the population of interest is hard to find (people with certain disabilities, victims of specific crimes, drug users)
30 to 500
30 for each category
preferred sample size for most research
primary data
data collected specifically for a desired analysis (e.g. surveys)
secondary data
data that is already collected and are available for statistical analysis
raw data
information obtained by observing values of a variable
discrete data
data obtained by observing values of a qualitative variable
continuous data
data obtained by observing values of a qualitative variable
ungrouped data
data that are not organized, or could only be numerically organized
grouped data
data that are organized and arranged into different classes or categories
tabular, textual, graphical
methods of presenting data
frequency distribution
organization of raw data in table form, using classes and frequencies
frequency distribution table (FDT)
A statistical table showing the frequency or number of observations contained in each of the defined classes or categories
relative frequency
obtained by dividing frequency by the sum of all frequencies; expressed as percentages
class limit
endpoints of a class interval
class boundaries
the numbers used to separate the classes so that there are no gaps in the frequency distribution; ends in .5
lower limit - 0.5
lower boundary
upper limit + 0.5
upper boundary
class width (i)
difference between boundaries for any class
class mark
midpoint of a class interval