1/64
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
statistics
statistics are mathematical techniques used by social scientists to analyze data in order to answer questions and test theories
variable
any trait that can change values from case to caseand can be measured or categorized.
hypothesis
a statement about the relationship between variables that is specific and exact
a variable is said to be discrete if…plus example
it has a basic unit of measurement that cannot be subdivided…example—number of people per household
a variable is said to be continuous if….plus example
it has scores that can be subdivided infinitely (at least theoretically), suc has time
examples of nominal levels of measurement:
zip code, political party affiliation, and relgiiosu preference, gender
examples of ordinal levels of measurement
socioeconomic class, pain scale, satisfaction level—the scores have no absolute or objective meaning and the distance between scores cannot be described in precise terms
interval-ratio level
actual numbers that can be analyzed with all possible statistical techniques—add or multiply scores, they have equal intervals and true zero points—number of children, life expectancy, years married, number of siblings, years of education, income
measures of association
statistics that summarize the strength and direction of the relationship between varialbles
descriptive statistics
the branch of statistics concerned with summarizing the distribution of a single variable or measuring hte relationship between two or more varialbegsd
discrete variable
a variable with a basic unit of measurement that cannot be subdivided
continuous variable
a variable with a unit of measurement that can be subdivided infinitely
inferential statistics
the branch of statistics concerned with making generalizations from samples to populations
theory
a generalized explanation of the relationship between two or more variables
cumulative frequency
an optional column in a frequency distribution that displays the number of cases within an interval and all preceding intervalsfr
frequency distribution
a table that displays the number of cases in each category of a variable
inference
conclusion based on a data point
statistic
data point derived from evidence
nominal and ordinal levels of measurement are also known as…
grouped/discrete levels of measurement
interval/ratio levels of measurement are also known as
continuous
sex, race, religion, are examples of ____ levels of measurement (classify not order)
nominal
social class, attitude, and opinion are examples of _____ level of measurement
ordinal—classify into categories and rank them
age, number of children, income are all examples of ____ level of measurement
interval-ratio, spacing between numbers if quantifiable, equal units
three commonly used measures of central tendency:
mode (most common score), median (the score of the middle case), mean (average score)
what is the formula to calculate the mean?
summation of scores/the number of cases
measures of variation describe…
the variety, diversity, or heterogeneity of a distribution of scores
interquartile range (Q) is
the range of the middle 50% of cases in a distribution; the difference between the third and first quartile
deviation
the difference between the score and the mean
standard deviation
a statistic that quantifies the amount of variation around the mean
variance
the square of the standard deviation
variation
the amount of variety, or heterogeneity, in a distribution of scores
characteristiscs of the mean
all scores cancel out around the mean (sum of deviations from the mean is zero), the mean is the point of minimized variance (least squares principle), the mean uses all the scores (the mean is affected by extreme values—outliers)
strength and weaknesses of the mean:
strength: the mean uses all the availabl einformation from the variable. weaknesses: the mean is affected by every score. If there are some very high or low scores, the mean may be misleading.
for a positive skew, the mean will be ____ than the median
greater than
for a negative skew, the mean will be ___ than the median
less than
the range
high score-low score. quick and easy indication fo variability. limitations based on only two scores: distorted by atypically high or low scores. No information about variation between high and low scores.
percentiles
let you know where you lie in relation to others; statistics that indicate where observation lies, relative to otherst
the median is the ___ percentile
50% percentile
Interquartile range (Q)
avoids some problems of R by focusing on the middle 50 percent of scores
normal curve
a theoretical distribution of scores that is symmetrical, unimodal, and bell-shaped. The standard normal curve always ahs a mean of 0 and a standard deviation of 1.
normal curve table
a detailed description of the area between a Z score and the mean of any standardized normal distribution
Z scores
standard scores; the way scores are expressed after they have been standardized to the theoretical normal curve
What are characteristics of the normal curve?
bell-shaped, unimodal, symmetrical, unskewed, mode+mean+mean, follows the empirical rule
the empirical rule
68-95-99 rule (typically rounded up at 0.5, but you susaully don’t round a probability up to 1
z=
(x-x̄)/s or (raw-mean)/standard deviation
Central Limit Theorem
a theorem that specifics the mean, standard deviation, and the shape of the sampling distribution, given that the sample is large
Cluster sampling
a method of sampling by which geographical units are randomly selected and all cases within each selected unit are tested
EPSEM
the equal probability of selection method for selecting samples. Every element or case in the population must have an equal probability of selection for the sample.
nonprobability sample
any sample that does not mean the EPSEM criterion
nonprobability sample
any sample that does not meet the EPSEM criterion
parameter
a characteristic of apopulationrep
representative sample
a sample that reproduces the major characteristics of the population from which it was drawn
sampling distribution
the distribution of a statistic for all possible sample outcomes of a certain size. Under conditions specified in two theorems, the sample distribution will be normal in shape, with a mean equal to the population value and a standard deviation equal to the population standard deviation divided by the square root of N
sampling frame
a list of all the cases in a population. The list can be numbered and used to draw a simple random sample, a systematic random sample, or a stratified random sample
simple random sample
a method for choosing cases from a population by which every case and every combination of cases has an equal chance of being includedst
standard error of the mean
the standard deviation of a sampling distribution of sample meansst
stratified sample
a method of sampling by which cases are selected from sublists of hte populations
systematic sampling
a method of sampling by which the first case from a list of the population is randomly selected. Thereafter, every kth case is selected
the central problem of statistics
We want to know about a population. All we’ve got is one sample. We must use that sample to make inferences about the population.
Point estimate
Use the sample statistic to make your best guess about the (most) likely value of the population parameter—the sample statistic
Interval estimate/confidence interval
use the sample statistic and other information from the sample and some choices to make an educated guess about the likely range of values of a population parameter
alpha level
alpha levels reflect your willingness to be wrong—ie to have the interval “miss” the true population value, or to have the true population value fall outside the interval. every alpha level has a corresponding z-score, sometimes call a critical value.
smaller alpha level
more likely to be right, but interval is wider, so your estimate is less informative
larger alpha level
less likely to be right but interval is narrower, so your estimate is more informative