2/4
Level of measurement
Nominal - (lowest level of measurement) asking questions that cannot be ranked, can level it from low to high, only naming them
It must be made into new variables
ex - color, gender, marital status, race, college major
Always discreet
Some are dichotomies
Ordinal variable - can rank it from low to high, but rank only, no other information
ex. - not a discrete answer/ number like 3 meaning a couple times a month instead of 3 being the actual number
Always discreet
Can be ranked or ordered
Interval ration - rank plus more, true 0 and knowable interval width, info between answers is more known
True zero point means we have an actual number
Could be continuous or discrete
The theory and logic of probability sampling
General term for samples selected in accord w probability theory
Often used for large scale surveys
If all members of population were identical in respects there would be no need for careful sampling procedures, this is rarely the case
Sample of individuals from population must contain same variations that exists in the
Population and sampling frames
Sampling frame would be the identifier for understanding the population,
like ID numbers for all usc students, drivers licences, telephone numbers
List of elements to closely approximate population
Population - pool from which sample is selected (targeted)
All married people in the US
All people born during the depression
Populations are abstract, need to estimate them
Population parameters
Summary description of a given
Statistic - the summary description of a variable in sample, used to estimate a population parameter
Importance of randomization
Essential for replying on probability ]each sampling element has equal chance of being selected
Increases representativeness of the sample
Also allows for sampling error to be calculated
Gap between sample results and pop. parameter
Types of randomized sampling methods
Simple random
Systematic
Stratified
Cluster (will have the most errors) will be on the quiz
All of these samples will have errors, although clusters will have the most
Simple random sample
Select cases randomly from sampling frame
Randomly selected w no pattern
Systematic sample
Randomly selected by the first one, and then the others after are selected in a variation
Example: every 10th person from a directory
Yields the same results are simple random
Stratified sample
Used for race to be able to get a little piece of every race
Divide the population into subpopulation/ strata
Strata usually based on some important characteristic
Randomly selected cases from each strata
Good for including proportionally small groups
Cluster sample
More geographic
City block, county
Randomly select a cluster, then you randomly select from within that cluster
Test he elements within the cluster, which keeps introducing more rounds
Always has at least 1 more round of sampling
This introduces another round of error
Less accurate that simple random sampling and other forms because it introduces error every rounds it undergoes
More clusters are better
Sample sizes
The larger the population is, the smaller the sampling ratio needs to be
Larger samples are important if
More accuracy is needed
Population is more heterogeneous
More variables will be modeled
Subgroups will be analyzed
draw inferences
One of the key functions of sampling
Smaller sampling error when sample is
Larger
More homogenous
The larger the sample size leads to a smaller sample error, it narrows a confidence intervals
Central tendency, levels of measurements are about variables themselves
Mode - least informations given, provides the common response, goes with nominal variables
Median - the middle score, for ordinal variables
Mean - interval ratio variables, every value of single scores are used to evaluate
Mode
Good for quick info
For nominal
Median
Exact midpoint of the 2 middle cases
Mean
Typical score
Be aware of:
Numerical middle meaning the value of all the scores have a mean
The mean is affected by every score
You anticipate additional statistical analyses
Dispersion : standard deviation
Tells us that the mean can't
Smaller the standard deviation the more similar the variables will be
Levels of Measurement:
Nominal: Lowest level, categories that cannot be ranked (e.g. color, gender). Discreet variables only.
Ordinal: Variables that can be ranked but don't specify the magnitude of difference (e.g. satisfaction ratings). Discreet and ordered.
Interval Ratio: Incorporates ranking, true zero, and knowable intervals; can be continuous or discreet and provides comprehensive information between values.
Probability Sampling:
A method where samples are selected according to probability theory, essential for large-scale surveys.
Ensures all population variations are represented. Randomization is crucial for representativeness and allows for sampling error calculation.
Sampling Techniques:
Simple Random Sampling: Random selection from a sampling frame, equal chance for all.
Systematic Sampling: Begins with a random selection, followed by sampling at regular intervals (e.g., every 10th person).
Stratified Sampling: Divides the population into strata based on characteristics, ensuring small groups are represented.
Cluster Sampling: Randomly selects clusters (e.g., geographic areas) and samples within them, leading to higher error due to multiple rounds of selection.
Importance of Sample Size:
Larger samples reduce sampling error and increase representativeness, essential for heterogeneous populations.