1/99
tamu-alwood
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
statistics
a set of tools and techniques used for describing, organizing, and interpreting information or data
three basic goals of science
description, prediction, explanation
description
how people behave
rule-based vs rule free play behavior among children
prediction
identifying the factors that influence behavior
when a group has more boys than girls, rule based play is more likely
explanation
identifying the underlying causes of a behavior
boys are competitive? rule based games allow for more competition
how does statistics help science?
help scientists accomplish the three goals
descriptive statistics
used to organize and describe data (counts, means, percentages
inferential statistics
next step after descriptive
used to make inferences about a larger group from a smaller group
allow you to infer the truth about the larger group based on information you at her from smaller group people
sample
the group you are actually collecting data from
(smaller group/subset of the larger group you are interested in)
population
the group you are actually interested in drawing conclusions abou
variable
something that can vary or change or have different values for different individuals
data
information collected from the sample on the variables we are interested in
the actual numbers, measurements, or characteristics that represent the ideas we are interested in
continuous data
data measured on a continuum
all numbers between two endpoints are possible scores
categorical data
data that sorts people into categories (only so many options for the variables)
central tendency
a single number that represents a group of scores
mean, median, mode
mean
the average, sensitive to extreme scores
median
midpoint in set scores, point at which half the scores are bigger and half the scores are smaller, not influenced by extreme scores
mode
the value occurs most frequently in the data set, used for categorical data (not the number)
bimodal
when there is more than one mode
variability
how different scores are from each other
represent the spread or dispersion in the dataset
helps us to understand the nature of our sample and the nature of our variables
range, standard deviation, variance
range
how far apart the scores are from each other, subtract lowest score from highest score, considers only the most extreme values
standard deviation
average amount of variability in a set of data
variance
if you know sd, you know this, rarely reported as a descriptive statistic
how many standard deviations away from the mean is a potential outlier?
anything more than 2 (3 is likely a outlier)
histogram
allows us to see the distribution of our data
- the height of each bar is the number of times each value occurs in our data sat
skewness
lack of symmetry
kurtosis
peaked vs flat distribution
platykurtic
low kurtosis, flat so more variability
leptokurtic
high kurtosis, peaked so less variability
bar graph
show frequency of categorical responses
correlations
how do changes in the value of one variable relate to changed in the value of another variable
- compute when we have scores on two variables
scatterplot
plots one variable on the x-axis and the other on the y-axis
- useful for looking at relationship between two variables
correlation coefficients
assign a single number to describe the relationship between two variables
direction
positive or negative
- the sign
- positive signs indicate positive relationships(scores on the variables move in the same direction)
- neg sign indicates negative relationships (variables move in opposite directions)
strength
magnitude of the coefficient (how far it is from zero)
- if r is 0 then there is no relationship but if r is 1/-1 then there is a perfect relationship
limitations of correlation coefficients
can only be used to identify linear relationships
- restriction of range- occurs when most subjects have similar scores on one of the variables being correlated
- outliers have a big influence on correlation coefficients
correlation matrix
simple way to report a bunch of correlations at one time
coefficient of determination
represents how much variation 2 variables share
coefficient of alienation
the variance left over
reverse causation
the causal direction may be opposite from what has been hypothesized
reciprocal causation
two variables cause each other
measurement
the act or process of assigning numbers to phenomena according to a rule
measurement scales
the type of scale we use has important implications for what kind of statistics we use
nominal scales
measures that split people into categories
- categories must be mutually exclusive
- data is in the form of counts/percentages
=namable
ordinal scales
the number is a ranking
=ordering
- class ranking, candidates for a job, top 10 lists
- not clear how much distance separates the data points on the scale
interval scales
ordering events with equal spacing (based on an underlying continuum
- most common type of scale in psychology
- equal intervals
ratio scale
- similar to interval scale but zero has a specific meaning 0
- zero=ratio
reliability
how do we know that the measure we use works consistently
validity
how do we know that the measure we use measures what it is suppose to
observed score
the score you actually got
true score
true reflection of what you really know
error score
discrepancy between observes score and true score
test retest reliability
does the same person get a similar score when they complete the measure at two different time points
parallel forms reliability
are different forms of the measure equivalent
inter item reliability
internal consistency reliability
- how similar are a person's answer to different items meant to measure the same thing
inter rated reliability
how consistent are the observations made by two or more people
cronbachs alpha
single number that reflects the degree of internal consistency of the items on a scale
content validity
is the measure a good sample of the universe of items that could be used to assess this construct?
criterion validity
Does the measure reflect or relate to what it "should" right now and/or in the future?
construct validity
Is the measure related to other constructs it should be related to and not related to other constructs it should not be related to?
concurrent validity
Does the measure correlate with grades in PSBI 301? Class attendance? Etc.
predictive validity
Does the measure predict who will have a job using stats 10 years from now?
convaergent validity
Does the measure relate to other things it should?
Statistics and math proficiency
Self-esteem and depression
Discriminant Validity
Does the measure NOT relate to things it shouldn't?
Statistics and language proficiency
Self-esteem and political affiliation
null hypothesis
here will be no difference between your groups or no association between your variables
research hypothesis
level of the sample and directly linked to our method/measures
non directional research hypothesis
Reflects a difference between groups, but the direction is not specified
directional research hypothesis
direction of difference is specified
inferential logic
process of going from data to a universal truth about the world
what makes a good hypothesis?
based on scientific literature
concise
testable
normal cuve
bell shaped curved
- Mean, Median, and Mode are equivalent
- Perfectly symmetrical
-Asymptotic
z-scores
Uses the mean and the standard deviation to transform raw scores (x values) into a standard score (z-score)
- how far a point is from its mean and rely on standard deviations as the standardized unit
z-score and probability
All of these %'s represent the probability that a score would occur in the range you've defined
percentile rank
score that indicates the percentage of people who scores at or below a given raw score
alpha
The accepted cutoff for calling a result "statistically significant"
p value
probability of the observed results if the null hypothesis is true
statistical significance
Hypothesis testing is the process of determining this
type 1 error
Rejecting a null hypothesis that is actually true
Bad for science and the scientist
You may waste time trying to replicate the finding
Others may waste time trying to replicate it
May be discredited when study doesn't hold up to replication
Slows down the progress of science
type 2 errors
Failing to reject a null hypothesis that is actually false
- May stop studying something of interest
◦ Important research findings never make it to light
◦ Truth may never be uncovered
file drawer effect
Studies that reject Ho are more likely to get published than studies that fail to reject H0
effect size
The absolute difference has to also be considered
confidence intervals
Another way to apply probability via the Z table.
An estimated range for a population mean, given the descriptive stats from a sample
- use the Z table and the descriptive statistics from a sample to calculate a range in the population associated with a certain level of confidence.
one sample z test
When we want to test the difference between a sample and a population
independent samples t test
Use when comparing scores from two separate groups of people
- Also referred to as a between-subjects t-test
degrees of freedom
the number of scores in a sample that are free to vary once the mean is known
- n-2
effect size
A measure of how different two groups are from one another
INDEPENDENT of sample size
Related directly to the magnitude (size) of the difference between groups
paired samples t test
dependent samples t-test
- Same group of participants tested more than once
dependent pros
Need fewer participants to get to the same degrees of freedom/power
◦ Each participant serves as his/her own control
dependent cons
Order/learning effects
◦ Can't always have same person do both things(control condition and experimental condition)
◦ May increase potential for alternative explanations
dependent degrees of freedom
n-1
one way anova
Used when you have ONE factor with more than 2 levels
factor
variable that designates the groups to be compared (aka independent variables)
levels
the different groups within a factor
Sum of Squares Between
The differences between the group means and the "grand mean"
sum of squares within
sum of the differences between each score and the mean of the group it came from
factorial anova
used when you have more than one factor
- two or more independent variables
repeated measures anova
used when you test multiple groups multiple times