1/52
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
2 common reasons to summarize data
To clarify patterns observed & conciseness
Before summarizing data
Identify possible omissions, errors, or other anomalies. This removes all errors before further analysis.
How to analyze your data set
Describe it and determine how to interpret it
Measures of Central Tendency
Mean, Median, and Mode
Frequency
How often a value occurs in the form of tables or graphs
Frequency distribution table
Summarizes distribution of the data in terms of how of how often the value occurs. It can be continuous, categorical, or discrete. The type of data determines the data category there’s individual and interval scores.
Types of graphs for data types
continuous data: histogram and discrete data: bar chart or pie chart. Pie charts are usually used when there’s more categories than bar charts.
Mean
The sum of all the scores divided by the number of scores. It is the average. It describes normally distributed data with an interval or ratio scale.
Median
The midpoint of the data distribution. Use the median when data is skewed that way that data isn’t intensely impacted by the mean. An example is if we calculated our annual income and included Beyonce. If she joins then our mean income would skyrocket and not be indicative of the population. Ordinal scale.
Mode
Value that occurs the most often. There may be one or more. Used in biology or in retail/small businesses. Nominal scale.
Deviations from the normal scale
One factor is kurtosis or how peaked the distribution is. Uniform distribution is when all scores have the same frequency. Uniform distribution is when there’s two distinct peaks. Skewed distribution is when most scores cluster at one end of the distribution.
Mesokurtic
Kurtosis with a moderate peak
Leptokurtic
Kurtosis with a high peak (most scores are clustered in the middle)
Platykurtic
Kurtosis with a flat peak (scores are more spread out)
Variability
Dispersion or spread of scores in a distribution
Range
Difference between the largest and smallest value. Best for data sets with outliers.
Interquartile range
divides the distribution into quarters.
Variance
Average squared distance of sample scores from the mean
Degrees of freedom
The sample size minus one
Sum of squares (SSS)
Sum of squared deviations from the mean
Standard deviation
Average distance scores deviate from the mean. Always report standard deviation when you report the mean.
Empirical rule
Rule for normally distributed data at least 99.7% of data falls within 3SD, 95% falls within 2SD, 68% falls within 1SD.
Types of graphs
Bar graphs for nominal or ordinal scale data, Line graph for interval or ratio scale data, and scatterplot which is a graphical display of data points
Between subjects design
Different participants are observed one time in each group or at each level of a factor (everyone gets treatment A). It is the only design that can meet all three requirements of an experiment a.k.a randomization, manipulation, and inclusion of a comparison/control group but the sample size needed can be large.
Between subjects experimental design
Levels of a between subjects factor are manipulated then different participants are randomly assigned to each group or to each level of that factor and observed one time. (Everyone gets treatment A)
Control
The manipulation of a variable while holding all other variables constant
Experimental or treatment group
Participants are exposed to a manipulation (IV) that is believed to cause a chance in the dependent variable
Control Group
Participants treated the same as those in an experimental group except that the manipulation is omitted
Experimental manipulation
Identification of an IV and the creation of two or more groups that constitute the levels of that variable.
Natural manipulation
Manipulation of a stimulus that can be naturally changed with little effort
Staged manipulation
Manipulation that requires the participant to be “Set up” to experience stimulus or event.
Random Assignment
Procedure used to ensure that each participant has the same likelihood offing selected to a given group
Restricted random assignment
Restricting a sample based on known participant characteristics then using a random procedure to assign participants to each group.
Error variance or error
Variance attributed to or caused by the individual differences of participants in each group
Test statistic
Mathematical formula that allows researchers to determine the extent to which differences observes between groups can be attributed to the manipulation
Two independent sample t test
Test hypotheses concerning the difference in interval or ratio scale data between 2 groups
One way between subjects ANOVA
used to test hypotheses for one factor with two or more levels
Post hoc test
Computed following a significant ANOVA to determine which pairs of group means significantly different
Self reporting measure
items used in a survey, information is given to you BY the subject. It’s easy and cost effective but self report items are often inaccurate.
Behavioral measure
Speed and distance traveled by an athlete, behavior is recorded. Its more direct than self reporting but can require lots of ethical problems and some aspects of behavior are constructs.
Physiological measure
Physical responses of the brain and body like heart rate or body temp. When careful collection procedures are used these measures are unbiased but expense and training to operate equipment needed is expensive.
Null Hypothesis significance testing
inferential stats include a diverse set of tests of statistical significance called the NHST.
Null hypothesis
A statement about a population parameter, such as the population mean, that is assumed to be true BUT contradicts the research hypothesis.
AKA we begin by assuming we’re wrong.
Criterion
Probability value for the likelihood of obtaining data in a sample if the null were true for the population.
Retain (accept) the null
Greater than .05. Null results alone are rarely published.
Reject the null
Less than .05. Power is the probability that we will detect an effect if an effect actually exists in a population. this is the most publishable outcome.
P value
Probability of obtaining a sample outcome if the value states in the null hypothesis were true. Interpreted as an error.
Type II error
Probability of retaining a null hypothesis that is actually false. This means the researcher is reporting no effect in the population when there is one.
Type I error
Probability of rejecting a null hypothesis that is actually true. Researchers directly control for the probability of committing this error by stating the level of significance.
Goodness of fit test
Statistical procedure used to determine whether observed frequencies at each level of one categorical variable.
test for independence
Statistical procedure used to determine whether frequencies observed at the combination of levels of two categorical variables are similar or different from frequencies expected.
Effect
A mean difference or discrepancy between what was observed in a sample and what was expected to be observed in the population.
Estimation
A sample statistic is used to estimate the value of an unknown population parameter.