Statistical Methods and Psychological Testing - Unit 1 Notes
Relevance of Statistics in Psychology
Statistics is a branch of applied mathematics focusing on the organization, classification, analysis, and interpretation of data.
Statistics are prevalent in everyday life (GPA, weather forecasts, election predictions, step count).
The importance lies in identifying and interpreting statistics.
Statistics is used when professors organize test scores into grade distributions and calculate class averages.
Tracking medals won by athletes in Olympic games involves statistics.
Statistics is used to make important decisions.
Example: Restaurant owners use sales data and customer feedback to introduce new menu items, increasing customer satisfaction and revenue.
Example: Biologists use statistical analysis to determine the effect of a new drug on lab animals, ensuring research validity.
Data collected without statistical analysis would have little meaning.
Psychologists use statistical techniques to:
Organize data: Present data in understandable ways using visual displays like graphs, pie charts, frequency distributions, and scatter plots.
Summarize data: Describe observations accurately using formulas to create simple indices (measures of central tendency and variability).
Determine relationships among variables: Measure the direction and strength of relationships between variables using correlation statistics.
Examples: Association between weight loss and depression, relationship between patient satisfaction and health status.
Make inferences based on data: Use inferential statistics to analyze data and draw valid conclusions beyond the immediate data.
Reason from sample data to general conclusions.
Statistics help researchers answer research questions by determining justified general conclusions from specific results.
Statistics includes univariate and multivariate procedures.
Univariate procedures: Used when measuring one variable.
Multivariate procedures: Used when multiple variables are involved to ascertain relationships, make inferences, and extract factors.
Examples of statistical use in psychology:
Personality psychologists study individual differences and their effects on behavior.
Family counselors describe patient behavior and treatment effectiveness.
Sports psychologists analyze athlete performance.
Cross-cultural psychologists study differences among people from different cultures.
Typical steps in quantitative research:
Research Question → Statistical Question → Data Collection → Statistical Conclusion → Research Conclusion
Statistical procedures are a middle step in psychological research.
Incorrect research conclusions can be made despite correct statistical conclusions if confounding variables are not controlled.
Descriptive and Inferential Statistics
Descriptive Statistics
Purpose: To organize and summarize observations to make them easier to comprehend.
Important for visualizing what the data shows, especially with large datasets.
Enables meaningful data presentation, allowing simpler interpretation.
Example: Analyzing 500 students' coursework results to understand overall performance and grade distribution.
Techniques: Tabular, graphical and numerical.
Used to describe data in terms of:
Distribution (frequency tables, bar graphs, pie charts, scatterplots etc.)
Center (measures of central tendency: mean, median, mode)
Spread (measures of variability: range, standard deviation, variance)
Examples:
A teacher uses Grade Point Average (GPA) to summarize student performance.
Ranking players by batting averages in sports.
A college professor uses test scores to interpret student competence levels.
An organization uses the average number of sales to assess telemarketing executive performance.
Inferential Statistics
Purpose: To draw conclusions about conditions in a population (complete set of observations) from a sample (subset) drawn from the population.
Population: All observations of interest (e.g., students’ scores, people’s incomes).
Populations are often too big or inaccessible to study entirely.
Sample: A carefully chosen subset of the population used to infer characteristics of the population.
Allows making inferences from data to reach valid conclusions that extend beyond the analyzed dataset.
Research Process
Begins with a question about a population parameter.
Data is obtained from a sample to compute a sample statistic, which makes an inference about the population parameter.
Parameter: A descriptive index of the population (e.g., mean, median, standard deviation, correlation coefficient).
Statistic: A descriptive index of the sample.
Parameters are the real entities of interest, and statistics are educated guesses at reality.
The aim of inferential statistics is to infer population characteristics (parameter) from sample characteristics (statistic).
The steps used in statistical inference:
Population (X) -> Random sampling -> Sample (X)
Description: Parameter () Description: Statistic ()
Statistical inference from Sample to Population.
Applications
Used to compare differences between treatment groups and make generalizations about the larger population.
Example: Testing the effect of a drug on learning speed by administering it to one sample group and a placebo to another.
If the difference in average learning scores is not due to chance variation, it is inferred that the drug improves learning speed for the population.
Used in exit polls to predict election outcomes by asking a small number of citizens about their voting preferences and drawing inferences.
Important Notes
Both population and sample are defined in terms of observations rather than people.
Example: IQ and self-esteem scores of students represent two populations if the investigator is interested in them.
Population is defined by the interest of the investigator.
Example: If interested in the present class's performance on a mid-term test, the students’ scores constitute the population.
Types of Inferential Statistics
Hypothesis Testing
A specific value of the population parameter is hypothesized in advance and tested using sample statistics.
A decision is made about retaining or rejecting the hypothesized value based on a derived score (t, F).
Used to determine if the population parameter differs from some hypothesized value.
Statistical tests include t-test, Analysis of Variance (ANOVA), chi-square test, etc.
Estimation
No value is specified in advance; the question is, “What is the value of the population parameter?”
Suited for research questions where no specific hypothesis is presented, such as determining the percentage of voters preferring a candidate.
Procedures are used to directly estimate the true value of the population parameter from a sample statistic.
Example: Interval estimates (confidence intervals) are estimations of a range of values within which the parameter is expected to fall.
Descriptive vs. Inferential Statistics
Descriptive Statistics
Aim: To organize and summarize the current dataset.
Operates within a specific area containing the entire target population.
It does not allow us to make conclusions beyond the data we have analyzed.
Inferential Statistics
Aim: To draw conclusions about a population outside of the obtained dataset.
Takes a sample of a population, especially if the population is too big to conduct research, or when we don’t have access to the entire population.
It allows us to make conclusions beyond the immediate data we have analyzed.
Note: Descriptive statistics are generally presented even when a data analysis primarily uses inferential statistics.
Scales of Measurement
Overview
Scales: Nominal, Ordinal, Interval, Ratio
Measurement
Definition: A process whereby values (scores) are assigned to properties of people, places, things, or events.
The way values are assigned determines the scale of measurement.
Used to categorize and/or quantify variables.
Level of measurement: Refers to the amount of information the measurement procedure can convey about the actual quantity of the variable and the differences in individuals with different scores.
Nominal Scale
Simplest of the four levels.
Process: Placing observations into categories that differ in some qualitative aspect.
Categories must be mutually exclusive (observations cannot fall into more than one category) and exhaustive (there must be enough categories for all the observations).
Variables: Qualitative or categorical in nature are usually measured on a nominal scale, because we merely assign category labels.
Examples: Gender, political affiliation, or eye color.
Categories are simply different, with no category having more or less of any particular quality.
Numbers are only arbitrary and do not designate “more” or “less” of anything when used to represent categories.
Example: Room numbers are simply names and do not reflect any quantitative information.
Numerical values are used as a code for nominal categories when data are entered into computer programs.
Example: Coding males with a 0 and females with a 1.
Ordinal Scale
Categories: Must be mutually exclusive and exhaustive, but they also indicate the order of magnitude of some variable.
Outcome: A set of ordered categories or ranks.
Ordering or ranking of responses along some underlying dimension that expresses “more” or “less” of something.
Example: Instructor, assistant professor, associate professor, and professor.
Supervisor estimates the competence of seven workers by arranging them in order of merit.
The relation expressed is that of “greater than.”
Interval between two successive ranks is indeterminate.
The difference between any two consecutive ranks may not be the same as that between another pair of consecutive ranks.
Measurements describe order but not the relative size or degree of difference between the adjacent steps on the scale.
Nothing is implied about the absolute level of merit.
Interval Scale
Properties: Has all the properties of the ordinal scale, but with the further refinement that a given interval (distance) between scores has the same meaning anywhere on the scale.
Tells us about the ordering of observations and indicates the distance between them.
Allows us to know how many units greater than, or less than, one observation is from another on the measured characteristic.
Examples: Degrees of temperature on the Fahrenheit or Celsius scales.
Zero point: An arbitrary reference point; the value of 0 is assigned as a matter of convenience or reference.
Zero does not indicate a total absence of the quality being measured.
Not possible to speak meaningfully about a ratio between two measurements.
Ratio Scale
Properties: Possesses all the properties of an interval scale and in addition has an absolute zero point in which there is total absence of the characteristic being measured.
Possible to speak meaningfully about a ratio between two measurements.
Example: The Kelvin scale has an absolute zero, the point at which a substance would have no molecular motion and, therefore, no heat.
Other examples: Length, weight, and measures of elapsed time.
Properties of Measurement Scales
Identity: Each value on the measurement scale has a unique meaning.
Magnitude: Values on the measurement scale have an ordered relationship to one another.
Equal intervals: Scale units along the scale are equal to one another.
Absolute zero: The scale has a true zero point, below which no values exist.
Scale Characteristics
Nominal Scale
Numbers are assigned to categories as "names."
Assigning numbers is arbitrary.
Numbers only give us the identity of the category assigned.
Ordinal Scales
Have the property of magnitude as well as identity.
Numbers represent a quality being measured (identity) and can tell us whether a case has more or less of the quality measured than another case (magnitude).
The distance between scale points is not equal.
Interval Scale
Has the properties of identity, magnitude, and equal intervals.
You know not only whether different values are bigger or smaller, you also know how much bigger or smaller they are.
Ratio Scale
Satisfies all four of the properties of measurement: identity, magnitude, equal intervals, and an absolute zero.
Scales of Measurement and Characteristics:
Nominal: Mutually exclusive and exhaustive categories differing in some qualitative aspect. Examples: Sex, ethnic group, religion, eye color, academic major.
Ordinal: Observations ranked in order of magnitude. Ranks express a "greater than" relationship, but with no implication about how much greater. Examples: Military rank, academic standing, workers sorted according to merit.
Interval: Numerical values indicate order of merit and meaningfully reflect relative distances between points along the scale. Examples: Temperature in degrees Celsius or Fahrenheit.
Ratio: Has an absolute zero point. Ratio between measures becomes meaningful. Examples: Length, weight, elapsed time, temperature in degrees Kelvin.
Scales of Measurement and Statistical Treatment
Many measuring instruments in the behavioral sciences lack equal intervals and an absolute zero point.
Consider a spelling test. A score of zero doesn't mean a total absence of spelling ability.
The same is true of midterm tests, IQ tests and the SAT.
Some people argued that calculating certain statistical variables (such as averages) on tests of mental abilities could be seriously misleading.
Fortunately, the weight of the evidence suggests that in most situations, making statistical conclusions is not seriously hampered by uncertainty about the scale of measurement.
Be aware of scale problems to avoid erroneous positions.
Do not say that a person with an IQ of 150 is twice as bright as one with an IQ of 75.
This problem may be critical when a test does not have enough “top” or “bottom” to differentiate adequately among the group measured.
The measuring instrument is simply incapable of showing this difference because it does not include items of greater difficulty.
Scales of Measurement and Appropriate Statistics
The level of measurement of a variable tells us which statistics are permissible and appropriate.
Descriptive Statistics
Nominal: Frequency tables, Mode
Ordinal: Frequency tables, Percentiles, Mode, Median, Range
Interval & Ratio: Frequency tables, Mode, Median, Mean, Range, Variance, Standard Deviation
Inferential Statistics
Nominal: Non-parametric tests: Chi-square test
Ordinal: Non-parametric tests: Rank-order correlation, Mann-Whitney U test, Kruskal-Wallis test, Friedman’s ANOVA
Interval and Ratio: Parametric tests: Pearson’s correlation coefficient, t-test, ANOVA, Regression, Factor analysis
Statistical tests are divided into two types: parametric and non-parametric tests.
Parametric tests are more powerful but can be used only with interval or ratio data.
Ordinal and nominal data require the use of non-parametric tests.
Problems related to Scales of Measurement
Examples of data:
Marital status
Number of students who drop a statistics course
The time students spend studying for their first statistics test
The weight loss over the first week of a “fad” diet
The amount owed on a credit card
The part on a new automobile that breaks during the first year of ownership
The rank of a military officer
Graphical Representation of Data
Overview
Basic Procedures
The Histogram
The Frequency Polygon
The Bar Diagram
Factors affecting the Shape of Graphs
Graphical Representation of Data
Frequency distributions present the main features of data succinctly, but they are still abstract numerical representations.
Graphs can impart the same information more directly by visually presenting the pertinent features of the data.
Graphs are easier to interpret, making them useful for presenting data to the general public.
A well formatted graph helps in visually illustrating certain characteristics and trends in a set of data.
Types of graphs:
Qualitative variables: Bar graphs and pie charts.
Quantitative variables: Histograms, frequency polygons, and cumulative frequency graphs.
Basic Procedures
Graphs have two perpendicular lines called axes: X-axis (horizontal axis, abscissa), Y-axis (vertical axis, ordinate).
The measurement scale (X values or categories) is listed along the X-axis (values increasing from left to right for quantitative variables).
The frequencies (or some function of frequency) are listed on the Y-axis with values increasing from bottom to top.
The point where the two axes intersect should have a value of zero.
Graph height should be approximately three-quarters (3/4th) of its width.
The graph should have an informative title, and both the axes should have appropriate labels.
The Histogram
Most commonly used graph to show frequency distributions.
Plots the frequency distribution of a numeric variable as a series of adjacent bars/rectangles.
Each bar represents the scores in one of the class intervals of the distribution.
The two vertical boundaries or the edges of the bar coincide with the real limits of the particular class interval.
The height of a bar represents the frequency of scores for that class interval. Frequencies or relative frequencies can be used.
Steps in construction:
Step 1: Construct a frequency distribution.
Step 2: Decide on a suitable scale for the X-axis by identifying and adding 2 class intervals falling immediately outside the end class intervals.
Step 3: Decide on a suitable scale for the Y-axis by multiplying the width by ¾ or .75 to find the approximate number of squares for the graph’s height.
Step 4: Draw bars of equal width for each class interval so that the height corresponds to the frequency or relative frequency of that interval. There should be no gaps.
The edges of bar represents both the upper real limit of one interval and the lower real limit of the next higher interval.
Step 5: Identify the class intervals by using either real limits or mid-points.
Step 6: Label the axes and give the histogram a title.
The Frequency Polygon
Like a histogram except that points are drawn rather than bars.
Points are plotted above the mid-point of each class interval at a height equal to the frequency or relative frequency of scores in that interval.
The points are then connected by straight lines.
The graph is brought down to the mid-points of the additional first and last class intervals with zero frequencies to ensure it is a closed figure.
Steps in construction:
Step 1: Construct a frequency distribution.
Step 2: Decide on a suitable scale for X-axis and Y-axis.
Step 3: Label the class interval mid-points along the X-axis.
Step 4: Place a dot above the midpoint of each class interval at a height equal to the frequency or relative frequency of the scores in that interval.
Step 5: Connect the dots with straight lines.
Step 6: Label the axes and give the polygon a title.
Normally, the polygon is brought down to the horizontal axis at both ends using class intervals with zero frequencies.
If scores in the next adjacent class interval are not possible, leave the dot “dangling.”
Choosing between a Histogram & a Polygon
Both are used for graphing quantitative data on an interval or ratio scale.
A histogram is often used when graphing an ungrouped frequency distribution of a discrete variable.
The general public seems to find a histogram a little easier to understand than a polygon.
A histogram also has some merit when displaying relative frequency.
However, representing frequencies by bars suggests that scores are evenly distributed within each class interval.
A frequency polygon is often preferred for grouped frequency distribution of a continuous variable because it shows the gradual change over a wide range of scores and suggests continuity of the variable.
Frequency polygons are particularly helpful when comparing two or more distributions. When distributions are based on different number of cases, relative frequencies rather than raw frequencies can be used.
The Bar Diagram
Used for depicting qualitative categories on a nominal or ordinal scale of measurement.
Similar to a histogram, except that space appears between the rectangles, suggesting the essential discontinuity of the several categories.
Within categories, subcategories may be displayed as adjacent bars.
Qualitative categories on a nominal scale of measurement have no necessary order and may be arranged in any order.
For ordinal scales of measurement, categories should be arranged in order of rank (e.g., freshman, sophomore, junior, senior).
Factors affecting the Shape of Graphs
Grouping: The same set of raw scores may be grouped in different ways, affecting the graph of the distribution.
Relative scale: The decision about relative scale is arbitrary, and the resulting graph can be squat or slender depending on the choice.
Scale of measurement: The same data can appear very different when graphed, depending on the scale of measurement used for frequency.
Vertical axes should always be continuous from zero to keep the proportional relationship among class interval frequencies.
Shapes of Graphed Frequency Distributions
Rectangular distribution: There are equal number of cases in all class intervals.
Skewed distributions: One tail slants to the right (positively skewed) or to the left (negatively skewed).
A negatively skewed distribution results, for example, if the participants are given a very easy test.
A positively skewed distribution results if the test is very hard.
Bimodal distribution: There are two peaks or humps, each with the same maximum frequency.
Multimodal distribution: A graph with three or more humps, each with the same maximum frequency.
Bell-shaped distribution: A specific type of bell-shaped distribution, called the normal curve, is of great importance in statistical inference.
Kurtosis: Refers to the degree of peakedness of a graphed distribution.
Normal distribution: Mesokurtic.
Distribution flatter than the normal curve: Platykurtic.
Distribution more peaked than the normal curve: Leptokurtic.