Alternatives to Experimentation

Correlational and Quasi-Experimental Designs

Overview:
- Nonexperimental designs are discussed, focusing on correlational and quasi-experimental methods.
- Correlational designs establish relationships among pre-existing behaviors and predict behaviors.
- Quasi-experimental designs are used when researchers cannot manipulate or control antecedent conditions.

Correlational Designs

Correlational Designs:
- Used to establish relationships among pre-existing behaviors.
- Can be used to predict one set of behaviors from others (e.g., predicting college grades from entrance exam scores).
- Can show relationships between antecedent conditions and behavioral effects (e.g., smoking and lung cancer).
- Antecedents are pre-existing and not manipulated or controlled by the researcher.
- Advanced methods (path analysis, cross-lagged panel designs) propose cause-and-effect relationships by developing causal models.
- Establishing cause-and-effect relationships conclusively is difficult using correlational techniques.
Causal Models:
- Causal models can be constructed from correlation-based designs.
Interpretation of Results:

Quasi-Experimental Designs

Quasi-Experimental Designs:
- Used when researchers cannot manipulate or control antecedent conditions.
- Lack essential elements of true experiments, such as manipulation of antecedents or random assignment.
- Subjects are selected for different conditions based on pre-existing characteristics.
- Used to compare behavioral differences associated with different types of subjects (e.g., normal vs. schizophrenic children).
- Also used in naturally occurring situations (e.g., one- vs. two-parent homes) or common/unusual events (e.g., birth of sibling, surviving a hurricane).
- Experimenter studies pre-existing antecedent conditions by selecting subjects based on the characteristic or circumstance.
Manipulation of Antecedent Conditions:
- Frequently not an option for researchers.
- Many behaviors of interest to psychologists are pre-existing conditions (e.g., childhood schizophrenia).
- Quasi-experimentation can increase understanding of environmental, biological, cognitive, and genetic attributes of behavioral disorders.
- Offers more systematic control than nonexperimental designs and can be used in various research settings.
Random Assignment:
- Quasi-experimental designs are used when subjects cannot be assigned at random to receive different experimental manipulations or treatments.
- Example: Comparing fluorescent and incandescent lighting on worker productivity in two manufacturing companies where workers are pre-assigned.
Confounding:
- Inability to establish cause with certainty in research is called confounding.
- Unless other antecedents are carefully controlled, the experiment will not be high in internal validity.
Internal Validity:
- Ability to establish a causal relationship between antecedent conditions (treatments) and observed behavior.
- High internal validity means confidence that the specified antecedents caused the observed differences.
- Quasi-experiments explore consistent differences between pre-existing groups or compare treatments in nonrandom groups, but cause cannot be established with confidence.
External Validity:
- Correlational designs and quasi-experiments tend to be higher in external validity (generalizability) than laboratory experiments.
Manipulation of Antecedents & Imposition of Units:
- Correlations are low in manipulation of antecedents.
- Quasi-experiments vary in the degree of manipulation of antecedents but are not considered high without random assignment.
- Both correlational and quasi-experimental designs tend to be high in the imposition of units (restrict subjects' responses).
- Researchers typically interested in specific kinds of information from each subject.
- Both methods rely on statistical data analyses to evaluate the significance or importance of results objectively.
- Some designs use correlational analyses; others use inferential statistics.
- Some designs use sophisticated correlational techniques to create causal models but have limited internal validity.

Correlation

Correlation as a Research Method:
- Correlation is a statistical summary of observations and is common in nonexperimental studies.
- It is used with both laboratory and field data.
Ethical and Practical Reasons:
- Some questions cannot be answered experimentally for practical and ethical reasons (e.g., long-term effects of TV violence on aggressiveness).
Correlational Study Definition:
- A correlational study determines the correlation, or degree of relationship, between two traits, behaviors, or events.
- Changes in one are associated with changes in another.
- Researchers explore behaviors that are not yet well understood and seek possible explanations for behaviors by measuring many variables.
Variables:
- A variable is any observable behavior, characteristic, or event that can vary or have different values.
Data & Hypothesis:
- Correlational data may serve as the basis for new experimental hypotheses.
How to conduct:
- Selected traits or behaviors of interest are measured first, and numbers (scores) representing the measured variables are recorded.
- The degree of relationship, or correlation, between the numbers is determined through statistical procedures.
- The researcher measures events without attempting to alter the antecedent conditions.
Predictions:
- Once the correlation is known, it can be used to make predictions.
- If we know a person’s score on one measure, we can make a better prediction of that person’s score on another measure that is highly related to it.
- The higher the correlation, the more accurate our prediction will be.
Example: Television Viewing and Vocabulary:
- The researcher gathers data to determine the relationship between television viewing and vocabulary size.
- An objective measure of vocabulary is devised, and daily television viewing time is measured.
- The degree of relationship, or correlation, between the two measures is then evaluated through statistical procedures.

Simple Correlation

Simple Correlations:
- Relationships between pairs of scores from each subject.
- Pearson Product Moment Correlation Coefficient (r) is most commonly used.
- Values range between -1.00 and +1.00.
Correlation Coefficient:
- -1.00 \leq r \leq +1.00
- Sign indicates positive or negative direction.
- Absolute value indicates strength.
Number Line:
- -1. 00 ————— - .50 —————— 0 ————— +.50 ———— +1.00
- Correlation coefficients are carried out to two decimal places.

Scatterplots

Scatterplots:
- Visual representations of scores for each subject.
- Each dot represents one subject with two scores.
- One score places the dot along the X (horizontal) axis, and the second score places the dot along the Y (vertical) axis.
- Give a rough indication of the direction and strength of the relationship.
Possible Correlation Outcomes:
- Positive Relationship: As viewing increased, vocabulary also increased.
- Negative Relationship: As viewing increased, vocabulary declined.
- No Strong Relationship: Dots form no particular pattern.
Regression Lines:
- Also known as lines of best fit.
- Illustrate the mathematical equation that best describes the linear relationship between the two measured scores.
- Direction corresponds to the direction of the relationship.
Positive Correlation:
- Means that the more a person watches television, the larger his or her vocabulary.
- Also called a direct relationship.
- If r = +1.00, we have a perfect positive correlation, and we can predict the value of one measure with complete accuracy if we know a subject’s score on the other measure.
Negative Correlation:
- Means that the more a person watches television, the smaller his or her vocabulary would be.
- Also called an inverse relationship.
- The direction of the relationship (positive or negative) does not affect our ability to predict scores.
Strength of the Relationship:
- Indexed by the absolute (or unsigned) value of r.
- A correlation of r = -.34 actually represents a stronger relationship than does r = +.16.
No Relationship:
- If r is near zero, our prediction may be no more accurate than any random guess.
Nonlinear Trend:
Range Restriction:
Outliers:
- Scatterplots can help identify a nonlinear trend, range restriction, and outliers.
- Correlation coefficients can be strongly affected by these features of the data.
- Nonlinear Trend: Simple correlations use a General Linear Model, which assumes that the direction of the relationship between X and Y generally remains the same.

Curvilinear Relationship

Curvilinear data patterns (or other patterns in which the relationship changes direction) cannot be adequately captured by simple correlations; the rs would be very small (or even zero) because the data do not have a simple, straight-line relationship.
Range truncation:
- Artificial restriction of the range of values of X or Y.
- For example, a strong positive correlation between age and shoe sizes in children between the ages of 4 and 16.
- The area in the box now looks much like the very weak correlation shown in Figure 5.1c. By restricting the age range, the positive trend becomes very weak. If we computed the correlation coefficient for this truncated range, we would be likely to get a correlation coefficient close to zero.
Standardized achievement tests:
- Have you ever wondered why standardized achievement tests are not better predictors of grades? Part of the answer appears to lie in range truncation.
Truncated Ranges:
- For example, the correlation between grades in graduate school and individuals’ Graduate Record Exam (GRE) scores is not very high (rs= .22 to .33; House & Johnson, 1998).
  *Testing experts argue that one reason for this is that the ranges of both types of scores are truncated. First, graduate schools tend to admit only the top GRE scorers, resulting in an artificially narrow range of scores. Second, graduate school grades tend to be mostly A’s and B’s—a narrow range indeed. If we plotted data from such restricted ranges, the scatterplot would appear to show only a weak positive trend.
Outliers:
- Extreme scores can dramatically reduce the size of the correlation coefficient because it disturbs the general linear trend of the data.

Correlational Studies

Usefulness of Correlational Studies:
- Researchers in every branch of psychology conduct correlational studies because they are useful and relatively easy to conduct.
- They have become indispensable in many areas that cannot be investigated using experimental approaches.
- For example, the link between smoking and many serious health problems was revealed from correlational studies.
Drawbacks:
- Cannot make causal inferences: Correlation does not imply causation.
- Even a perfect correlation (+1.00 or -1.00) does not indicate a causal relationship.
Causal Inferences:
- Cannot be made from correlational data even when a relationship exists between two measures.
- The fact that two measures are strongly related does not mean that one is responsible for the occurrence of the other.
Firmness of a Handshake:
- Firmness of handshake and positivity of first impressions were strongly correlated (r = .56), but other personality variables could have caused both.
Automobiles and Airplanes:
- Positive correlation between the number of automobiles and the number of airplanes in the world, but it would be illogical to say that automobiles cause airplanes or vice versa.
TV Violence and Aggressiveness:
- The causal direction between two variables cannot be determined by simple correlations.
- Cannot be certain which behavior is the cause and which is the effect.
- Cannot be certain that there is a causal relationship at all between the measured behaviors.

Causal Directions

Alternative Possibilities:
- Innate aggressiveness might determine a preference for violent TV—not the other way around.
- Innate aggressiveness results in more exposure to TV violence, but at the same time the more exposure a person has, the more aggressive he or she becomes (bidirectional causation).
- Some third agent may actually be causing the two behaviors to appear to be related (third variable problem).
- Example: A preference for violent TV and a tendency toward aggressiveness both result from an unknown or unmeasured variable, such as underactive autonomic nervous system functioning.

Coefficient of Determination

Calculation:
- Once we have calculated r, it is useful to compute the coefficient of determination (r^2).
- Estimates the amount of variability in scores on one variable that can be explained by the other variable.
Example: Handshake Study:
- Firmness of handshake and positivity of first impressions were correlated r =.56.
- The coefficient of determination, r^2, = .31.
- About 31% of all the fluctuation in subjects’ positivity scores can be accounted for by the firmness of the handshake.
- r^2 \geq .25 can be considered a strong association between two variables (Cohen, 1988).

Linear Regression Analysis

Linear Regression Analysis:
- Correlations can be used for prediction.
- When two behaviors are strongly related, the researcher can estimate a score on one of the measured behaviors from a score on the other.
- Technique is called linear regression analysis.
Regression Equation:
- A formula for a straight line that best describes the relationship between the two variables.
- It is an equation for a straight line that has both a slope (the direction of the line) and an intercept (the value on the Y, or vertical, axis when X =0).
Calculations:
- To predict someone’s score on one variable (a vocabulary test) when we know only their score on the other variable (TV viewing time), we would need to know the value of r and be able to calculate subjects’ average scores (called means) on both variables and the standard deviations for both sets of scores.
- TV viewing time is designated as variable X; vocabulary scores are designated as variable Y.
- The new score we are trying to predict is labeled Y'.
Computational Formula:
- Y' = bX + a where b = r \frac{sy}{sx}
- b = slope of the regression line
- a = \bar{Y} - b\bar{X} \bar{Y} and \bar{X} are Y and X means.
Calculations Example:
- TV viewing time and vocabulary scores are strongly correlated, with r =-.64.
- The mean for TV viewing time is 20 hours, and the standard deviation is 4.
- The mean score for the vocabulary test is 70, and the standard deviation is 5.
- We want to calculate an estimated score on the vocabulary test for an adult who watches 24 hours of TV.
- The computational formula: $\bar{X}$ and \bar{Y} are the means, sx and sy standard deviations.
- Y' = -0.8(24) + 86 = 66.8
- The score predicted for this person (66.8) is lower than the average (70).

Multiple Correlation and Multiple Regression

Multiple Correlation:
- We can use a statistic known as multiple correlation, represented by R, to test the relationship of several predictor variables (X1, X2, X_3 …) with a criterion variable (Y).
- Logically, R is quite similar to r, but R allows us to use information provided by two or more measured behaviors to predict another measured behavior when we have that information available.
Computation:
- We can also compute R^2 to estimate the amount of variability in vocabulary scores that can be accounted for by viewing time and age considered together.
- This multiple correlation would tend to put a damper on the earlier hypothesis that watching TV increases vocabulary, wouldn’t it?
Partial Correlation:
- This analysis allows the statistical influence of one measured variable to be held constant while computing the correlation between the other two—the partial correlation.
- If age is an important third variable that is largely responsible for both increased television viewing and increased vocabulary, statistically controlling for the contribution of age should greatly decrease the correlation between television viewing time and vocabulary.
Multiple Regression Analysis:
- When more than two related behaviors are correlated, a multiple regression analysis can be used to predict the score on one behavior from scores on the others.
- We could use multiple regression analysis, for example, to predict vocabulary scores from TV viewing time and age.
- Regression equations determine the weight of each predictor, and we could simply report these weights (called beta weights) in a research report.
- Or we could use the weights in a path analysis, one advanced correlational method, to construct possible causal sequences for the related behaviors.

Causal Modeling

Sophisticated Research Designs:
- Based on advanced correlational techniques have become increasingly frequent in the literature as computer statistics programs become widely available.
Causal Models:
- Researchers have tools to speculate about whether watching TV violence causes aggressiveness, or whether more aggressive people just naturally gravitate toward programs containing more violent content.
- Researchers have tools for causal modeling in correlation-based designs, such as path analysis and cross-lagged panel designs.

Factor Analysis

Factor analysis:
- a common correlational procedure that is used when individuals are measured on a large number of items.
- allows us to see the degree of relationship among many traits or behaviors at the same time.
- Such complicated statistical procedures have become common now that computers are available for statistical data analysis.
Factor Loadings:
- Are statistical estimates of how well an item correlates with each of the factors (they can range from ]1.00 to 11.00, like other correlational statistics).
Data Reduction:
- For example, when researchers create a new questionnaire, or scale, to measure attitudes or personality, they often begin by testing many more items than they will eventually use in their scale.
Factor analysis can help to determine which items seem to be measuring similar qualities, allowing researchers to drop some items that are less important or redundant (Kerlinger & Lee, 2000).

Path Analysis

Path Analysis:
- An important correlation-based research method that can be used when subjects are measured on several related behaviors.
- The researcher creates models of possible causal sequences.
- Example: Serbin and her colleagues (Serbin, Zelkowitz, Doyle, Gold, & Wheaton, 1990) were interested in trying to explain differences between boys’ and girls’ academic performance in elementary school.
Descriptive Method:
- Generates important information for prediction and can generate experimental hypotheses.
Limitations:
- The models can only be constructed using the behaviors that have been measured.
- If a researcher omits an important behavior, it will be missing in the model, too.
Path Models:
- Uses beta weights to construct path models, outlining possible causal sequences for the related behaviors.
- Computers can easily compare many multiple regression equations testing different paths, looking for the best model.
- The selected model can be further tested for “goodness of fit” to the actual data.
- It is not difficult to find path analyses in the literature that include multiple sets of interweaving paths linking as many as a dozen or more predictors (e.g., Pedersen, Plomin, Nesselroade, & McClearn, 1992).
Batson, Chang, Orr, and Rowland (2002):
- Investigated the effect of empathy on the amount of help someone was willing to give to stigmatized groups of people.
Researchers who use a path analysis approach are always very careful not to frame their models in terms of causal statements.
Internal Validity:
- It is low because it is based on correlational data.
- The direction from cause to effect cannot be established with certainty, and third variables can never be ruled out completely.

Cross-Lagged Panel Design

Cross-Lagged Panel Design:
- Another method used to create causal models.
- This design uses relationships measured over time to suggest the causal path.
- Subjects are measured at two separate points in time on the same pair of related behaviors or characteristics (the time “lag” can be quite a few years).
- The scores from these measurements are correlated in a particular way, and the pattern of correlations is used to infer the causal path.
Eron, Huesmann, Lefkowitz, and Walder (1972):
- Their study looked at the correlation between a preference for violent TV and aggressiveness in kids as they grew to young adulthood.
- Subjects were assessed once in the third grade and again 10 years later.
- The results of the cross-lagged panel indicated that it was more likely that TV violence caused aggressiveness than the other way around.
Huesmann, Moise-Titus, Podolski, and Eron (2003):
- The new study showed similar patterns of correlations across a time lag of about 15 years, indicating, once again, that the most likely direction of cause and effect was from early television violence viewing to later aggressive behavior.
The correlations along the two diagonals (see Figure 5.7) are the most important for determining the probable causal path because they represent effects across the time lag.
- if vocabulary size is the cause of TV viewing, we would expect that vocabulary size at age 3 and the amount of time spent watching TV at age 8 should be strongly correlated.

Quasi-Experimental Designs

Quasi-Experimental Designs:
- Can seem like a real experiment, but they lack one or more of its essential elements, such as manipulation of antecedents or random assignment to treatment conditions.
- Can be used to explore the effects of different treatments on preexisting groups of subjects or to investigate the same kinds of naturally occurring events, characteristics, and behaviors that we measure in correlational studies.
Goals:
- In correlational studies, we are looking for relationships or associations between variables, whereas in quasi-experiments, we are comparing different groups of subjects looking for differences between them, or we are looking for changes over time in the same group of subjects.
Researchers who want to compare people exposed to a naturally occurring event with a comparison group (usually unexposed people) often use a quasi-experiment, also called a natural experiment (Shadish, Cook, & Campbell, 2001).
Ganzel and her colleagues (Ganzel, Casey, Glover, Voss, & Temple, 2007) investigated emotional reactivity after the traumatic World Trade Center bombing in New York City on September 11, 2001.
Mellman and his colleagues (Mellman, David, Kulick-Bell, Hebding, & Nolan, 1995) conducted a natural experiment to compare sleep disorders in people who had survived Hurricane Andrew and people who had not been exposed to the hurricane.
The simplest quasi-experiments can look at gender differences in school-age children’s sleep patterns.
The point to remember is that when we conduct quasi-experiments, we can never know for certain what causes the effects we observe—so, relative to true experiments, we say these designs are low in internal validity.
Important difference between experiments and quasi-experiments is the amount of control the researcher has over the subjects who receive treatments.

Ex Post Facto Studies

Ex Post Facto Studies:
- A study in which the researcher systematically examines the effects of subject characteristics (often called subject variables) but without actually manipulating them.
- Researcher forms treatment groups by selecting subjects on the basis of differences that already exist.
- Ex post facto means “after the fact.” In effect, the researcher capitalizes on changes in the antecedent conditions that occurred before the study.
- The experimenter also has no direct control over who belongs to each of the treatment groups of the study.
- These studies generally fall in the low-high portion of Figure 3.1 from Chapter 3.
Membership:
Preexisting differences define membership in different treatment groups in the study: Hannah’s father died last year, so Hannah is placed in a group of subjects who have experienced the loss of a parent.
Advantages:
Like the correlational study, it deals with things as they occur. There is no manipulation of the conditions that interest the researcher.
The ex post facto researcher studies the extremes, the subjects who rank highest and lowest on the dimension of interest.
Systematically forming groups based on differences in preexisting characteristics is a critical feature of an ex post facto study, but it also prevents such a study from being classified as a true experiment.
Use random assignment of subjects to create treatment groups in which any preexisting differences in people are distributed evenly across all the treatment groups.
Franklin, Janoff-Bulman, and Roberts (1990) were interested in studying the potential effects of divorce on attitudes and beliefs of children. Using an ex post facto design, they assessed whether college-age children of divorced parents held different beliefs about themselves and other people than did college students who came from intact families.
A strong word of caution:
All ex post facto studies are low in internal validity because there is always the possibility that other differences between the groups of subjects were the true cause of the effects.
The ex post Facto approach enables us to explore many dimensions that we could not or would not choose to study experimentally.
The investigators studied cancer patients who had been identified as either repressors or non-repressors. Repressors are individuals who minimize negative emotional experiences.

Nonequivalent Groups Design

Nonequivalent Groups Design:
- The researcher compares the effects of different treatment conditions on preexisting groups of participants.
- The researcher cannot exert control over who gets each treatment because random assignment is not possible.
  It would be a good idea to measure productivity levels in the two companies before the study to rule out prior productivity levels as a plausible alternative explanation for the results.

Longitudinal Design

Longitudinal Designs:
Psychologists also use quasi-experiments to measure the behaviors of the same subjects at different points in time and look to see how things have changed. Here the specific question is often the influence of time on behaviors, rather than how different behaviors are related, as we saw in the cross-lagged panel design.
*Regression, in psychoanalytic theory, is a way of escaping the reality of a stressful situation by reverting to more childlike patterns of behavior.
Longitudinal studies can take place over periods of months, years, or even decades.
Wallerstein and Lewis (2004) in one interesting longitudinal study of children from divorced families, were able to track a sample of children from divorced families for 25 years.

Cross-Sectional Studies

*Instead of tracking the same group over a long span of time, subjects who are already at different stages are compared at a single point in time using a cross-sectional study.

Using different groups of subjects runs the risk that people in these groups might differ in other characteristics that could influence the behaviors you want to investigate.
On the other hand, a cross-sectional study will require more subjects; the more groups to be compared, the more subjects needed.
A cross-sectional design is similar to an ex post facto design.

Pretest/Posttest Design

Pretest/Posttest Design:
- Measure people’s level of behavior before and after the event and compare these levels.
  So, she decides to use a pretest/posttest design instead.
  Optimally, she would like to be able to give the training to only half these students so she could compare admissions test scores of students who received the training with scores of those who did not.
Using this design, how confident can she be that her training program caused the improvement?
In a pretest/posttest design, particularly if the study extends beyond a single, brief research session, there are simply too many other things that can influence improvement.
*Practice effects (also called pretest sensitization) cannot be ruled out.
*It is clear that a pretest/posttest design lacks internal validity.
The pretest/posttest design has often been used to test the effects of foreseeable real-world events, such as attitude changes after a series of public service announcements airs on television or ticket sales before and after renovations are made to a concert hall.
*When there is a long time between the pretest and the posttest, the researcher needs to be particularly aware that the event being assessed might not be the only cause of differences before and after the event.
Subjects could listen to the 45-minute tape, and the researcher could measure self-esteem again. If all other potential influences had been carefully controlled during the sessions, we would be somewhat more confident that an increase in self-esteem after the treatment was really attributable to the tape, rather than other outside influences.
*When all four groups are included in the design, it is called a Solomon 4-group design (Campbell & Stanley, 1966).
Even so, it has been widely used in situations, particularly outside the laboratory, where a comparison group is impossible or unethical.
*Given these limitations, quasi-experimental approaches remain very important adjuncts to experimentation. Summarized in Table 5.2: Quasi-Experimental Designs
Quasi-experimental designs can be extremely useful for showing relationships and for predicting behavioral differences among people.