Leary_Chapter_6_Descriptive_Research_corrected

Types of Descriptive Research

Descriptive research comprises several approaches that psychologists and other behavioral scientists use to describe characteristics, thoughts, feelings, and behaviors of groups. Three commonly distinguished kinds are survey research, demographic research, and epidemiological research. Survey research is by far the most common type and aims to describe people’s attitudes, lifestyles, behaviors, and problems. Its data can be gathered via questionnaires, interviews, or observational techniques, and surveys may be conducted face-to-face, by phone, mail, or Web sites. Importantly, a survey is a type of descriptive research, not merely a questionnaire; surveys may employ multiple data-collection methods and typically involve a cross-sectional design where a single group or a cross-section of the population is studied at one point in time. In contrast, a questionnaire is a data-collection instrument; a survey is the broader research design that may use questionnaires, interviews, or observations.

In many survey studies, respondents provide information about themselves by completing a questionnaire or answering interview questions (see Chapter 4 for questionnaires vs interviews). Some surveys are administrative or organizational and assess attitudes or experiences across different groups. Another key feature is the cross-sectional design, which provides a snapshot of characteristics for a population or groups within it; however, researchers may study changes over time using other survey designs, as described below.

Cross-sectional Survey Design, Successive Independent Samples, and Longitudinal (Panel) Surveys
A cross-sectional survey design surveys a single group of respondents at one time. By surveying different groups at one time, researchers can compare groups (e.g., age cohorts, geographic regions) to infer differences in attitudes or behaviors. A successive independent samples design uses two or more samples drawn from the population at different times and asks the same questions to each sample. Although the samples are different individuals, conclusions about changes in attitudes or behavior over time can be made if the samples are comparable and selected in the same way. For example, since 1939 the Gallup Organization has asked successive independent random samples of Americans whether they attended a church or synagogue service in the last seven days; Table 6.1 shows the percentage attending weekly over several decades, a result indicating remarkable stability across large spans of time. The validity of this design hinges on samples being comparable and drawn in precisely the same manner each time to ensure that observed changes reflect true population changes rather than sample differences.

A familiar problem in interpreting successive independent samples is that observed changes may reflect differences in the composition of later samples rather than real changes in the population. An oft-cited example is changes in standardized test scores over time. Scores in a given grade over several years may appear to rise or fall, but if the samples differ (e.g., more test-takers from states that require all students to take the test, many of whom did not take college-prep courses), the mean can shift for reasons unrelated to school quality. Figure 6.2 (ACT scores, 1998–2004) illustrates this issue: a decline in mean scores from 2002 to 2004 may partly reflect a larger, more diverse, or differently prepared group taking the test that year. Therefore, interpretation of changes over time in successive independent samples requires caution and careful attention to sample comparability.

In a longitudinal or panel survey design, the same group of respondents is questioned more than once. This design is powerful for studying true changes in individuals’ attitudes or behaviors over time. However, attrition poses a major problem: participants may move, die, or simply refuse to participate in follow-ups, which makes the follow-up sample no longer equivalent to the initial sample. If attrition is non-random, observed changes may reflect changes in who remains in the study rather than true changes in the population.

Internet Surveys: Opportunities and Limitations
With more people online, researchers increasingly collect data via Internet surveys (e-surveys). Advantages include low cost, automated data-entry, and access to hard-to-reach populations or respondents who can answer at their convenience. Disadvantages include reduced control over sample selection (sampling bias), uncertainty about who actually completes the survey, and the possibility of respondents answering more than once. Internet surveys are still maturing, and researchers are developing methods to address these issues (Anderson & Kanuka, 2003).

Demographic Research

Demographic research describes patterns of basic life events and experiences such as birth, marriage, divorce, employment, migration, and death. It addresses questions like why people have the number of children they do, the socioeconomic predictors of death rates, reasons for moving, and predictors of divorce. Although demographers and sociologists typically conduct most demographic research, psychologists and other behavioral scientists sometimes contribute to demography when they are interested in the psychological processes that underlie major life events. For example, a psychologist might explore demographic variables that predict differences in family size, marriage patterns, or divorce rates among groups. Demographic research is also used to forecast societal changes requiring government attention or new programs, as illustrated by the case study on population growth.

In the case study on Predicting Population Growth, Olshansky, Goldman, Zheng, and Rowe (2009) used demographic data to forecast future longevity and population composition in the United States. They predicted that the total population would rise from just over 310 million to between 411 and 418 million by 2050, with the 65+ population increasing from about 40 million to over 100 million and the 85+ population rising from under 6 million to about 30 million. Their models suggested that previous government projections had underestimated elderly population growth, highlighting implications for social security, Medicare, taxes, gerontology and geriatric psychology needs, and housing and aging-in-place services.

Epidemiological Research

Epidemiological research studies the occurrence of disease and death in groups of people. It is primarily conducted by medical and public health researchers, but it also informs health psychology because many illnesses and injuries are linked to behavior and lifestyle, and because descriptive epidemiology documents the distribution and prevalence of psychological disorders. Epidemiology distinguishes between prevalence—the proportion of a population with a disease or disorder at a given time—and incidence—the rate of new cases over a specified period. Descriptive epidemiology describes how frequently problems occur in populations and helps identify groups at risk and target interventions. For example, National Institute of Mental Health (NIMH) data from 2004 show, among others, that 32,439 people died by suicide in the United States in that year, with higher rates among white men over 65 and a substantial number of suicides among 15- to 24-year-olds. Such descriptive epidemiological data inform future research and guide mental health program targeting.

Describing and Presenting Data

Descriptive data must be presented accurately, concisely, and in an easily understandable form. Descriptive statistics can be numerical or graphical. The goal is to summarize and convey the results without distorting the underlying data. Three criteria govern useful descriptions: accuracy (faithfulness to the data), conciseness (clear and digestible), and understandability (readers can grasp the main message).

Numerical versus graphical descriptions: Researchers present data with numerical summaries (percentages, means, etc.) and/or graphical representations (histograms, bar graphs, frequency polygons). Both types have advantages and limitations. Researchers choose the most informative and straightforward representation given the data and the audience.

Frequency Distributions

A frequency distribution summarizes raw data by the number of scores falling into each category. Two common forms are simple and grouped frequency distributions.

Simple frequency distribution: Lists each possible score and the frequency of occurrences. Example: data from 168 university students on the number of friends they have, with a range from 1 to 40 and a most frequent value of 7.
Grouped frequency distribution: Used when there are many possible scores. The data are grouped into class intervals of equal size (e.g., 1–5, 6–10, 11–15, etc.). Three key features apply: (1) class intervals are mutually exclusive, (2) they cover all possible scores, and (3) all intervals have the same size. Relative frequency is the proportion of scores in each class interval, calculated as f/N, where f is the class frequency and N is the total number of scores. For the 1–5 interval in Table 6.3, the relative frequency is 31/168 = 18.5%.

Frequency distributions can be depicted graphically as histograms, bar graphs, or frequency polygons. Histograms are used for interval or ratio data (continuous on the x-axis) and have touching bars to indicate continuity. Bar graphs are used for nominal or ordinal data and have gaps between bars to indicate discontinuity. Frequency polygons connect the class-interval frequencies with a line, typically used for interval/ratio data.

Measures of Central Tendency

Frequency distributions can also be summarized with measures of central tendency: the mean, the median, and the mode.

The mean (average) is calculated as
\bar{x} = \frac{1}{n} \sum{i=1}^n xi
The mean is the most commonly used measure but can be misleading if the data are skewed or contain outliers. For example, in Table 6.2 the mean is 12.2, but most respondents reported fewer friends than that, illustrating how the mean can be pulled by a few high outliers (e.g., 39–40 friends).
The median is the middle score when data are rank-ordered; it is the value below which 50% of the scores fall. When n is odd, it is the middle score; when n is even, the median is the average of the two middle scores. The median is less affected by extreme scores than the mean.
The mode is the most frequent score. If all scores are different, there is no mode. Some distributions have more than one mode (bimodal, multimodal).

Presenting Means in Tables and Graphs

When reporting many means (e.g., across samples, groups, and variables), a table or a graph often presents the data more clearly than prose. For example, Löckenhoff et al. (2009) reported means for nine dimensions across 26 countries (234 means) in a table to reveal patterns. Graphs with means and error bars can effectively show trends, such as reading test scores by homework time across two age groups.

Error Bars and Confidence Intervals

Graphs of means sometimes include error bars, typically representing 95% confidence intervals (CIs). A CI describes the range around the sample mean within which the true population mean would fall with a specified probability if the study were repeated many times. If the same study were repeated 100 times, the true population mean would fall within the 95% CIs in about 95 of the samples. Narrow CIs indicate more precise estimates of the population mean, while wide CIs indicate more uncertainty. In Figure 6.7 (weight gain by gender), the bars show mean weight gain with I-shaped error bars representing the 95% CI for each gender.

Measures of Variability

Beyond the average, variability indicates how spread out the scores are around the mean. The range is the difference between the largest and smallest scores but is often insufficient because it relies on only two extreme values. The variance and its square root, the standard deviation, incorporate all scores and provide a more informative measure of dispersion.

Variance (S^2) is the average squared deviation from the mean, calculated as
S^2 = \frac{\sum{i=1}^n (yi - \bar{y})^2}{n-1}.
This is expressed in squared units and is mainly useful for statistical analyses and comparisons.
The standard deviation (s) is the square root of the variance and is expressed in the same units as the data, making it easier to interpret. If the variance is 1.47, then
s = \sqrt{S^2} = \sqrt{1.47} \approx 1.21.
The normal distribution (Gaussian) is a bell-shaped curve in which most scores cluster near the mean and fewer scores appear toward the tails. When data are approximately normal, about 68% of scores fall within one standard deviation of the mean, about 95% within two standard deviations, and about 99.7% within three standard deviations (the 68-95-99.7 rule). For an IQ test with a mean of 100 and a standard deviation of 15, 68% lie between 85 and 115, and 95% between 70 and 130.
Z-scores (standard scores) describe how far a score is from the mean in standard deviation units. The formula is
z = \frac{y_i - \bar{y}}{s}.
A z-score of -1.00 means the score is one standard deviation below the mean; a z-score of +2.90 means the score is about 3 standard deviations above the mean. Z-scores can be used to identify outliers (e.g., z < -3 or z > +3).
Standardization converts a distribution to a standard form with a mean of 0 and a standard deviation of 1, facilitating comparisons across different scales.

Developing Your Research Skills: Calculating Variance and Standard Deviation by Hand

Although computers handle most statistics, some learners benefit from manual calculations. The sample needs to be organized with two columns: one for raw scores and one for squared scores. The steps are:
1) Compute the squared deviations and the sum of the squared scores, i.e., (\sum (yi)^2). 2) Compute the sum of the scores, i.e., (\sum yi), and square it as needed for the numerator.
3) Apply the formula
S^2 = \frac{\sum (yi - \bar{y})^2}{n-1} = \frac{\sum yi^2 - \frac{(\sum y_i)^2}{n}}{n-1}.
From the example in the chapter, the data yield a variance of 1.47 and a standard deviation of about 1.21. The Z-score example introduces the relationship between raw scores, the mean, the standard deviation, and standardization.

Pathological Video-Game Use: A Descriptive Study Example

Gentile (2009) used a stratified random online sample of 588 boys and 590 girls (ages 8–18) to study video-game use and pathology indicators. The sample was large enough to yield a margin of error of ±3% at the 95% confidence level. Overall, 88% of respondents played video games at least occasionally. Boys played more hours per week on average (M = 16.4, SD = 14.1) than girls (M = 9.2, SD = 10.2). Adolescents tended to play fewer sessions per week as they aged, but longer per session, so total weekly use remained relatively stable. Among identified “pathological gamers,” the average was 24.6 hours per week (SD = 16). A table (Table 6.4) lists 11 symptoms of pathological use and the percentage of boys and girls who endorsed each symptom. Examples include: “Need to spend more time or money on video games to feel the same excitement” (boys 12%, girls 3%), “Spent too much money on video games” (13% vs 4%), “Play to escape problems” (29% vs 19%), and “Skip doing homework to play” (29% vs 15%). Figure 6.13 shows average hours per week by age, illustrating how use changes with age and highlighting gender differences in total usage. The report notes considerations about sampling, measurement, and interpretation, including how to present descriptive data in tables or graphs and how to calculate and interpret variability.

Summary of Key Concepts

1) Descriptive research aims to describe characteristics or behaviors of a population in a systematic and accurate fashion. 2) Survey research is the most common descriptive method; it uses questionnaires or interviews and often employs cross-sectional designs, though other survey designs exist. 3) Demographic research describes patterns of life events (birth, marriage, divorce, employment, migration, death) and can be used to forecast societal changes; psychologists may contribute to understanding the processes behind demographic patterns. 4) Epidemiological research studies disease and health-related behaviors, focusing on prevalence and incidence of conditions and the distribution of risk factors across groups. 5) Descriptive data must be accurate, concise, and understandable, using numerical or graphical descriptions. 6) Data can be summarized using numerical methods (means, medians, modes, percentages) or graphical methods (histograms, bar graphs, frequency polygons). 7) Frequency distributions start with raw data and can be simple or grouped; grouped distributions use class intervals of equal size and must be mutually exclusive and cover all scores. 8) Histograms are used for interval/ratio data with touching bars; bar graphs are used for nominal/ordinal data with separated bars; frequency polygons connect the frequencies with a line. 9) Measures of central tendency include the mean, median, and mode; the mean is most common but can be distorted by skew/outliers. 10) The presentation of means may be in tables or graphs; when there are many means, a table can clarify patterns. 11) Error bars (often representing 95% confidence intervals) convey the precision of mean estimates. 12) Measures of variability (range, variance, standard deviation) describe how spread out the data are; the standard deviation is generally more interpretable than the range. 13) The normal distribution shows most scores near the mean; about 68% lie within 1 SD, about 95% within 2 SDs, and about 99.7% within 3 SDs. 14) Z-scores describe how far a score is from the mean in SD units and help identify outliers and standardize comparisons.

Key Terms (selected)

bar graph
class interval
confidence interval
cross-sectional design
demographic research
descriptive research
epidemiological research
frequency distribution
frequency polygon
grouped frequency distribution
histogram
Internet surveys
longitudinal (panel) survey design
mean
measures of central tendency
measures of variability
median
mode
negatively skewed distribution
normal distribution
numerical method
outlier
panel survey design
positively skewed distribution
range
relative frequency
simple frequency distribution
standard deviation
variance
z-score
successive independent samples survey design
qualitative terms related to correlation (to be addressed in Correlational Research)

Questions for Review (selected topics)

How does descriptive research differ from other research strategies such as correlational, experimental, and quasi-experimental designs? 2. What is the most common type of survey research design? 3. A successive independent samples survey design is used to examine changes in attitudes or behaviors over time, but results from such designs are often difficult to interpret. Describe the design and discuss why it is sometimes hard to draw clear conclusions. 4. How does