Behavioral Science Comprehensive Final

CHAPTER 1

Various ways of acquiring knowledge
- Intuition
  - The ability to know something instinctively rather than through conscious reasoning or systematic observation
  - Accept unquestioningly what your own personal judgment or a single story (anecdote) about one person’s experience tells you
  - Problem - numerous cognitive and motivational biases affect our perceptions, and we may draw erroneous conclusions about cause and effect
  - Illusory correlation occurs when we focus on two events that stand out and occur together - likely to occur when we are highly motivated to believe in the casual relationship
- Authority
- Scientific method (approach) is the best
  - Requires more evidence than anecdotes and illusory correlations before conclusions can be drawn
  - Rejects the notion that one can accept on faith the statements of any authority; more evidence is needed to draw conclusions
  - Empiricism ~ use of objective, verifiable observations to answer questions and draw conclusions (idea that knowledge comes from observations)
  - Data play a central role
  - Report to other scientists who will follow up on findings by conducting research that replicates and extends observations
  - Good scientific ideas are testable, which means they can be supported or falsified by data
  - Peer-review ~ the process of judging the scientific merit of research through review by other scientists with the expertise to evaluate the research
Goals of science
- Description
  - Researchers are often interested in describing the ways in which events are systematically related to one another
- Prediction
  - Once events have been shown to be related to one another, predictions can be made and it becomes possible to make other, follow-on, predictions
- Determining causes
  - To know how to change behavior, we need to know causes
  - Test cause and effect
  - 3 types of evidence:
    - Temporal order in which cause precedes the effect
    - Covariation of cause and effect: when cause is present, effect occurs; when cause is not present, effect does not occur
    - Eliminating alternative explanations: nothing other than a causal variable could be responsible for the observed effect
- Explaining behavior
  - Understand why a behavior occurs
  - Under what conditions do they occur
Characteristics of true science
- Falsifiable - can be proven false
- Peer review - reviewed by someone else
- Empirical
What is pseudoscience
- The use of seemingly scientific terms and demonstrations to substantiate claims that have no basis in scientific research
- Example - facilitated communication
- Creates false hopes and makes promises that will not be fulfilled and techniques can be dangerous
Basic vs. applied research
- Basic - addresses fundamental questions about behavior
- Applied - addresses questions that have immediate practical implications

CHAPTER 2

Differences between hypotheses and predictions
- Hypothesis
  - A statement of the way in which variables are predicted to be related
  - A study can be designed to test it
  - A tentative idea/question waiting for evidence to support or refute it
  - No direction
- Prediction
  - A statement of the expected outcome of a research investigation
  - Follows directly from a hypothesis, is directly testable, and includes specific variables and methodologies
  - Assertion regarding a direction within a study
5 primary sources of ideas
- Common sense
  - Simple facts we learn throughout life
  - Things we all believe to be true
  - sayings/phrases that are common
- Practical problems
  - Tangible problems seen in society
- Observation
  - Events or the world
- Theories
  - Organize and explain facts or description of behavior
  - Generate new knowledge
- Past research
  - Become familiar with past research
Anatomy of a research paper
- Abstract
  - Overall summary of the research report
  - Includes hypothesis, procedure, and results info
- Intro/Lit. review
  - Explains the problem under investigation and specific hypotheses being tested
  - Review of past peer-reviewed work
- Methods
  - Describes in detail exact procedures used in the study
- Results
  - Findings are presented, usually in three ways: narrative form (writing), numerical form (stats), and tables/graphs
- Conclusion/Discussion
  - Why researcher thinks the results occurred

CHAPTER 3

Understand the basics of:
- Milgram’s experiment
  - Obedience to authority - fake electric shock to someone who was actually a part of the experiment
  - Milgram wanted to look at why Nazi Germany happened (why so many people went along with doing atrocious things)
- Zimbardo
  - Stanford Prison Experiment/Experience
  - Wanted to know if people would ‘turn bad’ even if they knew it was fake - does environment/social situations affect people
  - The ‘guards’ became different people
- Tuskegee
  - Tuskegee Syphilis Study
  - Withheld treatment for syphilis of a group of only black men until they died and then autopsied their bodies
  - Government and hospitals hid that this was happening
  - Started in 1932, people found out about it in 1972
Active vs. passive deception
- Active - deliberately lying to participants
  - Must be justified to advance science
  - Must tell participants the truth after (debriefing) and give them any care that may be necessary
- Passive - withholding key elements, but disclosing in part
  - Must tell participants the truth after (debriefing)
What is the Belmont Report + its history
- Published in 1979 by the National Commission for the Protection of Human Subjects of Biomedical and Behavioral Research
- An important foundational document guiding ethical research with human subjects
- It includes three basic principles: beneficence, autonomy, and justice
Main principles
- Beneficence
  - Research should confer benefits and risks must be minimal
  - Can’t replicate old science unless adding/changing something
  - Informed consent document outlining the risks and benefits is necessary
  - Conduct a risk-benefit analysis
- Autonomy (respect for persons)
  - Participants are treated as autonomous
  - Informed consent - participants must have all the information that might influence their decision on whether to participate
  - Right to withdraw, have their own data pulled, and are debriefed
- Justice
  - There must be fairness in receiving the benefits of research as well as bearing the burdens of accepting risks
  - Directly targeting the Tuskegee Syphilis Study
  - There should be no bias in selection and interpretation of data unless there is scientific merit or basis for excluding certain groups
IRB
- What is it
  - Institutional Review Board
  - Every college/university in the US receiving federal funding must have an IRB
- It’s purpose
  - Responsible for review of research conducted within the institution
  - Only assess if things are done ethically, they are NOT testing for scientific merit (that would be done through peer-review)
- Who is on it
  - 5 or more individuals, at least one must be from outside of the institution

CHAPTER 4

Categories of variables
- Situational
  - Characteristics of the environment
  - Examples - adverse childhood experiences, socioeconomic status, poverty level
- Response
  - How the individual reacts
  - Examples - brain activity, immune response
- Participant/Subject
  - Things that are characteristic of the person
  - Examples - race, gender, IQ, weight, baseline immune function
- Mediating
  - Psychological processes that influence response
  - Example - positive or negative outlook
What is an operational definition
- Definition of a concept that specifies the method used to measure or manipulate the concept
- objectify/operationalize an abstract concept - to make it concrete
- Examples:
  - Wong-Baker FACES pain rating scale for kids
  - PHQ-9 and GAD-7, Beck’s Depression Inventory
- Benefits to operationally defining a variable:
  - Forces scientists to discuss abstract concepts in concrete terms - can result in realization that the variable is too vague to study
  - Can help researchers communicate their ideas with others - forces them to agree on what terms mean in the context of the research
Possible relationships between variables
- Positive - increase and increase
- Negative - increase and decrease
- None - flat line
  - Unrelated variables vary independently of one another
- Curvilinear - increase and systematic increases and decreases
  - Sometimes referred to as nonmontonic function
Nonexperimental (correlational) vs. experimental research
- Nonexperimental (correlational)
  - Use of measurement of variables to determine whether variables are related to one another
  - Just measure, don’t manipulate
  - Assess relationships
  - Cannot say cause and effect
  - Third-variable problem
- Experimental method
  - A method of determining whether variables are related, in which the researcher manipulates the IV and controls all other variables ether by randomization or by direct experimental control
  - Direct manipulation and control of IV, observe DV
  - Reduces ambiguity and uncertainty in interpretation of results
  - Can look at cause and effect
  - Attempts to eliminate influence of confounding third variables
IV vs. DV
- IV
  - The manipulation that has multiple levels
  - What is controlled/changed
- DV
  - What is affected by the IV
  - What is being tested/looked at
Elements needed to establish causation
- Temporal precedence
  - Cause precedes the effect
  - A always comes before B
- Covariation of cause and effect
  - When cause is present, effect occurs; when cause is not present, effect does not occur
  - B is present when and ONLY if A is
- Eliminating alternative explanations
  - Nothing other than a causal variable could be responsible for the observed effect
  - A may not cause B, but C does
What is validity
- Refers to the extent to which, given everything that is known, a conclusion is reasonably accurate
Types of validity
- Internal
  - Accuracy of conclusions drawn about cause and effect
  - Looking at other factors to see how controlled the experiment was
- External
  - Extent to which a study’s findings can accurately be generalized to other populations and settings
  - How diverse it is
- Construct
  - Extent to which the measurement or manipulation of a variable accurately represent the theoretical variable being studied
  - Adequacy of the operational definition
- Conclusion (statistical conclusion)
  - Accuracy of the conclusions drawn from the results of a research investigation
  - Accuracy of the stats
- Face
  - The degree to which a measurement device appears to accurately measure a variable
  - Does it look like it tests what it’s supposed to

CHAPTER 5

Define reliability of a measure
- The degree to which a measure is consistent
- Measurement that is free from measurement error
- Most likely achieved when researchers use careful measurement procedures
- Can be increased by making multiple measures
- Types:
  - Test-retest reliability ~ measuring the same individuals at two points in time
  - Split half reliability ~ consistency of the items; testing the test against itself
  - Item total ~ provides info about each individual item; items that do not correlate with the total score on the measure are actually measuring a different variable
Ways to establish construct validity
- Discriminant (divergent)
  - An assessment of the construct validity of a measure by means of examining the extent to which scores on the measure are not related to scores on conceptually unrelated measures
- Concurrent
  - The construct validity of a measure is assessed by examining whether groups of people differ on the measure in expected ways
- Predictive
  - The construct validity of a measure is assessed by examining the ability of the measure to predict a future behavior or outcome
- Content
  - An indicator of construct validity of a measure in which the content of the measure is compared to the universe of content that defines the construct
- Face
  - Seeing if the test looks like it tests what it is supposed to
Reactivity
- A problem of measurement in which the measure changes the behavior being observed
- Awareness of being measured changes people’s behavior
  - They may not want to be honest or might try to ‘help’ the study
Scales of measurement
- Nominal
  - Categorical variables, no numerical/quantitative properties
  - categories/groups simply differ from one another and are assigned names
  - Can NOT do interval measures (no averages)
- Ordinal
  - Rank order levels of variable being studied
  - Categories can be ordered from first to last
  - Space between 1 and 2 is not equal to space between 2 and 3
  - No interval measures (no averages)
- Interval
  - Numeric properties
  - Equal intervals
  - No true zero, so cannot form ratios
- Ratio
  - Numeric properties
  - Does have an absolute zero, it is possible to make true claims about ratios

CHAPTER 6

Compare quantitative and qualitative methods
- Quantitative
  - N = number
  - Variables can be counted
  - Numerical form
  - Focuses on specific behaviors
  - Large populations/samples
- Qualitative
  - L = language
  - The story behind the numbers
  - Record discussions/interviews and transcribe them later
  - Focus on themes that emerged
  - Small populations
Describe naturalistic observation and discuss methodological issues
- Naturalistic observation
  - Field work, field observation, ethnography
  - Real-world settings (don’t necessarily ask for consent)
  - Observations in a natural setting over a period of time to collect data
  - Goal is to describe and understand behavior, not testing a hypothesis
  - Primarily qualitative data
- Issues
  - Participation - observer may lose the objectivity necessary to conduct scientific observation; remaining objective may be difficult when the researcher already belongs to the group being studied or is a dissatisfied former member of the group
  - Concealment - less reactive, but may be an invasion of privacy
Describe systematic observation and discuss methodological issues
- Systematic observation
  - Careful observation of one or more specific behaviors in a particular setting
  - Lab setting
  - There is a particular goal/question
  - Full informed consent is needed
- Issues
  - Equipment - it is becoming more common to use video and audio recording equipment
  - Reactivity - if they know they are being observed, may affect what happens
  - Reliability - refers to the degree to which a measurement reflects a true score rather than measurement error
  - Sampling - for many research questions, samples of behavior taken over an extended period provide more accurate and useful data than single, short observations
Case study
- In depth examination of one person/group/unit/setting
- Naturalistic observation is sometimes called a case study, but these do not necessarily have to be naturalistic observation
- Psychobiography is when looking at one individual
Describe archival research and its sources
- Archival research
  - Pulling past research
  - Typically used when describing a setting or doing a case study on a setting
  - Desired records may be difficult to obtain
  - We can never be entirely sure of the accuracy of info collected by someone else
- Sources
  - Statistical records - collected by many public and private organizations
  - Survey archives - consist of data from surveys that are stored digitally and available to researchers who wish to analyze them
  - Written, audio, and video records - diaries, books, ethnographies, speeches, tweets, Instagram and Facebook posts, magazine articles, movies, podcasts, internet search trends, etc.

CHAPTER 7

Population sampling types
- Probability sampling ~ each member of the population has a specificable probability of being chosen
  - Simple random sampling ~ every member of the population has an equal probability of being selected for the sample
  - Stratified random sampling ~ the population is divided into subgroups (strata), and random sampling techniques are used to select sample members from each stratum
  - Cluster sampling ~ research can identify “clusters” of individuals and then sample from these clusters; after clusters are chosen, ALL individuals in each cluster are included in the sample
- Nonprobability sampling ~ the probability of any particular member of the population being chosen is unknown
  - Convenience (haphazard) sampling ~ selecting subjects because they are easy to obtain, usually on the basis of availability, and not with regard to having a representative sample of the population
  - Purposive sampling ~ researcher makes a judgment regarding selection of an individual for the sample; purpose is to obtain a sample of people who meet some predetermined criteria
  - Snowball sampling ~ one or more current research participants recruit others to become part of the sample
  - Quota sampling ~ chooses a sample that reflects the numerical composition of various subgroups in the population
What surveys measure
- Attitudes and beliefs
  - Focus on the ways people evaluate and think about issues
  - Have to be careful with wording and topic sensitivity and clearly define the purpose
- Behaviors
  - Can include past behaviors or intended future behaviors
  - Clearly define the purpose
  - Avoid polarizing words/phrases, sensitive topics, and mundane behaviors
- Facts and demographics
  - Ask people to indicate things they know about themselves or their situation
  - Don’t ask anything you don’t need
  - Be inclusive and use respectful terms
  - Always leave a space for ‘other’
Constructing questions
- Wording
  - Don’t use unfamiliar terminology; define anything possibly confusing
  - Use grammatical sentence structure and avoid typos
  - Avoid over loaded phrases or compound sentences
  - Don’t use negative
  - Should be relatively simple and straightforward
- Response options
  - Forced choice (close-ended) - limited number of response alternatives and one must be picked
  - Likert scales
  - Visual scales
  - Open-ended (qualitative) - free to answer in any way
General ethical issues
- They can be anonymous
  - No link to the identity of the individual
- They can be confidential
  - There is a traceable identity
  - Informed consent must be signed
- CANNOT BE BOTH
Types of survey administration and their issues
- Paper and pencil
  - Can distribute to large groups at one time
  - Have a captive audience of individuals who are more likely to complete a questionnaire once they start it
  - If researcher is present, people can ask questions if necessary
- Mail
  - Can be mailed to home or business addresses
  - Very inexpensive
  - Potential for low response rate
  - People may become distracted and forget to mail it back
  - No one is present to help if people do not understand questions
- Online
  - Very easy to design using online survey software services
  - Open and closed-ended questions can be included
  - Responses are immediately available to researcher
  - May result in higher response rates
- Phone
  - Less expensive than face-to-face
  - Allow efficient data collection because no need for travel
  - Can be a live interview (conducted using a CATI system)
  - Can use IVR tech with pre-recorded questions and types responses straight to computer
- Face-to-face
  - Requires that the interviewer and respondent meet
  - Tend to be expensive and time-consuming
- Focus group
  - Individuals usually have particular knowledge of or interest in the topic
  - Usually open-ended questions asked of the whole group
  - Group interaction is possible
  - Interviewer must be skilled to facilitate communication or deal with any problems
  - Some people may try to dominate the conversation
  - Time-consuming and costly
  - Provides a great deal of information
Interviewer bias
- Intentional or unintentional influence exerted by an interviewer in such a way that the actual or interpreted behavior of respondents is consistent with interviewer’s expectations
Panel study
- Research in which the same sample of subjects is studied at two or more points in time, usually to assess changes that occur over time
- Consists of a set of individuals who have volunteered to be research participants for multiple studies over time

CHAPTER 8

Posttest-only vs. pretest-posttest design
- Posttest-only
  - Between-subjects design - participants are only in one group/level
  - Use randomization to assign participants to a level - must achieve equivalent groups to eliminate any potential selection differences
  - Minimum of two levels
  - Simple, efficient, clean cut, people come in only one time
- Pretest-posttest
  - Between-subjects design
  - Can look at change over time
  - Pretest is given before the experimental manipulation is introduced
  - Adherence and attrition (mortality) are concerns
  - Reactivity may be a concern when taking the same test twice - better to spread the tests out
How a confounding variable influences internal validity
- a variable that varies along with the independent variable, it occurs when the effects of the IV and an uncontrolled variable are intertwined so that you cannot determine which of the variables is responsible for the observed effect on the dependent variable
When to use a repeated-measures design
- When comparisons need to be made within the same participants
When to use a matched pairs design
- When we want to enforce a balance between important participant characteristics that may influence the outcome
How to counterbalance
- Randomize the order in which every condition is presented across the group of participants
- K prime or latin square

CHAPTER 9

Population sampling types
- Probability sampling
  - Everyone in the population has an equal chance of being included in the sample
  - The truest form of random sampling
- Nonprobability sampling
  - The chance of any particular member of the population being chosen is unknown
  - Samples are NOT RANDOM - some strategy is used
  - Non-sampling bias/error - the sample misses people you would get normally (if you had a full list of everyone in the population)
Straightforward vs. staged manipulations of IVs
- Straightforward
  - Manipulate the IV with relative simplicity by presenting written, verbal, or visual material to the participants
- Staged
  - Creating a scenario/experience - called event manipulation
  - Frequently use a confederate/accomplice
  - Examples - Asch conformity experiment, Simmons + Levin “Door” Study
Types of DVs
- Self-report
- Behavioral measures
- Physiological measures
Sensitivity of a DV
- The DV should be sensitive enough to detect differences between groups
- Issue of sensitivity is particularly important when measuring human performance
Floor vs. ceiling effects
- Ceiling effect - the IV appears to have no effect on dependent measure only because participants quickly reach the maximum performance level
- Floor effect - when a task is so difficult that hardly anyone can perform well
How to control participant and experimenter expectations
- Participant expectations
  - Reactivity - use physiological measures
  - Demand characteristics - use blinding, deception
- Experimenter expectations
  - Expectancy effects - use automated procedures, experimenters should be well trained
Pilot studies and manipulation checks
- Pilot studies
  - Mini studies to practice main studies
  - Can check typos and that instructions are clear
  - Can question these participants about their experience
  - Can use think-aloud protocol
  - Allows experimenters to be more comfortable
- Manipulation checks
  - Make sure that manipulation went noticed
  - Might serve as a demand characteristic
  - May prefer to use in a pilot study

CHAPTER 10

Definition of a factorial design
- At least 2 IVs investigated simultaneously
  - Typically 2 or 3, each with levels
- Simplest is a 2x2
How to understand and write out proper factorial design
- x
- The amount of numbers tells you the amount of IVs and how many main effects are possible
- The actual number tells you how many levels are in each IV
- Computation gives the number of conditions (2x2=4 conditions)
What is a main effect
- Associated with one factor, while ignoring the other(s)
- Looking at only one variable
- The overall relationship between the IV and DV
What is an interaction
- Combined effect of the factors on the DV
- The effect of one IV on the DV changes, depending on the level of another IV
- Cannot be obtained in a simple experimental design in which only one IV is manipulated

CHAPTER 11

Single case experimental designs and reasons to use
- May also be called small-N designs
- Reversal or withdrawal design
  - Also called an ABA design (baseline A -> treatment B -> baseline A)
  - Taking an intervention away should send the person back to baseline
  - To establish causality, have to remove the treatment to make sure that is what is helping - in ABAB, accommodation can be given back once causality is established
- Multiple baseline design
  - A reversal of some behaviors may be impossible or unethical
  - Across subjects
    - Treatment is introduced at different times to different subjects to determine that the treatment was effective
    - Still working with any one person at a time
    - Instead of removing the treatment to go back to baseline, see if scores improve for others also
  - Across behaviors
    - Same subject but different treatments are used to determine their effectiveness on different behaviors
    - At different times, the same manipulation is applied to each of the behaviors
    - Demonstrating that each behavior increased when the reward system was applied would be evidence for the effectiveness of the manipulation
  - Across situations
    - The same behavior is measured in different settings
    - A manipulation is introduced at a different time in each setting, with the expectation that a change in the behavior in each situation will occur only after the manipulation
Program evaluation
- 5 steps
  - Needs assessment - do a detailed needs interview
  - Program theory assessment - look at empirical literature or organizational comparison
  - Process evaluation - design and propose a process
  - Outcome evaluation - see if program is working
  - Efficiency assessment - return on investment (time, energy, money, etc.)
Primary threats to internal validity
- R. SMITH
  - R - regression toward the mean
    - Upon multiple testing, scores gradually approach the mean
    - The problem is rooted in the reliability of the measure
    - Occurs when try to explain events in the “real world” as well
    - Problems can be eliminated by the use of an appropriate control group
  - S - selection
    - How people are picked
    - In single case, only have one person
    - Cohort problems
  - M - maturation
    - People grow, age, and change over time
  - I - instrumentation/instrument decay
    - The scale being used could be faulty or break
    - Sometimes, the basic characteristics of the measuring instrument change over time
    - Over time, an observer may gain skill, become fatigued, or change the standards on which observations are based
    - Human error or tech error
  - T - testing
    - Upon repeated testing, may have fatigue or practice effects
  - H - history
    - Everyone has their own life events, community events, media events
    - Refers to any event that occurs between first and second measurements but is not part of the manipulation
    - Can be caused by virtually any confounding event that occurs at the same time as the experimental manipulation
Describe the following research designs:
- Cross-sectional
  - Persons of different ages measured at same point in time
  - Much more common than longitudinal, primarily because less expensive and yields results immediately
  - Researcher must infer that differences among age groups are due to the development of age - a difference may reflect developmental age changes or may result from cohort effects
- Longitudinal
  - Same group is observed at different times as they age
  - Best way to study how scores on a variable at one age are related to another variable at a later age
  - Over the course of the study, people may move, die, or lose interest
- Sequential
  - Combination of cross-sectional and longitudinal
  - Begins with cross-sectional, then individuals are studied longitudinally
  - Takes fewer years and less effort than a longitudinal study, and researcher reaps immediate rewards because data on the different age groups are available in the first year of the study
Cohort effect
- In developmental research using a cross-sectional approach, differences among age groups attributed to social, cultural, economic, or political differences rather than to the effect of age
- Most likely to be a problem when the researcher is examining age effects across a wide range of ages

CHAPTERS 12 + 13

Descriptive vs. inferential stats
- Descriptive - supply basic info about sample and test random assignment
  - Allow researchers to make precise statements about the data
  - Uses measures of central tendency
- Inferential - allow us to draw causal conclusions
  - See if IV caused DV
  - Determine if results match what would happen if we were to conduct the experiment again and again with multiple samples - in essence, whether we can infer that the difference in the sample means reflects a true difference in population means
  - Assume that if groups are equivalent, any differences in DV must be due to effect of the IV
  - Random or chance error will be responsible for some of the difference in the means, even if IV had no effect on DV
Scales of measurement
- Nominal
  - Numbers reflect categories
  - Example: yes=1, no=2
- Ordinal
  - Numbers reflect a rank where distance from 1 to 2 does not equal the distance from 2 to 3
  - Example: first, second, and third place times in a race
- Interval
  - Scores where distances are the same, but no true zero
  - Example: temperature
- Ratio
  - Scores where distances are the same and there is a true zero
  - Example: grades
Types of descriptive stats
- Measures of central tendency
  - Mean - average
    - For SCALE (ratio or interval) only, because actual values of the numbers are used in calculating the statistic
  - Median - score where 50% appear above and 50% appear below
    - For ORDINAL only, because it takes into account only the rank order of the scores
  - Mode - most frequent score
    - For NOMINAL only
- Correlation coefficients
  - How to scale variables related to each other
  - Scatterplots - plot x (horizontal) against y (vertical)
  - Range -1.00 to +1.00, 0.00 means no relationship
  - - means indirect, + means direct (same direction)
  - Strongest closer to +/- 1.00, weakest closer to 0.00
  - +/- does not affect strength
  - Line of best fit is y=mx+b
  - Regression
    - Allows us to make predictions using the equation (y=mx+b) of the line
    - Used to predict a person’s score on one variable when that person’s score on another variable (or set of variables) is already known
    - Essentially “prediction equations”
F-test (ANOVA) vs. t tests vs. chi squares
- F-test (ANOVA)
  - ANOVA = analysis of variance
  - If IV is nominal (three groups) and DV is interval/ratio, use F test
  - If there are 3+ levels of the IV, must use F test
  - If a factorial design, use F test
  - A ratio of two types of variance
    - Systematic variance ~ deviation of group means from the grand mean, or the mean score of all individuals in all groups
    - Error variance ~ deviation of individual scores in each group from their respective group means
  - The larger the F ratio is, the more likely it is that results are significant
- t test
  - Think t for two -> two levels of IV
  - If IV is nominal (two groups) and DV is interval/ratio, use t test
  - The value of t increases as the difference between obtained sample means increases
- Chi squares
  - If IV is nominal and DV is nominal, use chi-square
- Pearson correlation is used if IV is interval/ratio and DV is interval/ratio
Type I vs type II errors
- Type I - null is rejected when actually is true
- Type II - null is accepted when actually is false
- Usually occur just by chance
- Research should be designed so that the probability of a Type II error (called beta) is relatively low
  - Related to significance (alpha) level, sample size, and effect size
Know p values and the purpose of effect sizes
- If p < .05, reject the null and accept research hypothesis
  - This applies for both t and F test
  - With F test, it won’t tell where the difference lies - would have to do additional Post-hoc testing for this
- Effect size
  - You are most likely to obtain significant results when the effect size is large - that is, when differences between groups are large and variability of scores within groups is small
  - If the effect size is large, a Type II error is unlikely

CHAPTER 14

What is external validity
- Can the results be generalized to the rest of the world
- It is talked about in the Discussion section of a research paper
Issues created by generalizing research results to other populations and cultures
- Sex and gender
  - Sex - biological classification assigned at birth
  - Gender - sociocultural classification
  - Gender identity - person’s personal and psychological experience with a particular gender
  - Past research usually focused just on sex, now it is expanded to also look at gender and gender identity
  - To replicate research, we need to look at all 3
  - Researchers should not exclude sex or gender categories when recruiting or studying participants if they have a research question for which they would like to generalize the findings to all humans
  - It is important to note that real gender gaps remain in the behavioral sciences, and representation among researchers is critical to the success of the science
- Race, ethnicity, and culture
  - Race - social categorization based upon appearance
  - Ethnicity - common cultural background and descent
  - Culture - values, beliefs, language, behaviors, customs
  - Cannot focus on only one race, ethnicity, or culture
Potential problems with using each of the following as research participants:
- College students
  - the subjects tend to be young and to possess the characteristics of emerging adults: a sense of self-identity that is still developing, social and political attitudes that are in a state of flux, a high need for peer approval, and peer relationships that often change
  - College students also possess characteristics associated with academic success
  - Students, as a group, are more homogenous than nonstudent samples
- Volunteers
  - Volunteers tend to be more highly educated, of a higher socioeconomic status, more in need of approval, and more social
  - It seems that different kinds of people volunteer for different types of experiments
- Online persons
  - Although online samples can be more diverse than the typical college student sample, they are still not representative and so there are still generalization issues - internet users represent a unique demographic
  - The Pew data indicate that internet use is associated with living in an urban/suburban area, being a high school graduate or higher, being under 65 years of age, and having a higher income
Potential problem of generalizing and possible solutions
- Influences of the people conducting the study
  - Main goal is to ensure that any influence the experimenter has on subjects is constant throughout the experiment
- Effects of a pretest
  - Pretesting may limit the ability to generalize to populations that did not receive a pretest
  - Simply taking the pretest may cause subjects to behave differently than they would without the pretest
- Differences between a field study and a laboratory study
  - Research conducted in a laboratory setting allows the experimenter to study the impact of IVs under highly controlled conditions
  - A field experiment is a “real-life” alternative to the artificiality of a laboratory
Importance of replications
- A way of overcoming any problems of generalization that occur in a single study
Types of replications
- Exact
  - An attempt to precisely replicate the procedures of a study to see whether the same results are obtained
  - A researcher who obtains an unexpected finding will frequently attempt a replication to make sure that the finding is reliable
  - If starting your own work on a problem, you may try to replicate a crucial study to make sure that you understand the procedures and can obtain the same results
  - Often occur when a researcher builds on the findings of a prior study
  - When you replicate the original research findings using very similar procedures, your confidence in the external validity of the original findings is increased
  - A single failure to replicate does not reveal much, though; it is unrealistic to assume, on the basis of a single failure to replicate, that the previous research is necessarily invalid
- Conceptual
  - The use of different procedures to replicate a research finding
  - Researchers attempt to understand the relationships among abstract conceptual variables by using new, or different, operational definitions of those variables
  - Even more important than exact replications in furthering our understanding of behavior
  - In most research, a key goal is to discover whether there exists a relationship between conceptual variables
  - The same IV is operationalized in a different way, and the DV may be measured in a different way, too
  - Extremely important in the social sciences because the variables used are complex and can be operationalized in many ways
  - Sometimes the conceptual replication may involve an alternative stimulus or an alternative dependent measure
  - When conceptual replications produce similar results, our confidence in the generalizability of relationships between variables is greatly increased
Narrative literature review vs. meta-analysis
- Narrative lit review
  - A reviewer reads a number of studies that address a particular topic and then writes a paper that summarizes and evaluates the literature
  - Literature review provides info that:
    - Summarizes what has been found
    - Tells the reader which findings are strongly supported and which are only weakly supported in the literature
    - Points out inconsistent findings and areas in which research is lacking
    - Discusses future directions for research
  - The conclusions in a narrative literature review are based on the reviewer’s subjective impressions
- Meta-analysis
  - The researcher combines the actual results of a number of studies
  - The analysis consists of a set of statistical procedures that employ effect sizes to compare a given finding across many different studies
  - A method for determining the reliability of a finding by examining the results from many different studies
  - Focus on effect size
    - Allows comparisons of the effect sizes in different types of studies to allow tests of hypotheses
- Both provide valuable info and are often complementary
  - A meta-analysis allows statistical, quantitative conclusions, whereas a narrative review identifies trends in the literature and directions for future study - a more qualitative approach