Scientific Method
Observations lead to research questions.
Variables in Research
Definition of Variables: Anything that can change or take on different values.
Situational Variables: Environmental or contextual factors.
Response Variables: Outcomes measured.
Participant Variables: Characteristics of individuals (e.g., age, gender).
Independent Variable (IV): Manipulated by the researcher.
Dependent Variable (DV): Measured to assess the effect of the IV.
Experimental Design
Purpose: Manipulating the IV and measuring the DV establishes cause-and-effect relationships.
Types of Experimental Designs:
Between-Subjects Design: Different participants in different conditions.
Within-Subjects Design: Same participants in all conditions.
Statistical Analysis
Types of Statistics:
Descriptive Statistics:
Summarizing data (e.g., mean, median, mode).
Inferential Statistics:
Making inferences about populations (e.g., hypothesis testing).
Probability Distributions and Z-scores: Tools for standardizing data.
Hypothesis Testing
Null Hypothesis (H0): Assumes no effect or relationship.
Alternative Hypotheses: Proposes differences or effects.
**Errors in Hypothesis Testing:
**
Type I Error: Incorrectly rejecting the null hypothesis.
Type II Error: Incorrectly accepting the null hypothesis.
Correlation vs Causation
Correlations do not imply causation.
Validity Threats
Internal Validity Threats:
Address confounding variables and randomization.
External Validity:
Generalizability across contexts and populations.
Experiments Purpose: Establish cause and effect.
Requirements for Determining Cause:
Temporal Precedence: The cause must precede the effect in time.
Covariation: The effect occurs when the cause is present and does not occur when the cause is absent.
Elimination of Alternatives: No other plausible explanations should remain for the relationship.
Researcher controls the amount of time participants spend with dogs to study effects.
Internal validity refers to the degree to which conclusions can be drawn about causal relationships based on data.
Essential to ensure that variations in the dependent variable (DV) are solely due to the IV and not other factors.
Confounds: Occurs when an extraneous variable varies systematically with the IV.
This correlation threatens internal validity, potentially causing misinterpretations of the DV's changes.
Constancy: Keeping key extraneous variables consistent across all experimental conditions.
Randomisation: Essential to assign participants randomly to Conditions for equal representation of participant variables (e.g., gender, age).
Between-Subjects Design:
Different participants are required in each condition.
Equal chance assignment ensures participant variables are evenly distributed.
Matching: Participants are matched on relevant characteristics to maintain control.
Within-Subjects Design:
Same participants undergo all conditions but must account for order effects.
Order Effects: Changes in performance due to learning, fatigue, or other factors.
Counterbalancing: Presenting conditions in various orders to mitigate bias from order effects.
When researchers lack control over condition assignment or causal variable manipulation.
Example: Measuring happiness before and after introducing dogs in a nursing home context.
Important to recognize challenges to internal validity when interpreting results.
Selection Bias: Differences in groups before the experiment.
History: External events occurring between measurements.
Maturation: Changes in participants over time.
Testing: Impact of repeated measures on assessment.
Instrumentation: Variability in measurement tools over time.
Mortality: Participant dropout rates.
Expectancy Effects: Researcher biases influencing results.
Definition: The degree to which research results generalize across different contexts or populations.
Replication is crucial to validate findings in varied settings.
Historical Context: Previous non-Indigenous research methods have often been culturally insensitive towards Indigenous communities (Gower, 2012).
Setting the Scene: Misguided policies result from mistreatment and lack of understanding of Indigenous cultures.
Ethics Development: Established guidelines are necessary to safeguard Indigenous participants during research projects.
Updated ethical guidelines focusing on improving engagement with Aboriginal and Torres Strait Islander Peoples.
Emphasizes respect for rights, promoting partnerships, collaboration, and equitable responsibility.
Spirit and Integrity: Connection between past and future generations; respectful behaviors and genuine intentions.
Cultural Continuity: Maintains identity through relationships and cultural practices.
Equity: Fair partnerships respecting Aboriginal culture.
Reciprocity: Mutual benefits and shared responsibilities in research.
Respect: Acknowledging cultural heritage and ensuring informed consent.
Responsibility: Caring for community welfare and avoiding harm.
Participation must be voluntary, well-communicated, and culturally understood.
Provides clear pathways for ethical review in Indigenous research.
Considers the cultural context and prioritizes Indigenous engagement.
Forms of engagement: Inform, Consult, Involve, Collaborate, and Empower Indigenous communities.
Indigenous Data Governance: Ensures Indigenous control over data, legality, and benefit distribution.
Ganma: Knowledge sharing, symbolizing two cultures mingling.
Yarning: An informal dialogue fostering personal connection and understanding.
Four types: Social, Collaborative, Research Topic, and Therapeutic Yarning.
Dadirri: Represents deep listening in the research conversation.
Key Components of True Experiments:
Manipulation of Independent Variable (IV):
Changes in the dependent variable (DV) must be caused by different levels or conditions of the IV.
Random Assignment:
Participants are randomly allocated to groups/conditions to balance out differences, ensuring fair comparisons.
This helps control for extraneous variables which might bias results.
Control of Extraneous Variables:
Extraneous Variables: Variables other than the IV that can affect the DV.
True experiments aim to eliminate or control these variables to avoid confounding results.
Temporal Precedence:
The IV must precede the DV.
Example: Sleep must occur before measuring alertness.
Co-Variation:
Changes in the DV must correlate with changes in the IV.
Example: Good sleep leads to increased alertness; lack of sleep leads to no change in alertness.
Elimination of Alternative Explanations:
Must rule out other variables that could explain the effect (e.g., nutrition affecting alertness).
Control strategies include random assignment, holding variables constant, and counterbalancing.
Between-Subjects Design:
Different participants in each condition.
Random allocation helps control for bias.
Within-Subjects Design:
Same participants experience all conditions.
Counterbalancing is used to control for order effects.
T-Test:
For comparing two conditions/groups.
Independent Samples T-Test: Different participants in each condition.
Paired Samples T-Test: Same participants in each condition.
ANOVA (Analysis of Variance):
Used for comparing three or more groups.
Types of ANOVA: One-way, repeated measures ANOVA depending on the number of IVs and groups.
F-Ratio: Compares between-groups variability to within-groups variability. A significant F indicates that differences exist between group means.
Formulate null and alternative hypotheses (no difference vs. at least one difference).
Determine alpha level (commonly 0.05).
Calculate degrees of freedom:
Between Groups: K - 1 (where K = number of groups).
Within Groups: N - K (N = total participants).
Compute sum of squares (SS) for total, between, and within groups.
Calculate mean squares (MS) and then the F-statistic (MS between / MS within).
Compare calculated F to critical F from F-distribution table to determine significance.
Normality: Distribution of the dependent variable should be normally distributed.
Homogeneity of Variances: Variance among groups should be roughly equal. Tested with Levene's test.
Independence of Observations: Each participant's score should be independent of others.
Using ANOVA to compare the effectiveness of three levels of treatment for reducing depression symptoms:
Placebo, low dose, high dose groups.
Measure DV: Reduction in depression scores, assess via ANOVA.
Lecture Overview
Revision on calculating the sum of squares (SS) and aspects of the ANOVA summary table.
One-way between-subjects ANOVA using the Jamovi platform, with practice in tutorials.
Follow-Up Testing
Follow-up testing occurs after a significant ANOVA result to identify which means differ.
Types of follow-up tests depend on pre-set hypotheses:
Planned Comparison (a priori): Hypotheses defined before the experiment, e.g., CBT is expected to lower depression compared to placebo.
Unplanned Comparison (post hoc): Conducted after significant results are found.
Assumptions for ANOVA
Normality: Normal distribution of scores between groups, verified through tests (e.g., Shapiro-Wilk).
Homogeneity of Variances: Equal variances across groups, checked using Levene's test.
Data Measurement: Continuous scale required for dependent variable.
Independence of Observations: Each participant contributes one score only.
ANOVA Calculations
Total variability calculated as: SS{total} = SS{between} + SS_ {within}
Calculating degrees of freedom:
df_ {between} = k - 1 (where k = the number of groups)
df_ {within} = N - k (total participants - number of groups)
Effect Size
eta-squared (η²): Percentage of variance explained by the independent variable. Calculated as:
η² = SS{between} / SS{total}
Reporting effect size is important to show how strong the effect is beyond mere significance.
Example of Follow-Up Tests
If ANOVA reveals significant results, plan for:
Contrasts (a priori): Hypotheses-driven comparisons.
Pairwise Comparisons (post hoc): Control type One error using methods like Bonferroni adjustments.
One-Way Repeated Measures ANOVA
Concept: Involves repeated measures on the same participants across 3 or more conditions, violating the assumption of independence of observations from One-Way Between Subjects ANOVA, as each participant is measured multiple times.
Calculating Variability: Involves calculating sums of squares and error differently to account for participant overlap.
Null Hypothesis: No differences in mean scores across conditions, indicating identical condition scores.
F Statistic
Formula: F = \frac{ \text{Mean Squares of Effect} }{ \text{Mean Squares of Error} }
Mean Squares of Effect: Represents the experimental effect across conditions.
Mean Squares of Error: Represents differences expected if no effect from the independent variable exists, including unsystematic differences.
Assumptions of Repeated Measures ANOVA
Normality: Distributions within each condition should be normally distributed. Robust to violations with a larger sample size (n > 30).
Interval/Ratio Level Data: Dependent variable should be measured at an interval or ratio level (continuous scale).
Sphericity: Equality of variances of differences between conditions. Assessed using Mauchly's Test.
A significant Mauchly’s test (< 0.05) indicates a violation of sphericity.
Assessing Normality
Use Shapiro-Wilk test. A significant p-value (< 0.05) indicates a violation of normality.
Corrections for Violations
If sphericity is violated, apply corrections like Greenhouse-Geisser or Huynh-Feldt.
Degrees of Freedom: Report with decimal points when corrections are applied.
Variance Analysis Structure
SS Total: Total variability in data.
Between Subjects Variance: Variance between participants.
Within Subjects Variance: Variance within participant responses across different conditions.
Calculation Steps
SS Total: Calculate the squares of the differences between each data point and grand mean.
SS Between: Calculate based on the differences between participant means and grand mean.
SS Effect: Based on the differences between condition means and grand mean.
SS Error: Remaining variance calculated by rearranging total variance equation.
Degrees of Freedom Calculation
Between: N - 1
Effect: K - 1
Error: (N - 1)(K - 1)
Total: (N \times K) - 1
Final Reporting
Compute mean squares for error and effect.
Determine the F statistic from mean squares.
Repeated Measures ANOVA Recap:
Used when one group of participants completes every condition, providing repeated measures of the dependent variable.
Assumptions include normality, interval or ratio level dependent variable, and sphericity.
Normality: Assessed via descriptives and Shapiro-Wilk statistic.
Sphericity: Assessed via Mauchly's test; if violated, Greenhouse-Geisser or Huynh-Feldt correction is applied.
ANOVA Details:
Total variance (SS total) is broken into between-subjects (SS between) and within-subjects (SS within) variance.
Repeated measures ANOVA focuses on variance within participants (SS within), further partitioned into SS effect and SS error.
Calculations involve determining the sums of squares, degrees of freedom, mean squares, and the F ratio to assess the significance of the experimental effect.
Follow-Up Tests for Repeated Measures ANOVA:
Selection depends on specific hypotheses and whether comparisons are orthogonal.
Types include complex contrasts (a priori) and pairwise comparisons (a priori or post hoc).
A priori tests: Complex contrasts (orthogonal or non-orthogonal) and Bonferroni pairwise comparisons.
Post hoc tests: Tukey's HSD (for equal group sizes) or Bonferroni (if group sizes vary).
Measures the relationship between two naturally occurring variables (no manipulation).
Establishes a relationship when two variables covary.
No correlation: No relationship between variables.
Positive correlation: As one variable increases, the other increases (incline upwards).
Negative correlation: As one variable increases, the other decreases (decline downwards).
Covariation:
Positive covariance: High scores on variable X are met with high scores on variable Y, and vice versa.
Negative covariance: High scores on X are met with low scores on Y, and vice versa.
Pearson’s Correlation Coefficient (r):
Value between -1 and 1 indicating the strength of the correlation.
Close to 1 (positive) or -1 (negative) indicates a strong relationship.
0 to ±0.29: Weak correlation.
±0.30 to ±0.49: Moderate correlation.
±0.50 to ±1: Strong correlation.
No leading zeros (report 0.30 as .30).
Indicate direction with a negative symbol or no symbol for positive.
Closer data points indicate a stronger correlation.
No correlation: Randomly scattered points.
Positive correlation: Gathering in a positive direction.
Negative correlation: Gathering in a negative direction.
Data points that don't fit the trend can affect the correlation value.
Outliers weaken the correlation between variables.
Correlational research cannot explain the relationship between variables due to no strict manipulation.
Correlation does not equal causation.
Variable A causes variable B.
Variable B causes variable A.
A third variable causes the relationship between A and B.
R squared is the squared version of R.
Indicates how much variability in one variable is explained by the other.
Interpretation:
0.01 to 0.09: Small effect size.
0.09 to 0.25: Medium effect size.
Above 0.25: Large effect size.
Related pairs: Both X and Y scores for each participant.
Interval or ratio scale: Continuous data.
Normality: Scores for each variable should be normally distributed.
Linearity and homoscedasticity: Assessed via scatter plot.
Curvilinear relationships violate linearity.
Fanning or coning violates homoscedasticity.
Spearman’s rho correlation: Ranks data and conducts Pearson’s correlation on rankings.
Kendall’s tau: For small data sets with tied scores.
Be cautious when interpreting correlations based on a restricted data range.
Cannot generalize the correlation beyond the assessed data subset.
Predicts one variable from another.
Linear model (straight line) to predict outcome variable values based on a predictor variable.
Outcome variable: Dependent variable.
Predictor variable: Independent variable.
Equation: Y = B0 + B1X
Y: Predicted score on the outcome variable.
X: Score on the predictor variable.
B*0: Intercept (where the predicted value of Y intercepts the Y-axis when X is 0).
B*1: Slope (incline of the regression line; positive or negative).
R value: Same as Pearson’s r in simple linear regression.
R-squared value: Coefficient of determination (how much variance in the outcome is accounted for by the predictor).
Adjusted R-squared: Predicted R-squared for the population.
Tests whether the model is significantly better at predicting the outcome than using the mean.
Intercept: Value of Y when X is 0.
Predictor: Assesses significance; significant predictor if the variable is significant.
Uphill: Positive regression (positive slope).
Downward: Negative regression (negative slope).
Aims to find the line that minimizes deviations between the line and data points.
Finds the line of best fit by minimizing the sum of the squares of the differences (residuals) between predicted and observed values.
Assessed using R-squared and the significance of the ANOVA statistic.
Total Sum of Squares (SST): Total variability in the data set.
Model Sum of Squares (SSM): Variation between the mean and the regression line.
Sum of Squares Residual (SSR): Error of the regression as a model.
The formulas regarding calculating R, and degrees of freedom, T value, are all listed above.
With multiple predictors, we can look to our standardised estimate to be able to compare the relative contribution of each predictor.
Simple Linear Regression
Predicts one variable from another (outcome from predictor).
Builds upon correlation analysis by predicting outcomes.
Uses a linear model (straight line) to predict values of the outcome variable from one or more predictor variables.
Regression Equation: Y = B0 + B1X
Y: Calculated
B*0: Y intercept.
B*1: Slope.
X: Predictor variable.
Method of Least Squares: Calculates the regression equation, providing the line of best fit.
Line of Best Fit: A line as close as possible to data points (residuals) in a scatter plot.
Determining Significance
Assess if the line of best fit is significantly better than using the mean score.
Coefficient of Determination (R^2):
Indicates how much variance in the outcome is accounted for by the predictor variable.
Higher R^2 means a better fit.
Example: R^2 = 0.82 means 82% of the variance in exam scores is accounted for by time spent
ANOVA Statistic (F statistic):
Tests if the regression model is significantly better at predicting the outcome than having no model at all.
A significant F statistic indicates a significant line of best fit.
Interpreting Regression Output
Coefficient of Determination:
Found under model fit measures.
R: Pearson's correlation between variables.
R^2: Coefficient of determination (percentage of variance in the outcome accounted for by the
Omnibus ANOVA Test:
F statistic.
Indicates whether the model is significantly better than no model.
Write-up: F(df1, df2) = F value, p < 0.001.
Model Coefficients Table:
Y intercept (B*0): Value of the outcome when the predictor is zero.
Slope (B*1): Direction and significance of the predictor.
Positive slope: Positive predictor.
Negative slope: Negative predictor.
Significance (p-value): Indicates if the slope is significantly different from zero.
Magnitude of Coefficient: For every one unit increase in the predictor variable, the outcome variable changes by this value.
Regression Equation
Y = B0 + B1X
Y:
X: Score on the predictive variable.
B*0: Y intercept (estimate of the intercept).
B*1: Estimate of the slope (from the predictive variable).
Simple Linear Regression Example (Self-Compassion)
Outcome Variable: Self-compassion (understanding and patient towards oneself).
Predictors:
Insomnia (how much sleep interferes with daily functioning).
Well-being (daily life filled with positive interests).
Bivariate Correlations
Self-compassion and Well-being: Significant, moderate, positive correlation.
As self-compassion increases, well-being increases.
Variance shared: 16%.
Self-compassion and Insomnia: Significant, moderate, negative correlation.
As insomnia increases, self-compassion decreases.
Variance shared: 9.6%.
Regression Output Interpretation
Model Fit Measures:
Correlation (R) and R-squared (R^2).
ANOVA Test:
Tests if the model is significant.
Write up: F(df1, df2) = F value, p < 0.001.
Model Coefficients:
Predictor significance (p < 0.001).
Estimate: Positive or negative slope.
T statistic: T(df) = T value, p < 0.001.
Interpretation of Estimates:
Intercept: When X is zero, Y is the intercept value.
Slope: For every one unit increase in X, Y increases/decreases by the slope value.
Regression Equations (Examples)
Predicting compassion based on well-being: Compassion = B0 + B1 * WellBeing
Predicting compassion based on insomnia: Compassion = B0 - B1 * Insomnia
Important: Regression predictions do not equal causation.
Assumptions of Regression
Normally Distributed Errors:
Assessed via QQ plot of residuals.
Assumption met when residuals fall on the solid black line.
Snaking indicates non-normality.
Linearity:
Assessed using residual plot.
Want an equal distribution of residuals above and below the zero line.
Homoscedasticity:
Equal variance of residuals across the graph.
No major fanning or conning.
Independent Errors:
Durbin-Watson statistic between 1 and 3.
Jamovi Implementation
Analysis -> Regression -> Linear Regression.
Dependent Variable: Outcome.
Covariates: Predictors.
Model Fit: R, R^2, Adjusted R^2.
Model Coefficients: Confidence intervals, standardized estimates, ANOVA test.
Assumption Checks: Autocorrelation (Durbin-Watson), QQ plot of residuals, residual plots.
Chi-Squared Analysis
Used to determine the relationship between categorical variables.
Analyzes differences between frequencies.
Categorical Data Analysis
Previously: Categorical IVs and continuous DVs (t-tests, ANOVAs).
Continuous variables: correlations and regressions.
Now: Relationship between two categorical variables.
Frequencies
Assess frequency as numbers in categories.
Example: Pet Ownership and Survival
Organized in a frequency table.
Chi-Squared Statistic
Compared to critical table of distributions.
Determines if differences between categories are significant.
Research Question
Is there a relationship between having a dog and survival after a heart attack?
Variables:
Dog ownership (yes/no).
Survival status (survived/did not survive).
No specific dependent variable here.
Chi-Square Test
Tests for a relationship between two categorical variables.
Pearson's chi-squared test.
Compares observed frequencies to expected frequencies.
Null Hypothesis
Variables have no relationship.
Rejecting the null: there are differences between the two variables.
Assumptions of Chi-Square
Independence:
Each person falls into only one category.
Expected Frequencies:
Should be greater than 5 for each cell.
Must be assessed when expected frequencies are calculated.
Calculating Expected Frequencies
Also called model frequency.
Equation: (Row Total * Column Total) / Total Number of Scores (N).
Steps:
Calculate totals for rows and columns using observed frequencies.
Calculate N (add row totals or column totals).
Calculate expected values for each cell using the equation.
Calculating Chi-Squared Statistic
Calculate the squared deviation between observed and expected frequencies for each category.
Divide by the expected frequency.
Sum all of the calculations to obtain a single chi-squared statistic.
Equation: Sum of \frac{(Observed - Expected)^2}{Expected}
Comparing Chi-Squared to Critical Value
Degrees of freedom = (number of rows - 1) * (number of columns - 1)
For 2x2 table, degrees of freedom = 1.
Compare obtained X^2 to critical value (e.g., 3.84 for alpha = 0.05).
If obtained X^2 > critical value, reject the null hypothesis.
Jamovi Implementation
Analyses -> Frequencies -> Independent Samples (Chi-Square Test of Association).
Rows: One variable (e.g., Ownership).
Columns: Other variable (e.g., Survival).
Counts: Include frequency count.
Cells: Click on observed and expected.
Output: Provides observed/expected frequencies, percentages, chi-square statistic, degrees of freedom, and p-value.
Write-Up
Significant relationship between categories.
Chi-squared statement: \chi^2 (df, N = sample size) = chi-squared value, p = p-value.
Interpret differences using percentage frequencies.
Example: Dog owners had a better survival rate than those who did not have a dog.
Regression revision:
Simple Linear Regression Revision: Using screen time as a predictor for depression, anxiety, and stress.
Descriptive statistics including bivariate correlations: direction, strength, and significance of
Example: Significant moderate positive correlation between screen time and stress; weak positive non-significant correlation between screen time and anxiety; moderate, positive, significant relationship between screen time and depression.
Goodness of Fit and Regression Variables:
R-squared values indicating the proportion of variance in the outcome accounted for by the predictor.
Model significance assessed to determine if screen time is a significant predictor.
Exponents in Jamalfi output: Notation for very small or large numbers; converts exponents to decimals by shifting decimal places.
Regression Equations: Predicting depression, anxiety, and stress based on screen time, interpreting the coefficients and significance
Assumptions of Regression:
Normally distributed errors, linearity, homoscedasticity, and independent
Assessment methods include QQ plot for normality, scatter plot for linearity and homoscedasticity, and Durbin Watson statistic for independence of errors.
Note: Testing or reporting assumptions not required for the data analysis report Part B.
Used when assumptions of linear models are violated.
Each parametric test has a non-parametric equivalent (e.g., Kruskal-Wallace test for one-way independent
between-subjects ANOVA).
Depends on research question, data collection methods (continuous vs. categorical variables), and what is being compared.
Normal distribution of the dependent variable, equality of variance across groups, independence of data points, and continuous dependent variables
Reason for testing assumptions: to draw accurate inferences from data
What to do if assumptions are violated:
Accept the robustness of a test
Transform the data to fit the test
Change the test to fit the data (Non-Parametric Testing).
Non-Parametric Testing:
ranks data instead of testing bassed on distribution - solves issues of non-normal distributions and outliers - loses information about the degree of difficulty
ranking data examples: ages of 5 people ranked from lowest to highest tied ranks -assign the averages of what ranks could have been. -Underlining theory - assess ranked values rather than the original data
make fewer assumptions, versions to answer the same questions as metric tests, and non-reality rated or small samples sizes
between subjects design:
Two groups: man-Whitney Test
then 3 levels Kruskal-Wallace
repeated measures designs
Two groups: Wilcoxen signed rank test
More then 2 levels friedman's ANOVA
*
conducts no matric tests to focus on jamovie including examples
Research example: Effectiveness of treatments in treating depression (no treatment, CBT, antidepressants, CBT and AD).
Violation of normality: Conduct non-parametric ANOVA (Kruskal-Wallace test).
Step 1: Rank the scores from lowest to highest on the