Multivariate Analysis Study notes
Multivariate Data Analysis: Eurovision Example
- Eurovision Song Contest as a case study for multivariate data analysis.
- Objective: Determine if Eurovision judging is fair or if underlying patterns exist related to geopolitical relationships.
- Data: Matrix of countries scoring other countries' acts.
- Technique: Cluster analysis used to generate a dendrogram, revealing non-random patterns.
- Findings:
- Nordic/Scandinavian block.
- Greece and Cyprus cluster together.
- UK and Ireland cluster together.
- Bosnia and Turkey cluster.
- Conclusion: Voting is biased towards friendly nations rather than based purely on the quality of the acts.
- Representation: Multivariate data analysis can be represented in 2D or 3D mapping, showing clusters (e.g., Eastern, Nordic/Baltic, Western European blocks).
- Australia's chances: Australia is unlikely to win due to these geopolitical biases.
Course Overview: Techniques and Data
- First lecture: Focuses on the definition, collection, analysis, and exploration of multivariate data.
- Second lecture: Covers Principal Components Analysis and Factor Analysis.
- Other techniques:
- Cluster Analysis.
- Non-metric multi-dimensional scaling.
- MANOVA (multivariate, analysis of variance) - similar to ANOVA but with multiple variables.
References
- Quinn and Keogh (old or new edition).
- Chapters accessible online through the library.
- Examples using soil samples and biodiversity.
- Data available for download.
- Data chapter for free.
Course goals
- Understand major concepts such as ordination, principal components, factor analysis, non-metric multidimensional scaling, cluster analysis, MANOVA, and PerMANOVA.
*Learn to make decisions about data and its interpretation.
*To understand and interpret scientific papers using these techniques.
Multivariate Techniques: Key Themes
- Linear combinations of variables: A recurring concept.
- Distance/dissimilarity/similarity measures: Used repeatedly.
- Transformation of data: Note that its use differs from univariate transformation.
- Standardization.
Types of Multivariate Techniques
- Focus on ordination and clustering.
- Brief coverage of regression (multiple regression).
- Classification (some overlap with clustering).
Multivariate Data: Definition
- Multiple response variables.
- Variables are not necessarily independent; they interrelate.
- Examples:
- Biodiversity of a park.
- Physical properties of an environment (soil/water chemistry).
- Specialized approaches are needed because data often do not conform to traditional statistical assumptions.
Why use Multivariate Analysis
- Avoid conducting multiple ANOVAs on interrelated variables.
- Problem: Type I errors (false positives) increase with multiple tests.
- Multivariate statistics consider the interactions of variables together.
Types of Questions Addressed
- Change in community composition.
- Differences in water quality.
- Changes in habitat characteristics.
- Organismal traits (phenotypic traits, diet effects).
Examples of Outputs
- Ordinations from principal components.
- Clustering diagrams.
- Non-metric multidimensional scaling plots.
Habitat Complexity: An Example
- Habitat complexity is a multivariate term, comprising multiple interrelated variables.
- Measurement:
- Canopy cover.
- Shrub cover.
- Number of stems.
- Diversity of trees/shrubs.
- Number of logs.
- Percentage of dirt cover.
- Soil nutrients.
- Multivariate analysis can compare sites based on these variables.
Data Analysis
- Multivariate analysis summarizes existing data by measuring many variables.
- Simplifies data into fewer derived variables (e.g., principal components).
- Reduces Type I errors and reveals patterns.
- Applicable to soil characteristics, habitat, organismal traits, water quality, etc.
Good Practices
- Just because you can measure it, does not mean you should.
- Skills measured:
- Passing accuracy.
- Kicking strength (ball speed).
- Speed on angle tracks.
- Overall strength.
- Principal components analysis results:
- Skill Level: accuracy, technique, ball controle, etc..,
- Athleticism Level: Physical strength and speed.