Multivariate Analysis Study notes

Multivariate Data Analysis: Eurovision Example

  • Eurovision Song Contest as a case study for multivariate data analysis.

  • Objective: Determine if Eurovision judging is fair or if underlying patterns exist related to geopolitical relationships.

  • Data: Matrix of countries scoring other countries' acts.

  • Technique: Cluster analysis used to generate a dendrogram, revealing non-random patterns.

  • Findings:

    • Nordic/Scandinavian block.

    • Greece and Cyprus cluster together.

    • UK and Ireland cluster together.

    • Bosnia and Turkey cluster.

  • Conclusion: Voting is biased towards friendly nations rather than based purely on the quality of the acts.

  • Representation: Multivariate data analysis can be represented in 2D or 3D mapping, showing clusters (e.g., Eastern, Nordic/Baltic, Western European blocks).

  • Australia's chances: Australia is unlikely to win due to these geopolitical biases.

Course Overview: Techniques and Data

  • First lecture: Focuses on the definition, collection, analysis, and exploration of multivariate data.

  • Second lecture: Covers Principal Components Analysis and Factor Analysis.

  • Other techniques:

    • Cluster Analysis.

    • Non-metric multi-dimensional scaling.

    • MANOVA (multivariate, analysis of variance) - similar to ANOVA but with multiple variables.

References

  • Quinn and Keogh (old or new edition).

    • Chapters accessible online through the library.

    • Examples using soil samples and biodiversity.

    • Data available for download.

  • Data chapter for free.

Course goals

  • Understand major concepts such as ordination, principal components, factor analysis, non-metric multidimensional scaling, cluster analysis, MANOVA, and PerMANOVA.
    *Learn to make decisions about data and its interpretation.
    *To understand and interpret scientific papers using these techniques.

Multivariate Techniques: Key Themes

  • Linear combinations of variables: A recurring concept.

  • Distance/dissimilarity/similarity measures: Used repeatedly.

  • Transformation of data: Note that its use differs from univariate transformation.

  • Standardization.

Types of Multivariate Techniques

  • Focus on ordination and clustering.

  • Brief coverage of regression (multiple regression).

  • Classification (some overlap with clustering).

Multivariate Data: Definition

  • Multiple response variables.

  • Variables are not necessarily independent; they interrelate.

  • Examples:

    • Biodiversity of a park.

    • Physical properties of an environment (soil/water chemistry).

  • Specialized approaches are needed because data often do not conform to traditional statistical assumptions.

Why use Multivariate Analysis

  • Avoid conducting multiple ANOVAs on interrelated variables.

    • Problem: Type I errors (false positives) increase with multiple tests.

  • Multivariate statistics consider the interactions of variables together.

Types of Questions Addressed

  • Change in community composition.

  • Differences in water quality.

  • Changes in habitat characteristics.

  • Organismal traits (phenotypic traits, diet effects).

Examples of Outputs

  • Ordinations from principal components.

  • Clustering diagrams.

  • Non-metric multidimensional scaling plots.

Habitat Complexity: An Example

  • Habitat complexity is a multivariate term, comprising multiple interrelated variables.

  • Measurement:

    • Canopy cover.

    • Shrub cover.

    • Number of stems.

    • Diversity of trees/shrubs.

    • Number of logs.

    • Percentage of dirt cover.

    • Soil nutrients.

  • Multivariate analysis can compare sites based on these variables.

Data Analysis

  • Multivariate analysis summarizes existing data by measuring many variables.

  • Simplifies data into fewer derived variables (e.g., principal components).

  • Reduces Type I errors and reveals patterns.

  • Applicable to soil characteristics, habitat, organismal traits, water quality, etc.

Good Practices

  • Just because you can measure it, does not mean you should.

Brazil Football Example

  • Skills measured:

    • Passing accuracy.

    • Kicking strength (ball speed).

    • Speed on angle tracks.

    • Overall strength.

  • Principal components analysis results:

    • Skill Level: accuracy, technique, ball controle, etc..,

    • Athleticism Level: Physical strength and speed.