1/24
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is sequencing depth
number of times that a given nucleotide in the genome has been read in an experiment
The total number of sequence reads generated for a sample during 16S rRNA sequencing or shotgun sequencing
standardised number of reads per sample used during downstream analyses
What happens when sampling depth is uneven
artificial inflation of diversity
Rare taxa more likely to be found with higher sampling depth
Full range of species present in the sample is rarely saturated
Why do we normalise the feature table
to eliminate bias due to differences in the sampling sequencing depth
Biases do not reflect true differences in the underlying biology but exist due to variations in sample collection, library preparation, and/or sequencing
Effective normalisation enables accurate comparisons of statistics from different measurements
What happens when there is low sampling depth
Loss of rare taxa
List all normalisation methods
Centre log ratio (CLR)
Total sum Scaling (TSS)
Rarefaction
What is Centre log ratio
Normalisation + transformation
Normalisation = removes per-sample technical effects (e.g. sampling depth)
Transformation = makes skewed data nicer so the model fits are valid
Applies a centred log ratio transformation
Advantage of CLR
Suitable for the study of correlation coefficients and subsequent multivariate data analyses
Log-ratios are often preferable as they are resilient to matrix effects.
What is TSS
The most common normalisation technique
Transforms the feature table into proportions (relative abundance)
Divides feature read counts (the number of reads that cluster within the same ASV) by the total number of reads in each sample
What is rarefaction
randomly subsampling each sample to the lowest read depth of any sample.Or up until the same pre-defined number to ensure all samples have the same sampling depth
Advantage of rarefaction
low depth samples contain a higher proportion of contaminants (rRNA not from the intended sample) and they will be removed if they don’t meet the sampling depth defined
Disadvantage of rarefaction
this technique can throw out a lot of data, so it is not always appropriate
What is Alpha diversity
A measure of how diverse a single sample is, usually taking into account the number of different species observed.
What are the components of Alpha diversity
Richness and Evenness
What is Richness
estimates the number of different species present in a sample
It only takes into account the absence and presence of taxa
What is used to measure species richness
Chao1
Observed Feature
Faith PD
What is Chao1
Chao1 is an indicator of species richness
Estimates the total number of species in a sample
What 3 factors does Chao1 take into account
The number of species
The number of singleton taxa
The number of doubleton taxa.
What is Observed features
Indicator of species richness
The number of taxa (ASVs) observed
What is the difference between Chao1 and Observed feature
Chao1 considers not only observed ASVs, gives more weight to the lowabundance taxa, The singletons and doubletons are used to estimate the number of missing taxa
While
Observed features only considers observed ASVs.
What is Faith PD
Phylogenetic diversity (PD) is a measure of diversity, based on phylogeny
defined as the sum of the branch lengths of a phylogenetic tree connecting all species covered by a sample
Most widely used phylogenetic metric
Measures species richness
What is Evenness
a measure of relative abundance of different species that make up the richness.
What can be used to measure Evenness
Pielou’s evenness
Shannon Diversity Index
Simpson Index
What is Pielou’s Evenness
Provides information about the equity in species abundance in each sample,
Evenness is high if all species have similar distribution
A low J (evenness score) indicates that 1 or few species dominate the community
What is Shannon Diversity
Provides information on both species richness and evenness
Takes into account the relative abundance of each species
Considers the rarity and commonness of species in a community
Assumes all species are represented in a sample and that they are randomly sampled
Values of H are usually between 1.5 - 3.5 (the units are bits of information)
A value of H = 0 indicates a community that only has one species
doesnt Exceed 5 generally
What is Simpson Index
Provides information on both species richness and evenness
A dominance index – it gives more weight to common or dominant species.
• A few rare species with only a few representatives will not affect the diversity