1/35
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
comparative genomics
the study of differences and similarities in genome structure and organization in different organisms
combines genome biology and evolutionary biology
what does analysis of differences in genomes by a phylogenetic framework allow us to do?
we can study how genomes change over time and how those changes relate to observed biological diversity
hypotheses generated with a robust and independent phylogenetic framework are
more likely to provide better explanations for the observed traits
mutation, in the case of genomes
anything that generates variation at the genome level
why do genomes evolve?
mutation leads to natural selection and drift/sorting/luck
mutation process examples
polyploidy
segmental/chromosomal duplications
chromosomal rearrangements
retroviral insertions
transposition
endosymbiosis
change/effect of polyploidy
whole genome duplication
change/effect of segmental/chromosomal duplications
other large-scale duplications, gene shuffling, gene order alterations
change/effect of chromosomal rearrangements
mutagenesis, gene shuffling
change/effect of retroviral insertions
changes in genome size, shuffling, mutagenesis, altered gene expression
change/effect of endosymbiosis
genome reduction, EGT
long term (evolutionary) consequences are determined by
multiple strongly interdependent factors, such as population size, type of reproduction, phenotypic manifestation of the mutations, habitats (stable vs variable), etc
factors may not be generators of variation but instead
create conditions that are conducive to and modulate genome evolution
genome evolution factors (but not generators of variation) examples
radical lifestyle/environmental changes
composition of the ecosystem
exposure to pathogens
becoming a parasite or endosymbiont
acquiring endosymbionts
“nothing”
radical lifestyle/environmental changes
increased exposure to environmental stressors (heat, radiation) can lead to increased rates of mutation and rearrangements and/or alter nucleotide composition
changes in population structure and/or population size affect probability of fixation of rare chromosomal changes, including large duplications
composition of the ecosystem
e.g. invasive spp. increase competition pressure; arrival of predators do the same
exposure to pathogens
affect population structure (e.g. bottlenecks); favour fixation of low-frequency genomic changes; microorganisms can introduce active TEs into genome
becoming a parasite or endosymbiont
massive gene loss due to pressure for small genome and/or redundancy of gene function, and/or increased rates of mutation—result in genome reduction and overall sequence divergence
acquiring endosymbionts
induce massive acquisition and replacement of genes; affects gene size and gene complement
in a stable environment, continuous supply of resources basal rates of mutations result in
genome change over time (e.g. base substitutions, DNA damage, recombination)
what traits can we look at?
genome architecture, base composition (related to codon usage and amino acid content)—but there are many characteristics
genome architecture
number and shape of chromosomes; distribution of non-coding DNA; order of the genes
base composition
usually expressed as %GC or %AT
most eukaryotic genomes are at around 50% GC but due to multiple factors the composition can become strongly biased
many intracellular parasites exhibit strong AT biases, probably because they have poor DNA repair efficiency, resulting in right rates of certain mutations, especially those that turn C→A
complications with nucleotide composition
it is tightly related with codon usage and amino acid content of the encoded proteins
therefore, these 3 parameters are strongly interdependent, and this means that figuring out what exactly causes a particular bias is usually very difficult
what does “shared genes” mean?
homologs, ideally orthologs
minimal gene set/core eukaryotic genes
a given organism’s gene catalog that consists of a group of genes for components (proteins, RNAs) that are essential for the basic functions of the cell
e.g. ribosome components, DNA/RNA pols, other DNA/RNA synthesis and processing proteins, components of basic cellular mechanisms, etc.
we can expect to find the gene encoding these components in any eukaryotic organisms
generalized gene set
for genes involved in central metabolism and other cellular activities that are more or less ubiquitous
e.g. synthesis of macromolecules such as nucleotides and AAs, intracellular transport, etc.
found in most euk species
specialized gene set
for genes involved in specific pathways and systems that are restricted to certain groups of organisms
e.g. photosynthesis, assimilation of nutrients present in certain habitats, cellular structures involved in specific activities such as feeding, mobility, etc
rules for the layers of gene sets and distributions across lineages
most organisms will have genes from all layers
most genes in the core layer can be found in all organisms
most genes in the generalized layer will be found in most free-living organisms
genes in the specialized layer tend to show restricted taxonomic distribution
genes in a specific functional role that are found in different organisms are not necessarily homologous
EGT
like proks, euk genomes can take up foreign genes from various sources and through various mechanisms
contribution from this has been significant, bringing in thousands of genes of bacterial ancestry into euk genomes (e.g. from mitochondria and plastids)
LGT and euks
uncertain in extent and is sporadic compared to proks, but it still constitutes a significant source of metabolic innovation that enabled many organisms to colonize new environments
how can we count shared genes?
use specialized tools for classifying genes into orthologous groups, e.g. OrthoFinder
gene complement and counting shared genes
can be done at different evolutionary distances
with closely related species, complement will be very similar
as distance becomes larger, the number of genes in common becomes smaller
phylogenetic framework
allows us to infer rates of expansion and contractions (i.e. gains and losses) of genes and gene families that occurred on individual lineages
data and parameters considered for phylogenetic frameworks
number of genes in existing species
topology of the tree (branching order)
length of the branches (amount of evolution sustained by each lineage)
absolute time (calibrated with external data such as fossils)
we can also analyzed what ______ are represented in shared genes
functions