Genetic Markers in Wildlife Conservation Research

Genetic Polymorphism

  • Presence of two or more variant forms of a specific DNA sequence/same gene in the genome.
  • Can occur among different individuals, within a population, or amongst different populations.
  • Results from changes in DNA content/point mutation, variations in DNA fragment size/length, or copy number variations of the same gene/locus.

Types of Genetic Polymorphism

  • Variability in gene content
  • Single-nucleotide polymorphisms (SNPs):
    • Variation of a single nucleotide at a specific position in the nuclear genome.
    • Results from DNA substitutions or mutations at a single pair change.
  • Insertion/deletion (Indel):
    • Presence or absence of a specific nucleotide sequence or fragment in an individual’s nuclear genome.
    • Single indels are sometimes referred to as SNPs.
  • Variable number of tandem repeats:
    • Consist of ≥2 base pairs (bp) in length that are adjacent to each other.
    • Involve as few as two copies or many thousands of copies.
    • Organized in a head-to-tail orientation.
    • Classified based on the size of each repeat unit (e.g., minisatellites and microsatellites).
  • Variability in gene/loci copies:
    • Presence of more than one copy of the gene/locus (e.g., some immune genes from the major histocompatibility complex (MHC)).

Genetic Markers

  • Informative DNA sequence (gene or non-coding sequence) that:
    • Aids in the identification of individuals/species.
    • Resolves taxonomic uncertainties.
    • Assesses diversity in populations.
    • Associates with a particular trait.
  • Provides wildlife managers information to protect biodiversity by identifying conservation units, such as:
    • Evolutionarily Significant Units (ESUs).
    • Management Units (MUs).
    • Family groups.
  • Helps to understand evolutionary changes that occur over time within and across species and populations.
  • Informs conservation actions by understanding the impact of:
    • Genetic diversity loss.
    • Genetic inbreeding.
    • Population demographics/dynamics changes on the survival and risk of extinction.
  • Assists in informing breeding and reintroduction programs.
  • DNA polymorphisms can help in identifying variable genetic sequences (named genetic markers) that can tell the differences between individuals within populations/ amongst populations and/or between different species.

Classification of Genetic Markers According to Parental Source

  • Uniparental:
    • Mitochondrial (mtDNA): maternally inherited, extra-nuclear.
    • Chloroplast (cpDNA): maternally inherited (but not in all plant species), extra-nuclear.
    • Single Y-chromosome: nuclear, part of sex chromosomes, paternally inherited.
  • Biparental:
    • Nuclear DNA: maternally and paternally inherited.

Classification According to Evolutionary Constraints or Forces

  • Neutral genetic markers:
    • Tell us about population demographics (e.g., gene flow, migration or dispersal) and evolution of species (e.g. microsatellites).
    • Not influenced by selection (positive/negative).
    • Natural selection does not act upon these.
    • Not influenced by environment.
  • Adaptative genetic markers:
    • Tell us about the adaptive evolutionary history and potential of a population or a species (e.g., immune genes).
    • Help us to understand how species/populations respond or cope with environmental challenges, including diseases.
    • Are influenced by selection.

Nuclear DNA Markers (Biparental)

  • Variable number tandem repeat (VNTR): characterised by a high degree of length polymorphism and help to understand:
    • recent historic events
    • genetic diversity
    • population dynamics and demographics
    • gene flow
    • inbreeding
    • pedigree
  • Microsatellites: tandemly repeated motifs of 1–6 bases and can repeat from about 5–100 times at each locus; more or less randomly dispersed throughout the genome and frequently appear in transcription units
  • Minisatellites: tandemly repeating motifs of 8–100 bases that can repeat from two to several hundred times at each locus. Minisatellites are interspersed but often clustered in telomeric regions.

Microsatellite Markers

  • Microsatellites are co-dominant markers with bi-allelic or multi-allelic presentation in an individual or a population, respectively.
  • Are highly polymorphic.
  • Can easily be amplified by PCR.
  • Highly versatile markers for molecular fingerprinting.
  • Are specific to species.
  • Have specific positions in their genomes.
  • There are different approaches to identify species-specific microsatellites (including genome sequences and genome libraries).
  • Once species-specific microsatellites are identified for a species, a set or panel of informative loci can be widely used for that species.
  • Do not code proteins but could be linked to coding sequences.

Microsatellite Alleles

  • Microsats are neutral markers and mainly occur in non-coding DNA.
  • How microsatellites are named:
    • 1-3 letters of the scientific species name.
    • Chromosome that contains the microsatellite.
    • A simple consecutive number uniquely identifying a particular locus within that chromosome.
    • Alleles are annotated using their length or size.
    • Eg MML2S3 where correspond to Macaca mulatta (common name, rhesus macaque), 2 corresponds to the chromosome number and S3 corresponds unique name. Alleles in this case ae 28 and 38.

Frequencies of Microsatellite Genotypes

  • Microsatellites are useful to assess the level of heterozygosity and allelic diversity at the population level.
  • Frequencies are estimated between 0-1.

For example, of 103 individuals:

  • Genotype 38/28: 68 individuals, Frequency = 68/103=0.6668/103 = 0.66
  • Genotype 28/28: 31 individuals, Frequency = 31/103=0.3031/103 = 0.30
  • Genotype 38/38: 4 individuals, Frequency = 4/103=0.044/103 = 0.04
  • Total = 1.00
  • Total number of copies of ’28’ = (2×68)+31=167(2 \times 68) + 31 = 167
  • Total number of copies of ’38’ = 31+(2×4)=3931 + (2 \times 4) = 39
  • Overall total 206 alleles

Single Nucleotide Polymorphisms (SNPs)

  • Single substitution at a particular position/site.
  • Occurs every 300-1000 bp in the genome (millions in the genome).
  • Could focus on a single locus, multiple loci, genome region or entire genome.
  • SNP discovery is done these days using entire genome sequencing/resequencing.
  • Identified SNPs can be collected in an oligo or microchip to assess genome-wide SNP variation at scales of 50-100K SNPs.
  • SNPs can be used for genotyping and genome-wide scale analyses.

How to Genotype SNPs

  • Sequence the whole genome
    • Heterozygotes, comparison of multiple individuals
  • “Resequence” the genome
    • “Light” sequencing; align to a reference genome
  • Targeted methods
    • PCR, SNP chips, sequence capture
  • Reduced-representation sequencing
    • A method for sequencing a repeatable, small portion of the genome in multiple individuals
    • Becoming very popular in wildlife studies
    • Performs best with a reference genome, but is doable without one

Genome Sequencing and Resequencing

  • To assemble a reference genome, a high quality and quantity of sequencing is required; often multiple technologies are used.
  • In resequencing, we are looking for variations among individuals and have the benefit of the reference to compare to; less sequencing effort is required.

Reduced-Representation Sequencing (RRS)

  • Also sometimes called restriction-digest associated DNA sequencing (“RADseq”) or genotyping by sequencing (“GBS”).
    1. Cut up the DNA using a restriction enzyme (“restriction digestion”).
    2. Select fragments of a target size range (“size selection”).
    3. Sequence those fragments.
    4. Align fragments to each other or to a reference genome (“filtering”).
    5. Look for sequence differences (SNPs) among individuals or populations.

Frequencies of SNP Genotypes

For example, of 103 individuals:

  • Genotype A/B: 68 individuals, Frequency = 68/103=0.6668/103 = 0.66
  • Genotype A/A: 31 individuals, Frequency = 31/103=0.3031/103 = 0.30
  • Genotype B/B: 4 individuals, Frequency = 4/103=0.044/103 = 0.04
  • Total = 1.00
  • Total number of copies of ’A’ = (2×68)+31=167(2 \times 68) + 31 = 167
  • Total number of copies of ’B’ = 31+(2×4)=3931 + (2 \times 4) = 39
  • Overall total 206 alleles