L7: Emerging technology and genomic databases--genotyping

International HapMap Project

  • a catalog of common human genetic variants —— SNPs or single base pair indels

    • occur in both coding and non-coding region

    • distributed within and across populations

    • may contribute to disease or other phenotype due to small effect on

      • function of protein (if in coding region)

      • regulation of gene expression (if in non-coding region)

  • Hap stands for haplotype

    • a combination of alleles that located closely tgt on the same chromosome and that tend to be inherited tgt

    • crossover may split alleles→ more variation

    • linkage disequilibrium (LD)

      • the phenomenon that variants close in distance segregate together and the genotypes are correlated

      • delays by genetic distance as chance of recombination increases

      • recombinant hot-spot: no LD between alleles

      • haplotype block: region of high LD between alleles

Type of genetic variation

  • by nature

    • single nucleotide variation

    • insertion, deletion

    • copy number variation

    • structural variation

  • by minor allele frequency (MAF) in the population

    • mutation or rare DNA variation: MAF<1%

    • polymorphism: MAF>1%

    • common polymorphismL MAF ≥5%

What HapMap Project study

  • population-specific variants

  • allele frequencies

  • linkage disequilibrium pattern

  • haplotype information

  • tag SNPs

    • single nucleotide polymorphisms (SNPs) used to represent other SNPs in a region.

    • They help identify genetic variations across populations efficiently.

Importance of HapMap

  • LD patterns: reduce type for genotype and analyse SNPs

  • population-specific LD map: select tag SNPs for genotyping and reduce number SNPs genotyped and tested

  • accelerate GWAS to design high density SNP arrays

Genome-wide association study (GWAS)

  • Aim

    • Identifying associations through indirect tagging of causal variants in linkage disequilibrium.

  • Linkage / association studies

    • when an SNP is close to the disease gene→ likely to be transmitted with the disease: linked or in linkage disequilibrium

    • markers can be used to identify mutations or variants that cause the disease or affect other trait via indirect association

    • the “associated“ SNP are physically linked and transmitted with the functional DNA variant that may lead to disease→ can help us locate the DNA variant

  • compare the frequency of genetic markers between patients and controls

    • → markers more frequent in patients than in control = “associated with the disease“

GTEx Project

  • Genotype-Tissue Expression (GTEx) Project

  • Aim: To study how genetic variation influences gene expression across different human tissues.

  • Samples collected from deceased donors across multiple tissues.

  • Collected data is analysed via genotyping, RNA sequencing, and analysis of gene expression patterns.

  • Findings: Revealed tissue-specific gene expression patterns and how genetic variants impact gene regulation.