a catalog of common human genetic variants —— SNPs or single base pair indels
occur in both coding and non-coding region
distributed within and across populations
may contribute to disease or other phenotype due to small effect on
function of protein (if in coding region)
regulation of gene expression (if in non-coding region)
Hap stands for haplotype
a combination of alleles that located closely tgt on the same chromosome and that tend to be inherited tgt
crossover may split alleles→ more variation
linkage disequilibrium (LD)
the phenomenon that variants close in distance segregate together and the genotypes are correlated
delays by genetic distance as chance of recombination increases
recombinant hot-spot: no LD between alleles
haplotype block: region of high LD between alleles
by nature
single nucleotide variation
insertion, deletion
copy number variation
structural variation
by minor allele frequency (MAF) in the population
mutation or rare DNA variation: MAF<1%
polymorphism: MAF>1%
common polymorphismL MAF ≥5%
population-specific variants
allele frequencies
linkage disequilibrium pattern
haplotype information
tag SNPs
single nucleotide polymorphisms (SNPs) used to represent other SNPs in a region.
They help identify genetic variations across populations efficiently.
LD patterns: reduce type for genotype and analyse SNPs
population-specific LD map: select tag SNPs for genotyping and reduce number SNPs genotyped and tested
accelerate GWAS to design high density SNP arrays
Aim
Identifying associations through indirect tagging of causal variants in linkage disequilibrium.
Linkage / association studies
when an SNP is close to the disease gene→ likely to be transmitted with the disease: linked or in linkage disequilibrium
markers can be used to identify mutations or variants that cause the disease or affect other trait via indirect association
the “associated“ SNP are physically linked and transmitted with the functional DNA variant that may lead to disease→ can help us locate the DNA variant
compare the frequency of genetic markers between patients and controls
→ markers more frequent in patients than in control = “associated with the disease“
Genotype-Tissue Expression (GTEx) Project
Aim: To study how genetic variation influences gene expression across different human tissues.
Samples collected from deceased donors across multiple tissues.
Collected data is analysed via genotyping, RNA sequencing, and analysis of gene expression patterns.
Findings: Revealed tissue-specific gene expression patterns and how genetic variants impact gene regulation.