Classes 15-17
DNA polymorphisms
sequence differences
anonymous: donāt affect nature or amounts of any proteins (or ncRNA) in the body
most donāt influence phenotype
can serve as a DNA marker
Single nucleotide polymorphism (SNPs)
particular base positions in the genome where alternative letters of the DNA alphabet distinguish some people from others
most common type of genetic variant
inherited co-dominantly (bi-allelic)
two alleles for each SNP locus
can be heterozygous or homozygous
on average, occurs every 1000 (1kb) base pairs in any pairwise comparison
donāt influence phenotype (are uncoded)
Deletion-insertion polymorphism (DIPs or InDels)
short insertions or deletions of genetic material
second most common form of genetic variations
occur roughly every 10 kb
can be 1-100s of base pairs
Simple sequence repeats (SSR or micro-satellites)
loci that sequences of one or more bases that are repeated in tandem
different alleles have different numbers of repeat units
most common repeating units are one, two, or three-base sequences
meaning, either have one, two, or three types of nucleotide involved
3% of total DNA in genome, found once every 30kb
in non-coding regions, have no effect
in coding regions, remember trinucleotide repeat diseases (slipped mispairing)?
highly polymorphic, often with over 10 alleles at a single locus
but since they have a low mutation rate (relatively stable), can serve as DNA markers
Copy number variants (CNVs)
DNA length polymorphisms involving more than just a few nucleotides (like SSRs and DIPs)
variable number of copies of large blocks of genetic material up to 1mb in length
highly polymorphic, but stable
99% of alleles are inherited (not derived from a new mutation)
Pairwise comparison
comparing two genomes side by side
DNA fingerprint/profile
genotype of 13 unlinked, polymorphic SSR loci
unique to any one person (except identical twins)
any one person only has two alleles for any given locus
main point: itās highly unlikely (statistically) that someone has the exact same alleleic combination for multiple loci by chance (would have to be related somehow)
Polymerase chain reaction (PCR)
amplifies a target region of DNA
requires only the smallest amounts of DNA
uses two 16-30 base long oligonucleotides as primers
primers are the beginning and end of the target region
one oligonucleotide is complementary to one strand at one end while the other is complementary at the other end
primers are dyed to fluoresce different colors with the 13 SSRs
put into gel electrophoresis
can identify allelic variants for each locus based on the colors and sizes of the products
Haplotype blocks
segments of DNA with particular sets of link SNP alleles that tend to travel together from one generation to another, because they are flanked by recombination hotspots
DNA within blocks contain NO hotspots for crossing over
Genetic genealogy
the basis of genetic analysis, that relatives share haplotype blocks
more closely related = more haplotype blocks shared and the longer their uninterrupted shared DNA segments
Genetic relatedness
estimated by the fraction of autosomal DNA shared
each parent has two alleles of each SNP; each child inherits a random one of those two SNPs (from each parent)
means that the child will share half of their DNA with each parent, and with each sibling (on average)
Nucleic acid hybridization
the ability of complementary single strands of DNA or RNA to come together to form double-stranded molecules
need a perfect match between all nucleotides in primers and template
if thereās a mismatch, itās less stable (so in experimentation, researchers can weed out the imperfect ones but taking advantage of the fact that only perfect matches can withstand a particular temperature)
Anonymous loci
polymorphisms that donāt affect phenotype
serve as molecular markers for specific regions of the genome
Allele-specific oligonucleotides (ASOs)
short 20-40 base oligonucleotides that hybridize under the right conditions to only one of the two alleles at a SNP locus
attach to solid support, like a chip of silicon
turn DNA from genome into a probe by fragmentation, denaturing into single strands
DNA microarray
provide information about degrees of relatedness through the tracking of millions of polymorphs
Positional cloning
strategy to identify defects causing hereditary diseases
get information about the location of a disease gene by finding the polymorphic loci (known) that the mutation (unknown) is genetically linked with
maps genes more precisely (as compared to gene mapping)
limitation = phase problem
Uninformative cross (phase problem)
donāt know the allele configuration
cannot tell what allele a child got from what parent
canāt perform linkage analysis
Informative cross
CAN tell what allele child got from which parent
at least one parent is doubly heterozygous
Allelic heterogeneity
genetic diseases caused by a variety of different mutations in the SAME gene
ex: Cystic Fibrosis
Compound/trans-heterozygotes
one copy of the chromosome has a different mutation than the other copy, BUT the disease still occurs because the two genome copies fail to complement
Locus heterogeneity
genetic diseases caused by mutations in one of two or more DIFFERENT genes
ex: deafness
Complex/quantitative traits
many different genes influence the trait to different extents
no single gene determines the trait
High-throughput/massively parallel sequencing
allows for millions of individual DNA molecules to be sequenced simultaneously
Different from Sanger in thatā¦
DNA molecules are anchored in place when synthesized by DNA polymerase
timing of base addition is controlled to see what base is added
OH- group is protected and can be removed when dNTP needs to be reactive (does not stop synthesis permanently)
Whole-genome sequencing (in general)
goal: to directly find DNA alteration that is the disease allele (as opposed to looking for a marker, then sequencing candidate genes)
assumptions:
disease alleles are rare in the population
pedigree insight/knowledge of inheritance pattern