Lecture 1: Overview and History of DNA Typing
OVERVIEW AND HISTORY OF DNA TYPING
Brief History of DNA Typing:
1985: Alec Jeffreys, a genetic researcher at the University of Leicester, developed DNA profiling along with Peter Gill and Dave Werrett of the Forensic Science Service (FSS)
Developed a method to retrieve DNA from dried blood stains and a preferential extraction method to separate sperm from vaginal cells
Found certain regions of DNA contain repetitive sequences that are variable in number between individuals: DNA fingerprinting or DNA typing
Variable Number of Tandem Repeats (VNTRs):
Minisatellites (8-100bp repeats) often found at the ends of chromosomes
Microsatellites (2-7bp repeats) dispersed throughout the genome
The Pitchfork Murder Case:
A kitchen porter was arrested and confessed to murder
Jeffreys conducted DNA identification on evidence, and the kitchen porter did not “match” (exonerated from crime)
Investigation of ~5k local males eventually found only one “match”: Colin Pitchfork
His DNA profile matched with the semen from both murders
Convicted and sentenced to life in prison in 1988
Basic Principles of DNA Testing:
The DNA profile of every person is unique
The genome is the same in every cell
The DNA profile remains the same throughout life
Your DNA is inherited ½ from mother and ½ from father
DNA typing must be performed efficiently and reproducibly (must hold up in court)
Current standard DNA tests do not look at genes (little to no information about geographical ancestry, predisposal to disease, or phenotypical information is obtained
We probe subsets of genetic variation called short tandem repeats in order to differentiate between individuals
Applications for DNA Testing:
Involves generation of DNA profiles usually with the same core STR markers and then matching to reference sample
Crime solving - matching suspect with evidence
Paternity testing - who is the father
Missing persons investigations - whose remains
Immigration testing - are two people related
Disaster victims - after an airplane crash
Soldiers in war - who is the unknown soldier
Developing convicted felons’ databases - cases solved
DNA Testing as a Reference:
A DNA profile by itself is fairly useless because it has no context
DNA analysis for identity only works by comparison — you need a reference sample:
Forensic case: Crime scene evidence compared to suspects
Paternity case: Child compared to alleged father
Mass disaster ID: Victim’s remains compared to biological evidence
Armed forces ID: Soldier’s remains compared to direct reference sample
Three Possible Outcomes of Evidence Examination:
Exclusion (no match): The genotype comparison shows profile differences that can only be explained by the two samples originating from different sources
Non-exclusion (fail to exclude): Statistical evaluation of the significance of the match is usually reported with the match report
Inconclusive result: This finding might be reported if two analysts remain in disagreement after review and discussion of the data and it is felt that insufficient information exists to support any conclusion
BASICS OF DNA BIOLOGY AND GENETICS
Nuclear DNA in the Cell and Jargon:
A specific region of DNA is called a locus
Alternative forms of a locus are called alleles
If two alleles are identical by descent at a locus, they are called homozygous and if different, heterozygous
A genotype is the characterization of alleles at a locus
A DNA profile is the genotypes obtained at multiple loci
DNA Structure and Composition:
DNA molecule includes sugar backbone, phosphate groups and four nucleotide bases
Two DNA strands form H-bonds to make a double helix
Phosphodiester bonds form between adjacent nucleotides
Formed between the 3’-OH group of one nucleotide and the phosphate group of another
The DNA double helix denatures at elevated temperatures (above 95 degrees Celsius) or with chemical treatments
Restriction enzymes:
Recognize a specific sequence of DNA (4-8bp)
Produce a double-stranded cut (sticky or blunt)
Direct detection of DNA copy number variation: cut DNA into pieces and analyze fragments by gel electrophoresis
Can also detect sequence variation by the presence/absence of a restriction cut
Human Chromosome Nomenclature:
Variation in chromosome size and G-banding results in a karyotype
TH01:
Tyrosine Hydroxylase gene, intron 01
D16S539:
D: DNA
16: Chromosome 16
S: Single copy sequence
539: 539th locus described on chromosome 16
Characteristics of DNA for Forensic Applications:
Each person’s DNA is the same in every cell
An individual’s DNA profile remains the same throughout life
Half of an individual’s DNA is inherited from the mother and half from the father
Each person has a unique DNA profile
Population variation:
~99.7% of the 6.4 billion bp are the same between people
This 0.3% is still ~10 million differences and make us unique
Types of DNA Polymorphisms:
Sequence polymorphism
Length polymorphism
DNA markers are detectable variants of DNA which are useful when polymorphic
Genetic Variability:
In DNA typing, multiple markers are examined to create a DNA profile
The higher the number of markers examined, the greater the chance that two unrelated individuals will have different DNA profiles
If each locus is inherited independently from one another, then the DNA profile frequency can be calculated using the product rule by multiplying each genotype frequency together
More alleles = a higher power to distinguish between individuals
Full CODIS DNA profiles can sometimes be obtained from as little as 50 pg of DNA
Assuming a DNA extraction is 100% efficient from the biological sample collected:
There is ~6.6 pg in DNA diploid cell
50pg DNA/(6.6pg DNA/diploid cell) = ~ 8 diploid cells
HISTORICAL METHODS IN DNA TYPING
Types of Technology and Markers:
Non-DNA based:
Blood group testing
Forensic protein profiling
DNA-based:
RFLP (length) multi-locus markers and single-locus markers
PCR based (sequence): Reverse dot blot
PCR based (length): AFLPs, silver-stained STRs, fluorescently detected STRs
Mitochondrial DNA sequencing
Blood Group Typing:
Advantages:
Rapid, simple tests
Only test available for many years
Limitation:
Poor power of discrimination (~ 1 in 10) with such few alleles
Basics of RFLP-Based DNA Testing:
Detect:
A variable number of tandem repeats (VNTRs)
Repeating units up to 0.5-20 kb
~30,000 in human genome
Process:
Cut DNA with restriction enzymes
Separate fragments differing in length by gel electrophoresis
Detect length-based differences (polymorphisms) in DNA fragments of interest with a radioactive probe
Strip membrane and re-probe as necessary
Multi-probe results are more complicated to interpret:
“DNA fingerprint” developed by Jeffreys were unique to the individual (high power of discrimination between samples)
Mixtures from individuals in a single sample were impossible to interpret
Single-Locus Probe RFLP:
Repeated probing of the same membrane to yield a series of autoradiographs
Advantages:
Excellent powers of discrimination (~1 in millions or greater with four loci)
Large number of alleles at each locus which facilitates mixed-sample analysis
Limitations:
Limited sensitivity (>50 ng to 500 ng required)
Time-consuming process that cannot be automated (days to weeks)
Not suitable with degraded DNA samples due to high molecular weight needed
The need for binning introduces complications and sometimes difficulties of interpretation
Limited number of validated loci (4 to 6 loci commonly used) which meant that these VNTRs were of limited value in distinguishing between siblings
Invention of PCR:
1985: Kary Mullis invented PCR (awarded the Nobel Prize in 1993)
First publication of PCR by Cetus Corporation appears in Science
Amplify specific DNA loci from small amounts of starting material
Advantages:
Very small amounts of DNA template may be used even as little as from a single cell
DNA degraded to fragments only a few hundred base pairs in length can serve as effective templates for amplification
Large numbers of copies of specific DNA sequences can be amplified simultaneously with multiplex PCR
Contaminant DNA, such as from fungal and bacterial sources, will not amplify because human-specific primers are used
Commercial kits are now available for easy PCR reaction setup and amplification
Disadvantages:
The target DNA template may not amplify due to the presence of PCR inhibitors in the extracted DNA
Amplification may fail due to sequence mutations in the primer-binding region, AKA a “null allele”
Contamination from other human DNA sources besides the forensic evidence at hand or previously amplified DNA samples is possible
Amplified Fragment Length Polymorphisms (D1S80):
AFLP/AMP-FLP = amplified fragment length polymorphism
PCR products (400-800 bp) separated on a polyacrylamide gel and detected with silver staining
Advantages:
Improved sensitivity compared to RFLP because it uses PCR
Many alleles which facilitates mixed-sample analysis
Discrete allele calling possible using allelic ladder, which also simplifies statistical interpretation
Limitations:
Large allele range making it difficult to multiplex with other loci and giving rise to preferential amplification of smaller alleles
Poor power of discrimination as a single locus (~1 in 50)
Allele dropout seen with highly degraded DNA
Gel separation and silver-stain detection not amenable to automation or high-throughput sample processing
DQA1 Reverse Dot Blot Tests:
First method used allele specific probes to find sequence polymorphisms in a dot-blot format
Most common locus was HLA-DQA1 (integral membrane protein associated with immune response)
Most commonly, 8 alleles identified ( 1.1, 1.2, 1.3, 2, 3, 4.1, 4.2)
Commercial kit could not distinguish between 4.2 and 4.3 alleles
The “C” control dot to test a sample is above the stochastic threshold
The “S” dot is another control sample in the test
Advantages:
Fast, simple method (compared to RFLP)
Capable of analyzing small or degraded samples because it uses PCR
No instrumentation needed after PCR
Limitations:
Poor power of discrimination (~1 in 1000) with six loci developed
Mixture interpretation difficult with limited number of alleles per locus
Short Tandem Repeat Markers:
The efforts of the Human Genome project have increased knowledge regarding the human genome, and hence there are many more STR loci available now than there were 25 years ago when the 13 CODIS core loci were originally selected
More than 20,000 tetranucleotide STR loci have been characterized in the human genome
STR sequences account for approximately 3% of the total human genome
The FBI has selected 13(20) core STR loci that must be run in all DNA tests in order to provide a common currency with DNA profiles
Cause of STRs:
A commonly observed replication error is “replication slippage”, which occurs at repetitive sequences when the new stands mis-pairs with the template strand (or visa versa)
The microsatellite polymorphism is mainly caused by this mechanism
If the mutation occurs in a coding region, it could produce abnormal proteins, leading to diseases
Silver-Stained STRs:
Double-bands for each allele due to separation and detection of forward and reverse strands from each PCR product
Advantages:
Sensitive due to PCR
Relatively rapid process (a day or two)
Works well with degraded DNA samples since shorter fragments can be analyzed
A lower start-up cost compared to fluorescent STRs
Limitations:
Because only a single “colour” channel is available, multiplex amplification and detection is limited to 3-4 loci
Both strands of DNA are detected, leading to double bands with some loci that can complicate interpretation
Fluorescent STRs:
PCR primers anneal to unique sequences bracketing the variable STR repeat region
PCR product size generated
Advantages:
Sensitive due to PCR
Relatively rapid process (a few hours to a day or two)
Works fairly well with degraded samples since shorter fragments of DNA can be analyzed
Multiplex PCR amplification and multi-colour fluorophore labeling and detection enables examining 15+ loci simultaneously. This provides powers of discrimination ~ 1 in billions or greater
Standardized sets of core loci are widely used with availability of commercial STR kits
Automated detection enables high-throughput sample processing
The potential number of loci is very large, which is important if siblings or other relatives are involved
Limitations:
Less discrimination power per locus compared to VNTRs due to a smaller number of alleles and less heterozygosity per locus
The possibility of contamination from stray DNA is increased because of the PCR amplification process
Expensive equipment required for detection
Stutter products and unbalanced peak heights may occur and make the interpretation of mixtures more difficult
Data interpretation must account for the artifacts such as dye blob, electrophoretic spikes, ect
Lecture 2: Sample Collection and Storage, DNA Extraction
SAMPLE COLLECTION AND PRESERVATION
Evidence Collection and Preservation:
O.J. Simpson case illustrates need for careful collection, documentation, on continuity of evidence, and validation of techniques
Defense attacked the way evidence was collected and preserved
Pamphlet: “What Every law Enforcement Officer Should Know About DNA Evidence”
Prioritize sample collection
Collect evidence samples, elimination, or known samples
Protocols are in place to guard against errors still made in the field and in processing
Best results with >100 cells, but DNA profiles can be recovered from as little as a single cell
Sources of Biological Evidence:
Blood
Semen
Saliva
Urine
Hair
Teeth
Bone
Tissue
The Grim Sleeper (1985-88, 2002-2007, California):
11 women murdered in California over a span of 15 years; DNA profiles generated from evidence linked to one individual
Lonnie David was identified as a suspect using familial DNA testing
DNA profiles from the victims generated a partial profile that was similar to another profile (Christopher Franklin; son) in the California DNA database
Law enforcement collected a discarded pizza slice (among other items) from Lonnie and DNA confirmed that profile matched the profile generated from crime scene evidence
Evidence Collected and Preservation Basics:
Avoid contamination from collector (sneezing, coughing, not using clean gloves for each piece of evidence collected)
Separately package evidence
Air dry “wet” samples prior to packaging in paper. DO NOT use plastic (to avoid bacterial degradation of DNA or growth of mold)
All samples must be carefully labeled and sealed
Stains on unmovable surfaces: Swab with distilled water, air dry, place in envelope and take reference swab as a negative control
Unknown DNA Sample from Evidence:
Cotton swabs commonly used to collect biological material from bloodstains or semen from sexual assault victims
Cellulase can break down cotton swab fibers and release sperm cells that stick to cotton swab
Cellular material can be collected from clothing using adhesive tape which cna be placed directly into a DNA extraction tube
The amount of DNA needed has decreased dramatically in the past decade due to sensitivity of the PCR process
Reference DNA Sample from Suspect:
To perform the Q-K comparative DNA test, a reference sample must be taken
Blood samples may be collected (but not rapid or painless)
Easier to collect a buccal swab from the inside of an individual’s mouth, which scrapes off some cheek cells (less invasive than drawing blood)
Swab must be dried before storing and shipping to lab to avoid mold and bacterial growth
Storage of DNA Samples:
DNA can be stored as non-extracted tissue, or fully extracted DNA
DNA molecules are best stored dry (to prevent base hydrolysis) and cold to protect from DNA digesting enzymes (DNAses)
Extracted DNA molecules can be stored:
Short-term in a fridge (4 degrees) or freezer (-20 degrees)
Long term in an ultralow freezer (-80 degrees)
Sample Characterization:
Increasingly, cases come to court in which the presence of cellular material of a person is not disputed but the activity that caused the deposition is
The debate centers around how did their DNA get there?
Presumptive tests are performed to indicate whether biological fluids are present on an item of evidence
Presumptive testing enables a sampling area to be selected from a piece of evidence
Presumptive tests should be:
Simple
Inexpensive
Safe
Only use a small amount of material
Non-destructive
Presumptive tests to identify sample source:
Blood stains: Serological methods (detect hemoglobin) or luminol (binds iron in hemoglobin)
Semen stains: Serological methods (detect acid phosphates) or direct observation of sperm (microscope)
Saliva stains: Serological method detect amylase
RNA TESTING OF FLUID SAMPLES
Body Fluid Identification with RNA Testing:
Each cell type in the human body has a unique pattern of gene expression that is manifested by the presence and relative abundance of specific mRNA species, the molecular intermediate between DNA and expressed protein
Some mRNA are selectively expressed in cells that collectively comprise a particular body fluid
Modified extraction protocols could simultaneously isolate RNA and DNA from the same sample for different tests
Advantages of RNA testing over conventional methods:
High sensitivity due to the possibility of PCR amplification
High specificity due to the pattern of gene expression unique tissue
Simultaneous DNA isolation without loss of material, if necessary
mRNA markers have been identified for the most forensically relevant body fluids based on functional differences between the cells and tissues involved
“Housekeeping” transcripts can be used as a positive RNA control
Amplification of mRNA is detected via:
Real time PCR (probe based)
Capillary Electrophoresis (fluorescent-labelled amplicon) (primer based)
Co-Analysis System for Personal Identification:
Arrangement of loci of the co-analysis system (PP16 STR markers in blue, green, and yellow dye channels)
mRNA markers in the red dye channel
Results for STR and mRNA co-analysis:
STR profiling results (PP16, top three rows) for identity
Sample characterization indicates fluid type
DNA EXTRACTION METHODS
DNA Extraction:
DNA is negatively charged
DNA is tightly complexed with positively charged histone proteins
Other molecules co-purify with DNA depending on the isolation method
The general steps in any DNA extraction are:
Lyse cells and release DNA
Separate DNA from cellular material
Isolate DNA so that it can be used for downstream STR typing
Store DNA at -20 Celsius or -80 Celsius to prevent nuclease activity
The goal is to maximize yield while minimizing contaminants and inhibitors
The extraction process is where DNA is most susceptible to cross contamination in the lab
Common PCR inhibitors:
Iron from RBCs
Ethanol
Minerals from bone
Jean and textile dyes
Melanin from hair
Proteins
Nucleases degrade DNA, requiring magnesium and optimal activity at ~37 degrees celsius
Hydrolytic cleavage of DNA is increased by high temperatures and high humidity, this can add “nicks” in the DNA template, which interfere with primer annealing or Taq polymerase
DNA Extraction Methods:
Organic (phenol-chloroform)
Solid-phase extraction methods:
Quiagen (silica bind/was/release with vacuum filtration or centrifugation)
DNA IQ and PrepFiler (silica bind/wash/release with magnetic bead capture)
Chelex
FTA paper
Differential extraction:
Separation of non-sperm and sperm fractions based on absence or presence of DTT that breaks open the sperm cell coating
Lysing Cells and Releasing DNA:
Dissolve tissue in buffer to break cells and nuclear membranes, disrupt DNA-protein complexes, bind multivalent ions
Urea disrupts hydrogen and hydrophobic bonds holding the cell membrane together
SDS is an ionic detergent that binds proteins and unfolds them to disrupt DNA-protein complexes
EDTA is a chelating agent that binds multivalent ions which are required for nuclease digestion of DNA
Add Proteinase-K to digest proteins by breaking the peptide bonds to produce smaller peptides
proteinase-K activity is stimulated in the presence of detergents (SDS)
Enzymatic activity has an optimum temperature of ~65 degrees celsius for tertiary and quaternary protein structures and ~37 degrees celsius for primary and secondary structures
At these high temperatures, phosphodiester bonds in DNA can be broken and it should not be exposed to 65 degrees celsius for extended periods
Separating DNA from cellular material (digested proteins)
Organic solvents remove proteins from nucleic acids
Solid support to bind DNA
Magnetic beads to bind DNA
Traditionally, organic solvents removed proteins by denaturing and precipitating them, where DNA is more soluble in the aqueous solution
DNA extracted via organic solvents is generally high molecular weight and double stranded, making it suitable for RFLP analysis or PCR
Other approaches take advantage of the ability of silica-oxide coated materials to bind DNA
Quiagen Extractions:
Nucleic acids bind to a porous silicon-oxide coated membrane in the presence of chaotropic salts and ethanol
Washing steps keep proteins and cellular debris in solution so they can flow through the column and be discarded
The DNA is bound to the column on the membrane
Under low salt conditions the DNA is eluted (released) into solution
Promega Extractions:
Magnetic beads coated in silicon-oxide bind DNA
Adding resin to lysate allows extraction to be performed in one tube, various solutions are used to wash the magnetic beads and a magnet holds DNA in place
The amount of resin used is limiting to the amount of DNA captured
The sample is incubated at 65 degrees Celsius in an elution buffer (TE) which releases DNA from the beads into solution
Chelex Extractions:
5% chelating resin is added directly to a sample of blood or semen
Magnesium ions are drawn to and bound by resin
Cells are lysed by boiling for several minutes
A pellet of cellular debris and chelex resin is formed via centrifugation and the ssDNA is removed from the supernatant
FTA Paper Extractions:
Blood added to FTA paper and left to dry
Cellulose-based paper that lyses cells, binds white blood cells to paper’s matrix, protects DNA from nuclease activity, and deters bacterial growth
The paper is stable at room temperature for several years
A wash punch removes heme and other PCR inhibitors
A clean punch can be added directly to PCR without quantification
It is possible to perform chelex or solid support extractions on punches
Differential Extractions:
Separate epithelial cells from sperm cells in sexual assault cases to facilitate STR interpretation
A modified organic extraction that breaks open female epithelial cells
Sperm cells are then lysed with DTT that breaks protein disulfide bridges of sperm cell walls
Not useful for azoospermic sperm (vasectomy = no measurable levels of sperm)
Lecture 3: DNA Quantitation
Purpose of DNA quantitation:
All sources of DNA are extracted when biological evidence from a crime scene is processed to isolate the DNA present
Non-human DNA such as bacterial, fungal, plant, or animal material may also be present in the total DNA recovered from the sample along with the relevant human DNA of interest
For this reason, the DNA Advisory Board Standard 9.3 and FBI Quality Assurance Standard 9.4 requires human-specific DNA quantitation so that appropriate levels of human DNA can be included in the subsequent PCR amplification
Higher quality data saves time and money!
The Importance of Quantitation:
Estimate extraction method efficiency:
Methods yield different amounts and quality of DNA
Multiplex STR typing works best with a narrow range of human DNA:
Typically 0.5 to 2.0 ng of input DNA for commercial STR kits
It is especially important to dilute samples to a known concentration for use in PCR
Conserves DNA
Reduces the possibility of introducing inhibitors which may be in the extracted solution
Too much DNA in multiplex PCR:
Off-scale peaks
Split peaks
Locus-to-locus imbalance
Too little DNA in multiplex PCR:
Heterozygote peak imbalance
Allele drop-out
Locus-to-locus imbalance
Normalization: The process of obtaining a DNA concentration suitable for STR analysis
Slot Blot Method:
A primate specific oligonucleotide probe (40bp) binds to satellite sequence D17Z1
DNA is bound to a nylon membrane and then probed
The intensity of the signal from the probed DNA is compared to standards prepared via serial dilution
Comparison is visual (subjective) but digital capture emerged in the early 2000s
The test normally consumes 5 uL of DNA
The test takes several hours, 30 samples per test, detects ssDNA and dsDNA down to 150 pg
UV Absorbance via Nanodrop/Spectrophotometer:
Nucleic acids such as DNA and RNA absorb UV light at a maximal wavelength of 260 nm
A linear relationship between the absorption of light and the concentration of nucleic acid in the cuvette
We can use the Beer-Lambert equation to determine the concentration of DNA in a sample
Poor limit of detection (~3.5 ng/uL)
Not specific for human DNA and does not specify the type of DNA
Contaminants such as proteins and phenol give false signals
PicoGreen Intercalating Dye Assays:
Certain molecules bind dsDNA
The inside of dsDNA molecules provides a hydrophobic environment that allows PicoGreen or other molecules to fluoresce differently than when they are in aqueous solution
These molecules are excited by light and fluoresce in proportion to the amount of DNA, where stronger emissions indicate more DNA
PicoGreen is automatable with 80 samples and 16 calibration samples in less than 30 minutes
Uses a standard curve to convert fluorescence signal into the amount of DNA present in an unknown sample
Not human specific and only dsDNA specific
PCR and Quantitative PCR:
In PCR the products are analyzed after cycling is completed
In qPCR the products are monitored as the PCR is occurring
Once per thermal cycle, fluorescence is measured and recorded as a normalized reporter signal (Rn)
The PCR is monitored during the exponential phase where the first significant increase in amount of PCR product correlates to the initial amount of target template
Cycle Threshold (Ct): The number of cycles required for the fluorescent signal to exceed background levels (baseline noise)
When PCR is close to 100% efficiency, doublind of amplicons occurs during each cycle
Ct levels are inversely proportional to the amount of target nucleic acid in the sample
Plotting Ct vs log[DNA] should result in a linear relationship with a negative slope
Low Ct = greater amount of nucleic
Two main types of qPCR assays:
Fluorogenic 5’ nuclease TaqMan assay with two primers and a fluorescent probe (very specific)
Inter-chelating dye SYBR green with two primers (less specific)
Both assays quantify DNA based on crossing a cycle threshold Ct relative to the # of PCR cycles
Reflect both quantity and quality (“amplifiability”) of DNA for subsequent STR typing processes
Both assays use a fluorescent reporter whose signal increases in direct proportion to the amount of PCR product in the reaction
TaqMan Assay:
During PCR the TaqMan MGB probe anneals specifically to a complementary sequence between the forward and reverse primer sites
The minor groove binder at the 3’ end of the probe enables the use of shorter probes that still have higher melting temperatures
The probe is designed to have a higher Tm than the primers so that it remains hybridized during polymerization
When the probe is intact, the proximity of the reporter dye to the quencher dye results in suppression of the reporter fluorescence primarily by Forster-type energy transfer
Energy from the reporter is absorbed by the quencher but re-emitted as heat rather than light
AmpliTaq Gold DNA polymerase cleaves only probes that are hybridized to the target
Cleavage separates the reporter dye to the quencher dye which increases the fluorescent signal
SYBR Green:
Reaction chemistry:
SYBR Green dye fluoresces when bound to dsDNA
Denaturation:
When DNA is denatured, SYBR green dye is released, and fluorescence is reduced
Polymerization:
PCR products are amplified
Polymerization complete:
The Dye binds to the dsDNA product resulting in a net increase in fluorescence
Quantitative PCR Advantages:
The ability of commercial qPCR kits
Higher throughput and reduced user intervention
Automated set up and analysis using the standard curve
High sensitivity
Large dynamic range ~30 pg to ~30 ng
Quantitative PCR Limitations:
Subject to inhibition, but IPC can help
<100 pg qPCR subject to variability and uncertainty
In highly degraded samples, assays that amplify short target sequences will detect and measure more DNA than assays that amplify long target sequences (nicks between primers)
Accurate quantitation assumes that each unknown sample is amplified at the same efficiency as the calibrant samples in the dilution series
qPCR for Human Quantitation:
Designed to simultaneously quantify the total amount of amplifiable human DNA and human male DNA in a sample?
If the sample contains enough human DNA and/or human male DNA to proceed with STR analysis
The relative quantities of human male and female DNA in a sample that can assist in the selection of the applicable STR chemistry (Identifiler vs Y-Filer)
If PCR inhibitors are present in a sample that may require additional purification before proceeding to STR analysis
Quantifiler Duo DNA Quantification Kit:
The target-specific assays consist of:
Two primers to amplify human DNA + One TaqMan probe labeled with VIC dye for detecting the amplified human target sequence
Two primers to amplify human male DNA + One TaqMan probe labeled with FAM dye for detecting the human male amplified target sequence
The Internal PCR Control assay:
Two primers to amplify a synthetic sequence not found in nature + One TaqMan probe labelled with NED dye for detecting the IPC DNA
Polymerization and strand displacement
Probe cleavage (release of reporter dye)
Fluorescence occurs when reporter dye and quencher dye are no longer in close proximity
Completion of polymerization
Provides the quantity of human and male DNA in biological samples. From these values, one can calculate the ratio of male and female DNA in a mixture using the following equation:
Male DNA : Female DNA Ratio = Male DNA/Male DNA : (Human DNA - Male DNA)/Male DNA
All quantities in the above equation are ng/μL
For example, assuming: Male DNA concentration = 2 ng/ul and the Human DNA concentration = 8 ng/ul, then the Male DNA:Female DNA ratio is 2/2:(8-2)/2 = 1:3
This ratio determines the extent of the mixture and is useful in determining whether to proceed with autosomal STR or Y STR analysis….. but there are other kits as well!!
Passive Reference Dyes:
Passive reference dyes (ROX) used to normalize well-to-well fluorescence signal differences
variation in the optical paths between wells
minor differences in volumes due to pipetting errors
Wells in the center of the thermal cycler block have a shorter light path and they emit higher fluorescence signals compared to wells on the perimeter of the thermal cycler block that have a longer light path
During qPCR data processing, the sample fluorescence for each well are corrected using differences in the fluorescence of the passive reference to normalize the reporter signals
Microsatellites (STRs):
Millions of micros in human genome (3%); ~ every 10,000bp
Some microsatellite regions (or STRs) useful for identification because # of repetitive motifs varies between individuals!
Types of Short Tandem Repeats:
Requires size-based DNA separation to resolve different alleles from one another
Dinucleotide (CA)(CA)(CA)(CA)
Trinucleotide (GCC)(GCC)(GCC)
Tetranucleotide (AATG)(AATG)(AATG)
Pentanucleotide (AGAAA)(AGAAA)
Hexanucleotide (AGTACA)(AGTACA)
Currently, tetranucleotides most commonly used in forensics for technical reasons to be discussed
Short tandem repeat (STR) = microsatellite = simple sequence repeat (SSR)
R CRITERIA FOR FORENSIC APPLICATIONS
High discriminating power (i.e., high # of alleles)
High percentage of heterozygotes (e.g., HE > 70%) that amplify similarly well within/between individuals
Separate chromosomal locations to avoid linkage of loci (linkage would invalidate statistics of ‘product rule’ used)
Narrow allele range (100 - 400 bp) to amplify well in degraded DNA samples
Work in combination with other micros (multiplexing)
(Relatively) low mutation rate (i.e., frequencies do not change over time)
Low level of biological artifacts (i.e., stutter – more later!)
STR ALLELE NOMENCLATURE
Nomenclature developed so labs can communicate with one another
Define motifs using the first repeat you come across on the 5’ strand
Depending on strand you arrive with very different names and repeats
If STR is in protein coding region, coding strand should be used (e.g., TPOX)
Name according to placement near a gene or chr number
Microvariants (alleles with a partial repeat) named by # of complete repeats and then the number of nucleotides in the partial repeat
e.g., TPOX allele 9.3 is (AATG)9 (ATG)
Standards set by International Society of Forensic Genetics (ISFG)
STRS IN FORENSIC APPLICATIONS
Small product sizes are generally compatible with degraded DNA and PCR enables recovery of profile from small amounts of biological material
Multiplex amplification with fluorescence detection enables high power of discrimination one test
Commercially available in easy-to-use kit formats
Uniform set of core STR loci provide capability for national and international sharing of criminal DNA profiles
THE CODIS STR LOCI
13 core STRs selected in 1997 to form basis of the National DNA Database in the United States known as CODIS (Combined DNA Index System)
As of January 1, 2017, the FBI required an additional 7 STR loci for uploading DNA profiles to the National DNA Index System (NDIS).
COMMERCIAL STR KITS
STR kits vary based on:
STR loci amplified
Fluorescent dye combinations
DNA-strand labelled
Allelic ladder included in the kit
Primer sequence utilized for PCR amplification
MULTIPLEX PCR: REACTION SETUP
DNA sample is added (~1 ng total based on DNA quantitation) – 10 µL possible
PCR primers and other reaction chemicals from an STR typing kit are added – 15 µL
WHAT IS IN A STR TYPING KIT
Kit Components:
Primer mix
PCR buffer (MgCl2 and dNTPs)
DNA polymerase (e.g., AmpliTaq Gold)
Allelic ladder
Positive control
Common kits used:
Profiler Plus/COfiler (Applied Biosystems)
Identifiler (Applied Biosystems)
PowerPlex 16 (Promega)
Primer mix contains fluorescently labeled oligonucleotides used to amplify specific STRs in the human genome
Applied Biosystems has not published their primer sequences
PowerPlex 16 (Promega), which amplifies 16 genomic regions, contains 32 PCR primers
ALLELIC LADDERS
Mix common alleles together to create a reference for scoring alleles in unknown samples
Generated using same primers used for sample amplification!
Different genetic analyzer platforms and running conditions can lead to different mobility; therefore, crucial
Measurement (genotype determination) is performed by comparing allele size (relative to an internal size standard) to a commercially provided STR kit allelic ladder with calibrated repeat numbers (sized according to the same internal size standard)
An internal size standard is run with each sample and external standard to correlate sizes
STR TYPING
Alleles designated by comparing sized peaks from allelic ladder “bins” (+/- 0.5 bp) to PCR-amplified samples if they have similar dye colours
Internal size standard in all samples enables comparison!
ALLELIC OVERLAP: SOLUTIONS
Multiplexing originally designed to avoid overlap of alleles, but as more loci were included in multiplexes, it was more difficult to avoid overlap
Use more dyes when possible (e.g., 4 vs 5)
Add non-nucleotide linkers to change mobility of PCR product (e.g., hexaethyleneoxide; HEO, product runs 2.5bp slower per unit), which allows continued use of validated primers (important advantage!)
Redesign primers and amplify varying amounts of flanking sequence
AmpFISTR Identifier Kit Innovations
COfiler Kit
Identifier Kit
PowerPlex16 Kit innovations
Altering primers = be aware of potential null alleles
PowerPlex 1.1 kit
Powerplex 16 kit
SEX DETERMINATION: AMELOGENIN
Sample donor sex critically important for context (e.g., sexual assault)
Amelogenin encodes gene for tooth enamel (on X and Y)
Primers flank a 6bp deletion in intron 1 of the gene
PCR amplicon size varies based on the kit used
X amplicon acts as internal positive control for reaction
Peak height of amelogenin amplification can be used to determine relative amounts of female:male DNA in a mixed DNA sample
Example
Example: X = 21,000 RFU and Y = 7,000 RFU?
Female (X,X) RFU = 21,000 – 7000 = 14,000
Female (X) RFU = 14,000 / 2 = 7,000
Therefore, female:male DNA ratio = 7,000:7,000 OR 1:1
Technical issues: preferential amplification of X (smaller amplificon)?
Problem 1: rare deletion of gene on Y-chr (6/30,000 males) yields no Y-amplification and a false result of female (Y allele dropout)
Problem 2: rare mutations in primer binding sites (3/7,000 males) yields no X-chr amplification (X allele dropout)
Lecture 4: DNA Amplification (PCR)
THE POLYMERASE CHAIN REACTION
History and Application of PCR:
First described by Mullins in 1985 (Cetus Corp)
Revolutionized molecular biology, awarded with the Nobel Prize in 1993
Allows a DNA template to be quickly and reliably amplified to greater than 10^9 copies starting with small amounts of DNA
Previous methods extremely laborious to get enough DNA template for visualization
Advantages of PCR:
Very small amounts of DNA template may be used
DNA degraded to fragments only a few hundred base pairs in length can serve as effective templates for amplification
Large numbers of copies can be amplified simultaneously
Contaminant DNA, such as fungal and bacterial sources, will not amplify because of human-specific primers used
Commercial kits available
Disadvantages of PCR:
The target DNA template may not amplify due to the presence of PCR inhibitors
Amplification may fail due to sequence changes in primer-binding region of the genomic DNA template
Contamination from other human DNA sources besides the forensic evidence or previously amplified samples is possible without careful laboratory technique and validated protocols
PCR Cycles:
Three distinct events:
Denature template
Primer annealing
DNA synthesis by a thermostable polymerase
Denaturation of the template:
Heat reaction to 95-98 degrees Celsius
DNA becomes single stranded, to which primers can anneal
Primer annealing:
Primers hybridize to complementary ssDNA as reaction cools to 45-65 degrees Celsius, dependent on the base composition of primers
A much greater concentration of primers than target DNA is used, they preferentially hybridize to their complementary sequences
Primers are “consumed” (used up) in the cycle
DNA synthesis:
Extension of primers by thermostable DNA polymerase at 72 degrees Celsius
Time required to copy template depends on length of PCR product (~ 1 min/kb)
The cycles are repeated ~28-30 times to give rise to millions of copies of the target DNA sequence
PCR Raw Materials:
Template DNA
Thermostable DNA polymerase
Oligonucleotide primers
Deoxynucleotide triphosphates (dNTPs)
Magnesium ions
Reaction buffer
Optional additives
Template DNA:
0.5-10 ng of template DNA in the majority of PCR reactions
Assuming 3.2 X 10^9 bp weights ~ 3.3 X 10^-12 g
Then, 0.5-10 ng of genomic DNA corresponds to ~150-3000 copies of each desired region
A diploid cell contains 6.6 pg DNA, therefore 0.5 ng nuDNA = 500 pg = 76 diploid cells = 152 copies of each allele
10 ng nuDNA = 10,000 pg = 1515 diploid cells = 3030 copies of each allele
Plasmid DNA, mtDNA, and chloroplast DNA are much smaller genomes and therefore less mass of template DNA is required to provide the same number of copies
Stochastic sampling occurs with low template amounts, where unequal sampling of two alleles present in a heterozygous individual happens by change
Results in allele imbalance, allele dropout, or complete locus dropout
Not an accurate reflection of the original DNA sample
Thermostable DNA Polymerase:
Originally Taq DNA polymerase isolated from hot spring bacterium Thermus aquaticus
Taq is used because of its thermostability
Taq concentration of 0.05 units/ul in each reaction
Taq can incorporate errors
Other polymerases considered in specific situations based on error rate or efficiency
An example is AmpliTaq Gold
Oligonucleotide Primers:
Must be specific to their target region, possess similar annealing temperatures, and not interact significantly themselves (hairpin structures)
Programs: Oligo, Primer3, Primer Express
Reverse primers are the reverse complement of the target sequence
Reasons for limiting the size of a primer:
We want to be specific
We want to work within an optimal temperature range
Maximum copy number of amplified target sequence is 1E+12, so we need more copies of primer than amplified product
Trade-off between maximizing desired product produced and minimizing off-target products
Consider primer design that maximizes non-template-dependent “A” addition to PCR products
See lecture slides for calculation samples
Magnesium Ions:
[Mg 2+] is a crucial factor affecting the performance of Taq and stringency of primer annealing
They allow the template and primers to bind one another by alleviating the repulsion of negative charges
Cofactor in enzymatic reaction of DNA polymerization and without adequate free magnesium, Taq is inactive
Mg2+ neutralizes repulsion between negatively charged DNA strands (templates and primers)
Reaction components, including chelating agents present in sample (EDTA) or proteins can reduce the amount of free magnesium
Low [Mg 2+]: Repulsion of backbones of primer and template will be stronger, therefore melting will occur at low temperature
High [Mg 2+]: Neutralizes repulsion of primer and template backbones, therefore melting will occur at higher temperatures
Overall, there is a trade-off between decreasing primer binding to non-desirable regions, while allowing primers to bind to the desired targets
Very small differences in [Mg 2+] can result in changes to quality and quantity of amplified product
Optimal [Mg 2+] will be locus specific, but 1.5 mM concentration is standard and should never exceed 3.0 mM
Optimal Additives:
Bovine Serum Albumin (BSA) bind potential PCR inhibitors such as protein and phenol
BSA can act as a chelating agent when it is in excess
Solutions to PCR Inhibitors:
Dilute template DNA, which also dilutes the PCR inhibitor
Add more Taq DNA polymerase to overcome the inhibitor, where some Taq binds inhibiting molecules and the remaining Taq amplifies the template
Add optional additives (BSA) to minimize inhibition
Purify/wash the template (spin column, ethanol precipitation) to remove inhibitors
STANDARD PROCEDURES IN THE PCR CYCLE
Denaturing Time and Temperature:
Generally, there is a long initial denaturing step at 94 degrees celsius (5-20 mins) to completely dissociate the template strands
30 second denaturing steps are used within the cycles as templates are much shorter
Annealing Time and Temperature:
Small molecules are more likely to form ionic bonds with the targeted annealing site
Annealing time is short to allow primers to bind in the correct position while limiting the chances of primer mis-binding
Annealing time is often kept stable at 1 minute through PCR cycles
Optimizing annealing temperature is the first and most important step in optimizing a PCR reaction
Primers with more hydrogen bonds (based on length or GC content) require higher temperatures to melt off the template
The optimal annealing temperature is high enough to melt the primer off while being low enough to allow sequence-specific primer binding
Extension Time and Temperature:
The extension temperature of Taq DNA polymerase is 72 degrees Celsius
Taq synthesizes 1000bp per minute
PCR products under 500bp do not require much time to complete synthesis, 30 seconds is enough
Generally, extension runs 1 minute per cycle
A final extension step at 60 degrees celsius runs for 45 minutes to ensure that all products are fully extended
Additional adenosine is added to the end of PCR products to make scoring easier
Cycle Number:
The PCR cycle number is generally around 30 because it will max out after this
Reagents become a limiting factor in PCR amplification
The exponential plateau happens around 30 cycles (1E +11 to 1E +12 copies)
The cycle number is only increased when starting template amounts are low
The tradeoff of using too many cycles is that undesired regions may be amplified and contaminants may become present
Touchdown PCR:
A ~50 cycle run at incrementally lower temperatures
The initial temperature is higher than the primer melting temperature and gradually declines
The first primer-template hybridization events are stringent, keeping undesired amplification low
Hot Start PCR:
Taq polymerase is added at a higher temperature to minimize the effects of mis-priming
Combats primers binding to non-target regions at room temperature, causing contamination of the PCR product
Hot Start Taq:
Modified AmpliTaq Gold polymerase is only active at high temperatures
The Taq polymerase must be “shocked” into activity at 95 degrees Celsius
Eliminates the accumulation of non-specific product
PCR Controls:
Controls provide perspective to interpret experimental results and guide troubleshooting
Negative control:
Contents of the reaction mix and no DNA sample
No product formation should be observed other than primer-dimers
Product = contamination
Positive control:
Includes a DNA sample with a known profile
If the sample fails to amplify, there is something wrong with the reaction mix
The Identifiler STR genotyping kit includes positive control DNA 9947A
Primer control:
Specific uses for human forensic biology labs
Amplifying and sequencing mtDNA hypervariable regions
This control determines if a problem arose due to the primers used
Can signal degradation of primers used
Troubleshooting Scenario Practice:
Positive and primer-control amplified, negative control did not amplify. Most samples amplified but some did not:
PCR inhibitors present
Positive and negative control did not amplify but primer control did:
Degraded primers
Positive, negative, and primer-control did not amplify:
Problems in reaction mix
Positive, negative, and primer control amplified:
Contamination present in samples
Avoid changing multiple conditions at once. Rule out the source of a problem with one troubleshooting run at a time.
Multiplex PCR:
Amplifying many loci in one reaction
Difficult to avoid primer-dimer formation if sequences are similar
Primers must have similar annealing temperatures
Primer and magnesium ion concentrations are adjusted accordingly
Amplicons should not overlap in size if they are not labeled with different dyes
Week 6: DNA Separation & Detection
ELECTROPHORESIS
Goal: Separate DNA (STRs) by size due to negative charge.
STR typing requires:
Spatial resolution (separate STRs and loci)
Separate resolution (separate fluorescent dyes)
Sizing precision (run-run consistency; we use allelic ladders)
Larger DNA molecules interact more frequently with the gel and are thus retarted in their migration through the gel.
Gel Electrophoresis System
An electric field is applied such that the negatively charged DNA molecules migrate away from a negative (cathode) towards a positive charge (anode).
DNA moves towards positive charge and separates based on size; with smaller DNA molecules traveling faster than larger DNA molecules.
DNA molecules can be loaded into the wells of slab gels that consist of a microporous matric through which the DNA must pass through.
The amount of sample loaded into the gel is dependent on the width of the tooth in a comb and the depth the comb sits in the gel.
Gel is submerged in a tank filled with electrophoresis buffer (eg. Tris-acetate-EDTA) and DA samples are loaded into the top of the well.
The number of samples you can run on a gel is dictated by the size of the gel and the number of the wells you have.
Loading Dyes And Gels
Sample are mixed with a loading dye
Tracking dye (eg: bromophenol blue) monitors migration.
Sucrose that binds DNA, increases its viscosity, and pulls the DNA down into well.
There are two types of gels used in molecular biology; agarose and polyacrylamide gels (PAGE), which differ based on the pore size.
Agarose = larger size which resolves large DNA
PAGE = smaller pore size which resolves smaller DNA molecules (<1000 bp)
Therefore, PCR-amplified sTR alleles (100-400 bp) are better resolved by PAGE.
However, PAGE is time consuming, not easily automated and acrylamide is neurotoxin.
Native vs Denaturing Conditions
DNA can run through the pores of a gel as:
ds DNA in native/non-denaturing conditions, or as ssDNA under denaturing conditions
Generally, better resolution between closely sized DNA molecules occurs under denaturing conditions because a ssDNA molecule is more flexible as it moves through the matrix.
Chemicals such as formamide or urea can be added to the DNA sample where they interact with NDA and interfere with the formation of hydrogen bonds between complementary ssDNA molecules
Capillary Electrophoresis (CE)
The general component of CE include sample injection, separation, and detection of STR alle.s
Caps are made of glass or fused silica with a diameter of ~50 um (length 36-80 cm).
Caps contain viscous polymer rather than gel matrix to separate DNA. New polymer added for each run.
Longer capillary = higher resolution.
*Sample tray moves automatically beneath the cathode end of the capillary to deliver each sample in succession.
Capillary Electrophoresis Advantages:
Fully automated, no need to individually load samples, don’t need to pour gels (electrokinetic injection of samples)
Only a subset of the amped sample used, can be re-tested.
Faster: higher voltages due to enhanced heat dissipation (300 V/cm vs 10 V/cm)
Electronic output: no need for gel pictures nor scanning gel.
No lane tracking, enclosed cap mea no lane bleed through, less cross-contamination
Capillary Electrophoresis disadvantages:
Each cap can only process one sample at a time, sequential injections (so 4, 16, or 96 caps).
Expensive to buy and maintain.
Salts, unwanted DNA can out-compete PCR products for sample injection.
Electrokinetic Injection
Voltage is applied to a liquid sample to introduce DNA into the capillary.
DNA molecules are negatively charged, and positive voltage draws DNA into the capillary.
CE injected DNA is extremely sensitive to contaminating small molecules with a negative charge (eg; chloride from PCR) that “out-compete” the larger DNA molecules (the solution is PCR purification).
FLUORESCENT DETECTION
Most platforms are based on fluorescence detection that excites dye molecules and then detects light emitted.
Fluorescent dye is attached to the premier unlike DNA sequencing (FL-ddNTPS)
Dyes are incorporated into amplicon via PCR. Two or more dyes can be separated using optical filters.
DNA is visualized with a charge-coupled device (CCD)
How it works:
Unlabeled DNA → intercalator inserts between base pairs on double stranded DNA→ DNA is then labeled with intercalating dye (SYBR green) → fluorescent dNTPs are incorporated into both strands of PCR product → fluorescent dye labeled primer is attached → the one strand of PCR product is labeled with fluorescence dye.
Fluorescence
Argon ion lasers (488 or 514.5 mm) excite fluorophore (dye) attached to amplified DNA.
Fluorophore absorbs laser photon energy and emits light at lower energy (higher wavelength (𝜆))
Filters are used to collect only emitted light at a specific 𝜆.
Filters detect multiple fluorophores at once using fluorophore separation algorithms called a matrix.
CCD collect and amplify the signal from the fluorophore and convert it to an electronic signal (relative fluorescence unit; FRI)
Fluorescent Dyes
Promega PowerPlex 16 (4C matrix)
Fluorescein (blue), JOE (green), TAMRA (yellow; black visual), ROX (red; internal size standard)
Promega PowerPlex Fusion (5C matrix)
Fluorescein (blue), JOE (green), TMR-ET (yellow; black visual), CXR-ET (red), WEN (orange; ISS)
Identifiler and Identifiler Plus (5C matrix)
6FAM (blue), VIC (green), NED (yellow; black visual), PET (red), LIZ (orange; ISS)
Dyes used depend on the system you have, the filters your instrument has, and the software you have.
SPECTRAL CALIBRATION
Filters set to detect emission spectra of each dye; the spectra overlap to a degree.
Spectral overlap is removed by applying a matrix where samples labeled with a single due are used to create a calibration file that shows spectral overlap between different dyes,.
If matrix is not optimized, you can observe “pull-up” (eg; green peaks showing up under blue peaks, or vice versa).
STR Data: ABI Prism 310
Red-labeled peaks are from the internal sizing standard GS500-ROX..
SAMPLE PREPARATION & INJECTION
DNA from PCR is prepped in the following way (1:10) dilution:
1 ul PCR product (or allelic ladder – once per plate).
8.7 ul of deionized formamide (denatures DA and dilutes salts)
0.3 ul of internal lane standard (eg. GS500-ROX).
Heat for 2 minutes at 95C, the palace on ice (“snap-cooling”). Add plate onto genetic analyzer.
DNA injected into cap via electrokinetic injection, begins with pre-injection of samples (Eg. 15kV for 5 sec)
Injectio is a competitive process (sample vs salts), and the amount of DNA injected is inversely proportional to the ionic strength of the sample.
DILUTION OF DNA SAMPLES
Following PCR, a small portion of the sample is transferred for analysis and diluted in formamide.
This aliquot of the sample is mixed with a molecule size marker (termed an internal size standard) that permits calibration of sizing measurements.
Samples plates spun down via centrifuge:
Samples plates are spun to remove bubbles that would interfere with the injection (loading) process onto the capillary electrophoresis instrument.
ABI 3130xl: Data Collection
Data analysis is performed on an Applied Biosystems (ABI) 3130xl capillary electrophoresis instrument.
SAMPLE PROCESSING SUMMARY
Replace capillary, refill syringe w polymer solution, fill buffer vials → performed only once per batch of ~96 samples
Prepare samples (denature, cool, ad mix with size standard)
Prepare sample sheet and injection list → allelic ladder every tenth injection.
Automated sample injection, Electrophoresis and Data collection.
Size DNA fragments → GeneScan software Genotype STR alleles → GeneScan software
Perform Data analysis → manually inspect the data
ELECTROPHORESIS and DETECTION steps are simultaneous.
STR GENOTYPES VS STR PROFILES
An STR genotype is the allele (homozygote) or alleles (heterozygote) present for a particular locus.
An STR profile is preceded by combining all the STR genotypes (CODIS 13/20).
Individuals will differ from one another in terms of their STR profile, but not necessarily at a single STR genotype.
Steps involved in STR Genotyping:
Data collection → colour separation → peak identification → peak sizing → comparison to allelic ladder → genotype assignment to alleles → peak editing to remove artifacts calls → data review by analyst → confirmation of results by second analysts
DATA COLLECTION
Four dimensions of data are collected by the CCD detector:
capillary position (x-axis)
wavelength of light across spectrum (y-axis)
intensity of light at specific wavelengths (z-axis)
time (t-axis)
The analysis software then synthesizes an electropherogram of the STR PCR products for each sample by connecting thousands of CCD frames
PEAK DETECTION THRESHOLD
Thresholds are set to separate signal from noise – in other words, are we confident that a peak is real?
Signal peak height is measured in relative fluorescence units (RFUs) that are related to the amount of DNA present in the sample loaded onto instrument
Detection thresholds typically vary from 50 RFU to 200 RFU
Analytical Threshold 50 RFUs LOD
Peak not considered reliable
In between threshold
Peak reliable, but only used for exclusions
Interpretation Threshold (stochastic) 150 RFUs LOQ
Peak reliable, can be used for inclusions
SIZING ALGORITHM AND INTERNAL STANDARDS
How accurate are peak sizes that fall near edge of the region define by internal size standard?
Local Southern Method
DNA Fragment peaks are sized based on the curve produced from the points on the internal standard
OVERVIEW OF STR TYPING
COMPARISON OF ALLELIC LADDER TO SAMPLES TO CONVERT SIZE INTO ALLELE REPEAT NUMBER
MICROVARIANT “OFF-LADDER” ALLELES
Defined as alleles that have a form of sequence variation compared to more commonly observed alleles
Do not size the same as common alleles (“off-ladder”)
Alleles with partial repeat units are designated by the number of full repeats and then a decimal point followed by the number of bases in the partial repeat
Example: TH01 9.3 allele: [TCAT]4 -CAT [TCAT]5
Microvariants are common and sequence variation can occur in flanking regions as well!
Microvariant Allele calculations
Relative size difference between the sample alleles and ladder alleles can be used to determine whether a microvariant exists
Sample Allele 25 = -0.12nt
Off-ladder allele = +0.87nt
Relative peak shift = 0.99nt
Therefore, off-ladder allele is 1 nt larger than ladder allele 28 and is designated 28.1
FACTORS AFFECTING GENOTYPING
Matrix file
Internal size standard
Allelic ladder sample
Degraded dna
Micture dna
NON-ALLELIC PEAKS
Not all data represents alleles from the sample!
Non-allelic peaks may be:
PCR artifacts (e.g. stutter, non-template dependent nucleotide addition, and non-specific amplification products),
analytical artifacts (e.g. spikes and raised baseline),
instrumental limitations (e.g. incomplete spectral separation resulting in pull-up or bleed-through),
or may be introduced into the process (e.g. disassociated primer dyes resulting in a dye blob)
STUTTER PRODUCTS
Peaks that show up one repeat unit less than the true allele due to strand slippage during DNA synthesis
E.g., Y-STR (single source) should have one peak; used CTT repeat to exaggerate the results!
Each successive stutter product is less intense (allele > repeat-1 > repeat-2 > repeat-3)
Stutter less pronounced with larger repeat unit sizes (di- > tri- > tetra- > penta-nucleotides)
Stutter Product Information
Repeat unit bulges out when strand slippage occurs during replication
Forward stutter is RARE
Typically <2% of allele in tetranucleotide repeat STR loci
Reverse stutter
Typically 5-15% of allele in tetranucleotide repeat STR loci
Rpeat unit deletion
Caused by slippage on the copied (bottom strand)
Repeat unit insertion
Caused by slippage of the copying (top) strand
STUTTER PEAK IMPLICATIONS
Difficult to discern stutter from alleles when numerous sources (mixtures) contribute to DNA profile because stutter has same size as potential alleles; especially problematic if you have a minor contributor of DNA
DNA labs normally quantify stutter in % relative to taller allele peak height
What might tip you off that you have a minor contributor and not just stutter (hint: remember you are using 16 loci)?
STUTTER TRENDS AND PRINCIPLES
Quantity of stutter depends on locus as well as PCR conditions and polymerase used
Quantity of stutter is greater for longer alleles within a locus
Typically, less than 15% of corresponding allele peak height
Quantity of stutter is less if the sequence of core repeats is interrupted (e.g. compound repeat)
Stutter amount increases when amplifying low levels of DNA template due to stochastic effects
STUTTER FOMRATION SOLUTIONS
Longer (bp) repeat motifs (not longer # of repeats) have less stutter as do; so pentas (e.g. Penta A-G) normally have around 1% stutter (two pentas used in PowerPlex16)
STRs with imperfect (compound) repeats have less stutter than STRs with simple repeats
STRs with interrupted motifs (e.g., TH01 allele 9.3) have less stutter that STRs with un-interrupted motifs
Thermocycling conditions can also influence stutter intensity (lowering annealing and extension temperature)
Use Taq DNA polymerases that extend template faster; therefore, less time for loop formation (slippage)
NON-TEMPLATE ADDITION
Taq DNA polymerase will often add an extra nucleotide to the end of a PCR product; most often an “A” (termed “adenylation”, or “A+”)
If forward primer is dye-labelled, adenylation is influenced by sequence of the 5’-end of reverse primer (e.g., “G” can be put at the end of a primer to promote non-template addition)
A+ enhanced with extension “soak” at end of PCR cycle (15- 60 min @ 60 or 72 oC) – to give polymerase more time
Excess amounts of DNA template in the PCR can result in incomplete adenylation (not enough polymerase at end)
Allelic and PCR products must be in the same form for easier interpretation by the analyst!
NON-TEMPLATE ADDITION SOLUTIONS
It is best if there is not a mixture of “+/- A” peaks to improve the likelihood of a correct genotype call
TRI-ALLELIC PATTERNS
Three alleles can be observed at a locus in a single-source DNA that are not always a result of DNA sample mixtures
Result from extra chromosome fragments being present in a sample that produce an additional PCR product
Likely to occur within a 15-locus STR profile about once every 1,000 samples and their distribution is not even
How would you discern a 3-allele pattern from a mixture or a chromosomal duplication?
NULL ALELLES
Allele is present in the DNA sample but fails to be amplified due to a nucleotide change in a primer binding site
Mutations in primer binding sites can result in primers not annealing and no product for that allele
Flanking regions where primers anneal, not as subject to mutation as STR repeats, should not happen that often
Null alleles are a problem because a heterozygous sample appears falsely as a homozygote
Null Alleles and Sequence Variation
Sequence variation within or around STR regions effect PCR amplification efficiency differently!
Null alleles are rare events because mutation rate in flanking sequences is low
Null Allele Primer Concordance Studies
Null Allele Solutions
Primer redesign
Drop the locus from the STR multiplex
Degenerate primer: include a primer with the base change so multiple primers are used to amplify the template
Re-amplify sample with lower annealing temperature (reduces primer annealing stringency)
Reduce match stringency (e.g., moderate stringency of 25/26 alleles) to account for variation between labs when searching a national database
Although null alleles occur, as long as the same primers and conditions are used to amplify Q and K samples, the resulting profile should be the same!
DECIPHERING ARTIFACTS FROM THE TRUE ALLELES
Manual inspections of allele calls can be subjective/based on personal bias, need two reads of each gel and genotypes must agree prior to report