Forensic Bio Unit 3

Fluorescence Labeling

-              A fluorescent dyes contain a fluorophore which is the component of the dye that actually fluoresces

-              The fluorescence is caused when a certain wavelength of light hits the fluorophore and excites it

-              The fluorophore will excite at one wavelength and emit at a separate wavelength

-              The difference in the excitation wavelength and the emission wavelength of a dye is called the STOKES SHIFT

 

Dyes Used to Fluorescently Label PCR

-              Different dyes can be used in the same analysis since optical filters can be used to separate out different dye colors

-              The number of dues used depends on need (or kit used)

-              Generally, the more dyes, the more loci (locations in the DNA that can be analyzed)

-              Dyes used generally fall within the 400-650nm range

-              Some kits used in forensics use up to 6 dyes

 Singleplex vs Multiplex

-              In PCR it is possible to use more than one primer set

-              As long as the regions of interest are not overlapping, you can amplify more than one area

-              The components will be the same, you just need to have a primer set for each region of interest

 

Multiplex Kits

-              Current multiplex kits in forensics can amplify over 20 loci at the same time

-              From this single PCR reaction, enough information can be generated to differentiate between every individual in the world (except for identical twins)

-              This saves a large amount of time instead of setting up and running over 20 separate reactions for each sample

Summary

-              Electrophoresis separates DNA based on size only

-              Similarly sized PCR products (amplicons), even if from entirely different locations, cannot be separated by electrophoresis without labeling

-              Fluorescently labeled products allow for DNA fragments of the same size to be distinguished from one another

 Two Different Types of VNTR Assays

-              RFLP – based VNTRs

-              PCR-based VNTRs

o   Both types use markers that are minisatellites (8-200bp repeats)

 

RFLP-based VNTRs

-              This type of analysis was not based on PCR – requires a much larger amount of starting material for an analysis

-              Multiple VNTR loci can be used

o   Cannot be multiplexed (combined into one reaction)

-              This method is based on restriction fragment length polymorphisms

 

Variable Number Tandem Repeats

-              In this case, the repeat unit is 31 base pairs long

-              There are 20 repeats total

-              This would be an example of one allele

o   Each person has two alleles (one from each parent)

-              The number of repeats can vary widely (typically 10-30)

-              Differences are based on length (ex. 30 repeats is longer than 28 repeats)

 

Restriction Fragment Length Polymorphism

-              RFLP- based VNTR profiling utilizes restriction fragment length polymorphisms (RFLPs)

-              RFLPs are based on restriction endonucleases, which are enzymes that cut DNA at a certain place

-              The enzyme has what is known as a recognition sequence – they recognize a certain nucleotide sequences and cut at that sequence

 

Detecting Size Differences for VNTRs

-              Since we are detecting differences in length, you can perform electrophoresis

-              Once you have the difference size fragments, you can compare your crime scene sample to any possible suspects

 

Disadvantages of VNTRs with RFLP

-              Does not perform well with degraded samples

-              Does not perform well with samples in limited quantity

-              Since both these characteristics are common among forensic samples, a different type of method is needed

 

PCR-Based VNTRs

-              VNTRs using PCR to amplify the regions is called amplified fragment length polymorphism (AFLP)

-              If the overall size of the locus is under 1000bo, then AFLPs can be used

-              Similar to the RFLP method, AFLPs look at differences in the number of repeats by amplifying with primers in the conserved flanking regions

-              Primers are designed to anneal in the conserved region, the number of repeats is what varies

 

PCR-Based VNTRs

-              Primers are designed so the length of the flanking regions are always the same

-              This is showing an example of a heterozygous individuals

 

Advantages of the PCR-based VNTRs

-              The use of PCR makes it so less initial DNA is required

-              By multiplexing multiple loci, it is possible to get a higher discriminatory power (a greater level of individualization)

-              Many of the VNTR loci are in upwards of 1000bp in length, which is less than ideal for forensic samples

o   The need to reduce this size leads to our next topic, STRs, which is how forensic testing is currently performed.

Minisatellites vs Microsatellites

-              Minisatellites and microsatellites are both examples of tandem repeats (Adjacent regions of repeated units)

-              The main difference between the two types is the length of the repeat

o   Minisatellites are typically 8-200bp repeat units

o   Microsatellites are typically 2-7bp repeat units

 

Microsatellites

-              Smaller repeat number means smaller overall allel sizes

-              Smaller allele sizes are better for forensic applications

-              Why?

o   Better for degraded DNA

o   Easier to multiplex (increases discriminatory power)

 

Characteristics of STRs

-              The microsatellites used in forensic analysis are short tandem repeats (STRs)

-              These are based on nuclear DNA, so there are two copies for each locus (location)

-              Alleles can be homozygous or heterozygous

-              There are a large amount of STRs in the human genome (estimated to be over 100,000)

-              Each STR is characterized by the core repeat and the flanking region

-              The core repeat consists of the tandemly repeated regions

-              The flanking region consists of a conserved area on each side (this is where the primers anneal)

 

Repeat Unit Length

-              The different lengths of repeats for STRs range from dimeric (2bp), trimeric (3bp), tetrameric (4bp), pentameric (5bp), hexameric (6bp), and heptameric (7bp)

-              Dimeric and trimeric repeats have issues with stutter

-              Pentameric, hexameric, and heptameric are less abundant

-              Tetrameric is the length used in all core loci used in forensics (USA)

o   A few pentameric STR loci are utilized internationally

 

Repeat Unit Length

-              The tetrameric unit length is commonly found in the human genome

-              The tetrameric length is highly polymorphic – a crucial trait for STR analysis

o   Polymorphic means that an allele appears in multiple forms, or that there is a lot of variety for that locus

 

Repeat Unit Seqences

-              Not each type of repeat is the same

-              The differences in types is based on the sequence of the repeats

-              The core STR loci include each of these types

o   Simple

o   Compound

o   Complex

 

Simple STR Repeats

-              Simple repeats consist of tandem repeats with identical repeating units

-              EX: repeat at D5 is (AGAT)n, where n is the number of repeats

-              This allele number would be 10; (AGAT)10

 

Compound STR Repeats

-              Compound repeats consist of more than one type of repeat

-              EX: D8 – TCTAN [TCTG] NTCTAN

-              This allele number is 14; [TCTA] 2 [TCTG][TCTA]11

 

Complex STR Repeats

-              Complex repeats consist of several clusters of different tandem repeats with intervening sequences

 

How is the STR Repeat Named?

-              The tetranucleotide repeat motif is named based on the top strand

o   There are historical exceptions to this naming convention

-              This would be referred to as a TCAT repeat

 

Characteristics of the Core Loci

-              In the US, there are currently 20 loci (and a sex marker) that are part of the core CODIS loci

-              The increase to 20 loci recently happened in 2017, as dating

o   1998 until 2017 only 13 loci were included in the original CODIS loci

-              CODIS stands for Combined DNA Index System

 

Why Would you Expand the Core Loci Set?

-              The move to increase the number of core CODIS loci was made primarily to address the following items:

o   Decrease chance of adventitious match (a match due to random chance)

o   Be more like international databases

o   Increase discriminatory power

 

How are the Core CODIS loci named?

-              The core CODIS loci are named based on either 1) the chromosome they are found or 2) the name of the gene they are a part of

-              EX: D5S818 is located on chromosome 5

-              Other loci (ex. FGA, CSF1PO) are genes, but refer specifically to the introns of those genes (regions not coding for the protein)

 

What Makes a Good Core CODIS Marker?

-              The goal of these markers is to lead to the individualization of a sample

-              What are some characteristics they could possess?

o   Polymorphic

o   Consistency in flanking regions among all individuals (primer locations)

o   Smaller sizes are preferred (Degraded DNA)

o   Not linked to other loci (generally on separate chromosomes; if on the same chromosome, far enough away not to be linked)

o   Few amplification artifacts (ex: stutter)

 

Forensic STR Analysis

-              The current STRs used in forensic investigation are amplified using fluorescently labeled primers

-              The different amplicons are separated by capillary electrophoresis

-              The different color dyes can be separated by the optical filters

 

Internal Lane Standard

-              The internal lane standard (ILS) is added to each sample

-              The ILS consists of DNA fragments of known size, making it possible to size your fragments of unknwon size

-              This is the same function as the PCR marker for slab gel electrophoresis

 

An Electropherogram

-              The data collection process generates an electropherogram

-              An electropherogram is a display of the peaks representative of the different fragments

-              An electropherogram is seprated into color channels and shows the relative amounts of each fragment by displaying an RFU (relative fluorescent unit) value

-              The allele number is the number of repeats, the overall size is how many base pairs the fragment is long

 

Determining the Allele

-              The internal lane standard is used to determine the size of the fragments represented by each peak

-              How can we determine what the allele is? An allelic ladder

-              An allelic ladder is a sample that is made up of the majority of known alleles at a certain locus

-              PCR markers used in slab gel electrophoresis are sometimes called “ladders”

 

Allelic Ladder

-              In this figure, each locus is separated

-              The peaks represent the common alleles for that particular locus

 

Determining Genotype

-              The genotype of an individual is the number of repeats for both alleles at that locus

 

STR Profile

-              A STR profile consists of the genotype for all the loci used in a certain kit

-              For example, a STR profile used for uploading to the CODIS database will include the genotype for all 20 core CODIS loci

 

Example of STR Profile

-              This electropherogram shows an STR profile using the Identifier Kit

-              This kit amplifies 15 loci plus a sex marker

-              Prior to 2017, this kit could amplify the core CODIS loci

-              Since all 13 core loci are represented, this was a complete STR profile

 

Interpretation of STR Profiling Results

-              There are three common conclusions at the end of STR analysis

o   Inclusion: the genotype of two compared STR profiles are identical

o   Exclusion: the two genotypes differ, and that the profiles originated from different sources

o   Inconclusive: indicates that there is not enough info to support a conclusion of either inclusion or exclusion (common for partial profiles)

-              In the cases of a match, a statistical weight to the likelihood of such a match can be obtained

Factors Affecting Genotyping Results

-              There are numerous types of factors that alter the interpretation of an STR profile

o   Genetic-related factors

o   Amplification-related factors

o   Electrophoresis-related factors

 

Mutations

-              The loci used to evaluate STRs are selected in part because of low mutation frequencies

-              A mutation is just a change in the DNA that is brought about by a rare event

o   Can cause the changing of a single base pair

o   Can result in duplication/deletion of large section of DNA or entire chromosome

-              Despite tests to ensure low mutation rates, some mutations can occur in STRs and alter the interpretation of STR profiles

 

Chromosomal and Gene Duplications

-              In some cases, duplication of one of the two chromosomes can lead to three chromosomes for an allele

o   Remember you are diploid – got one set of DNA from your mother, one set from father

-              This condition is called trisomy and is associated with many genetic diseases

-              At certain loci, three alleles will appear

 

Tri-Alleles

-              When there are tri-alleles, interpretation can be altered

-              If only one locus shows the tri-allele, then it is probably from a single source

 

Point Mutations

-              Point mutations involve the change of the nucleotide sequence at a singular point

-              This is particularly problematic when the point mutation occurs in the primer binding site

-              A change in the nucleotide sequence at the primer binding sit can lead to a failure of the amplification of that allele

-              When an allele that should be present does not amplify, this is called a null allele

-              If there is a mutation that makes it so the primer cannot bind, it is possible that one of the alleles would not amplify

-              If there is a mutation in the primer site of the 18 repeat allele, then it will not amplify

 

Amplification-Related Artifacts

-              There are several artifacts that can be introduced during the amplification process

-              The two we will discuss in this class are

o   Stutter

o   Non-template Adenylation

 

What is stutter?

-              During the extension phase of PCR, some portion of DNA may be “slip” forward or backwards

-              This slip leads to a produce that is one repeat short (more common) or longer (less common) than the true allele

 

Why Stutter is Problematic

-              Difficult because stutter is located where a true allele would be

 

Non-Template Adenylation

-              During PCR amplification, Taq Polymerase generally adds an extra adenine (the “A” base) to the 3’ end of the amplicon

-              This addition is referred to as a non-template addition – the addition of a base that is not determined by the sequence of the template strand

-              Most multiplex kits are design to factor the addition but occasionally some unadded forms will be present (typically from too much template DNA)

o   The unadded form are one based (an “A”) shorter

 

Electrophoretic Based Artifacts

-              There are several artifacts that can be introduced during the electrophoresis step of analysis

-              These artifacts include:

o   Pull-up peaks

o   Spikes

 

Pull-Up Peaks

-              A pull-up peak is when a minor peak of one color is “pulled up” from a major peak in another color

-              This is the result of the sample being overloaded or a bad spectral calibration

-              If the pull-up peak corresponds to the position of an allele in another color channel, then the interpretation of the DNA profile may not be accurate

 

Spikes

-              Spikes are very sharp peaks (narrower than a true allele peak) that are present in all the color panels

-              Spikes are caused by either air bubbles or changes in the voltage

-              If spikes occur, the sample needs to be re-run

-              The spike will be present at approximately the same height in each color channel

 

Genotyping Challenging Forensic Samples

-              Numerous factors unrelated to genetic, amplification, or electrophoretic characteristics can also impact DNA analysis

-              Many of these types of factors are a result of the environment the DNA sample is collected from and are unavoidable

-              These factors include

o   Degraded DNA

o   Low copy number DNA testing

o   Mixtures

 

Degraded DNA

-              DNA degradation is the breaking down of large DNA molecules into smaller fragments

-              This break down is brought about by environmental factors such as high heat and humidity

-              The normal size range of STRs is between 100-500bp

-              Alleles that are larger (closer to 500bp) are more likely to be degraded than the smaller alleles

-              In a degraded sample, larger DNA is less likely to be amplified since it is degraded

-              These alleles will “drop out”

 

Low Copy Number DNA Testing

-              Low copy number (LCN) DNA is a sample with a very low amount of DNA (less than 100 picograms)

-              LCN DNA is often found in instances of touch DNA samples

-              Samples in which there is a low amount of DNA can be amplified by increasing the cycle numbers

-              Increasing the cycle number allows for the amplification of smaller amounts, but it also introduces other artifacts

-              The other artifacts introduced include allele dropout, heterozygote peak imbalance, and increased stutter product, which makes interpretation more difficult

-              Since the samples are low in DNA, re-amplification to confirm the presence of true alleles is not very likely

 

Mixtures

-              A mixture is a sample that includes DNA from two or more contributors

-              In some cases, you know that one of the contributors is the victim

-              In other cases, it is unknown how many contributors may be present

 

Mixture Interpretation

-              Mixture interpretation is the interpretation of DNA profiles that contain mixtures

-              The field of DNA analysis is still searching for the best approach to interpreting mixtures

-              mixture interpretation can be made more complicated by the different types of artifacts previously discussed

-              there are a number of factors that are indicative of the presence of a mixture

o   severe heterozygote imbalance

o   increased stutter

o   presence of three or more alleles per locus

 

Heterozygote Imbalance

-              heterozygote imbalance is when the two alleles of a heterozygote individual at a certain locus are not approximately equal in height

-              it is expected there will be some differences in height, but if the ratio is less than 60% than it can be an issue

 

The Number of Alleles per Locus

-              for a single source profile (one contributor), the maximum number of alleles expected is two (heterozygous for that locus)

-              it is also possible that there will only be one allele shown

-              if there are more than two alleles at more than one locus, then it is probably that the profile is a mixture involving two or more contributors

o   three alleles at single locus is likely a tri-allele

 

A Mixture Profile

-              mixtures are messy

-              if you have more than two contributors, it becomes really hard to determine individual contributors

-              in two person mixtures, you can possible determine individual profiles if at least one is true

o   there is a victim profile

o   if there is a major and minor contributor

 

Summary of Factors Impacting STR Genotyping

-              Genetic

o   Tri-alleles

o   Null alleles

-              Amplification

o   Stutter

o   Non-template adenylation

-              Electrophoresis

o   Spikes

o   Pull-up

-              Sample quality

o   Degraded DNA

o   Low copy number DNA

o   Mixtures

When Mitochondrial DNA Profiling is Used

-              In cases where there are samples that contain little or no nuclear DNA (nDNA), mitochondrial DNA (mtDNA) can be used

o   Some samples do not contain any nDNA (ex. Hair shafts)

o   In other cases, the nDNA that was present may have been degraded (ex. Mass disaster cases)

 The Mitochondrial Genome

-              The first mitochondrial genome was sequenced by Fred Sanger’s laboratory in 1981 at Cambridge University

o   The Cambridge reference sequence (CRS)

-              Due to errors in the original sequence, a revised Cambridge reference sequence (rCRS) was published in 1999

-              The rCRS is used as the point of comparison for all mitochondrial DNA forensic samples

-              The mitochondrial genome encodes for a total of 37 genes

-              There are no introns in the mitochondrial genome

-              The control region is hypervariable, and therefore able to be utilized for forensic purposes (aka the D-loop)

 

The Hypervariable Regions

-              There are a total of 3 hypervariable (HV) regions:

o   HVI

o   HVII

o   HVIII

-              HVI and HVII are used for forensic purposes since they are the most polymorphic

 

Heteroplasmy

-              Heteroplasmy is when an individual carries more than one mtDNA haplotype (think of haplotype as a genotype)

-              It is possible that the individual carries one haplotype in one type of tissues like hair, and another haplotype in skin cells

-              Two types of heteroplasmy that exist:

o   Sequence heteroplasmy

o   Length heteroplasmy

 

Sequence Heteroplasmy

-              Sequence heteroplasmy is defined as the presence of two difference nucleotides at a single position

-              Represented as overlapping peaks in an electropherogram

 

Length Heteroplasmy

-              Length heteroplasmy is typically due to differences in the length of the “C-stretch” between two mtDNA haplotypes

o   A “C-stretch” is just numerous cytosines (the “C” base) in a row

 

mtDNA Sample Processing

-              Many of the same procedures used to extract and quantify nuclear DNA can be used for mtDNA

-              PCR steps are similar, with the following exceptions:

o   Different primers are used

o   A higher number of PCR cycles is generally used

§  Makes the reaction more sensitive, but also increases the likelihood of contamination – use controls to monitor contaminations

 

DNA Sequencing of mtDNA Samples

-              The common DNA sequencing technique for mtDNA samples is the chain termination method

-              A sequencing reaction contains the following:

o   Template DNA

o   Primers

o   DNA polymerase

o   Cofactors

o   dNTPs

o   ddNTPS – the same as dNTPs but are missing a hydroxide group: these are fluorescently labeled

§  the absence of the OH group prevents the chain from growing anymore; terminates growth

 

dNTP present: Chain can grow

-              typically, a dNTP is incorporated and the chain can continue growing

-              in this example, a third base has been added and more could be as well

-              the presence OH group allows for more bases to be added

 

ddNTP present: Chain cannot grow

-              When a ddNTP is incporated, the chain can no longer grow

 

How ddNTPs are Visualized

-              The ratio of dNTPs (which allow chain growth) and ddNTPs (which terminate growth) varies in a reaction

-              At the end of a sequencing reaction, you end up with lots of fragments that are different lengths

-              The different lengths vary by one nucleotide to include the shortest possible fragment, the longest possible fragment, and everything in between

 

Cycle Sequencing

-              The chain termination method is carried out by cycle sequencing

-              Cycle sequencing uses thermal cycling (just like PCR) to generate a single stranded template for the chain-termination sequencing reactions

-              The three steps of thermal cycling are the same as PCR:

o   Denaturing (double to single stranded)

o   Annealing (primer attached to single stranded template)

o   Extension (DNA polymerase adds dNTPs to form new strand; the addition of a ddNTP terminates the growth)

 

After Cycle Sequencing

-              Following cycle sequencing, the different length fragments can be separated using capillary electrophoresis

-              The only difference between mtDNA capillary electrophoresis and nDNA capillary electrophoresis (STRs) are:

o   POP-6 is used as the matrix

o   A longer capillary tube is used

-              Modifications to capillary electrophoresis process allow for better resolution to the single base level

-              The mitochondrial DNA profile should be sequenced in both directions (forward and reverse)

-              If there is enough sample available, samples should be sequenced twice

 

How Sequences are Reported

-              For mitochondrial sequence analysis, every sequence is compared to the rCRS

-              The data for sequence reporting consists of the nucleotide position and the base that differs for the mitochondrial DNA profile

 

How Results are Reported

-              Three possibilities: exclusion, cannot exclude, inconclusive

-              Exclusion: if the questioned and known sequenes are different, samples can be excluded

o   At least two differences need to be reported, sincemtDNA has a higher mutation rate

-              Cannot Exclude: if the questioned and known sequences are the same, they cannot be excluded

-              Inconclusive Result: if the questioned and known sequences differ by only a single nucleotide, the result is inconclusive

 

 

 

Y-STRs

 

Y-Chromosome Inheritance

-              The Y-chromosome is only present in males

-              It has a mode of inheritance known as patrilineage, where a father passes it on to all his male offspring

-              The Y-chromosome contains ~59 million base pairs

-              It encodes for 50-60 genes

 

Why Use Y-STRs

-              In many ways, Y-STRs are inferior to STRs

-              The same Y-STR profile is shared by all male relatives through patrilineage inheritance

-              When calculating the statistical weight of the evidence, the Y-STRs are so close to each other, so their frequencies can only be added, not multiplied

 

Why Use Y-STRs

-              In the case of sexual assaults with a female victim and a male perpetrator, often there is a mixture of DNA evidence

-              This mixture is generally made up of a large amount of female DNA and only a small portion of male DNA

-              Y-STRs can be used in such circumstances as the female DNA will not interfere

-              Y-STRs are also useful for cases of sexual assaults and multiple male perpetrators

 

The Core Y-STR Loci

-              The Y-STR loci are constantly expanding

-              The initial core Y-STR loci included a total of 9 loci

-              This number has increased based on the different kits used and new information obtained

-              Current kits can multiplex and amplify over 20 different Y-STR loci

-              The same methods used to isolate, amplify, and separate STRs can be used for Y-STRs (just with different primers)

 

Multiple Y-STR Loci

-              Since there is only one Y-chromosome, there is an exception that only one allele should be present

-              For multilocal Y-STR loci, there is a duplication that leads to two alleles being present (there is still only one y-chromosome)

-              If two alleles are present for a Y-STR, it is referred to as bilocal

 

Interpretation of Results

-              There are three possible determinations when comparing an unknown and a known Y-STR genotype

o   Exclusion: Y-STRS are different and could not have originated from the same source

o   Inconclusive: there is insufficient data to make a determination on the origin of the source (partial profile)

o   Failure to Exclude: Y-STRs from the unknown and known profile are the same and therefore could have originated from the same source

 

Future of Y-STRs: Rapidly Mutating Y-STR

-              There is a subset of Y-STR that are referred to as rapidly mutating Y-STRs (RM Y-STRs)

-              The average mutation rate is 0.0001, for RM Y-STRS it is 0.01

-              Currently, the Y-STR loci cannot differentiate between patrilineage relatives

-              The RM Y-STRs, since they mutate at a higher rate, open the possibility of differentiating between patrilineage relatives