Sequencing and genomes

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/20

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

21 Terms

1
New cards

Sanger sequencing requirements

Developed in 1977 = still used today

  • DNTP:ddNTP = 100:1

  • Single primer used

  • Template = PCR product or plasmid

2
New cards

What are ddNTPs

→ chain terminating nucleotide bases

  • 3’ OH group is replace by a H molecule, so once added, the DNA polymerase is unable to add another nucleotide onto the strand.

  • Blocks further polymerisation

3
New cards

Sanger sequencing 1977-1980s

  1. A DNA template, primer, DNA polymerase, normal nucleotides (dNTPs), and a small amount of a fluorescently labeled ddNTP (A, T, C, or G) are mixed.

  2. DNA polymerase adds nucleotides until it reaches the ddNTPs, where it stops

  3. Produces lots of fragments of different lengths ending in fluorescent tags

  4. Size separation by gel electrophoresis- 4 separate sequencing reactions, each with a different radioactive ddNTP

  5. Photographic paper to detect radioactive DNA strands

  • manual process, takes time

<ol><li><p>A DNA template, primer, DNA polymerase, normal nucleotides (dNTPs), and a small amount of a fluorescently labeled ddNTP (A, T, C, or G) are mixed.</p></li><li><p>DNA polymerase adds nucleotides until it reaches the ddNTPs, where it stops</p></li><li><p>Produces lots of fragments of different lengths ending in fluorescent tags</p></li><li><p>Size separation by gel electrophoresis- 4 separate sequencing reactions, each with a different radioactive ddNTP</p></li><li><p>Photographic paper to detect radioactive DNA strands</p></li></ol><ul><li><p>manual process, takes time</p></li></ul><p></p>
4
New cards

Sanger 19090s-present

  1. Sequencing uses a mixture of flourescently labelled ddNTPs and normal dNTPs, each base has a different colour

  2. Chain termination PCR produces mixture of DNA strands of different lengths

  3. Size separation by gel electrophoresis in a capillary using a laser

  • automated

<ol><li><p>Sequencing uses a mixture of flourescently labelled ddNTPs and normal dNTPs, each base has a different colour</p></li><li><p>Chain termination PCR produces mixture of DNA strands of different lengths</p></li><li><p>Size separation by gel electrophoresis in a capillary using a laser</p></li></ol><ul><li><p>automated</p></li></ul><p></p>
5
New cards

What come before sequencing always

PCR

Amplifies small bits of DNA into large numbers

6
New cards

Human genome

→ complete set of genetic info present in a cell/organism

Genome size = total no. Of DNA base pairs in one copy of a haploid genome

Human genome = 3.2 billion bp

<p>→ complete set of genetic info present in a cell/organism</p><p>Genome size = total no. Of DNA base pairs in one copy of a haploid genome</p><p>Human genome = 3.2 billion bp </p>
7
New cards

Human genome project timeline

1977- Sanger sequencing

1989- UK research council and USA jointly funded the project

1990- project begins, international team of researchers

2003- completed (almost) using Sanger sequencing

2023- fully completed genome

Cost = 3 billion pounds

Wellcome Trust Sanger Institute

8
New cards

HGP aims

  • identify and map all genes in the human DNA

  • Determine sequences

  • Store info in databases

  • Discover efficient technologies for data analysis

  • Sequence other genomes of medical importance e.g. mouse/yeast

9
New cards

Sequence content of human genomes

Transponsons:

  • LINEs = long interspersed nuclear elements

-Encode reverse transcriptase

  • SINEs = short…

-Don’t encode, steal it off LINEs

Micro satellites (STRs)

  • Repeats of 1-6 bp, typically repeated 5-50 times

Introns

  • non-coding DNA segments

<p><strong>Transponsons:</strong></p><ul><li><p>LINEs = long interspersed nuclear elements</p></li></ul><p>-Encode reverse transcriptase </p><ul><li><p>SINEs = short…</p></li></ul><p>-Don’t encode, steal it off LINEs</p><p><strong>Micro satellites (STRs)</strong></p><ul><li><p>Repeats of 1-6 bp, typically repeated 5-50 times</p></li></ul><p><strong>Introns</strong></p><ul><li><p>non-coding DNA segments </p></li></ul><p></p>
10
New cards

Transponsons

→’jumping genes’→ try to make more of themselves

  • can replicate and insert into other parts of the genome

  • Discovered by Barbara McClintock

Two types:

  • reterotransponsons = transpose via mRNA, most mobile elements in eukaryotes

  • DNA Transponsons = no intermediate, ‘cut and paste’, most mobile elements in bacteria

<p>→’jumping genes’→ try to make more of themselves</p><ul><li><p>can replicate and insert into other parts of the genome</p></li><li><p>Discovered by Barbara McClintock</p></li></ul><p>Two types:</p><ul><li><p>reterotransponsons = transpose via mRNA, most mobile elements in eukaryotes</p></li><li><p>DNA Transponsons = no intermediate, ‘cut and paste’, most mobile elements in bacteria</p></li></ul><p></p>
11
New cards

Reterotransponsons

→ encode reverse transcriptase

  • after transcription, the RNA is converted to DNA + inserted into another part of the genome (using mRNA intermediate)

  • LINEs encode reverse transcriptase, SINEs don’t

  • Alu is a SINE, 11% of human genome, uses LINEs for reverse transcriptase as don’t encode it

<p>→ encode reverse transcriptase</p><ul><li><p>after transcription, the RNA is converted to DNA + inserted into another part of the genome (using mRNA intermediate)</p></li><li><p>LINEs encode reverse transcriptase, SINEs don’t</p></li><li><p>Alu is a SINE, 11% of human genome, uses LINEs for reverse transcriptase as don’t encode it</p></li></ul><p></p>
12
New cards

Micro/mini satellites

Micro satellites = STRs, 1-6bp repeats 5-50 times

Dinucleotide repeat = ATATAT

Trinucleotide = ATGATGATG

Mini satellites = longer repeats from 10-60bp, also repeated 5-50 times

<p>Micro satellites = STRs, 1-6bp repeats 5-50 times</p><p>Dinucleotide repeat = ATATAT</p><p>Trinucleotide = ATGATGATG</p><p></p><p>Mini satellites = longer repeats from 10-60bp, also repeated 5-50 times</p>
13
New cards

HGP outcomes

  • approx. 22,300 protein coding genes

  • Similar to mouse, drosophila = 15,000

  • Many protein encoding genes in humans produce more than one type of protein via alternate splicing

  • Humans have more transcription factors and control elements

14
New cards

Prokaryotic genomes

Operons = cluster of genes with a single promoter

No introns

Non-coding, only 12% of the genome

Nonessential prokaryotic genes are commonly encoded on extrachromosomal plasmids

15
New cards

Illumina sequencing

  1. Library generation- adaptors added with ligand (synthetic DNA added on the ends)

  2. Attach DNA to surface of a flow cell

  3. Bridge PCR- produces clusters after multiple cycles

  4. Add 4 different (bases) reversible terminator nucleotides and image after each cycle

  5. A picture is taken of the flow cell after each addition, using a powerful camera and microscope

<ol><li><p>Library generation- adaptors added with ligand (synthetic DNA added on the ends)</p></li><li><p>Attach DNA to surface of a flow cell</p></li><li><p>Bridge PCR- produces clusters after multiple cycles</p></li><li><p>Add 4 different (bases) reversible terminator nucleotides and image after each cycle</p></li><li><p>A picture is taken of the flow cell after each addition, using a powerful camera and microscope</p></li></ol><p></p>
16
New cards

Comparison of Sanger vs illumina sequencing

Sanger

  • PCR or cloning amplification

  • Gel electrophoresis and fluorescence detection

  • 1000 bp sequences

  • High accuracy

  • Low throughput (1 million bp/day)

Illumina

  • Bridge PCR on flow cell amplification

  • Fluorescence detection on powerful camera/microscope

  • 100-200 bp sequences

  • Low accuracy for a single read but repeats improve

  • Extremely high throughput

<p>Sanger</p><ul><li><p>PCR or cloning amplification</p></li><li><p>Gel electrophoresis and fluorescence detection</p></li><li><p>1000 bp sequences</p></li><li><p>High accuracy</p></li><li><p>Low throughput (1 million bp/day)</p></li></ul><p>Illumina</p><ul><li><p>Bridge PCR on flow cell amplification</p></li><li><p>Fluorescence detection on powerful camera/microscope</p></li><li><p>100-200 bp sequences</p></li><li><p>Low accuracy for a single read but repeats improve</p></li><li><p>Extremely high throughput</p></li></ul><p></p>
17
New cards

3rd generation sequencing

  • long-read sequencing - up to 10,000bp

  • Urgently under active development

  • PACBIO - a large machine

  • Oxford nanopore tech, MinION, a handheld device

  • Completed human genome project in 2023

18
New cards

Chromosomal differences

Down’s syndrome = extra chromosome 21

Klinefelter = XXY, only symptoms later in life

Turner’s syndrome = lacks second X as a female

19
New cards

1,000 genomes project

  • launched in Jan 2008

  • International research effort to establish the most detailed catalogue of human genetic variation by 2020

  • 2500 unrelated humans sequenced

20
New cards

Re sequencing with 2nd/3rd gen

→ to identify genetic variants

  • whole genome sequencing= sequence the genome from genomic DNA

  • Exogenous/transcriptome sequencing = expressed genes cDNA (mRNA coding)

  • Spotting differences→ align the short sequences to the reference genome to identify differences

  • Identify genetic variants whole= alleles, SNPs (single nucleotide differences) etc.

21
New cards

Common genetic variants

SNP: single nucleotide polymorphisms

  • Substitution of a single nucleotide at a specific position

Indels:

  • insertions and deletions, INDELs

Most genetic variants = no phenotypic effect

  • intergenic region = silent

  • Non-coding regions may be silent or affect gene expression e.g.promoters

Copy number variants (CNVs)

  • variations in which part of the genome is either deleted or repeated