400 lec 1b & 2a

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/42

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 7:48 PM on 3/13/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

43 Terms

1
New cards

large scale sequencing

  • random fragmentation ( cut dna and add tags for genome lib)

  • sequence fragments

  • asemble (resequence) —> de novo

  • gap filling/error check

  • redundant = more accurate

2
New cards

what is required for DNA sequencing

DNA

primer

nucleotides

dna poly

nucleotide analogs (terminators) ddNTPs

3
New cards

de novo assembly

build genome w out reference

4
New cards

redundancy

reduces sequencing erros

improves assebmly

helps resolve repeates

5
New cards

DNA sequencing - SANGER (requirements)

  • interrupt dna synthesis OR chain termination

  • use mixture of ddNTPs and dNTPs

  • higly accuate

  • need dna template, primer, nucleotides, and dna polymerase

  • nucleotide analogs (terminators) stop the reaction at each base

6
New cards

dNTPs

normal nucleotides

- have 3’ OH group

7
New cards

ddNTPs

terminator sequences

- lack 3’ OH group (have H insted)

8
New cards

DNA sequencing - SANGER (mechanism)

  • ddNTPs lack a 3′-OH group

  • Once incorporated → chain termination

Produces fragments that differ by exactly one base

9
New cards

ddNTP vs dNTP

ddNTP = terminate DNA synthesis bc lack 3-OH group

dNTP = continued DNA chain elongation bc they HAVE 3-OH group

10
New cards

sanger pros and cons

Result

  • A mixture of DNA fragments of varying lengths

  • Fragment length identifies base position

Strengths

  • Extremely low error rate (<0.1%)

  • Gold standard for sequence confirmation

  • Ideal for single-gene studies

Limitations

  • Low throughput

Short read lengths (<800 bp)
**good for confirming sequences

11
New cards

NGS Next gen sequencing

  • Extremely high throughput (~1.8 trillion bases / 72 hrs)

  • Produces >1 billion reads

  • Read length: ~150 bp (up to ~100 kb with Nanopore)

  • Higher error rate (~0.26–15%)

  • Used for:

    • Whole-genome resequencing

    • Population genomics

    • Metagenomics

***SCALE AND DISCOVERY

12
New cards

Oxford Nanopore (MinION)

  • Very long reads (up to ~100 kb)

  • Higher error rate than other methods

  • Useful for:

    • Genome assembly

    • Structural variants

    • Scaffolding

    • FOR NUC ACID

***ACCURACY & CONFIRMATION

13
New cards

de Bruijn Graphs (Short Reads)

  • Reads are broken into k-mers

  • Overlapping k-mers form a graph

  • Paths through graph = assembled sequence

Efficient but sensitive to sequencing errors

14
New cards

stages of genome assembly

  1. Reads – raw sequencing output

  2. Contigs – continuous assembled sequences

  3. Scaffolds – ordered contigs with gaps (“NNNN”)

  4. Chromosome-level assembly – near-complete genome

Reference genome – official, annotated version used by the community

15
New cards

contig N50

60 kbp

  • The shortest contig length such that 50% of the genome is contained in contigs of that size or larger

  • sequence length of the shortest contig at 50% of the total genome

16
New cards

BUSCO

  • Benchmarking Universal Single Copy Orthologs

  • Measures genome completeness

>95% score = high-quality assembly

17
New cards

genome annotation tools

  • Phylosift

  • Blast2GO

  • Use annotated reference genomes as a guide

  • Comparative genomics (evolutionary conservation)

  • Transcript (RNA-seq) evidence + Protein homology

**provides a roadmap for the genome

18
New cards

databases

  • Primary databases: raw data archives (e.g., sequence reads)

  • Secondary databases: curated, interpreted data

  • Relational databases: link information across databases

  • Local databases: built from your own NGS data (next gen sequencing)

19
New cards

genetic database landmarks

  • 1965: Atlas of Protein Sequences (Dayhoff)

  • Protein Information Resource (PIR)

  • EMBL (1982)

  • Human Genome Project

  • GenBank + DDBJ + EMBL collaboration

NCBI Entrez → integrated database access

20
New cards

genbank fila anatomy

  1. Header – metadata, organism, accession number

  2. Features – genes, CDS, regulatory elements

  3. Nucleotide sequence

21
New cards

FASTA Format

  • > header line

  • Sequence lines below

Simple, widely used for alignment and analysis

22
New cards

database errors

sources

  • PCR introduces mutations

  • PCR amplifies the wrong organism

  • Gene families → homology confusion

Incorrect taxonomic assignment

23
New cards

PCR

lab tech that makes millions of copies of specific DNA segments (photocopies)

24
New cards

FASTA N

any nucleotide

25
New cards

FASTA R

purine

A

G

26
New cards

FASTA Y

pyrimidines

C

T/U

27
New cards

FATSA -

gaps of intermediate length

28
New cards

FATSA *

translation stop

**amino acids

29
New cards

linear model of progressive evolution

 (old view): evolution was once thought of as a straight line from “simple → complex.

30
New cards

linear model modern view

 genomes change in a branching, tree-like pattern, reflecting common ancestry and divergence over time

**evo represed w phylo trees not ladders

31
New cards

newick nomenclature

  • A text-based format for representing phylogenetic trees.

  • Uses parentheses and commas to show branching relationships.

Common in computational biology and tree-building software.

32
New cards

dichotomy

  • Modern phylogenetics focuses on relationships and ancestry, not ranking organisms

33
New cards

types of phylo trees

  • cladogram

  • phylogram

    • dendrogram

34
New cards

cladogram

  • Shows branching order only.

  • Branch lengths have no meaning.

Emphasizes shared ancestry.

35
New cards
36
New cards

phylogram

  • Branch lengths are proportional to the amount of evolutionary change.

  • Reflects genetic distance

37
New cards
  • Dendrogram (Ultrametric tree)

  • All tips are the same distance from the root.

  • Assumes a molecular clock (equal rates of evolution)

38
New cards

homology

  • Similarity due to shared ancestry.

  • Example: shared genes between cats and whales despite different functions.

  • Function ≠ homology; ancestry matters more than what the gene currently does.

39
New cards

homoplasy

  • Similarity not due to shared ancestry.

  • Arises from convergent evolution or reversals.

40
New cards

gene dupe & homology

  • Gene duplication complicates homology because multiple copies exist.

  • After duplication, copies can evolve independently and take on new functions.

  • orthologs

  • paralogs

    • xenolog

Example: Human Hemoglobin (Hb) Gene Family

  • Originated through gene duplication events.

  • Different Hb genes are paralogs.

Functional diversification allows different hemoglobins to act at different developmental stages (e.g., fetal vs adult Hb).

41
New cards

ortholog

  • Genes in different species that diverged via speciation.

Usually retain similar function.
gene copy

42
New cards

paralog

gene copy

  • genes related by duplication within a genome.

Often evolve new or specialized functions.

43
New cards

xenolog

gene copy

  • Genes acquired via horizontal gene transfer.

Explore top notes

note
Algebra1 SOL Brain Dump
Updated 686d ago
0.0(0)
note
AP LANG
Updated 214d ago
0.0(0)
note
Ecology Basics
Updated 533d ago
0.0(0)
note
HBS EOC REVIEW
Updated 640d ago
0.0(0)
note
les régions de la France
Updated 1236d ago
0.0(0)
note
Algebra1 SOL Brain Dump
Updated 686d ago
0.0(0)
note
AP LANG
Updated 214d ago
0.0(0)
note
Ecology Basics
Updated 533d ago
0.0(0)
note
HBS EOC REVIEW
Updated 640d ago
0.0(0)
note
les régions de la France
Updated 1236d ago
0.0(0)

Explore top flashcards

flashcards
Intro to Business - Final
49
Updated 1154d ago
0.0(0)
flashcards
FLEX - Numbers 1-20
20
Updated 192d ago
0.0(0)
flashcards
Hous book 4
47
Updated 1d ago
0.0(0)
flashcards
Digital SAT Vocabulary
991
Updated 668d ago
0.0(0)
flashcards
Vert bio fish anatomy
146
Updated 1d ago
0.0(0)
flashcards
IMENICE
24
Updated 392d ago
0.0(0)
flashcards
Intro to Business - Final
49
Updated 1154d ago
0.0(0)
flashcards
FLEX - Numbers 1-20
20
Updated 192d ago
0.0(0)
flashcards
Hous book 4
47
Updated 1d ago
0.0(0)
flashcards
Digital SAT Vocabulary
991
Updated 668d ago
0.0(0)
flashcards
Vert bio fish anatomy
146
Updated 1d ago
0.0(0)
flashcards
IMENICE
24
Updated 392d ago
0.0(0)