8 - Genomes

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/17

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

18 Terms

1
New cards

Diff between genomics and genetics

Genomics

  • larger scale and complexity of data/datasets

  • closely related to computer/technology and bioinformatics

2
New cards

CoT Analysis/Reassociation kinetics

  • Estimate the complexity and repetitive content of DNA → see how qucikly they anneal

  • extract DNA → hear DNA and fragment to single stranmd → wait and let cool to measure reassociation → plot ssDNA vs log C0T

  • x-axis - log10cot, y-axis % single-strand DNA

  • more “bumps” and smaler dropoffs = less repetiitive DNA, bigger dropoff w? less bumps = more repetitive DNA

3
New cards

Genetic Map

  • based on recombination frequency

  • measure relative positions of genes + affected by recombination hotspots

  • made by loofing at crossing over

  • “How often trians switch tracks between stations”

4
New cards

Physical Mapping (STS, contig, BAC, fingerprinting, FPC, YAC, two ways to close gaps (library and PCR))

  • ordered collection of clones from genomic library → find order and physical distance between DNA base pairs by DNA markers

  • Involves:

    • BAC - Bacterial Artificial Chromosome, clonging vector used with physical maps

    • STS - Sequence tagged site, marker, only hybridize one locaiton in genome

    • contig - set of overlapping DNA clones

    • Fingerprint - unique pattern of restriction fragments used to orient and order clone

    • FPC - fingerprinted contigs, contigs made with fingerprinting

    • YAC - Yeast artifical chromsome, used to potentially fill in gaps + some seq poisonous in BAC (and vice-versa)

    • Filling in gaps for physical map

      • 1) long-range PCR

      • find clones in genomic libraries hybriziing to ends of contigs

5
New cards

Physical Map - X-value and coverage

  • denotes how much sequence has been duplicated

  • ex: 3X coverage means each base covered about 3 times

  • Coverage = 1-e-x

  • e-x = how much not covered

6
New cards

Reference genome and why it isn’t representative

  • complete sequence of an organisms chromosome

  • artificial because

    • involves many different people’s genomes

    • haploid

7
New cards

Sanger era (golden path, finishing, scaffold, shotgun seq) + issues of sanger

  • Human genome mainly used Sanger sequencing

  • picked the “golden path” with given physical map

  • Bac clone sequence with shotgun seq - random chosen sub-clones of BAC)

  • seq aligned were grouped into contigs → single consensus seq

  • finishing - filling in the gaps of contigs, often with PCR

  • scaffolds - connecting contigs logically but not by using contiguous sequence

  • Issues

    • 1) needed to create physical map

    • 2) subcloning BACs to sequence them

8
New cards

Short-read era (adv+dis of shortread, N50

  • used PCR instead of cloning DNA

  • SOLiD, Illumina, pyrosequencing

  • Advatange → cheap, high coverage

  • Disadvatange → short read (100-150 bp) anmd genome has may repeats, make assembly difficult

  • N50 → measure quality and contiguity of genome assembly

    • add up total contig sizes

    • add up contigs until hits half of total size

    • last number to reach 50% threshold is the N50

9
New cards

Long-read era

  • PacBio and Oxford

  • huge read with high error rates → long reads better for genome assembly

10
New cards

Genoem facts

  • largest genome → amoeba, 670Gb

  • Pufferfish have more genes than humans but 1/10th gneome size

  • S. cerevisae only few huynder iuntrons in 6.3k gnees, anothe rfungus C, neofromans has 5.3 introns per gene

  • C. elegnas have trnas and cis pslicing

  • most common repeat in pine is 5% genoem, most common repeat in corn in 75% of genome

  • intron of animal genomes are huge, but not plants or fungi

  • some highly conservedseq in mammmal don;t code fvor protein

11
New cards

Tandem Repats - micro/mini satelite, replication slippage

  • occur next to each other

  • microsatelite - coupkle of nt long

  • ministatelite - 10-60 bp long

  • long tandem repeats - duplications or inversions

  • satelite grow and shirnk via replication slippage  - DNA pol slips causing indel

12
New cards

Intersperesed repeats - gene conversion

  • distributed all over genome

  • can copy to new locations in the genome and disrupt normal genes

  • “junk dna”

  • combat gene conversion

  • interspersed repeats find way into introns and make genes look dissimilar, preventing homologous conversion and gene conversion

13
New cards

RNA transposons (LTR, non-LTR (LINE and SINE))

  • also known as retrotransposons, are common in eukaryotic genomes, 42% of human genome

  • get name from going through RNA intermediate, from RNA →: DNA

  • RNA transposon types

    • 1) LTR (long terminal repeats)

      • have long terminal repeats and encode enzymes allowing self replication and integration in genome

      • couplkt hundred nucleotides and enzymes are 5 kb, look similar to retrovirus

    • 2) Non-LTR

      • also called LINEs (long interspersed nuclear elements), 21% of genome

      • LINE1 - 6kb lonmg, RT activity

      • many LINE truncated because RT falls off before ocmpleting transcript -:> explains nonfunctionality

      • another type → SINE, 100-700 nt long

      • human genome, SINE = Alu elements → 300 nt and 1 million present anout 10% genoem

      • SINE originate from small RNA, transcribed by RNA p[ol III

14
New cards

DNA transposons - Helitrons and politrons

  • dont use RNA intermediate

  • two types

    • Helitrons - eukaryotic DNA transposon, make copies via rolling circle)

    • Politrons - 15-20 kb lonf an related to viruses,

15
New cards

Psuedogene - retrosuedogenes.proicessed, non-processed, unitary, psuedo-psuedogene

  • look like broken protein coding genes, 4 majopr classes

  • 1) Process psuedogenes (retropsuedogene)

    • mRNBA rev transcribed to DNA and inserted into geneome

    • contian a poly-A tail and have no intron

    • often form highly expressed genes

    • can create a higher BLASTX score than regular DNA

  • 2)Non-processed

    • DNA psuedo genes

    • reuslt of incomplete duplication

    • retian OG sequence and look like real genes

  • 3) Unitary

    • gene nonfuinctional without dupliocation

    • ex: GULOm, why we need Vitamin C

  • 4) Psuedo-Psuedogenes

    • genes seem broken with nonsense mutation but may actually funmction

    • stop codons read through

16
New cards

Variation - SNP, SNV, SV, CNV

  • humans vary 1bp/1000bp (human/chimp = 1/100)

  • replication errors 1/100,000bp (30,000 errors per haploid gneome)

  • SNPs (single nucleotide polymorphisms)\

    • present in1% or more of population

    • often used as markers for GWAS

    • SNV → also simlar but cna be rare

    • calssified about where they occur in genome (ex: non0-coding, coding, indel)

  • SV (structural variant)

    • gneetic differences that make larger chagnes to genoeme

    • ex: indel, inversion, translocationm

    • CNV (copy miumber variant) →: region of chromosome where ther is differnece of repeasts

      • some diseases incluse Fragile X, Dup15q

    • SV harder to asses than SNPs

17
New cards

Human gneoem Trivia

  • Size - 3 billion bp

    • LKongest chromosome: 1 - 249Mbp, 2- 242bp, 3- 198 bp

    • Shortest Chromosome: 21: 47 Mbp, 22 - 51 Mbp, Y - 57 Mbp

  • Genes

    • 20KL protein-coding genes

    • 1% of genome corresponmds to protien-coding DNA

    • unknown number of RNA genes

  • Repeats - >50% repetitive geneome

    • 13% SINE (11% Alu)\

    • 20% LINE ( 17% LINE1)

    • 8% LTR transposons

    • 3% DNA elements

    • 3% SSR

    • 3% Duplications

  • variation - 1bp/1000bp

18
New cards

Metagenomics - 16S rDNA, rarefaction curve, OTU

  • Environmental sequencing

  • look at 16S rDNA, part of small ribosomal subunit

  • rarefaction curve → unique OTU (operational taxnomical units/unique seq) y-axis vs. # of sequences x-axis

    • tells us hwo manyh differnet species present

Explore top flashcards

Final Exam Crothers
Updated 725d ago
flashcards Flashcards (105)
List #32
Updated 1155d ago
flashcards Flashcards (37)
Ch.14: Water
Updated 979d ago
flashcards Flashcards (24)
Vocab U.6
Updated 1060d ago
flashcards Flashcards (20)
Lecture 19
Updated 22d ago
flashcards Flashcards (44)
Biology EOC
Updated 1049d ago
flashcards Flashcards (383)
Final Exam Crothers
Updated 725d ago
flashcards Flashcards (105)
List #32
Updated 1155d ago
flashcards Flashcards (37)
Ch.14: Water
Updated 979d ago
flashcards Flashcards (24)
Vocab U.6
Updated 1060d ago
flashcards Flashcards (20)
Lecture 19
Updated 22d ago
flashcards Flashcards (44)
Biology EOC
Updated 1049d ago
flashcards Flashcards (383)