148 final (Set 4)

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/62

There's no tags or description

Looks like no tags are added yet.

Last updated 8:52 AM on 5/12/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

63 Terms

New cards

CAST

type V-K CRISPR associated transposon, targets but doesn’t cut so transposon can hop into a specific site. efficient and promising

targeted insertion of DNA is hard bc u need host cell repair machinery

CAST integrates DNA into target site at 80% success rate w NO selection

type V-K crispr-cas ALWAYS encoded within a tn7 like transposon

tns have their roles, tniQ the “communicator” between Cas9 and tns

Cas system only has tracrRNA and crRNA

kinda similar to dCas9-effector fusions

New cards

tn7 type transposon

has TnsA-E

tnsA/B = transposon
tnsCDE = accessory transposition genes
cargo genes next to them, flanked by IRs (RE/LE)

New cards

RNA guided DNA transposition / experiment example

TnsB/C and TniQ recognize IRs in Tn7 like transposon and excise it

tns/transposon complex binds to cas12K/crRNA/tracrRNA complex which targets transposon for insertion at site specified by crRNA

experiment:

pHelper has the crRNA, tnsB/C/TniQ
pDonor has just kanR and repeats
pHelper goes and snatches kan R through process above, expectation was that if cas12k and crRNA direct Tn insertion they’d see the kanR at the crRNA site.

<p>TnsB/C and TniQ recognize IRs in Tn7 like transposon and excise it</p><p>tns/transposon complex binds to cas12K/crRNA/tracrRNA complex which targets transposon for insertion at site specified by crRNA</p><p></p><p>experiment:</p><ul><li><p>pHelper has the crRNA, tnsB/C/TniQ</p></li><li><p>pDonor has just kanR and repeats</p></li><li><p>pHelper goes and snatches kan R through process above, expectation was that if cas12k and crRNA direct Tn insertion they’d see the kanR at the crRNA site.</p></li></ul><p></p>

New cards

CAST advantages

programmability of CRISPR + insertion capability of a transposon, don’t need to worry about homologous recombination machinery

transposition is more efficient + only requires transposase enzymes —> can happen in bacteria with low HR efficiency, doesn’t require a template homologous to target gene.

New cards

how we can determine which microbes are present in a certain envrionment

culture dependent (isolate microbe, study in lab)

culture independent (PCR based & 16s rRNA/metagenomics/transcriptomics)

16s rRNA profiling

New cards

why are microbes hard to culture sometimes

might have specific growth requirements

might grow super slow (esp relative to other microbes in sample)

bacteria might need presence of other organisms to grow

New cards

16s rRNA profiling

culture independent

9 variable regions and conserved regions

used to ID organisms

PCR amplify 16s gene using universal primers, compare to known 16s sequences, if they’re 97% similar —> same species

community analysis gives you:

species richness (distinct) & abundance (quantity)

<p>culture independent</p><p>9 variable regions and conserved regions</p><p>used to ID organisms</p><p>PCR amplify 16s gene using universal primers, compare to known 16s sequences, if they’re 97% similar —> same species</p><p></p><p>community analysis gives you:</p><ul><li><p>species richness (distinct) & abundance (quantity)</p></li></ul><p></p>

New cards

Human Microbiome Project

goals

1) see how many microbial species there are in healthy adults using 16s profiling and metagenomics.

2) see dynamic changes that occur in microbiome of host under:

pregnancy/preterm birth, inflammatory bowel disease, stressors in folk w prediabetes

big pic: how does microbiome impact human health / disease, can we make a reference microbiome for typical individual

major conclusions:

there is NO REFERENCE MICROBIOME for healthy adults
- major variability
one body site in individual is more similar to same site on someone else than different site on same person
- tongue to someone elses tongue > tongue to same persons foot
Higher inter individual variation at species / strain level
- ex: streptococcus main in oral, but theres a ton of types of them

New cards

GI Tract microbiome

GI tract has most diverse microbiome (gut has 10²/g, 300-500 species present), commensal relationship (they benefit, neutral for us)
colon is STRICTLY ANOXIC (no o2)
- anoxia promotes growth of obligate anaerobes like clostridium (firmicute) and bacteroides (bacterioidetes)
- ^ ferment products of complex carbohydrates catabolism
- phylum level diversity is low

big 3: firmicutes, bacteroidetes, proteobacteria

FB dominate in abundance

Roles:

develop early immune system
digestion
pathogen protection
influence how we process therapeutic drugs

New cards

factors about human microbiome that make research had

New cards

Auto-brewery syndrome / gut fermentation syndrome

consumed carbs are converted to alcohol by gut microbiota

candida / saccharomyces (fungal) & klebsiella pnumoniae (bacterial)

New cards

Bacteroides (GI microbiome)

obligate anaerobes fermentative species

Saccharolytic —> ferment sugars / proteins to acetate/succinate

bacteroides thetaiotaomicron:

prominent in large intestine
specializes in degradation of complex polysaccharides
increases diversity of plant polymers that can be degraded by human digestive tract

<p>obligate anaerobes fermentative species</p><p>Saccharolytic —> ferment <strong>sugars / proteins</strong> to <strong>acetate/succinate</strong></p><p></p><p>bacteroides thetaiotaomicron: </p><ul><li><p>prominent in <strong>large intestine</strong></p></li><li><p>specializes in <strong>degradation of complex polysaccharides</strong></p></li><li><p>increases diversity of plant polymers that can be degraded by human digestive tract</p></li></ul><p></p>

New cards

germ free mice

introduce specific microbes into them to see composition and effects

monocolonization, defined cocktail, complex

New cards

backhed et all (fat microbiome transfer)

gut microbiota is an important environmental factor affecting energy harvest from diet and energy storage in host.

<p><strong>gut microbiota</strong> is an important environmental factor affecting <strong>energy harvest from diet and energy storage in host.</strong></p>

New cards

diet influences microbiome composition

western diet = high in fat w sucrose, malodextrin, corn syrup

^ cuts species diversity, body fat is higher

GF mice with western fed donor microbes vs cho donor microbes —> ate same amount but CHO donors only has 23%. body fat increase whereas 47% for western.

New cards

Challenges in native gut microbiome genetic analysis

1) delivery

DNA has to make it in efficiently without getting degraded by gut, other bacteria, chemicals, immune responses, washout
bc of all this, use conjugation
MetaEdit, megRNA decides where delivery happens

2) persistence of engineered bacteria

can be transferred then lost

<p>1) delivery</p><ul><li><p>DNA has to make it in efficiently without getting degraded by gut, other bacteria, chemicals, immune responses, washout</p></li><li><p>bc of all this, use <strong>conjugation</strong> </p></li><li><p>MetaEdit, megRNA decides where delivery happens</p></li></ul><p></p><p>2) persistence of engineered bacteria</p><ul><li><p>can be transferred then lost</p></li></ul><p></p>

New cards

2 CAST experiments

GF mice given 6 commensal species (pic)

4% OV got the DNA precisely integrated but got lost in 10 days

they were only successsfully able to edit bacteroids thetaiotaomicron
selected for b thetawhatever that got the transposon by addiont 3 genes in transposon that give bacterium a growth advantage then gave them inulin (hella hard to digest, most cant)

BACTRINS (therapeutic e.coli donor strain that can be fed to mice infected w shiga toxin

BACTRINS uses CAST to deliver transposon into shiga toxin gene —> not virulent yay

<p>GF mice given 6 commensal species (pic)</p><p>4% OV got the DNA precisely integrated but got lost in 10 days</p><ul><li><p>they were only successsfully able to edit bacteroids thetaiotaomicron</p></li><li><p>selected for b thetawhatever that got the transposon by addiont 3 genes in transposon that give bacterium a growth advantage then gave them inulin (hella hard to digest, most cant)</p></li></ul><p></p><p>BACTRINS (therapeutic e.coli donor strain that can be fed to mice infected w shiga toxin</p><ul><li><p>BACTRINS uses CAST to deliver transposon into shiga toxin gene —> not virulent yay</p></li></ul><p></p>

New cards

homology

similarity due to shared evolutionary history, common ancestral character

descended with divergence

bat/mouse forelimbs

New cards

Analogy

relationship between 2 characters that descended convergently from unrelated ancestors

“parallel evolution”

birds and insects both evolving wings

but their evolution as forelimbs is homologous

<p>relationship between 2 characters that descended convergently from unrelated ancestors</p><p>“parallel evolution”</p><p></p><p>birds and insects both evolving wings</p><ul><li><p>but their evolution as forelimbs is homologous</p></li></ul><p></p>

New cards

gene/protein homology key concepts

1) G/Ps that have similar sequences are likely homologs but NOT ALWAYS (short sequences can be similar)

2) closely related homologs are likely to share some functionality but not always

enzymes that act on slightly diff substrates

3) in understanding function of uncharacterized protein, look for similar sequences, try to infer homology, can help predict function of protein

New cards

orthologs vs paralogs

orthologs are best for making species trees and good predictors of conserved function (shared functionality)

paralogs are likely to diverge in function

gene duplication is a major driver of genome evolution

<p><strong>orthologs are best for making species trees and good predictors of conserved function </strong>(shared functionality)</p><p></p><p><strong>paralogs are likely to diverge in function</strong></p><ul><li><p>gene duplication is a major driver of genome evolution</p></li><li><p></p></li></ul><p></p>

New cards

genetic complementation across domains (e.coli / yeast)

e.coli vs yeast example:

hella homologs, but functional equivalence?
experiment: rescue lethality of gene KOs in yeast by adding e.coli homolog
over half were functional, found that heme biosynthesis pathway is transferable from e.coli to yeast

New cards

Xenologs

relationship of any 2 homologs whose history since their common ancestor involves an interspecies (horizontal) transfer of genes

found when species and gene tree not matching

New cards

info u get from protein sequences

has 20 amino acids, order and composition = structure which = function

compare proteins to learn things boom sometimes more helpful than looking at whole protein

New cards

protein domain structural vs bioinformatics definition

structural:

protein domain = sequence that can fold independently and fulfill a minimal function

domain architecture = arrangement of domains in a protein

protein complex = group of proteins that come together to perform a biological function

bioinformatics:

protein domain: sequence matching a statistical model that corresponds to known or predicted domain

domain architecture: set of domains comprising a protein

protein complexes: not directly genome encoded

New cards

protein annotation vs domain annotation

protein:

find full length orthologs among proteins of known function
danger —> overannotation and assuming ohmology = shared function

domain:

finding which known protein domains are present in the sequence

New cards

Protein function prediction reasons?

1) work with protein sequence bc DNA evolves faster than protein

2) groups of amino acids have similar chemical properties, we can compare proteins by similarity not just identity

New cards

BLAST

heuristic approach, approximates smith waterman algorithm for local sequence alignment

guaranteed to find optimal local alignment between 2 sequences, but BLAST isn’t
but its way faster!

bit score = log2 scaled and normalized raw score

higher bit score = higher similarity

Expected value (e-value) = # of hits of equal or better quality that could be found by chance

lower values more significant
varies based on size of database you’re looking at

<p>heuristic approach, approximates smith waterman algorithm for local sequence alignment</p><ul><li><p>guaranteed to find optimal local alignment between 2 sequences, but BLAST isn’t</p></li><li><p>but its way faster!</p></li></ul><p></p><p><strong>bit score</strong> = log2 scaled and normalized raw score</p><ul><li><p><strong>higher bit score = higher similarity</strong></p></li></ul><p></p><p><strong>Expected value (e-value) = </strong># of hits of equal or better quality that could be found by chance</p><ul><li><p><strong>lower values more significant</strong></p></li><li><p>varies based on size of database you’re looking at</p></li></ul><p></p>

New cards

global vs local alignment

New cards

protein BLAST problems

limited at finding distant homology

doesnt take into account conservation / functional importance of individual amino acids

based on local similarity, can lead to annotation errors & propagation

New cards

BLOSUM62

amino acid substitution matrix (pBLAST default)

compares amino acid substitutions among proteins no more than 62% identical

sees if swaps are significant or not

can help find homologs of moderate distance (62%)

New cards

Profile search methods

take into account conservation of individual amino acids residues in a group of homologs, calculate position specific scoring matrices (PSSMs)

PSSMs useful for IDing more distant homologs
tools: PSI-BLAST, Profile hidden markov models (PFam)

New cards

Multiple sequence alignment

used to calculate a statistical model that corresponds to a known domain or predicted domain

PSSM: for every position in alignment, make custom scoring matrix

conserved positions: high positive score for conserved AA, negative for not conserved AA

New cards

Annotating protein domains (Pfam and InterPro)

Pfam = collection of statistical models of protein domains

InterPro = meta collection of protein models including multi-domain ones

protein domain annotation can be used to gather homologs

Pfam answers "does my protein contain domain X?" and InterPro answers "what is my protein, based on everything we know about all its parts?

New cards

AlphaFold

predicts lowest energy folding state for protein from structure alone

uses the fact that residues in proximity evolve together + 3d info

Two proteins can have completely different sequences but still fold into the same 3D shape —> reveals they're related in ways sequence comparison alone would miss.

New cards

Protein language models

sequence diversity analysis

databases learn protein sequence patterns then can use that knowledge to predict sequences

New cards

Metagenomics

direct genetic analysis of genomes contained within an environmental sample

culture independent approach

what is there, what are they doing

2 types:

Marker gene (amplicon)

PCR amplify and sequence a conserved marker gene from community to see what is there
deeper in on one sample, cheaper and faster

genome-resolved (shotgun)

sequence everything, can use larger samples

New cards

marker gene vs amplicon

New cards

Marker-based / amplicon metagenomics

Amplicon-based “metagenomics” is useful for “What is there?” questions; and for “How does the microbial community change in response to …....?” question

1) ID microorganisms that have a conserved marker gene —> PCR amplify it

^ primers don’t always catch everybody tho

2) sequence using illumina / NGS

3) compare to known database

commonly used marker genes:

16s rRNA (bacteria / archaea)
ITS (fungi)
18s rRNA gene (euks)
COX1 (animals)
single copy highly conserved protein coding genes

<p>Amplicon-based “metagenomics” is useful for “What is there?” questions; and for “How does the microbial community change in response to …....?” question</p><p></p><p>1) ID microorganisms that have a conserved marker gene —> PCR amplify it</p><p>^ primers don’t always catch everybody tho</p><p>2) sequence using illumina / NGS</p><p>3) compare to known database</p><p></p><p>commonly used marker genes:</p><ul><li><p>16s rRNA (bacteria / archaea)</p></li><li><p>ITS (fungi)</p></li><li><p>18s rRNA gene (euks)</p></li><li><p>COX1 (animals)</p></li><li><p>single copy highly conserved protein coding genes</p></li></ul><p></p>

New cards

16s RNA / ITS

prokaryotes:

conserved/variable, interacts with RBS, vertically inherited, in all prokaryotes

use universal primers to amplify

eukaryotes:

not enough variable regions to do 16s —> ITS instead

iTS primers use conserved regions in rRNA genes to amplify non-coding regions between them

New cards

16s rRNA based marker gene approach to characterize microbiomes

advantages:

cheap and fast
readily done even with low amounts of DNA (bc PCR)
hella samples can be processed/multiplexed (good for temporal studies / large surveys of community structure)

disadvantages:

link between 16s rRNA sequence and functional capacity of microbe isn’t great (don’t have genome)
many sources of bias (DNA not represented or chimeric)
16s primers work on most but not all bacteria

New cards

16s profiling experimental setup

1) isolate DNA from all environmental samples

2) PCR amplify 16s rRNA genes frome each community, sequence amplicons w/ illumina

bruh what

New cards

operational taxonomic units (OTU)

unit to classify groups of similar organisms

clustering performed on a group of sequences

OTUs are defined by user (operational)

ex: 97% identity of 16s rDNA gene for OTU is human made stat

OTU clustering: grouping together closely related sequencing reads from a sample

OTU is just sequences you think are the same, then cross referencing actually identifies it

higher resolution attempts to make the threshold higher than 97%

amplicon sequence variants (ASVs)
exact sequence variants (ESVs)
zero radius OTUs (zOTUs)

<p>unit to classify groups of similar organisms</p><p>clustering performed on a group of sequences</p><p>OTUs are defined by user (operational)</p><p>ex: 97% identity of 16s rDNA gene for OTU is human made stat</p><p></p><p>OTU clustering: grouping together closely related sequencing reads from a sample</p><p></p><p><span>OTU is just sequences you think are the same, then cross referencing actually identifies it</span></p><p></p><p><span>higher resolution attempts to make the threshold higher than 97%</span></p><ul><li><p>amplicon sequence variants (ASVs)</p></li><li><p>exact sequence variants (ESVs)</p></li><li><p>zero radius OTUs (zOTUs)</p></li></ul><p></p>

New cards

Earth Microbiome Project

New cards

What can 16s rRNA based metagenomics tell you

microbial abundance profiles at diff taxonomic levels

alpha diversity measures micorbial community composition

shannon index auuuughhghhghghghgh
measures for a single sample
- richness and evenness

Beta diversity: measures microbial community differences

quantitative (otu abundance, same organisms in same amount?)
qualitative (otu yes/no, share same organisms?
displayed as PCA (principal component analysis plots) pic

<p>microbial abundance profiles at diff taxonomic levels</p><p><strong>alpha diversity</strong> measures micorbial community composition</p><ul><li><p>shannon index auuuughhghhghghghgh</p></li><li><p>measures for a single sample</p><ul><li><p>richness and evenness</p></li></ul></li></ul><p></p><p>Beta diversity: measures microbial community differences</p><ul><li><p>quantitative (otu abundance, same organisms in same amount?)</p></li><li><p>qualitative (otu yes/no, share same organisms?</p></li><li><p>displayed as PCA (principal component analysis plots) pic</p></li></ul><p></p><p></p>

New cards

assembling whole genomes from metagenome challenges

New cards

Binning in metagenome sequence assembly

can be based on read abundances, GC content, tetranucleotide frequency

2 types:

1) bin reads then assemble: bin reads, assemble into contigs, genomes

puzzle pieces together then make the picture

2) assemble into contigs, then bin

put pieces together in squares, then put them together to make the picture

<p>can be based on read abundances, GC content, tetranucleotide frequency</p><p></p><p>2 types:</p><p>1) bin reads then assemble: bin reads, assemble into contigs, genomes</p><ul><li><p>puzzle pieces together then make the picture</p></li></ul><p></p><p>2) assemble into contigs, then bin</p><ul><li><p>put pieces together in squares, then put them together to make the picture</p></li></ul><p></p><p></p>

New cards

metagenome assembled genomes (MAGs)

what u get after the binning, “draft genome”

annotation of MAGs

1) structural, ID features in genome
2) functional, assign putative functions to features via sequence similarity, homology, protein domains, predicted protein structures

assessing quality of MAGs:

in a MAG, look for single copy genes (SGCs) that are universally present in related bacteria
% completeness: estimated by % of marker genes present in MAGs
contamination = estimated by # of instances of more than one copy in marker gene

high quality MAG has >90% completeness and <10% contamination

SCGs used to assess genome completion/contamination oare orthologs & homologs

<p>what u get after the binning, “draft genome”</p><p></p><p><strong>annotation of MAGs</strong></p><ul><li><p>1) structural, ID features in genome</p></li><li><p>2) functional, assign putative functions to features via sequence similarity, homology, protein domains, predicted protein structures</p></li></ul><p></p><p><strong>assessing quality of MAGs</strong>:</p><ul><li><p>in a MAG, look for <strong>single copy genes (SGCs)</strong> that are universally present in related bacteria</p></li><li><p><strong>% completeness:</strong> estimated by % of marker genes present in MAGs</p></li><li><p><strong>contamination</strong> = estimated by # of instances of more than one copy in marker gene</p></li></ul><p></p><p>high quality MAG has >90% completeness and <10% contamination</p><p></p><p>SCGs used to assess genome completion/contamination oare orthologs & homologs</p><p></p>

New cards

expanded tree of life (CPR)

Candidate Phyla Radiation (CPR)

bacteria (many novel phyla):

• small genomes (~1,000 genes)

• small cells

• Incomplete biosynthetic

capabilities (ex. Amino acids)

• May be obligate symbionts or

parasites of other organisms

New cards

genome resolved metagenomics summary

New cards

Functional Metagenomics

microbes r so important but we cant culture all of them

Functional metagenomics is the identification of new compounds or genes with desired activities from environmental DNA

need to express DNA in e.coli (heterologous host)

new gene functions from metagenomic DNA can be IDd by selections (of e.coli) or screens

selections: antibiotic resistance genes
screens: new phosphates

New cards

discovery of new antibiotics from metagenomic DNA

New cards

IDing new functionally tested genes from metagenomic DNA

New cards

spread of antibiotic resistance

New cards

Func genomic application 1: identifying antibiotic resistance genes from natural envrionments

q: Have pathogens and soil bacteria recently exchanged resistance genes (if yes then sequences should be similar)

approach: func metagenomic using DNA soil from microbes

found that some resistance genes from soil were 100% identical at nucleotide level to genes from pathogens

evidence of HGT of resistance genes between soil bacteria / pathogens found

<p>q: Have pathogens and soil bacteria recently exchanged resistance genes (if yes then sequences should be similar)</p><p>approach: func metagenomic using DNA soil from microbes</p><p></p><p>found that some resistance genes from soil were 100% identical at nucleotide level to genes from pathogens</p><ul><li><p>evidence of HGT of resistance genes between soil bacteria / pathogens found</p></li></ul><p></p>

New cards

Func genomic application 2:

q: are novel phosphatases that escaped detection

approach: func metagenomics using DNA from soil microbes

experimental approach + Add an indicator compound to agar plates that makes phosphatase-positive colonies a different color

BCIP turns phosphatase including ones blue

21 e.coli colonies with phosphatase activity found —> next step BLAST to gather putative homologs, protein domain annotation, structure modeling.

find proteins with similar sequences

<p>q: are novel phosphatases that escaped detection</p><p>approach: func metagenomics using DNA from soil microbes</p><p></p><p>experimental approach + Add an indicator compound to agar plates that makes phosphatase-positive colonies a different color</p><ul><li><p>BCIP turns phosphatase including ones blue</p></li></ul><p></p><p>21 e.coli colonies with phosphatase activity found —> next step BLAST to gather putative homologs, protein domain annotation, structure modeling.</p><ul><li><p>find proteins with similar sequences</p></li></ul><p></p>

New cards

common functional metagenomics challenges

New cards

Tn-Seq

forward genomics tool

importance of thousands of genes assayed simultaneously in one pot
all tn strains grown together in same culture and compete
important genes for fitness IDed through NGS

genes providing benefits under certain conditions get DEPLETED during tn seq

<p>forward genomics tool</p><ul><li><p>importance of thousands of genes assayed simultaneously in one pot</p></li><li><p>all tn strains grown together in same culture and compete</p></li><li><p>important genes for fitness IDed through NGS</p></li></ul><p></p><p>genes providing benefits under certain conditions get DEPLETED during tn seq</p><p></p>

New cards

Essential genes

required for growth

conserved across species (usually)

10-20% genes are essential typically

New cards

CRISPRi for functional genomics

The mechanism:

You design a guide RNA targeting the promoter or early coding region of a gene you want to silence
dCas9 + guide RNA bind that location
dCas9 just sits there — physically blocking RNA polymerase from transcribing the gene
No transcription = no mRNA = no protein = gene effectively silenced

repressor domain called KRAB in euks

New cards

CRISPR cas in bacteria

1) adaptive immune system in prokarotes:

infection, bacteria gives phage protospacer to CRISPR array
CRISPR array transcribed into RNA
RNA —> crRNA (1 per spacer) —> cas + crRNA
cas patrols cell using this, if it recognizes dna match, creates a ds break in phage DNA —> phage is done for

2) tracrRNA + crRNA —> sgRNA + Cas = systsm with 2 parts that can cut at a specific location in a genome

targeted mutation via ds break at a specific sequence
BUT NHEJ lethal in bacteria

3) community editing of microbes using CRISPR guided transposons

CAST to guide where they go?

New cards

Diff functional genomics approaches based on CRISPR

1) ds break —> loss of function mutation in target gene

2) fusion of transcriptional activator = overexpression of target gene

3) fusion of transcriptional repression = repression of target gene

New cards

performing genetics research on uncultivated microbes