1/39
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
Why do we compare biological sequences? 🧬
Because similar sequences have similar structure and function,
letting us predict what genes do and how mutations affect them 🧠
What is the difference between similarity and homology? ⚠️
Similarity is a measured sequence match, while homology is an inferred evolutionary relationship (you never “measure” homology directly).
What is genome annotation? 🧩
The process of finding genes,
regulatory regions,
and functional elements in raw DNA sequence
to giving them biological meaning.
Why is gene finding easier in prokaryotes than eukaryotes? 🦠➡️🧬
Prokaryotes have dense genes and no introns,
while eukaryotes have introns, large gaps, and complex gene structures.
What makes gene prediction difficult in eukaryotic genomes? 🚧
Introns interrupt genes,
genes are spread far apart,
and exon lengths don’t always match reading frames.
What are the four main approaches used to find genes? 🔍
ORF detection,
pattern matching,
codon usage bias,
and homology searching (used together, not alone).
What is an ORF (Open Reading Frame)? 📖
A stretch of DNA that starts with ATG,
ends with a stop codon,
How does ORF detection work? 🔁
DNA is translated in all six reading frames
and long uninterrupted ORFs are searched for.
Why are long ORFs more likely to be real genes? 📏
Long ORFs are unlikely to occur by chance,
making them good gene candidates.
Why does ORF detection work best in prokaryotes and cDNA? ✅
Because they don’t contain introns that break up coding regions.
What is a major limitation of ORF detection? ❌
It performs poorly on eukaryotic genomic DNA
because introns disrupt reading frames.
What is pattern matching used for in gene finding? 🧠
To locate signals like promoters,
start sites,
splice sites,
and polyadenylation signals.
What is the Shine–Dalgarno sequence and where is it found? 🦠
A ribosome binding site in prokaryotes with consensus AGGAGG.
What is the Kozak sequence and where is it found? 🧬
A eukaryotic translation start signal with consensus gccRccAUGG.
What does the polyadenylation signal do? ✂️
It marks the 3′ end of a gene and signals where the poly-A tail is added (AAUAAA).
Why does pattern matching not give complete gene models? ⚠️
It only identifies approximate locations, not full exon–intron structures.
What is codon usage bias? 🔄
Organisms prefer certain codons over others to code for the same amino acid.
Why does codon usage bias exist? ⚙️
Because some codons match abundant tRNAs and are translated faster and more efficiently.
Why is codon bias strongest in highly expressed genes? 🚀
These genes benefit most from fast and efficient translation.
Why is codon usage bias only supporting evidence for a gene? ⚠️
It suggests coding potential but cannot prove a gene exists on its own.
What is homology searching? 🔗
Comparing a sequence to known sequences in databases to find similar genes or proteins.
Why is homology searching powerful? 💪
It can identify known genes, protein families, and functional domains.
What is a limitation of homology searching? 🚫
It can only find genes that are already represented in databases.
Why are protein alignments often better than DNA for homology searches? 🧠
Proteins are more conserved over evolution and better reflect function.
What is a reference genome? 🗺️
A standard genome used as a framework to compare and interpret new sequencing data.
What is re-sequencing? 🔄
Aligning new sequencing reads to a reference genome instead of assembling from scratch.
What do differences between a sample and the reference genome represent? 🧬
Genetic variants such as SNPs, insertions, or deletions.
Why do almost all diseases have a genetic component? 🧠
Genes influence how the body develops and responds, even when environment also plays a role.
What is a monogenic (Mendelian) disease? 🧬
A disease caused by a mutation in a single gene.
What is a complex (multifactorial) disease? 🌍
A disease caused by many genes plus environmental factors.
What is linkage analysis? 🧬➡️🧬
A family-based method that finds disease genes by tracking linked genetic markers.
Why does linkage analysis work well for rare Mendelian diseases? 👨👩👧👦
Because the disease follows clear inheritance patterns in families.
What is a limitation of linkage analysis? ⚠️
It has low resolution and identifies large genomic regions.
What is GWAS (Genome-Wide Association Study)? 🌐
A population-based method that links SNPs to disease traits.
Why is GWAS good for complex diseases? 🧠
It can detect small genetic effects across many individuals.
What is a major limitation of GWAS? ⚠️
It is prone to false positives and does not prove causation.
What is the key difference between linkage analysis and GWAS? 🔍
Linkage uses families and finds rare disease genes, while GWAS uses populations to study complex diseases.
Why has exome sequencing transformed rare disease genetics? ⚡
It sequences all genes at once, allowing rapid identification of disease-causing mutations.
Why is functional validation still needed after sequencing? 🧪
Because finding a mutation does not automatically prove it causes disease.