bioinformatics lecture 5 (kinda important)

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/39

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

40 Terms

1
New cards

Why do we compare biological sequences? 🧬

Because similar sequences have similar structure and function,

letting us predict what genes do and how mutations affect them 🧠

2
New cards

What is the difference between similarity and homology? ⚠️

Similarity is a measured sequence match, while homology is an inferred evolutionary relationship (you never “measure” homology directly).

3
New cards

What is genome annotation? 🧩

The process of finding genes,

regulatory regions,

and functional elements in raw DNA sequence

to giving them biological meaning.

4
New cards

Why is gene finding easier in prokaryotes than eukaryotes? 🦠➡️🧬

Prokaryotes have dense genes and no introns,

while eukaryotes have introns, large gaps, and complex gene structures.

5
New cards

What makes gene prediction difficult in eukaryotic genomes? 🚧

Introns interrupt genes,

genes are spread far apart,

and exon lengths don’t always match reading frames.

6
New cards

What are the four main approaches used to find genes? 🔍

ORF detection,

pattern matching,

codon usage bias,

and homology searching (used together, not alone).

7
New cards

What is an ORF (Open Reading Frame)? 📖

A stretch of DNA that starts with ATG,

ends with a stop codon,

8
New cards

How does ORF detection work? 🔁

DNA is translated in all six reading frames

and long uninterrupted ORFs are searched for.

9
New cards

Why are long ORFs more likely to be real genes? 📏

Long ORFs are unlikely to occur by chance,

making them good gene candidates.

10
New cards

Why does ORF detection work best in prokaryotes and cDNA? ✅

Because they don’t contain introns that break up coding regions.

11
New cards

What is a major limitation of ORF detection? ❌

It performs poorly on eukaryotic genomic DNA

because introns disrupt reading frames.

12
New cards

What is pattern matching used for in gene finding? 🧠

To locate signals like promoters,

start sites,

splice sites,

and polyadenylation signals.

13
New cards

What is the Shine–Dalgarno sequence and where is it found? 🦠

A ribosome binding site in prokaryotes with consensus AGGAGG.

14
New cards

What is the Kozak sequence and where is it found? 🧬

A eukaryotic translation start signal with consensus gccRccAUGG.

15
New cards

What does the polyadenylation signal do? ✂️

It marks the 3′ end of a gene and signals where the poly-A tail is added (AAUAAA).

16
New cards

Why does pattern matching not give complete gene models? ⚠️

It only identifies approximate locations, not full exon–intron structures.

17
New cards

What is codon usage bias? 🔄

Organisms prefer certain codons over others to code for the same amino acid.

18
New cards

Why does codon usage bias exist? ⚙️

Because some codons match abundant tRNAs and are translated faster and more efficiently.

19
New cards

Why is codon bias strongest in highly expressed genes? 🚀

These genes benefit most from fast and efficient translation.

20
New cards

Why is codon usage bias only supporting evidence for a gene? ⚠️

It suggests coding potential but cannot prove a gene exists on its own.

21
New cards

What is homology searching? 🔗

Comparing a sequence to known sequences in databases to find similar genes or proteins.

22
New cards

Why is homology searching powerful? 💪

It can identify known genes, protein families, and functional domains.

23
New cards

What is a limitation of homology searching? 🚫

It can only find genes that are already represented in databases.

24
New cards

Why are protein alignments often better than DNA for homology searches? 🧠

Proteins are more conserved over evolution and better reflect function.

25
New cards

What is a reference genome? 🗺️

A standard genome used as a framework to compare and interpret new sequencing data.

26
New cards

What is re-sequencing? 🔄

Aligning new sequencing reads to a reference genome instead of assembling from scratch.

27
New cards

What do differences between a sample and the reference genome represent? 🧬

Genetic variants such as SNPs, insertions, or deletions.

28
New cards

Why do almost all diseases have a genetic component? 🧠

Genes influence how the body develops and responds, even when environment also plays a role.

29
New cards

What is a monogenic (Mendelian) disease? 🧬

A disease caused by a mutation in a single gene.

30
New cards

What is a complex (multifactorial) disease? 🌍

A disease caused by many genes plus environmental factors.

31
New cards

What is linkage analysis? 🧬➡️🧬

A family-based method that finds disease genes by tracking linked genetic markers.

32
New cards

Why does linkage analysis work well for rare Mendelian diseases? 👨‍👩‍👧‍👦

Because the disease follows clear inheritance patterns in families.

33
New cards

What is a limitation of linkage analysis? ⚠️

It has low resolution and identifies large genomic regions.

34
New cards

What is GWAS (Genome-Wide Association Study)? 🌐

A population-based method that links SNPs to disease traits.

35
New cards

Why is GWAS good for complex diseases? 🧠

It can detect small genetic effects across many individuals.

36
New cards

What is a major limitation of GWAS? ⚠️

It is prone to false positives and does not prove causation.

37
New cards

What is the key difference between linkage analysis and GWAS? 🔍

Linkage uses families and finds rare disease genes, while GWAS uses populations to study complex diseases.

38
New cards

Why has exome sequencing transformed rare disease genetics? ⚡

It sequences all genes at once, allowing rapid identification of disease-causing mutations.

39
New cards

Why is functional validation still needed after sequencing? 🧪

Because finding a mutation does not automatically prove it causes disease.

40
New cards