Bioinformatics lecture 4

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/49

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

50 Terms

1
New cards

What is dynamic programming in sequence alignment?

An exhaustive and accurate method that evaluates all possible alignments to guarantee the optimal alignment score.

2
New cards

Why are dynamic programming alignment algorithms computationally expensive?

Because they explore all possible alignment paths through a scoring matrix.

3
New cards

What type of alignment does Needleman–Wunsch perform?

Global alignment that aligns sequences end-to-end.

4
New cards

What type of alignment does Smith–Waterman perform?

Local alignment that finds the best matching region(s) within sequences.

5
New cards

What key modification distinguishes Smith–Waterman from Needleman–Wunsch?

Negative scores are reset to zero, allowing detection of local similarity.

6
New cards

How is sequence alignment represented in dynamic programming algorithms?

As a path through a scoring matrix where different paths correspond to different alignments.

7
New cards

Why can multiple optimal alignments exist in dynamic programming?

Because different paths through the matrix can yield the same maximum score.

8
New cards

What components are required to score alignments in dynamic programming?

Match/mismatch scores, gap penalties, and substitution matrices.

9
New cards

Why are dynamic programming algorithms unsuitable for large database searches?

They are too slow to apply exhaustively to millions of sequences.

10
New cards

Why are heuristic alignment algorithms needed?

To enable fast approximate searches across large databases.

11
New cards

What is a heuristic alignment algorithm?

A fast alignment method that sacrifices guaranteed optimality for speed.

12
New cards

What is BLAST?

Basic Local Alignment Search Tool used for fast similarity searches in sequence databases.

13
New cards

What types of data are BWA and Bowtie designed for?

Short reads from next-generation sequencing.

14
New cards

What is the core idea behind BLAST’s speed?

It searches only promising regions instead of performing full dynamic programming everywhere.

15
New cards

What is the first step of BLAST?

The query sequence is broken into short words.

16
New cards

What happens after BLAST finds exact word matches?

Matches are extended into longer alignments allowing mismatches and gaps.

17
New cards

Why are low-complexity regions filtered before BLAST searches?

Because repetitive regions can generate false similarity.

18
New cards

What is the raw score in BLAST?

The sum of matches minus penalties for mismatches, gaps, insertions, and deletions.

19
New cards

What does a higher raw BLAST score indicate?

A better alignment.

20
New cards

What is a bit score in BLAST?

A normalised alignment score independent of database size.

21
New cards

Why is the bit score useful?

It allows comparison of alignment strength across different searches.

22
New cards

What is query coverage in BLAST?

The percentage of the query sequence included in the alignment.

23
New cards

What is the E-value in BLAST?

The expected number of hits with a similar score occurring by chance.

24
New cards

How should E-values be interpreted?

Smaller values indicate more significant matches.

25
New cards

What is the rule of thumb for E-values?

E-value much less than 1 indicates meaningful similarity.

26
New cards

Why must BLAST results be interpreted using multiple metrics together?

Because score, coverage, and E-value each describe different aspects of alignment quality.

27
New cards

What does BQE stand for in BLAST interpretation?

Bit score (strength), Query coverage (extent), E-value (significance).

28
New cards

What is a multiple sequence alignment (MSA)?

An alignment of more than two sequences simultaneously.

29
New cards

Why are MSAs more informative than pairwise alignments?

They reveal conserved regions and shared evolutionary constraints.

30
New cards

What types of regions are identified using MSAs?

Conserved regions, functional domains, and evolutionary relationships.

31
New cards

Why should MSAs include both closely and distantly related sequences?

Close sequences provide signal; distant sequences provide variation.

32
New cards

Why are sequences that are too similar problematic in MSA?

They add redundancy and little new information.

33
New cards

Why are sequences that are too different problematic in MSA?

They make alignments unreliable.

34
New cards

What is the core strategy used by Clustal Omega?

Progressive alignment.

35
New cards

What are the steps of progressive multiple sequence alignment?

Compute pairwise similarities, build a guide tree, then align sequences step-by-step.

36
New cards

What is a major limitation of progressive alignment?

Early alignment errors are locked in and cannot be corrected.

37
New cards

Why are MSAs not guaranteed to be optimal?

Because they rely on heuristic, stepwise alignment rather than exhaustive searching.

38
New cards

What defines a conserved region in an alignment?

Regions with many matches, few gaps, and similar residues across sequences.

39
New cards

Why are conserved regions biologically important?

They are under strong selection and often correspond to functional or structural roles.

40
New cards

What types of protein features are commonly conserved?

Active sites, binding regions, and structural cores.

41
New cards

What defines a variable region in an alignment?

Regions with many mismatches, many gaps, and variable lengths.

42
New cards

Why are variable regions less constrained evolutionarily?

Changes in these regions usually do not disrupt function.

43
New cards

What protein features are often found in variable regions?

Loops, linkers, surface regions, and regulatory or spacer regions.

44
New cards

How do alignments relate to biological meaning beyond scores?

Clusters of matches indicate conserved regions, while scattered matches indicate variability.

45
New cards

What is a consensus sequence?

The most common residue at each position in an alignment.

46
New cards

Why is a consensus sequence not the same as an ancestral sequence?

It reflects frequency, not evolutionary history.

47
New cards

What factors influence a consensus sequence?

Species composition and phylogenetic bias.

48
New cards

Why can consensus sequences be misleading?

If one group dominates the alignment, the consensus reflects that group rather than true conservation.

49
New cards

What are consensus sequences commonly used for?

Motif discovery, functional annotation, and profile-based models.

50
New cards