1/49
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No study sessions yet.
What is dynamic programming in sequence alignment?
An exhaustive and accurate method that evaluates all possible alignments to guarantee the optimal alignment score.
Why are dynamic programming alignment algorithms computationally expensive?
Because they explore all possible alignment paths through a scoring matrix.
What type of alignment does Needleman–Wunsch perform?
Global alignment that aligns sequences end-to-end.
What type of alignment does Smith–Waterman perform?
Local alignment that finds the best matching region(s) within sequences.
What key modification distinguishes Smith–Waterman from Needleman–Wunsch?
Negative scores are reset to zero, allowing detection of local similarity.
How is sequence alignment represented in dynamic programming algorithms?
As a path through a scoring matrix where different paths correspond to different alignments.
Why can multiple optimal alignments exist in dynamic programming?
Because different paths through the matrix can yield the same maximum score.
What components are required to score alignments in dynamic programming?
Match/mismatch scores, gap penalties, and substitution matrices.
Why are dynamic programming algorithms unsuitable for large database searches?
They are too slow to apply exhaustively to millions of sequences.
Why are heuristic alignment algorithms needed?
To enable fast approximate searches across large databases.
What is a heuristic alignment algorithm?
A fast alignment method that sacrifices guaranteed optimality for speed.
What is BLAST?
Basic Local Alignment Search Tool used for fast similarity searches in sequence databases.
What types of data are BWA and Bowtie designed for?
Short reads from next-generation sequencing.
What is the core idea behind BLAST’s speed?
It searches only promising regions instead of performing full dynamic programming everywhere.
What is the first step of BLAST?
The query sequence is broken into short words.
What happens after BLAST finds exact word matches?
Matches are extended into longer alignments allowing mismatches and gaps.
Why are low-complexity regions filtered before BLAST searches?
Because repetitive regions can generate false similarity.
What is the raw score in BLAST?
The sum of matches minus penalties for mismatches, gaps, insertions, and deletions.
What does a higher raw BLAST score indicate?
A better alignment.
What is a bit score in BLAST?
A normalised alignment score independent of database size.
Why is the bit score useful?
It allows comparison of alignment strength across different searches.
What is query coverage in BLAST?
The percentage of the query sequence included in the alignment.
What is the E-value in BLAST?
The expected number of hits with a similar score occurring by chance.
How should E-values be interpreted?
Smaller values indicate more significant matches.
What is the rule of thumb for E-values?
E-value much less than 1 indicates meaningful similarity.
Why must BLAST results be interpreted using multiple metrics together?
Because score, coverage, and E-value each describe different aspects of alignment quality.
What does BQE stand for in BLAST interpretation?
Bit score (strength), Query coverage (extent), E-value (significance).
What is a multiple sequence alignment (MSA)?
An alignment of more than two sequences simultaneously.
Why are MSAs more informative than pairwise alignments?
They reveal conserved regions and shared evolutionary constraints.
What types of regions are identified using MSAs?
Conserved regions, functional domains, and evolutionary relationships.
Why should MSAs include both closely and distantly related sequences?
Close sequences provide signal; distant sequences provide variation.
Why are sequences that are too similar problematic in MSA?
They add redundancy and little new information.
Why are sequences that are too different problematic in MSA?
They make alignments unreliable.
What is the core strategy used by Clustal Omega?
Progressive alignment.
What are the steps of progressive multiple sequence alignment?
Compute pairwise similarities, build a guide tree, then align sequences step-by-step.
What is a major limitation of progressive alignment?
Early alignment errors are locked in and cannot be corrected.
Why are MSAs not guaranteed to be optimal?
Because they rely on heuristic, stepwise alignment rather than exhaustive searching.
What defines a conserved region in an alignment?
Regions with many matches, few gaps, and similar residues across sequences.
Why are conserved regions biologically important?
They are under strong selection and often correspond to functional or structural roles.
What types of protein features are commonly conserved?
Active sites, binding regions, and structural cores.
What defines a variable region in an alignment?
Regions with many mismatches, many gaps, and variable lengths.
Why are variable regions less constrained evolutionarily?
Changes in these regions usually do not disrupt function.
What protein features are often found in variable regions?
Loops, linkers, surface regions, and regulatory or spacer regions.
How do alignments relate to biological meaning beyond scores?
Clusters of matches indicate conserved regions, while scattered matches indicate variability.
What is a consensus sequence?
The most common residue at each position in an alignment.
Why is a consensus sequence not the same as an ancestral sequence?
It reflects frequency, not evolutionary history.
What factors influence a consensus sequence?
Species composition and phylogenetic bias.
Why can consensus sequences be misleading?
If one group dominates the alignment, the consensus reflects that group rather than true conservation.
What are consensus sequences commonly used for?
Motif discovery, functional annotation, and profile-based models.