bioinformatics final exam condensed focusing on slides he told us to focus on

0.0(0)
studied byStudied by 1 person
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/101

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

102 Terms

1
New cards

Heuristic search

A rapid, approximate search strategy that favors speed over guaranteed optimal alignment.

2
New cards

Why BLAST uses heuristics

Allows extremely fast identification of local similarities in huge databases.

3
New cards

Seed-and-extend method

BLAST finds short matching “seeds” then extends them outward to form alignments.

4
New cards

Local alignment

Focuses on matching the most similar regions between sequences, not the entire length.

5
New cards

BLAST purpose

Finds homologous sequences and identifies evolutionary or functional relationships.

6
New cards

E-value meaning

Number of expected random matches; a lower E-value means a more significant hit.

7
New cards

Importance of E-value

Used to distinguish biologically meaningful matches from random noise.

8
New cards

EXPASY definition

A bioinformatics portal linking major protein tools including UniProt, Prosite, and PDB.

9
New cards

UniProt importance

Primary resource for protein sequence, structure, function, variants, and domain information.

10
New cards

Swiss-Prot vs TrEMBL

Swiss-Prot is manually reviewed; TrEMBL is computationally annotated.

11
New cards

Prosite definition

Database of protein domains, motifs, and profiles used for identifying protein families.

12
New cards

Functional domains

Conserved protein regions responsible for specific biochemical functions.

13
New cards

PDB definition

Database of 3D protein structures determined by X-ray, NMR, cryo-EM, or predicted (AlphaFold).

14
New cards

When PDB is used

To examine active sites, domain organization, and ligand interactions.

15
New cards

BLASTN purpose

Compares nucleotide sequences to identify genes, family members, or splice variants.

16
New cards

BLASTX purpose

Translates DNA → protein to detect potential coding regions.

17
New cards

BLASTP purpose

Compares protein sequences to find orthologs, paralogs, and conserved domains.

18
New cards

Importance of annotation

Makes BLAST results interpretable by showing coding regions, domains, and features.

19
New cards

Alternative splicing in BLAST

Seen when family members align strongly but have different exon structures.

20
New cards

Systems biology definition

An interdisciplinary approach that models entire biological systems rather than single components.

21
New cards

Holistic property

System behavior emerges from interactions, not isolated parts.

22
New cards

Interdisciplinary nature

Integrates biology, math, computing, engineering, and statistics.

23
New cards

Emergent properties

New functions or behaviors that arise from interacting components.

24
New cards

Positive feedback

Amplifies system output and can push systems toward new states.

25
New cards

Negative feedback

Stabilizes systems and maintains homeostasis.

26
New cards

Chaos definition

System behavior highly sensitive to initial conditions; long-term outcomes unpredictable.

27
New cards

Attractors

Stable system states toward which dynamic systems tend to move.

28
New cards

Tipping point

Critical threshold where a small input causes a large system transition.

29
New cards

Why modeling is needed

Systems are too complex for intuition alone; require mathematical simulation.

30
New cards

Data integration importance

Combining omics datasets gives a complete picture of system behavior.

31
New cards

Genomics role

Provides the blueprint of potential cellular functions.

32
New cards

Transcriptomics role

Shows which genes are actively expressed.

33
New cards

Proteomics role

Shows functional proteins actually produced.

34
New cards

Metabolomics role

Reveals biochemical activity and pathway flux.

35
New cards

GLP-1 pathway modeling

Predicts metabolic effects, glucose levels, and drug response profiles.

36
New cards

Modeling dosage effects

Helps determine therapeutic windows and toxicity boundaries.

37
New cards

Metagenomics definition

Sequencing DNA directly from the environment to study entire microbial communities.

38
New cards

Metagenomics advantage

Reveals diversity beyond what can be cultured.

39
New cards

Properties of complex systems

Self-regulating, modular, feedback-driven, capable of emergent behavior.

40
New cards

Evolution definition

Change in genetic composition of populations over generations.

41
New cards

Descent with modification

Organisms inherit traits but accumulate changes over time.

42
New cards

Mutation’s role

Source of all new genetic variation in evolution.

43
New cards

Gene flow

Migration of individuals moves alleles between populations.

44
New cards

Genetic drift

Random fluctuations in allele frequency; strongest in small populations.

45
New cards

Natural selection

Differential reproductive success based on heritable traits.

46
New cards

Gradualism

Slow, continuous evolutionary change.

47
New cards

Punctuated equilibrium

Rapid evolutionary bursts followed by long periods of stasis.

48
New cards

Stasis

Species remain relatively unchanged over long time periods.

49
New cards

Allopatric speciation

Formation of new species in geographically isolated populations.

50
New cards

Peripatric speciation

Small peripheral populations evolve quickly due to strong drift.

51
New cards

Sympatric speciation

New species form without geographic isolation.

52
New cards

Phylogenetic tree

Diagram showing evolutionary relationships based on shared ancestry.

53
New cards

Cladogram

Shows branching order based on shared derived traits.

54
New cards

Distance method steps

Align sequences → count differences → cluster closest taxa → build tree.

55
New cards

Parsimony definition

Tree requiring the fewest evolutionary changes is preferred.

56
New cards

Why parsimony works

Assumes simplest explanation is most likely correct.

57
New cards

Informative site

Position where two taxa share a derived state different from the outgroup.

58
New cards

Non-informative site

Position identical across taxa or showing ambiguous change.

59
New cards

Molecular clock concept

Genetic differences accumulate at roughly constant rates over time.

60
New cards

Calibrating a molecular clock

Use fossils or known divergence times to calculate mutation rate.

61
New cards

Molecular clock limitations

Rates are not constant; different genes evolve differently; selection pressures exist.

62
New cards

Metagenomics significance

Reveals unculturable organisms and novel pathways.

63
New cards

Protein conservation reason

Functional constraints limit acceptable mutations.

64
New cards

RNA secondary structure evolution

Includes compensatory mutations and stability-driven changes.

65
New cards

DNA → RNA → protein

The fundamental flow of biological information.

66
New cards

Localization signals

Codes directing proteins to specific cellular compartments.

67
New cards

Coding as layers

Multiple “languages” inside a cell: genetic, regulatory, structural, signaling.

68
New cards

Synthetic biology definition

Engineers biological systems using standardized parts and design principles.

69
New cards

Goal of synthetic biology

Make biology programmable and predictable like engineering.

70
New cards

BioBrick concept

Standardized DNA parts designed for modular assembly.

71
New cards

Base vector

A plasmid backbone enabling construction of new BioBrick-compatible vectors.

72
New cards

BioBrick assembly methods

Four-enzyme standard assembly or 3A assembly.

73
New cards

Verification of assembly

Antibiotic selection, colony PCR, sequencing.

74
New cards

Oncolytic virus definition

Engineered virus that infects, replicates in, and kills tumor cells selectively.

75
New cards

Tumor-selective replication

Achieved via deleted virulence genes or tumor-specific promoters.

76
New cards

Therapeutic transgenes

Boost immune activation or promote tumor destruction.

77
New cards

Safety mechanisms

Deletion of immune-evasion genes, use of suicide switches.

78
New cards

Targeting modifications

Surface protein engineering to bind only tumor cell receptors.

79
New cards

Glycosylation enzyme needed

Glycosyltransferase.

80
New cards

Cell fate pathway

Regulatory program determining cell decisions: division, differentiation, apoptosis.

81
New cards

Structural bioinformatics

Computational analysis of protein structures and interactions.

82
New cards

Orphan receptor definition

Receptor with no known ligand; detected via homology search.

83
New cards

Role of AI in biology

Analyzes massive datasets and learns patterns inaccessible to manual methods.

84
New cards

Artificial intelligence definition

Systems that learn patterns from data to make predictions or decisions.

85
New cards

Machine learning types

Supervised, unsupervised, and reinforcement learning.

86
New cards

Deep learning

AI using multilayer neural networks to learn complex features automatically.

87
New cards

Neural network architecture

Input layer → hidden layers → output layer.

88
New cards

Backpropagation

Process of updating model weights based on prediction errors.

89
New cards

RNN definition

Neural network for sequential data with memory of previous inputs.

90
New cards

LSTM definition

RNN variant designed to learn long-range dependencies.

91
New cards

AI in bioinformatics

Predicts splicing, motifs, gene expression, classification, and structural patterns.

92
New cards

Classification model

Assigns inputs to predefined categories (e.g., SignalP).

93
New cards

SignalP example

Classifies proteins as containing or lacking signal peptides.

94
New cards

PCA purpose

Reduces dimensionality and reveals major patterns in high-dimensional data.

95
New cards

Clustering definition

Groups data based on similarity without labels.

96
New cards

Linear regression

Predicts continuous numerical outcomes from input variables.

97
New cards

Logistic regression

Predicts binary outcomes such as disease vs healthy.

98
New cards

Survival analysis

Models time until an event like relapse or death occurs.

99
New cards

Supervised vs unsupervised

Supervised uses labeled data; unsupervised identifies structure without labels.

100
New cards

AlphaFold significance

AI system predicting highly accurate protein 3D structures.