bioinformatics quiz #1

0.0(0)

Studied by 0 people

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/48

There's no tags or description

Looks like no tags are added yet.

Last updated 1:01 AM on 10/17/25

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai

No analytics yet

Send a link to your students to track their progress

49 Terms

New cards

origins of bioinformatics

the earliest foundations (1950–1970) focused primarily on protein sequence analysis, before DNA could be easily sequenced

New cards

COMPROTEIN

the first known bioinformatics software (early 1960s), developed by Margaret Dayhoff, designed to assemble whole protein sequences (de novo) from small Edman peptide fragments.

New cards

paradigm shift (1970–1980)

bioinformatics began shifting its focus from protein analysis to DNA analysis after the deciphering of the genetic code and the invention of efficient sequencing methods (Sanger)

New cards

Needleman-Wunsch (1970)

developed the first dynamic programming algorithm for performing Homology: Orthologyprotein sequence alignments

New cards

homology: orthology

defined by walter m. fitch (1970) as homology resulting from a speciation event

New cards

dayhoff/pam matrix

developed the first probabilistic model of amino acid substitutions (Point Accepted Mutations, PAMs) in 1978, using probability to measure evolutionary change

New cards

De Novo Sequencing

the determination of a full-genome sequence without using a known template or reference sequence

New cards

massively parallel /multiplexing

Massively Parallel involves multiple processors working simultaneously. Multiplexing combines multiple inputs/samples into a single sequence run.

New cards

overfitting

occurs when a model built on training data shows high accuracy (e.g., classifying stream condition using OTUs) but significantly decreased accuracy when applied to separate validation data, indicating the model is too specific to the initial dataset features

New cards

sanger (dideoxy) read length

long reads (~600–1000 bp)

New cards

sanger (dideoxy) throughput/cost

low throughput, typically for single samples

New cards

sanger (dideoxy) accuracy/output

reads suffer quality loss at the beginning and end of the sequence

New cards

sanger (dideoxy) key feature/mechanism

based on chain-terminating dideoxynucleotides (ddNTPs)

New cards

illumina (MiSeq) read length

short reads (100-300bp)

New cards

illumina (MiSeq) throughput/cost

high/massively parallel throughput

New cards

illumina (MiSeq) accuracy/output

high accuracy

New cards

illumina (MiSeq) key feature/mechanism

uses Sequencing by Synthesis / Bridge Amplification where fragments attached to a flow cell are amplified into clusters

New cards

Oxford Nanopore (MinION) read length

ultra-long reads (up to 1,000,000 bp/millions of bases)

New cards

Oxford Nanopore (MinION) throughput/cost

high throughput, portable, USB-powered

New cards

Oxford Nanopore (MinION) accuracy/output

moderate error rate compared to other platforms

New cards

Oxford Nanopore (MinION) key feature/mechanism

DNA strand passes through a nanopore; changes in electrical current are decoded into the DNA sequence (called basecalling).

New cards

FASTQ file

a file format that incorporates both the nucleotide sequence and the associated quality scores

New cards

phred score (Q)

a measure of sequence quality determination. a Phred score of 20 (Q20) implies a probability of less than 1% error per base, meaning 99% accuracy in the base call. Q30 implies 99.9% accuracy

New cards

coverage

the average number of reads that align to, or "cover," known reference bases. 50X genome coverage is recommended for robust taxonomic work

New cards

single-end reads

sequence in one direction of the fragment

New cards

paired-end reads

report sequences from both directions of a DNA fragment, which is valuable for assembly

New cards

3 domains of life

archaea, bacteria, and eukarya

New cards

why might we move to two domains?

based on the discovery that eukaryotes evolved from the domain Archaea, rather than separately, invoking the idea that there are now 2 domains: bacteria and archaea, with eukarya branching inside of archaea

New cards

membrane bond differences

The cell membranes of Bacteria and Eukarya contain Ester linkages in their phospholipids; Archaea use Ether bonds in their membrane lipids.

New cards

what is the cell wall made of?

Peptidoglycan: The polymer forming a mesh-like layer outside the bacterial cell membrane, providing protection against osmotic pressure (preventing cell bursting). It is the target of many antibiotics.

Archaea do not possess peptidoglycan

New cards

porins

channels found in the outer membrane of Gram-negative bacteria, facilitating the exchange of nutrients

New cards

chemoautotrophy

(metabolic diversity) relies on inorganic compounds for carbon and energy and is utilized only by prokaryotes (bacteria and archaea)

New cards

Classic Biological Species Concept

a species is a group of organisms that can interbreed naturally and produce viable, fertile offspring, and are reproductively isolated from other groups

meaningless for microbes primarily because they reproduce asexually.

New cards

Horizontal Gene Transfer (HGT)

Mechanisms for new gene acquisition in bacteria, including Transformation (naked DNA uptake), Transduction (DNA transfer via viruses/phage), and Conjugation (DNA transfer via cell-to-cell contact).

New cards

chemotaxis

(bacterial movement) describes movement toward chemical attractants or away from repellents

New cards

The Species Problem for Microbes

long-standing difficulty in defining what constitutes a microbial species

based on 16S rRNA gene similarity, organisms are often grouped by operational definitions.

New cards

Operational Taxonomic Unit (OTU)

Typically defined by sequences being 97% or 99% similar

New cards

Amplicon Sequence Variant (ASV)

sequences that are 100% identical and function as unique identifiers for taxa

New cards

16S rRNA Gene Sequencing

This marker gene is commonly used for taxonomic diversity studies (metabaroding). The V3-V4 hypervariable region (approx. 464 bp) is typically targeted in short-read sequencing (Illumina MiSeq). Full-length 16S sequencing (V1-V9 regions, approx. 1465 bp) is possible with long-read platforms like MinION.

New cards

Long Read Advantage for 16S rRNA Gene Sequencing

Sequencing the near full-length 16S rRNA gene (MinION) generally provides significantly higher taxonomic resolution at the species level compared to short-read sequencing that analyzes only partial regions (Illumina MiSeq)

New cards

Phyla Dominance in Streams

Proteobacteria dominated both water and sediment stream samples in the Maryland study

New cards

Sediment microbial communities

proved much better at predicting ecological condition (BIBI scores) than water column samples

New cards

Alpha Diversity

Measures diversity within a single sample

Measures include species richness and evenness
Indices often used: Shannon, Simpson, Chao1, ACE

New cards

species richness

number of phenotypes

New cards

species evenness

relative abundance/distribution of individuals per phylotype

New cards

beta diversity

measures the difference or change in diversity of species between communities (e.g., between two different environments)

a High Beta diversity measure indicates low similarity between the two communities
Indices include Jaccard, Bray-Curtis, Euclidian, and UniFrac Distances

New cards

quorum sensing (QS)

a system of cell-to-cell communication where bacteria regulate specific group behaviors in response to population density

QS uses small signal molecules called autoinducers
Regulated behaviors include biofilm formation, bioluminescence, and virulence production

New cards

Microbial Resilience / Dysbiosis (AKP)

As corals undergo stress, their microbial community becomes destabilized, often resulting in increased variability (or dispersion) in microbial community composition. This pattern of increased variability in stressed individuals is sometimes related to the Anna Karenina Principle (AKP)

New cards

Anna Karenina Principle (AKP)

healthy microbiomes are similar across individuals, while disease-associated microbiomes are often unique and vary from person to person