bioinformatics quiz #1

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/48

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

49 Terms

1
New cards

origins of bioinformatics

the earliest foundations (1950–1970) focused primarily on protein sequence analysis, before DNA could be easily sequenced

2
New cards

COMPROTEIN

the first known bioinformatics software (early 1960s), developed by Margaret Dayhoff, designed to assemble whole protein sequences (de novo) from small Edman peptide fragments.

3
New cards

paradigm shift (1970–1980)

bioinformatics began shifting its focus from protein analysis to DNA analysis after the deciphering of the genetic code and the invention of efficient sequencing methods (Sanger)

4
New cards

Needleman-Wunsch (1970)

developed the first dynamic programming algorithm for performing Homology: Orthologyprotein sequence alignments

5
New cards

homology: orthology

defined by walter m. fitch (1970) as homology resulting from a speciation event

6
New cards

dayhoff/pam matrix

developed the first probabilistic model of amino acid substitutions (Point Accepted Mutations, PAMs) in 1978, using probability to measure evolutionary change

7
New cards

De Novo Sequencing

the determination of a full-genome sequence without using a known template or reference sequence

8
New cards

massively parallel /multiplexing

Massively Parallel involves multiple processors working simultaneously. Multiplexing combines multiple inputs/samples into a single sequence run.

9
New cards

overfitting

occurs when a model built on training data shows high accuracy (e.g., classifying stream condition using OTUs) but significantly decreased accuracy when applied to separate validation data, indicating the model is too specific to the initial dataset features

10
New cards

sanger (dideoxy) read length

long reads (~600–1000 bp)

11
New cards

sanger (dideoxy) throughput/cost

low throughput, typically for single samples

12
New cards

sanger (dideoxy) accuracy/output

reads suffer quality loss at the beginning and end of the sequence

13
New cards

sanger (dideoxy) key feature/mechanism

based on chain-terminating dideoxynucleotides (ddNTPs)

14
New cards

illumina (MiSeq) read length

short reads (100-300bp)

15
New cards

illumina (MiSeq) throughput/cost

high/massively parallel throughput

16
New cards

illumina (MiSeq) accuracy/output

high accuracy

17
New cards

illumina (MiSeq) key feature/mechanism

uses Sequencing by Synthesis / Bridge Amplification where fragments attached to a flow cell are amplified into clusters

18
New cards

Oxford Nanopore (MinION) read length

ultra-long reads (up to 1,000,000 bp/millions of bases)

19
New cards

Oxford Nanopore (MinION) throughput/cost

high throughput, portable, USB-powered

20
New cards

Oxford Nanopore (MinION) accuracy/output

moderate error rate compared to other platforms

21
New cards

Oxford Nanopore (MinION) key feature/mechanism

DNA strand passes through a nanopore; changes in electrical current are decoded into the DNA sequence (called basecalling).

22
New cards

FASTQ file

a file format that incorporates both the nucleotide sequence and the associated quality scores

23
New cards

phred score (Q)

a measure of sequence quality determination. a Phred score of 20 (Q20) implies a probability of less than 1% error per base, meaning 99% accuracy in the base call. Q30 implies 99.9% accuracy

24
New cards

coverage

the average number of reads that align to, or "cover," known reference bases. 50X genome coverage is recommended for robust taxonomic work

25
New cards

single-end reads

sequence in one direction of the fragment

26
New cards

paired-end reads

report sequences from both directions of a DNA fragment, which is valuable for assembly

27
New cards

3 domains of life

archaea, bacteria, and eukarya

28
New cards

why might we move to two domains?

based on the discovery that eukaryotes evolved from the domain Archaea, rather than separately, invoking the idea that there are now 2 domains: bacteria and archaea, with eukarya branching inside of archaea

29
New cards

membrane bond differences

The cell membranes of Bacteria and Eukarya contain Ester linkages in their phospholipids; Archaea use Ether bonds in their membrane lipids.

30
New cards

what is the cell wall made of?

Peptidoglycan: The polymer forming a mesh-like layer outside the bacterial cell membrane, providing protection against osmotic pressure (preventing cell bursting). It is the target of many antibiotics.

  • Archaea do not possess peptidoglycan

31
New cards

porins

channels found in the outer membrane of Gram-negative bacteria, facilitating the exchange of nutrients

32
New cards

chemoautotrophy

(metabolic diversity) relies on inorganic compounds for carbon and energy and is utilized only by prokaryotes (bacteria and archaea)

33
New cards

Classic Biological Species Concept

a species is a group of organisms that can interbreed naturally and produce viable, fertile offspring, and are reproductively isolated from other groups

  • meaningless for microbes primarily because they reproduce asexually.

34
New cards

Horizontal Gene Transfer (HGT)

Mechanisms for new gene acquisition in bacteria, including Transformation (naked DNA uptake), Transduction (DNA transfer via viruses/phage), and Conjugation (DNA transfer via cell-to-cell contact).

35
New cards

chemotaxis

(bacterial movement) describes movement toward chemical attractants or away from repellents

36
New cards

The Species Problem for Microbes

long-standing difficulty in defining what constitutes a microbial species

  • based on 16S rRNA gene similarity, organisms are often grouped by operational definitions.

37
New cards

Operational Taxonomic Unit (OTU)

Typically defined by sequences being 97% or 99% similar

38
New cards

Amplicon Sequence Variant (ASV)

sequences that are 100% identical and function as unique identifiers for taxa

39
New cards

16S rRNA Gene Sequencing

This marker gene is commonly used for taxonomic diversity studies (metabaroding). The V3-V4 hypervariable region (approx. 464 bp) is typically targeted in short-read sequencing (Illumina MiSeq). Full-length 16S sequencing (V1-V9 regions, approx. 1465 bp) is possible with long-read platforms like MinION.

40
New cards

Long Read Advantage for 16S rRNA Gene Sequencing

Sequencing the near full-length 16S rRNA gene (MinION) generally provides significantly higher taxonomic resolution at the species level compared to short-read sequencing that analyzes only partial regions (Illumina MiSeq)

41
New cards

Phyla Dominance in Streams

Proteobacteria dominated both water and sediment stream samples in the Maryland study

42
New cards

Sediment microbial communities

proved much better at predicting ecological condition (BIBI scores) than water column samples

43
New cards

Alpha Diversity

Measures diversity within a single sample

  • Measures include species richness and evenness

  • Indices often used: Shannon, Simpson, Chao1, ACE

44
New cards

species richness

number of phenotypes

45
New cards

species evenness

relative abundance/distribution of individuals per phylotype

46
New cards

beta diversity

measures the difference or change in diversity of species between communities (e.g., between two different environments)

  • a High Beta diversity measure indicates low similarity between the two communities

  • Indices include Jaccard, Bray-Curtis, Euclidian, and UniFrac Distances

47
New cards

quorum sensing (QS)

a system of cell-to-cell communication where bacteria regulate specific group behaviors in response to population density

  • QS uses small signal molecules called autoinducers

  • Regulated behaviors include biofilm formation, bioluminescence, and virulence production

48
New cards

Microbial Resilience / Dysbiosis (AKP)

As corals undergo stress, their microbial community becomes destabilized, often resulting in increased variability (or dispersion) in microbial community composition. This pattern of increased variability in stressed individuals is sometimes related to the Anna Karenina Principle (AKP)

49
New cards

Anna Karenina Principle (AKP)

healthy microbiomes are similar across individuals, while disease-associated microbiomes are often unique and vary from person to person