1/49
Vocabulary-style flashcards covering biological domains, genomics, sequence analysis, algorithmic complexity, and dynamic programming based on the COMPSCI 260 Unit 1 Midterm.
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Three Domains of Life
The three primary classifications of biological organisms which, in alphabetical order, are Archaea, Bacteria, and Eukarya.
Human Genome Size
Approximately 3imes109 base pairs in scientific notation.
SARS-CoV-2 Genome Size
Approximately 3imes104 nucleotides in scientific notation.
Transcription
The biological process where DNA is transformed into mRNA, primarily involving the agent RNA polymerase.
Translation
The biological process where mRNA is transformed into proteins, primarily involving the agent known as the ribosome.
Replication
The biological process where DNA is used to create more DNA, primarily involving the agent DNA polymerase.
Stop Codon
A specific nucleotide triplet that signals the molecular machinery to cease the growth of a polypeptide chain.
ORF-finder
A computational procedure used to exhaustively scan the genome of a simple organism for subsequences that might correspond to protein-coding genes.
False Positive (ORF-finder)
An output where the code identifies a subsequence as a gene when it is not actually a protein-coding gene.
False Negative (ORF-finder)
A real gene that is excluded by the ORF-finder code due to specific filtering parameters.
Reads
The resulting sequences produced when fragments of DNA from an organism are processed by a sequencing instrument.
Read Mapping
The process of aligning reads to an existing complete reference genome by finding exact substring matches.
Genome Assembly
The algorithmic task of stitching DNA reads together into longer sequences to determine an organism's genome sequence.
Contigs
Longer sequences of consecutive nucleotides output by an assembly algorithm after identifying and stitching overlapping reads.
Variable X (Sequencing)
The total length of all the reads that are sequenced by a sequencing instrument.
Variable Y (Assembly)
The total length of all the contigs that are output by a genome assembly algorithm.
Y vs X Relationship
In genome assembly, the value of Y is less than the value of X because the algorithm stitches overlapping reads together.
Genbank
A gene database that contains genetic sequences from a wide variety of organisms.
DNA Base Pairing
The principle where specific nucleotides hybridize to form double-stranded DNA; for example, G complements C and A complements T.
5’ Overhang
A structural feature of a DNA fragment where one strand extends beyond the other at the 5’ end, often referred to as a sticky end.
Recursive Function
A function that solves a problem by calling itself with smaller instances of the same problem.
Asymptotic Running Time
The measurement of how an algorithm's execution time grows as the input size increases toward infinity.
Bogosort
A sorting algorithm which is generally the asymptotically slowest among common sorting methods.
Merge Sort
A sorting algorithm known for being asymptotically faster than Bubble, Insertion, or Selection sort.
Recognition Site
A specific sequence of DNA that a restriction enzyme recognizes and binds to for the purpose of cutting the DNA.
Restriction Enzyme Symbol: r
In a recognition site, this lowercase letter represents that the enzyme accepts either A or G at that position.
Restriction Enzyme Symbol: y
In a recognition site, this lowercase letter represents that the enzyme accepts either C or T at that position.
Restriction Enzyme Symbol: w
In a recognition site, this lowercase letter represents that the enzyme accepts either A or T at that position.
Restriction Enzyme Symbol: s
In a recognition site, this lowercase letter represents that the enzyme accepts either C or G at that position.
\Theta(n \log n)$^
The asymptotic running time for efficient sorting algorithms like Merge Sort.
Θ(n2)
The asymptotic running time for algorithms like Bubble Sort, Insertion Sort, and Selection Sort in their average or worst cases.
Θ(n3)
A cubic asymptotic running time; if the problem size increases by a factor of 10, the running time increases by a factor of 1000.
Θ(2n)
An exponential asymptotic running time where the time required doubles with each incremental increase in problem size.
Elementary Multiplication Operations
The basic numeric operations required to compute the product of matrices, where the number of operations depends on the dimensions of the matrices.
Fibonacci Numbers
An infinite series of integers defined by the recurrence relation fi=fi−1+fi−2, starting with f1=1 and f2=1.
Mathematical Induction
A proof method used to prove a claim for all positive integers, consisting of a base case and an inductive step.
Base Case (Induction)
The first component of an induction proof that demonstrates the truth of the claim for the smallest possible value (e.g., n=1).
Inductive Step
The second component of an induction proof that demonstrates if the claim holds for n, it must also hold for n+1.
Overlapping Subproblems
A scenario in recursive algorithms where the same sub-calculations are performed multiple times, leading to slow performance.
Memoization
A strategy used to speed up recursive functions by storing the results of subproblems in a lookup table to avoid redundant calculations.
Dynamic Programming
An algorithmic technique that solves a problem by building a table of solutions for smaller subproblems before solving larger ones.
DP Table
A table, often designated as V, used in dynamic programming to store the solutions to subproblems parameterized by indices.
Recursive Update Equation
The formula at the heart of a dynamic programming algorithm used to calculate a cell's value from previously computed cells.
Base Cases (Dynamic Programming)
The initial cells in a DP table that must be filled manually before the recursive update equation can be applied to the rest of the table.
TRANSMOGRIFY Function
A recursive function described in the text that splits an input list into thirds (left, middle, right) and processes them.
ZAP Function
A function used within TRANSMOGRIFY that processes three lists and returns a final transmogrified result.
Constant Time (\Theta(1))
An execution time that remains the same regardless of the size of the input data.
Linear Time (\Theta(n))
An execution time that grows in direct proportion to the size of the input data.
Central Dogma
The conceptual framework representing the flow of genetic information from DNA to RNA to Protein.
Pseudo-code
A simplified, high-level description of a computer program's logic that resembles code but is intended for human reading rather than machine execution.