Biostats

0.0(0)
Studied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/164

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 3:43 PM on 3/20/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

165 Terms

1
New cards

Multiple Sequence Alignment

2
New cards
3
New cards

Multiple Sequence Alignment

combines both optimal (global/local) and heuristic alignment; cannot compare DNA to protein due to different scoring matrices

4
New cards

Profile Alignment

profile is created by taking a finished alignment and counting the frequency of every letter and gap and each location; progressively aligns all sequences pairwise, starting with the most similar

5
New cards

ClutalW

cluster alignment weighted; progressive alignment strategy; neighbor joining guide tree; lower accuracy; medium speed; use for small datasets with similar sequences

6
New cards

T-Coffee

tree-based consistency objective function for alignment evaluation; consistency-based alignment strategy; neighbor-joining and consistency weights guide tree; medium accuracy; lower speed; use for small datasets

7
New cards

MUSCLE

multiple sequence comparison by log-expectation; iterative progress and refinement alignment strategy; UPGMA guide tree; higher accuracy; higher speed; use for medium-large datasets

8
New cards

MAFFT

multiple alignment using fast fourier transform; progressive and iterative refinement alignment strategy; UPGMA/NJ guide tree; highest accuracy; highest speed; use for large datasets

9
New cards
10
New cards

Molecular Evolution

11
New cards
12
New cards

Mutation Types

13
New cards

Single Base Substitutions

AKA point mutations; a single base is replaced by another

14
New cards

Transition

same class of nucleotide; purine to purine or pyrimidine to pyrimidine

15
New cards

Transversion

different class of nucleotide; purine to pyrimidine or pyrimidine to purine

16
New cards

Synonymous

encodes for the same amino acid

17
New cards

Silent Mutation

the new nucleotide alters the codon but does not alter the amino acid for which it encodes

18
New cards

Nonsynonymous

encodes for a different amino acid

19
New cards

Missense Mutation

the new nucleotide alters the codon to produce an altered amino acid in the protein product (ex

20
New cards

Nonsense Mutation

the new nucleotide changes a codon that specified an amino acid to a stop codon; translation of the mRNA transcribed from this mutant gene will stop prematurely

21
New cards

Indels

the addition or subtraction of extra base pairs; creates a change in the reading frame

22
New cards

Frameshift

change in the reading frame

23
New cards

Genome Rearrangements

large scale chromosome structure changes; can alter phenotype by 1) destroying gene function, 2) change in expression via influence of different promoters and enhancers, or 3) creating hybrid genes

24
New cards

Deletion and Duplication

occurs on the same chromosome

25
New cards

Inversion (Reversal)

occurs on the same chromosome

26
New cards

Translocation

occurs between different chromosomes; usually between paternal and maternal

27
New cards

Homolog

a gene related to other genes by evolutionary descent from a common ancestral DNA sequence

28
New cards

Identity

29
New cards

((number of identical residues))/((number of residues and gaps in th? alignment)) x 100

30
New cards

Similarity

some amino acid substitutions have similar side chains, leading to a smaller effect in the final protein

31
New cards

((number of similar residues))/((number of residues and gaps in th? alignment) ) x 100

32
New cards

Point Accepted Mutation (PAM)

quantifies the rate at which amino acids change over evolutionary time; assumes constant rate of change for amino acids

33
New cards

Constant Rate

mutations occur at a relatively steady pace over time

34
New cards

Independence

each amino acid position mutates independently of its neighbor

35
New cards

Natural Selection

only count "accepted" mutations that don't break down the protein's function and are passed down

36
New cards

Matrices

PAM matrices are a series, as the number increases the evolutionary distance grows

37
New cards

PAM #

of mutations per 100 amino acids

38
New cards

PAM 1

very conserved; observable mutation; small-scale evolution

39
New cards

PAM 250

same amino acid mutation repeatedly; not observable but extrapolated; has error associated with it; large-scale evolution

40
New cards

Block Substitution Matrices (BLOSUM)

based on observed alignments; aligned sequences from functional domains (blocks) of proteins; look at domains (blocks) rather than looking at entire sequence

41
New cards

Blocks

represents highly conserved regions that have survived natural selection

42
New cards

Matrices

BLOSUM matrices represent the minimum percentage identity of the sequences used to build it

43
New cards

Lower #

distant relatives; BLOSUM45 used for very divergent sequences

44
New cards

Higher #

close relatives; BLOSUM80 used for very similar sequences

45
New cards

Similarity Score

not all amino acid matches produce the same similarity score; add all numbers for individual score, the higher the better

46
New cards

Ortholog

a gene present in different species that evolved from a common ancestral gene by speciation; retain the same/similar function in the course of evolution; speciation to give two separate species

47
New cards

Paralog

one gene of a set of genes that underwent a duplication event in a common ancestor; evolve new functions (can be related to the original function); gene duplication and divergence

48
New cards
49
New cards
50
New cards

Phylogenetic Trees

51
New cards
52
New cards

Phylogenetics

method of classification of organisms based upon their evolutionary history

53
New cards

Phylogenetic Tree

shows the evolutionary relationships among various species or other entities that likely have a common ancestor; multiple trees possible showing multiple plausible evolutionary scenarios

54
New cards

Gene-Specific Phylogenies

different genes may show different phylogenetic histories; can avoid this by using multiple genes and many single-gene analyses then concatenating them

55
New cards

Neutral Marker

genes under similar positive selection regimes in different taxa can result in convergent evolution; can make confusing phylogenetic analysis

56
New cards

Connected Graph

graph containing at least one path between any two nodes

57
New cards

Tree

type of connected graph in which there is exactly one path between every two nodes

58
New cards

Rooted Tree

shows evolutionary history of the taxa; single unique node which is the ancestor of all other nodes; directed tree which shows change over time; best done by using an outgroup

59
New cards

Outgroup

a species or molecule that is known to be more distantly related than everything else in the tree

60
New cards

Ingroup

taxa being analyzed to view relationships

61
New cards

Unrooted Tree

shows evolutionary relationships between the taxa; can't make any statement about the direction of evolution, only the closeness of relationships

62
New cards

Nodes

common ancestor; rotating a tree at a node does not change the relationships between the taxa, only the way those relationships are visualized; each node called an operational taxonomic unit

63
New cards

Branches

evolutionary lineages

64
New cards

Tips/Leaves

the most recent taxa in the analysis

65
New cards

Cladogram

branch lengths do not represent time; branching is determined by distinguishing characteristics which identify a particular clade

66
New cards

Phylogram

explicitly represents number of character changes through its branch lengths; indicates the amount of evolutionary time separating taxa

67
New cards

Distance-Based Methods

calculate the genetic distance between pairs of taxa and construct a tree based on these distances

68
New cards

Unweighted Pair Group Method with Arithmetic Mean (UPGMA)

determination of phylogenetic relationships are explicitly non-historical; simply based on similarity/dissimilarity; assumes an ultrametric tree in which the distances from the root to every branch tip are equal

69
New cards

Steps

70
New cards

(1) create tree by first selecting the most closely related sequences and insert a node to represent their common ancestor

71
New cards

(2) then replace the selected sequences by a set containing both and replace the distances from the pair to the others by the average distances

72
New cards

(3) repeat

73
New cards

Neighbor-Joining

clustering creates an additive unrooted tree using pairwise distances; all the taxa do not diverge from a most common ancestor; does not assume that all sequences have the same rate of substitution; fast and often used as a starting point in phylogenetic analyses

74
New cards

Steps

75
New cards

(1) determine the pairwise distances between all the sequences

76
New cards

(2) identify the two sequences closest to each other based on their distances

77
New cards

(3) combine these two sequences into a single node

78
New cards

(4) update the distances between this new node and the other sequences

79
New cards

(5) repeat until all sequences are joined into a single tree

80
New cards

Strengths

81
New cards

Weaknesses

82
New cards

Cladistic Methods

consider the various possible trees and choose the best possible tree; tree selection criteria varies depending on the approach; slower than neighbor joining, but usually more accurate

83
New cards

Maximum Parsimony

finds the tree that requires the fewest number of evolutionary changes to explain the observed data

84
New cards

Strengths

85
New cards

Weaknesses

86
New cards

Maximum Likelihood

finds the tree that has the highest probability of producing the observed data given a specific model of evolution

87
New cards

Strengths

88
New cards

Weaknesses

89
New cards

Newick Format

field standard for representing trees in computer-readable form; allows us to see the connections between nodes

90
New cards
91
New cards
92
New cards

Tree Accuracy and Validation

93
New cards

Methods for Testing Accuracy

94
New cards

Purpose of Bootstrapping

95
New cards
96
New cards
97
New cards

DNA Sequencing

98
New cards
99
New cards

First Generation

100
New cards

Sanger-Sequencing

AKA chain termination method; AKA dideoxynucleoside sequencing; method for determining the nucleotide sequence of DNA; DNA template from PCR required