BIOINFORMATICS

0.0(0)
Studied by 1 person
call kaiCall Kai
Locked
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/102

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 12:48 PM on 5/10/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai
Chat

No analytics yet

Send a link to your students to track their progress

103 Terms

1
New cards

Margaret O. Dayhoff

edited the collection of amino acid sequences compiled in the Atlas of protein sequence and structure by comparison of amino acid sequences by developing computer software for detecting distantly related sequences

2
New cards

EMBL

established data library in 1980

3
New cards

NCBI

established in USA and became the primary information databank and provider of information

4
New cards

Bioinformatics

combination of biology and informatics

5
New cards

In silico

analysis by computer

6
New cards

Human Genome Project

spurred the rapid rise of bioinformatics as a formal discipline

7
New cards

NCBI

creates automated systems for storing and analyzing knowledge about molecular biology, biochemistry and genetics

8
New cards

Identity

extent to which two sequences are the same

9
New cards

Alignment

lining up two or more sequences to search for maximal regions of identity or similarity

10
New cards

Local alignment

alignment of some portion of two sequences

11
New cards

Multiple sequence

alignment of three or more sequences arranged with gaps so common residues align together

12
New cards

Optimal alignment

alignment of two sequences with the best degree of identity

13
New cards

Conservation

sequence changes that maintain the properties of the original sequence

14
New cards

Similarity

relatedness of sequences, percent identity or conservation

15
New cards

Algorithm

fixed set of commands in a computer program

16
New cards

Domain

discreet portion of a protein or DNA sequence

17
New cards

Motif

highly conserved short region in protein domains

18
New cards

Gap

space introduced in alignment to compensate for insertions or deletions

19
New cards

Homology

similarity attributed to descent from a common ancestor

20
New cards

Orthology

homology in different species due to a common ancestral gene

21
New cards

Paralogy

homology within the same species resulting from gene duplication

22
New cards

Query

sequence presented for comparison with all other sequences in a selected database

23
New cards

Annotation

description of functional structures such as introns or exons in DNA

24
New cards

Interface

point of meeting between a computer and an external entity

25
New cards

GenBank

genetic sequence database sponsored by the National Institutes of Health

26
New cards

PubMed

search service sponsored by the National Library of Medicine providing access to literature citations in Medline and related databases

27
New cards

SwissProt

protein database sponsored by the Medical Research Council United Kingdom

28
New cards

International Union of Pure and Applied Chemistry and International Union of Biochemistry and Molecular Biology

organizations that made the IUB universal nomenclature for mixed bases

29
New cards

R

purine or A and G

30
New cards

Y

pyrimidine or C and T

31
New cards

M

A and C

32
New cards

K

G and T

33
New cards

S

C and G

34
New cards

W

A and T

35
New cards

H

A, C, T or not G

36
New cards

B

C, G, T or not A

37
New cards

V

A, C, G or not T

38
New cards

D

A, G, T or not C

39
New cards

N

A, C, G, T or any

40
New cards

X or ?

unknown A or C or G or T

41
New cards

O or -

deletion

42
New cards

BLAST

Basic Local Alignment Search Tool

43
New cards

BLAST

used for homology searches

44
New cards

BLAST

searches GenBank maintained by NCBI

45
New cards

BLAST

searches for regions of local similarity between protein and nucleotide sequences

46
New cards

E-value

number of matches to the query sequence

47
New cards

Very low E-values 10^-12

associated with perfect match

48
New cards

Mis-primes

caused by multiple potential binding sites

49
New cards

Off-target products

caused by multiple potential binding sites

50
New cards

GenBank

international nucleotide sequence database and repository of NCBI

51
New cards

ENA

international nucleotide sequence database and repository of EMBL-EBI

52
New cards

DDBJ

nucleotide sequence database in Japan

53
New cards

UniProt

protein database with sequence and functional annotation

54
New cards

Ensembl

vertebrate and eukaryotic genomes database

55
New cards

Ensembl genomes

genome-scale data for bacteria, protists, fungi, plants and invertebrate metazoa

56
New cards

InterPro

functional analysis database for protein sequences

57
New cards

Pfam

manually curated collection of protein domain families

58
New cards

FASTA

most widely used format in bioinformatics

59
New cards

.fasta or .fa

file extensions for FASTA

60
New cards

FASTA

format beginning with greater than symbol >

61
New cards

GenBank file

starts with LOCUS and sequence itself

62
New cards

ORIGIN

beginning of sequence in GenBank format

63
New cards

//

ending of GenBank sequence

64
New cards

EMBL file

used by European Molecular Biology Laboratory

65
New cards

ID

identifier marking beginning of EMBL file

66
New cards

SQ

start of sequence in EMBL format

67
New cards

.aln

typical file extension for CLUSTAL

68
New cards

CLUSTAL

multiple sequence alignment format used for phylogenic algorithms

69
New cards

Dashes -

indicate deletions in CLUSTAL

70
New cards

.nex or .nxs

typical file extensions for NEXUS

71
New cards

NEXUS

begins with wording “nexus” followed by blocks containing commands

72
New cards

.phy or .ph

typical file extensions for PHYLIP

73
New cards

Sequence alignment

arranging DNA, RNA or protein sequences to identify regions of similarity

74
New cards

Reference sequence

known sequence

75
New cards

Query sequence

unknown sequence

76
New cards

Global alignment

uses Needleman-Wunsch algorithm

77
New cards

Global alignment

assumes sequences are similar over entire length

78
New cards

Local alignment

based on Smith-Waterman

79
New cards

Local alignment

finds local regions with highest similarity

80
New cards

Pairwise sequence alignment

used by BLAST

81
New cards

Dot matrix

old method of producing pairwise alignments

82
New cards

Dynamic programming algorithm

advanced method of producing pairwise alignments

83
New cards

Word or K-tuple method

advanced method used in FASTA and BLAST

84
New cards

Dot plots

another term for dot matrix

85
New cards

Richard Bellman

introduced dynamic programming method in 1940

86
New cards

Word or K-tuple method

identifies short non-overlapping subsequences of query sequence

87
New cards

BLASTp

compares amino acid query sequence against protein database

88
New cards

BLASTn

compares nucleotide query sequence against nucleotide database

89
New cards

BLASTx

searches six frame translation products of nucleotide sequence against protein database

90
New cards

tBLASTn

searches protein sequence against translated nucleotide sequence database

91
New cards

tBLASTx

compares six frame translations of nucleotide query against six frame translations of database

92
New cards

Mega BLAST

optimized for aligning long DNA sequences

93
New cards

PSI BLAST

position specific iterated BLAST

94
New cards

PHI BLAST

pattern hit initiated BLAST

95
New cards

Nucleotide BLAST

option clicked first when performing BLAST procedure

96
New cards

CCAGAGTCCAGCTGCTGCTCATACTACTGATACTGCTGGG

example sequence used for BLAST practice

97
New cards

BLASTn

program selected under program selection category during procedure

98
New cards

E-value less than 1.0

indicates significant alignments

99
New cards

Percent identity

percentage similarity between query and subject sequence

100
New cards

Accession number

unique identifying number assigned to a sequence before database entry