BIOINFORMATICS

0.0(0)

Studied by 1 person

Call Kai

Learn

Practice Test

Spaced Repetition

Match

Flashcards

Knowt Play

Card Sorting

1/102

There's no tags or description

Looks like no tags are added yet.

Last updated 12:48 PM on 5/10/26

Name	Mastery	Learn	Test	Matching	Spaced	Call with Kai	Chat

No analytics yet

Send a link to your students to track their progress

103 Terms

New cards

Margaret O. Dayhoff

edited the collection of amino acid sequences compiled in the Atlas of protein sequence and structure by comparison of amino acid sequences by developing computer software for detecting distantly related sequences

New cards

EMBL

established data library in 1980

New cards

NCBI

established in USA and became the primary information databank and provider of information

New cards

Bioinformatics

combination of biology and informatics

New cards

In silico

analysis by computer

New cards

Human Genome Project

spurred the rapid rise of bioinformatics as a formal discipline

New cards

NCBI

creates automated systems for storing and analyzing knowledge about molecular biology, biochemistry and genetics

New cards

Identity

extent to which two sequences are the same

New cards

Alignment

lining up two or more sequences to search for maximal regions of identity or similarity

New cards

Local alignment

alignment of some portion of two sequences

New cards

Multiple sequence

alignment of three or more sequences arranged with gaps so common residues align together

New cards

Optimal alignment

alignment of two sequences with the best degree of identity

New cards

Conservation

sequence changes that maintain the properties of the original sequence

New cards

Similarity

relatedness of sequences, percent identity or conservation

New cards

Algorithm

fixed set of commands in a computer program

New cards

Domain

discreet portion of a protein or DNA sequence

New cards

Motif

highly conserved short region in protein domains

New cards

Gap

space introduced in alignment to compensate for insertions or deletions

New cards

Homology

similarity attributed to descent from a common ancestor

New cards

Orthology

homology in different species due to a common ancestral gene

New cards

Paralogy

homology within the same species resulting from gene duplication

New cards

Query

sequence presented for comparison with all other sequences in a selected database

New cards

Annotation

description of functional structures such as introns or exons in DNA

New cards

Interface

point of meeting between a computer and an external entity

New cards

GenBank

genetic sequence database sponsored by the National Institutes of Health

New cards

PubMed

search service sponsored by the National Library of Medicine providing access to literature citations in Medline and related databases

New cards

SwissProt

protein database sponsored by the Medical Research Council United Kingdom

New cards

International Union of Pure and Applied Chemistry and International Union of Biochemistry and Molecular Biology

organizations that made the IUB universal nomenclature for mixed bases

New cards

purine or A and G

New cards

pyrimidine or C and T

New cards

A and C

New cards

G and T

New cards

C and G

New cards

A and T

New cards

A, C, T or not G

New cards

C, G, T or not A

New cards

A, C, G or not T

New cards

A, G, T or not C

New cards

A, C, G, T or any

New cards

X or ?

unknown A or C or G or T

New cards

O or -

deletion

New cards

BLAST

Basic Local Alignment Search Tool

New cards

BLAST

used for homology searches

New cards

BLAST

searches GenBank maintained by NCBI

New cards

BLAST

searches for regions of local similarity between protein and nucleotide sequences

New cards

E-value

number of matches to the query sequence

New cards

Very low E-values 10^-12

associated with perfect match

New cards

Mis-primes

caused by multiple potential binding sites

New cards

Off-target products

caused by multiple potential binding sites

New cards

GenBank

international nucleotide sequence database and repository of NCBI

New cards

ENA

international nucleotide sequence database and repository of EMBL-EBI

New cards

DDBJ

nucleotide sequence database in Japan

New cards

UniProt

protein database with sequence and functional annotation

New cards

Ensembl

vertebrate and eukaryotic genomes database

New cards

Ensembl genomes

genome-scale data for bacteria, protists, fungi, plants and invertebrate metazoa

New cards

InterPro

functional analysis database for protein sequences

New cards

Pfam

manually curated collection of protein domain families

New cards

FASTA

most widely used format in bioinformatics

New cards

.fasta or .fa

file extensions for FASTA

New cards

FASTA

format beginning with greater than symbol >

New cards

GenBank file

starts with LOCUS and sequence itself

New cards

ORIGIN

beginning of sequence in GenBank format

New cards

ending of GenBank sequence

New cards

EMBL file

used by European Molecular Biology Laboratory

New cards

identifier marking beginning of EMBL file

New cards

start of sequence in EMBL format

New cards

.aln

typical file extension for CLUSTAL

New cards

CLUSTAL

multiple sequence alignment format used for phylogenic algorithms

New cards

Dashes -

indicate deletions in CLUSTAL

New cards

.nex or .nxs

typical file extensions for NEXUS

New cards

NEXUS

begins with wording “nexus” followed by blocks containing commands

New cards

.phy or .ph

typical file extensions for PHYLIP

New cards

Sequence alignment

arranging DNA, RNA or protein sequences to identify regions of similarity

New cards

Reference sequence

known sequence

New cards

Query sequence

unknown sequence

New cards

Global alignment

uses Needleman-Wunsch algorithm

New cards

Global alignment

assumes sequences are similar over entire length

New cards

Local alignment

based on Smith-Waterman

New cards

Local alignment

finds local regions with highest similarity

New cards

Pairwise sequence alignment

used by BLAST

New cards

Dot matrix

old method of producing pairwise alignments

New cards

Dynamic programming algorithm

advanced method of producing pairwise alignments

New cards

Word or K-tuple method

advanced method used in FASTA and BLAST

New cards

Dot plots

another term for dot matrix

New cards

Richard Bellman

introduced dynamic programming method in 1940

New cards

Word or K-tuple method

identifies short non-overlapping subsequences of query sequence

New cards

BLASTp

compares amino acid query sequence against protein database

New cards

BLASTn

compares nucleotide query sequence against nucleotide database

New cards

BLASTx

searches six frame translation products of nucleotide sequence against protein database

New cards

tBLASTn

searches protein sequence against translated nucleotide sequence database

New cards

tBLASTx

compares six frame translations of nucleotide query against six frame translations of database

New cards

Mega BLAST

optimized for aligning long DNA sequences

New cards

PSI BLAST

position specific iterated BLAST

New cards

PHI BLAST

pattern hit initiated BLAST

New cards

Nucleotide BLAST

option clicked first when performing BLAST procedure

New cards

CCAGAGTCCAGCTGCTGCTCATACTACTGATACTGCTGGG

example sequence used for BLAST practice

New cards

BLASTn

program selected under program selection category during procedure

New cards

E-value less than 1.0

indicates significant alignments

New cards

Percent identity

percentage similarity between query and subject sequence

100

New cards

Accession number

unique identifying number assigned to a sequence before database entry