1/134
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
BIOINFORMATICS
is a combination of information, technology, and molecular biology
BIOINFORMATICS
It is being used largely in the field of human genome research
BIOINFORMATICS
is also used to store and organize the different discovery in the sequence which will be stored in the software
BIOINFORMATICS
It can also used in understanding diseases, new molecular targets, drug discovery, etc
BIOINFORMATICS
The study of how information is represented and transmitted in biological systems, starting at the molecular level
BIOINFORMATICS
is the merger of biology with information technology
COMPUTATIONAL BIOLOGY
Bioinformatics dedicated specifically to handling sequence information is a form of ____?
BIOINFORMATICS
also used to store and organize large amount of data into databases such as those used in clinical sequence analysis
BIOINFORMATICS
used due to vast amount of data arising from the sequence discovery.
BIOINFORMATICS
the science of computer technology and developing computer databases to facilitate biological research.
Standard expression of sequence data
is important for the clear communication and organized storage of sequence data
Interpretation of sequence variants
Used in epidemiology to speciate organisms or to find homologies within or between species
Identification of new sequences
Useful for test and primer design
Uses of Sequence Information:
Pneumocystis jirovecii or Pneumocystis carinii
was first thought to be a protozoan that is present in the sputum, but it doesn’t align with the protozoan sequence; it matches with the sequence of a fungi
NCBI
Commonly used database
SEQUENCE INFORMATION
includes the principles, practical aspects, and structural analysis
Polymorphic or heterozygous sequences
are written as consensus sequences with proportional representation of the polymorphic bases
International Union of Pure and Applied Chemistry and the International Union of Biochemistry and Molecular Biology (IUB)
have assigned a universal nomenclature for mixed, degenerate, or wobble bases
Consensus Sequences
if there is a mutation in the heterogeneous sequences, there may be more than 1 base or mix bases at the same position in the sequence.
A, G
symbol: R
bases: ??
Mnemonic: PURINE
C, T
symbol: Y
bases: ??
Mnemonic: PYRAMIDINE
G, T
symbol: K
bases: ??
Mnemonic: KETO
A, C
symbol: M
bases: ??
Mnemonic: AMINO
C, G
symbol: S
bases: ??
Mnemonic: 3 H BONDS
A, T
symbol: W
bases: ??
Mnemonic: 2 H BONDS
A, C, T
symbol: H
bases: ??
Mnemonic: NOT G
C, G, T
symbol: B
bases: ??
Mnemonic: NOT A
A, C, G
symbol: V
bases: ??
Mnemonic: NOT T
A, G, T
symbol: D
bases: ??
Mnemonic: NOT C
A, C, G, T
symbol: N
bases: ??
Mnemonic: ANY
UNKNOWN
symbol: X, ?
bases: ??
Mnemonic: A or C or G or T
DELETION
symbol: O, -
bases: ??
Basic Local Alignment Search Tool
BLAST
GENE SEQUENCE
FASTA format =
ARRANGED
GenBank =
Basic Local Alignment Search Tool
System used for homology searches
Basic Local Alignment Search Tool
searches GenBank in National Center for Biotechnology Information (NCBI)
Basic Local Alignment Search Tool
Useful in epidemiology too. You can also confirm bacteria with the same genus through their DNA sequence.
Basic Local Alignment Search Tool
uses GenBank which is also a database for all DNA sequences that were discovered.
Basic Local Alignment Search Tool
is a tool used to aligned 2 sequences
Basic Local Alignment Search Tool
Comparing gene and protein sequences against others in public databases
Basic Local Alignment Search Tool
is a set of sequence comparison algorithms used to search databases for optimal local alignments to a query
Basic Local Alignment Search Tool
It breaks the query and databases sequences into fragments and seeks matches between them
Basic Local Alignment Search Tool
is a computer algorithm that is available for use online at the National Center for Biotechnology Information (NCBI) website and many other sites
Local Alignment
finding similarities on a specific region of a DNA.
Global Alignment
finding similarities from one end to another end, whether they are matching or mismatching.
Basic Local Alignment Search Tool
is the most widely used program in the Bioinformatics
FASTA, GENBANK FORMAT
Input sequences in either of these 2 formats
HTML, plain text, and XML formatting
BLAST output can be delivered in a variety of formats. These formats include ___?
Expect value (E)
is a parameter that describes the number of hits one can "expect" to see by chance when searching a database of a particular size
OUTPUT
shows all the records matching the query
Most of the time, it is in HTML format
mismatching
the higher the background noise, the higher the _____ sequence
matching
The lower the E, the lower the background noise, the higher the ___ sequence?
match
E value = 10-12 = ?
Nucleotide BLAST
sequences of the DNA
Protein BLAST
sequences of the amino acids (sequences of the amino acids were also made from the information of the DNA)
High Scoring Segment Pair (HSP)
local alignment used for aligning 2 DNA without a graph
High Scoring Segment Pair (HSP)
We have match, mismatch, and a gap – all of these have a score.
match
= +2
mismatch
= -2
gap
= 0
HSP
The higher the ___, the higher the amount of match.
EMBL
GenBank
DDBJ (DNA Data Bank of Japan)
PRIMARY BIOLOGICAL DATABASE OF NUCLEIC ACID?
PIR
MIPS
SWISS-PROT
TrEMBL
NRL-3D
PRIMARY BIOLOGICAL DATABASE OF PROTEIN?
PRIMARY BIOLOGICAL DATABASE
Also known as Archival Database
GenBank
best for nucleic acid, you can also find protein sequences here.
FASTA
stands for fast-all” or “FastA”
FASTA
It was developed by W.R. Pearson and Lipman and this algorithm can be accessed from EBI site
FASTA
It was the first database similarity search tool developed, preceding the development of BLAST
FASTA
The alignment in diagonals is then refined
FASTA
Finds regions of similarity by first breaking the sequence into short subsequences, then searching for diagonals with highest density of words that match
FASTA
Its fast but is not guaranteed to find the best alignment, it may miss matches
FASTA
Its fast but is not guaranteed to find the best alignment, it may miss matches
FASTA
gives better results for nucleotide sequences than protein
FastP
is for protein sequences
FASTX and FASTY
compares DNA query to a protein database.
TFASTA
compares a protein query to a DNA database.
FASTA format
is a text-based format that represents either the nucleotide sequence or the protein sequence in which that bases or base pairs are represented using a single letter code.
FASTA
can be used for both Local and Global Alignment
FASTA, BLASTA
to infer relationship between sequences,
to identify members of the gene families
as a searching tool for the matching sequences
FASTA GRAPH
simple technique. You just have to find similarities, mismatching, gap, by scoring and tracing back to find the local similarities (or even global similarities).
LOCAL ALIGNMENT
write only the parts of the DNA sequence that are similar or matching.
GLOBAL ALIGNMENT
write both matching and mismatching from end to end of the DNA sequence.
GENBANK FILE FORMAT
Genetic sequence database sponsored by NIH in USA
PubMed
searching tool for journals
SWISS-PROT FILE FORMAT
Protein database sponsored by Medical Research Group of UK (Europe)
Basic Local Alignment Search Tool (BLAST)
Gene Recognition and Assembly Internet Link (GRAIL)
FAST-All derived from FAST-P (protein)
FAST-N (nucleotide) search algorithms (FASTA)
Phred
Polyphred
Phragment Assembly Program (Phrap)
The Institute for Genomic Research (TIGR Assembler)
Factura (Factura)
SeqScape (SeqScape)
Assign
Matchmaker
SOFTWARE PROGRAMS USED TO ANALYZE AND APPLY SEQUENCE DATA
Basic Local Alignment Search Tool
Compares an input sequence with all sequences in a selected database
Gene Recognition and Assembly Internet Link (GRAIL):
Finds gene-coding regions in DNA sequences
FAST-All derived from FAST-P (protein) and FAST-N (nucleotide) search algorithms (FASTA)
Rapid alignment of pairs of sequences by sequence patterns rather than individual nucleotides
Phred
Reads bases from original trace data and recalls the bases, assigning quality values to each base
Polyphred
Identifies single nucleotide polymorphisms (SNPs) among the traces and assigns a rank indicating how well the trace at a site matches the expected pattern for an SNP
Phragment Assembly Program (Phrap)
Uses user supplied and internally computed data quality information to improve accuracy of assembly in the presence of repeats
The Institute for Genomic Research (TIGR Assembler)
Assembly tool developed by TIGR to build a consensus sequence from smaller-sequence fragments
Factura
Identifies sequence features such as flanking vector sequences, restriction sites, and ambiguities.
SeqScape
Mutation and SNP detection and analysis, pathogen subtyping, allele identification, and sequence confirmation
Assign
Allele identification software for haplotyping
New primer or probe
sequence query the primer or probe sequence to confirm that it belongs to the correct species and is not duplicated in multiple places in a genome
misprimes and off-target products
Primers and probes with multiple potential binding sites will produce ??
query
We have to ___ first the primer to confirm if there is similar DNA or if we can anneal it to a similar DNA.
PRIMER DESIGN
Also used to check the size of the amplicons