1/23
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
|---|
No study sessions yet.
Which journal mantains an authoritative, yearly updated list of molecular biology databases?
The Nucleic Acids Research (NAR) journal.
What is the key characterictic of a primary database?
It contains raw, experimental results and the initial experimental interpretation → GenBank for nucleotide sequences
What is the key characteristic of a secondary database?
It contains data derived from primary resources, such as collections of protein families or conserved domains → PFAM, SCOP
Nucleotide Sequence: UniGene
Protein Sequence: RefSeq, CCDS
Primary or secundary?
Primary
Collection of conserved protein sequence motifs: PFAM, CDD
Protein families: GPCRDB, CAZY
Conserved structural domains: SCOP, CATH, Superfamily
Primary or secundary?
Secundary
What are the three major international nucleotide sequence databases coordinated by the INSDC?
GenBank (NCBI,US)
EMBL-EBI European Nucleotide Archive (Europe)
DDBJ (DNA Databank of Japan)
What are the different types of Nucleotides data, stored in different databases?
Raw genomic sequences (chromosomal DNA)
cDNAs
Expressed Sequence Tagas (ESTs) libraries
Sequence-Tagged Sites (STSs)
Genome Survey Sequences (GSSs)
High Throughput Genomic Sequence (HTGS)
Whole Genome Shotgun projects
What are the Reference Sequences collections (RefSeq)?
It provides the best representative sequence of each transcript or protein produced by a gene. There may be hundreds of GenBank entries corresponding to a gene, but only one RefSeq gene entry.
The RefSeq database aims to provide a comprehensive set of sequences. Which of the following is not a charateristic of RefSeq?
a. Non-redundant
b. Well annotated
c. Contains raw, unprocessed data
d. Provides representative sequences
c. Contains raw, unprocessed data. (RefSeq is curated and processed, unlike primary databases like GenBank).
What type of record has this starting accession code?
NM_
Trancript products. Mature mRNA
What type of record has this starting accession code?
NP_
Protein products
What type of record has this starting accession code?
NR_
Non-coding transcripts (structural RNAs, pseudogenes…)
What type of record has this starting accession code?
NC_
Complete genomic molecules (genomes, chromosomes, organelles, plasmids)
What type of record has this starting accession code?
NW_ NT_
Incomplete genomic assemblies. Contigs
What type of record has this starting accession code?
NZ_###
Collection of whole genome shotgun sequence data for a projecct (###). Unfinished.
What type of record has this starting accession code?
XM_
Automated model mRNA provided by genome annotation
What type of record has this starting accession code?
XP_, YP_, ZP_,
Protein
What type of record has this starting accession code?
XR_
Non-coding transcripts
Which NCBI resource provides integrated access to genes and genomes, connecting sequence, mapping, expression, and homology data from worldwide databases?
The NCBI Gene database
What is the primary purpose of the UniGene project?
To provide an organized, gene-oriented view of the transcriptome by automatically clustering Expressed Sequence Tags (ESTs) into non-redundant sets.
In the UniGene database, what does a cluster containing tens of thousands of ESTs most likely represent?
A highly expressed gene.
The Consensus Coding Sequence (CDDS) project tries to identify what?
A core set of human and mouse protein-coding regions that are consistently annotated and of high quality.
Which four groups collaborate in the CCDS?
EBI
NCBI
Wellcome Trust Sanger Institute
University of California Santa Cruz (UCSC)
What does the NCBI Genome explorer do?
Organizes information on genomes including sequences, maps, chromosomes, assemblies and annotations
Summarizes the available sequencing projects of any given organism
Allows a simple visualization of the genome content