1/9
Looks like no tags are added yet.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is bioinformatics?
-interdisciplinary field which combines biology, computer science, and mathematics -analyse and interpret biological data (primarily sequence data - DNA, RNA, protein)
What are biological sequence databases?
NIG - DNA Data Bank of Japan, NCBI - GenBank (USA), EMBL-EBI - European Nucleotide Archive (Europe)
What are the different sequence formats?
-for DNA/RNA sequence: 4 nucleotide bases: ACGT or acgt -IUPAC codes for nucleotides can be used for ambiguity in sequence -for protein sequences: 20 amino acids: single letter codes
What are FASTA format?
can have multiple sequences in one file -single line description, starting with ">" denotes sequence starts next line
What is GenBank format?
-can have multiple annotation lines before sequence -can have multiple sequences in one file -starts with LOCUS, line before ORIGIN, end of sequence
What is ENSEMBL format?
-can have multiple annotation lines before sequence -can have multiple sequences in one file -starts with ID, line before sequence SQ, end of sequence //
What does homologous mean?
genes, proteins, or other biological sequences that are evolutionary related due to their descent from a common ancestral sequence (2 types)
What does orthologues mean?
-homologous genes found in different species that diverged from a common ancestor through speciation -typically retain similar functions across different species and are often involved in fundamental biological processes
What does paralogues mean?
-homologous genes within the same species that arose from gene duplication events -are similar in sequence but may have diverged over time
What are sequence alignments?
-substitution matrix -used to score amino acid alignments in programs like BLAST - +ve score = conserved - -ve score = non-conserved -conservative substitutions -radical substitutions