bioinformatics midterm

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/66

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

67 Terms

1
New cards

Sequence Analysis

The goal of sequence analysis is to collect, compare, and understand biological data. A common task is to identify unknown sequence.

2
New cards

BLAST (Basic Local Alignment Search Tool)

A program used to compare a given sequence against a database to find similar sequences.

3
New cards

E-value (Expected Value)

Indicates the number of hits one can 'expect' to see by chance when searching a database of a particular size. A low E-value suggests a more significant match.

4
New cards

Low E Value in BLAST results

suggests a more significant match when searching for a similar sequence

5
New cards

What does a low E value and a high score indicate

a likely hit

6
New cards

Example Application of BLAST

An unknown HIV sequence was identified as an HIV-1 N434 retrovirus strain from Venezuela by using BLAST. The result showed a 100% query cover, a 0.0 E-value, and 100% identity.

7
New cards

Primary Structure

The sequence of amino acid residues.

8
New cards

Secondary Structure

Local folding into structures like alpha-helices (A Helix).

9
New cards

Tertiary Structure

The overall three-dimensional shape of a single polypeptide chain.

10
New cards

Quaternary Structure

The arrangement of multiple assembled subunits.

11
New cards

X-ray Crystallography

A method for determining protein structure.

12
New cards

Nuclear Magnetic Resonance (NMR)

A method for determining protein structure.

13
New cards

Cryo-Electron Microscopy (cryoEM)

A method for determining protein structure.

14
New cards

AlphaFold

An AI program that can predict protein structures with high accuracy. It uses an input sequence and searches genetic and structure databases to generate a 3D structure.

15
New cards

Bioinformatic Drug Design

Aims to create therapies by targeting specific proteins.

16
New cards

HIV-1 Protease Inhibition

The drug ritonavir can inhibit the HIV-1 protease, preventing the virus from producing new viral envelopes.

17
New cards

Types of RNA

Major types include messenger RNA (mRNA), transfer RNA (tRNA), and ribosomal RNA (rRNA). Other types like siRNA, miRNA, and lncRNA are also important.

18
New cards

RNA Structure

RNA is typically single-stranded but can have double-stranded regions. It contains the sugar ribose and the base Uracil (U) instead of Thymine (T).

19
New cards

Functions of RNA

Involved in translation, regulation of gene expression, and can act as enzymes (ribozymes) or regulatory elements (riboswitches).

20
New cards

RNA Lifestyle

Includes transcription, transport, and degradation by nucleases

21
New cards

Techniques for RNA analysis

Hybridization, nothern blotting, microarrays, RNA seq.

22
New cards

Hybridization

The principle that Uracil (U) pairs with Adenine (A) and Cytosine (C) pairs with Guanine (G) is used to detect specific sequences.

23
New cards

Northern Blotting 

A technique to detect specific RNA sequences in a sample.

24
New cards

microarrays

Used to measure the expression levels of large numbers of genes simultaneously

25
New cards

RNA-Seq

A sequencing technique used to reveal the presence and quantity of RNA in a biological sample at a given moment in time.

26
New cards

Bioinformatic tools for RNA

RFAM, RNAanalyzer, RNAfold, Riboswitch finder

27
New cards

Rfam

A database containing a collection of RNA families, represented by sequence alignments, consensus secondary structures, and covariance models.

28
New cards

RNAanalyzer

A web-based tool for analyzing regulatory RNA elements and secondary structures from an RNA sequence.

29
New cards

RNA fold

A web server that predicts the secondary structure of single-stranded RNA sequences based on minimum free energy.

30
New cards

Riboswitch Finder

A tool to search RNA/DNA sequences for known riboswitches.

31
New cards

Sequencing Methods

Techniques like Sanger and Next-Generation Sequencing (NGS) produce short sequence reads.

32
New cards

Assembly

These short reads must be assembled by finding overlapping parts; this process is difficult for highly repetitive sequences.

33
New cards

Tools for Assembly 

BLAST can be used to compare a new sequence to an already determined one to aid in assembly, especially for matches with E-values less than 50.

34
New cards

Genome Annotation

The process of identifying the locations of genes and other biological features on a nucleotide sequence.

35
New cards

Annotation Pipelines

NCBI provides pipelines for prokaryotic (PGAP), eukaryotic (EGAP), and viral (VADR) genomes.

36
New cards

Approaches for genome annotation

analyzing RNA sequence data to identify transcribed regions
Finding promoter sequences using databases like Transfac
Ab initio methods that predict genes based on sequence characterstics.

37
New cards

Human Genome Project (HGP)

A major international research effort to determine the sequence of the human genome and identify the genes that it contains.

38
New cards

ENCODE Project

The Encyclopedia of DNA Elements project aimed to systematically map regions of transcription, transcription factor association, chromatin structure, and histone modification.

39
New cards

How much of the ENCODE project assigned biochemical functions to the genome

80%

40
New cards

Composition of the Human Genome

Only about 2-3% of the human genome consists of protein-coding genes; the majority is composed of introns (26%), repetitive elements like LINEs (20%) and SINEs (13%), and other non-coding DNA.

41
New cards

intergenic

what is in between

42
New cards

LTR (long terminal repeat)

more repetitive DNA for genes and gene sequences

43
New cards

what is the percentage of interspersed elements

33%

44
New cards

Hidden Markov Models (HMMs)

An HMM is a statistical model used to describe observable events that depend on underlying, unobservable 'hidden' states.

45
New cards

HMMs

Hidden Markov Models used in bioinformatics for tasks like gene prediction, sequence alignment, and protein secondary structure prediction.

46
New cards

HMMER

A tool for biosequence analysis using profile Hidden Markov Models.

47
New cards

transmembrane proteins

about 30% of cells in the membrane

48
New cards

how long are transmembrane domains and what are they made of

20 AA long

Made of alpha helix with hydrophobic AA
R groups stick out (Isoleucine or phenylalanine)

49
New cards

transface

transcription factors are looking for a place to bind

50
New cards

what percentage of genes have transcription factors

10%

51
New cards

do they always enhance binding sites

no sometimes they are

52
New cards

metabolomics

The large-scale study of small molecules, or metabolites, within cells, biofluids, tissues, or organisms.

53
New cards

KEGG

Kyoto Encyclopedia of Genes and Genomes, a database resource for understanding high-level functions and utilities of biological systems, containing graphical maps of metabolic pathways.

54
New cards

Flux Balance Analysis (FBA)

A method to calculate the flow of metabolites through a metabolic network and predict growth rates using a stoichiometric matrix and linear programming.

55
New cards

Elementary Mode Analysis (EMA)

A computational method that identifies all minimal, feasible metabolic pathways within a network.

56
New cards

metatool

a tool for metabolic modeling

57
New cards

CellNetAnalyzer

a tool for metabolic modeling

58
New cards

COBRA toolbox

a tool for metabolic modeling 

59
New cards

cytoscape

A tool for visualizing molecular interaction networks, including metabolic pathways from databases like KEGG.

60
New cards

systems biology 

The study of complex interactions between components of a biological system to understand the system as a whole

61
New cards

boolean models

Logic-based systems that use binary variables and logic (AND OR NOT) to represent biological processes like gene regulation.

They dont need detailed kinetic data

62
New cards

cell designer 

A program used to add components and their interconnections in biological modeling.

Stimulates the network dynamics using Boolean commands.

Compare the model to experimental data for validation.

63
New cards

MAPK/ERK Pathway 

A chain of proteins that communicates a signal from a cell surface receptor to the DNA in the nucleus, often involving phosphorylation

64
New cards

G-protein coupled receptors (GPCR)

A class of receptors that play a role in signaling pathways, including those involved in heart failure.

65
New cards

Receptor Tyrosine Kinases (RTK)

A class of receptors that are involved in signaling pathways, including those related to heart failure.

66
New cards

Ordinary Differential Equations (ODEs)

Mathematical equations used for quantitative modeling when experimental data on concentration changes over time are available.

67
New cards

Quantitative Modeling

Modeling that uses mathematical equations to represent biological processes, as opposed to simpler, semiquantitative approaches.