bioinformatics lecture 1

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/44

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

45 Terms

1
New cards

What is bioinformatics?

The use of computational tools to store, analyse, and interpret biological data.

2
New cards

Why is biological data considered discrete while reality is continuous?

Biological systems are continuous in reality, but are represented as discrete units (DNA bases, RNA reads, protein sequences) so computers can process them.

3
New cards

Why is biological data computationally challenging?

High variability, limited building blocks (DNA: A/T/C/G; proteins: 20 amino acids), and 3D biology represented as 2D data adds unpredictability.

4
New cards

List the main branches of bioinformatics.

Applied omics (genomics, proteomics), data analytics and visualisation, machine learning, computational biology, and data integration.

5
New cards

What is the central dogma of molecular biology?

The flow of genetic information from DNA to RNA to protein.

6
New cards

What are the exceptions to the central dogma?

RNA can be reverse-transcribed into DNA, and RNA can be replicated directly in RNA viruses.

7
New cards

What type of data does genomics analyse and for what purpose?

DNA data, used to identify genes and mutations.

8
New cards

What type of data does transcriptomics analyse and for what purpose?

RNA data, used to measure gene expression.

9
New cards

What type of data does proteomics analyse and for what purpose?

Protein data, used to study protein function and interactions.

10
New cards

What does the phenome describe?

Observable traits and biological outcomes.

11
New cards

Give examples of biological data used in bioinformatics.

DNA sequences, RNA expression levels, protein sequences and structures, pathways, and phenotypic data.

12
New cards

List real-world applications of bioinformatics mentioned in the lecture.

Disease gene discovery, cancer genomics, and drug target identification.

13
New cards

In which direction does DNA synthesis occur and why does this matter for sequencing?

DNA synthesis occurs 5′ to 3′, which sequencing methods rely on to build and read DNA strands.

14
New cards

What is the core principle of Sanger (dideoxy) sequencing?

Incorporation of ddNTPs terminates DNA synthesis because they lack an OH group, creating fragments of different lengths.

15
New cards

What components are required for Sanger sequencing?

DNA template, primer, DNA polymerase, dNTPs, and ddNTPs.

16
New cards

What happens when a ddNTP is incorporated during Sanger sequencing?

DNA chain elongation stops, producing a terminated fragment.

17
New cards

How is the DNA sequence read in Sanger sequencing?

Fragments are separated by size and read from smallest to largest (bottom to top).

18
New cards

What technological improvements modernised Sanger sequencing?

Fluorescent dyes, single-reaction sequencing, automated detection, and capillary electrophoresis.

19
New cards

What are the advantages of Sanger sequencing?

Long read length (~1000 bp) and high accuracy.

20
New cards

What are the limitations of Sanger sequencing?

It is slow, expensive, and low throughput.

21
New cards

What was the goal of the Human Genome Project?

To create a reference human genome.

22
New cards

What are the two competing sequencing strategies of the Human Genome Project?

Top-down (hierarchical, public) and whole-genome shotgun (Celera, private).

23
New cards

What is the top-down (hierarchical) sequencing strategy?

A physical genome map is built first using BACs, then each BAC is shotgun sequenced.

24
New cards

What is whole-genome shotgun sequencing?

The entire genome is randomly fragmented, sequenced at once, and assembled computationally.

25
New cards

What are BACs and why are they used?

Bacterial Artificial Chromosomes with large inserts (100–300 kb) used for stable cloning and genome mapping.

26
New cards

How are BACs mapped using restriction enzymes?

Restriction enzymes cut DNA into fragments, producing patterns used to create a restriction map.

27
New cards

What is a Sequence-Tagged Site (STS)?

A short, unique DNA sequence with a known sequence that can be amplified by PCR to map genome locations.

28
New cards

What is the typical size and spacing of STS markers?

200–500 bp in length, approximately every 100 kb.

29
New cards

What is the Golden Path in genome sequencing?

A minimal overlapping set of BACs that efficiently covers the entire genome.

30
New cards

What is a sequencing read?

A short fragment of DNA sequence generated by sequencing.

31
New cards

What is a contig?

A continuous DNA sequence formed by overlapping reads.

32
New cards

What is a scaffold?

An ordered collection of contigs separated by gaps.

33
New cards

What is sequencing coverage?

The average number of times each base in the genome is sequenced.

34
New cards

What is a gap in genome assembly?

A region of the genome with no sequencing reads.

35
New cards

What are the main steps in genome assembly?

Sequence reads, identify overlaps, build contigs, link contigs using paired-end data, and form scaffolds.

36
New cards

How does sequencing coverage affect genome assembly?

Higher coverage increases confidence, while low coverage leads to gaps.

37
New cards

What is paired-end sequencing?

Sequencing both ends of a DNA fragment with a known distance between them.

38
New cards

Why is paired-end sequencing useful for assembly?

It helps link contigs and span gaps to form scaffolds.

39
New cards

How does top-down sequencing compare to shotgun sequencing in physical mapping?

Top-down uses a physical map; shotgun does not.

40
New cards

How does top-down sequencing compare to shotgun sequencing in assembly difficulty?

Top-down is easier to assemble; shotgun is more difficult.

41
New cards

How does top-down sequencing compare to shotgun sequencing in speed?

Top-down is slower; shotgun is faster.

42
New cards

How do repetitive DNA regions affect shotgun sequencing?

They make assembly more difficult because repeats are hard to place correctly.

43
New cards

What is the basic idea behind De Bruijn graph assembly?

Reads are broken into k-mers which are used to construct paths for genome assembly.

44
New cards

What is the trade-off when choosing k-mer size?

Shorter k-mers increase overlap, but if too short they create ambiguity.

45
New cards