bioinformatics lecture 2

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/35

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

36 Terms

1
New cards

What is Next-Generation Sequencing (NGS)?

large scale parallel sequencing where millions of DNA fragments are sequenced simultaneously.

2
New cards

Why did NGS replace Sanger sequencing for genome-scale projects?

It provides far higher throughput at much lower cost per base, enabling whole-genome and population-scale sequencing.

3
New cards

How does NGS connect to genome assembly concepts from Lecture 1?

NGS produces short reads, which makes genome assembly computationally challenging.

4
New cards

How did sequencing cost trends compare to Moore’s Law?

Sequencing costs dropped dramatically faster than Moore’s Law, which predicts computing power doubles every two years.

5
New cards

What biological advances became possible due to cheap sequencing?

Whole-genome sequencing, population genomics, and personalised medicine.

6
New cards

What tradeoff do most NGS technologies make compared to Sanger sequencing?

They trade long read length for massive throughput and speed.

7
New cards

How do Roche 454 and Illumina sequencing differ at a high level?

Roche 454 produces longer reads but lower throughput, while Illumina produces short reads with very high throughput.

8
New cards

What core features are shared by most NGS technologies?

Random DNA fragmentation, short reads (100–300 bp), amplification, sequencing-by-synthesis, and no gel electrophoresis.

9
New cards

Why is amplification required in NGS?

A single DNA molecule produces a signal too weak to detect reliably.

10
New cards

What are the main steps of Illumina sample preparation and amplification?

DNA fragmentation, adaptor ligation, size selection, and bridge amplification on a flow cell.

11
New cards

What is a flow cell in Illumina sequencing?

A surface where adaptor-ligated DNA binds and undergoes bridge amplification to form clusters.

12
New cards

What is a cluster in Illumina sequencing?

Thousands of identical copies of a single DNA fragment in one location on the flow cell.

13
New cards

What is bridge amplification?

A process where DNA fragments bend and replicate on the flow cell surface to form clusters.

14
New cards

What is reversible terminator technology in Illumina sequencing?

Modified nucleotides with fluorescent labels and 3′ blocking groups that allow only one base to be added per cycle.

15
New cards

Why does Illumina sequencing enforce one base incorporation per cycle?

To improve accuracy and ensure clear signal detection.

16
New cards

Describe one full Illumina sequencing-by-synthesis cycle.

Base incorporation, laser excitation, colour detection, cleavage of dye and blocker, then repetition.

17
New cards

How have Illumina read lengths changed over time?

They increased from about 35 bp to as long as 300 bp.

18
New cards

Why does read quality decrease toward the end of Illumina reads?

Due to phasing, where clusters lose synchrony during sequencing cycles.

19
New cards

What is phasing in Illumina sequencing?

The loss of synchronisation within a cluster, causing mixed signals and reduced accuracy.

20
New cards

What is paired-end sequencing?

Sequencing both ends of the same DNA fragment.

21
New cards

Why are paired-end reads useful?

They improve genome assembly, read alignment, and variant detection.

22
New cards

What is FastQC used for?

Assessing sequencing quality such as base quality, GC content, and adapter contamination.

23
New cards

What is GC content and why is it useful?

The percentage of G and C bases, used to identify genes and distinguish exons from introns.

24
New cards

What types of sequencing applications commonly use NGS?

Whole-genome sequencing, exome sequencing, and RNA sequencing (RNA-seq).

25
New cards

What major impacts has NGS had on genomics?

Increased data volume, reduced cost, routine resequencing, and population-scale studies.

26
New cards

What cost milestones were highlighted for genome sequencing?

The $1000 genome and the emerging $100 genome.

27
New cards

What are major technical challenges of NGS?

Short reads, higher error rates, phasing, and uneven sequencing depth.

28
New cards

Why do short reads make genome assembly difficult?

They provide limited context, especially in repetitive regions.

29
New cards

Which genome regions are particularly difficult to assemble using NGS?

Repetitive sequences, GC-poor regions, telomeres, and centromeres.

30
New cards

What type of sequencing error is most associated with Illumina platforms?

Substitution errors.

31
New cards

What key shift did NGS create in bioinformatics workflows?

The main challenge moved from data generation to data analysis.

32
New cards

What is a reference genome?

A standard genome sequence used for alignment, annotation, and variant calling.

33
New cards

Which human genome builds were listed and which is current?

hg19/hg37 are older; hg38/GRCh38 is the current standard.

34
New cards

Which databases provide access to reference genomes?

UCSC Genome Browser, Ensembl, and NCBI.

35
New cards

Compare Sanger sequencing and NGS in terms of speed, cost, read length, accuracy, and throughput.

Sanger is slow, expensive, long-read, highly accurate, and low-throughput; NGS is fast, low-cost, short-read, lower per-read accuracy, and massive throughput.

36
New cards