1/42
A comprehensive set of practice questions and answers drawn from the lecture notes on genomics, sequencing technologies, annotations, evolutionary genomics, and bioinformatics workflows.
Name | Mastery | Learn | Test | Matching | Spaced |
---|
No study sessions yet.
What is genomics?
The scientific study of biological processes from the perspective of the whole genome.
How identical are humans at the DNA level?
99.9% identical.
Why Genomics? How do our bodies develop in terms of gene expression?
Expression of the genome differs among cell types, leading to development of diverse tissues and functions.
How does genomics relate to cancer?
Genomics helps understand the cause of cancer by studying mutations and genome changes that lead healthy cells to become cancerous.
What causes mutations?
Accidental errors in genome replication during cell division.
What is the central dogma of molecular biology?
DNA is transcribed into RNA, which is translated into a protein.
Do all cells in a multicellular organism express the same genes?
No; gene expression is regulated so different cells express different sets of genes.
What is gene regulation?
The process of controlling which genes in a cell’s DNA are expressed to produce functional products.
At which steps can gene expression be regulated?
Chromatin accessibility, transcription, RNA processing, RNA stability, translation, and protein activity.
What are the three subdivisions of genomics?
Structural Genomics, Functional Genomics, Evolutionary Genomics.
What does Structural Genomics focus on?
Sequencing of whole genomes and annotation; provides a parts list of the organism’s genetic toolkit.
What measurements are important in Structural Genomics?
Genome size, genome structure, gene numbers, coding vs non-coding DNA portions.
What is Whole-Genome Sequencing (WGS)?
Sequencing the entire genome by fragmenting DNA, sequencing pieces, and assembling them by overlaps.
Why is 30-40x coverage used in WGS?
To minimize the chance of sequencing errors during assembly.
What is a major obstacle in WGS assembly?
Repetitive DNA sequences that can’t be placed uniquely, complicating contig assembly.
Describe the Clone-by-clone sequencing approach.
Chromosome is broken into overlapping clones arranged linearly; each clone is sequenced separately using existing genetic maps.
What is a genetic map?
A map that measures distance between genes based on recombination frequency (cross-over during meiosis).
What was the goal of the Human Genome Project (HGP)?
To sequence a reference human genome, produce high-resolution genetic and physical maps, and develop new DNA technologies.
When did the HGP begin and how many countries were involved?
1989; scientists from 20 institutions in six countries (e.g., China, France, Germany, Japan, UK, USA).
What was the scale of the HGP outputs?
About 2.7 x 10^9 bp sequenced; >1.4 million SNPs identified; ~20,000–23,000 proteins coded; ~1-2% of DNA codes for proteins.
What is Next-Generation Sequencing (NGS) file format FASTQ used for?
Storing sequencing reads with quality scores; each entry includes a read identifier, sequence, and quality string.
What is SRA in NCBI?
Sequence Read Archive, an international public archive for next-generation sequencing data.
What is Functional Genomics?
Uses genomic sequences to study gene and protein function and expression on a global scale, focusing on transcription, translation, and interactions.
What is gene annotation?
Describes the biochemical, cellular, and biological function of each gene product encoded by the genome.
What are experimental approaches to structural annotation?
Using cDNA and ESTs to identify transcribed sequences and annotate exons/introns and alternative splicing by comparing to the genome.
What are computational approaches to structural annotation?
Predict gene structure by identifying open reading frames (ORFs) and comparing related species; less accurate than experimental data but helpful for design.
What is Evolutionary Genomics?
The comparison of genomes within and between species to understand evolution and variation over time.
What is the difference between interspecific and intraspecific comparisons?
Interspecific compares across species; intraspecific compares within populations of a single species.
How similar are humans to chimpanzees, and when did they diverge?
About 98.9% identical; divergence occurred roughly 6 million years ago.
What broad insight has Evolutionary Genomics revealed about genes across species?
Many genes are shared across phylogenetically distant species, supporting that all life is related (Tree of Life).
Name some key catalogs of human genetic variation.
The 1000 Genomes Project, HapMap, dbSNP, COSMIC.
What does the term annotation typically include for a gene?
Biochemical function, cellular role, and biological context of gene products.
What distinguishes Genomics from standard biology in terms of scope and data?
Genomics studies all genes across the genome (global, high-throughput) leading to large data sets and computational analysis; traditional biology is targeted and often lower throughput.
Name a few sequencing platforms and their approximate release years.
Sanger ABI 3730xl (2002); PacBio RSII (2010); Ion Torrent (2010); ABI SOLiD 5500xl (2010); Illumina MiSeq (2011); Oxford Nanopore MinION (2014).
What are the main steps in the bioinformatics workflow for NGS DNA data?
Base calling, alignment, variant calling, filtering/annotation, and identifying causal variants.
What are the main output formats at each step of the NGS workflow?
FASTQ (reads), SAM/BAM (alignments), VCF (variants).
What is a causal variant in the NGS workflow?
The most promising candidate variant(s) thought to cause the phenotype after filtering and annotation.
What is SNP, SNV, indel, and SV in genetic variation?
SNPs: common single-nucleotide variants in populations; SNVs: less characterized single-nucleotide variants; indels: small insertions/deletions; SVs: large structural variations.
Why are public variation catalogs important?
They provide reference data for population genetics, disease association, and variant interpretation (e.g., 1000 Genomes, HapMap, dbSNP, COSMIC).
What is the role of UCSC Genome Browser in genomics?
A web-based platform to visualize genomes, genes, transcripts, and annotations interactively.
What is the IGV (Integrative Genomics Viewer)?
A software tool for visualizing sequencing alignments and variant data to interpret results.
What does Sanger sequencing electropherogram show?
Peaks corresponding to nucleotide bases; used to determine the DNA sequence in Sanger sequencing.
How are sequencing data quality and content assessed in practice?
Using tools like FastQC to evaluate base quality, GC content, sequence length distribution, adapter content, etc.