Genome Content and Organisation Notes

Genes and Genomes: An Overview

Genome: Entire DNA content, including genes and intergenic regions.
Transcriptome: Entire set of gene transcripts (mRNAs) in a cell.
Proteome: Entire set of proteins encoded by a genome.

The Human Genome

Size: 3.3 x 10^9 basepairs (over 3 billion).
Chromosomes: 24 (22 autosomes, 2 sex chromosomes).
Protein-coding exons: Roughly 2%.
Number of genes: Approximately 20,000.

Mitochondrial Genome

Size: 16,569 base pairs.
Structure: One circular chromosome.
Copies per mitochondrion: 10,000.
Genes: 37 (13 protein-coding, 2 ribosomal RNA, 22 transfer RNAs).

Genome Size and Genetic Complexity

No simple correlation between genome size (C value) and genetic complexity (gene number).
C-value paradox: the amount of DNA in a haploid genome does not correlate with the complexity of the organism.
Gene density and the proportion of coding vs. non-coding DNA are important factors.

Genome Content Percentages

Exons: 2%
Introns: 24%
Large duplications: 5%
Simple repeats: 3%
Retrotransposons: 45%
Other intergenic DNA: 21%

Repetitive vs. Non-Repetitive DNA

Repetitive DNA: Sequences (5-5000 bp) present in multiple copies, usually non-coding.
- Examples: centromeres, telomeres.
Non-repetitive DNA: Unique sequences, one copy per haploid genome.
- Includes >98% of protein-encoding genes, gene clusters.

Non-Repetitive DNA

Protein-coding genes and genes encoding certain RNAs.
Gene clusters/families: Sets of genes with similar functions (e.g., globin genes).
Pseudogenes: Non-functional gene sequences similar to functional genes.

Gene Families

Arise through duplication during evolution.
Related on sequence level, perform similar functions.
Example: Globin family (myoglobin vs. hemoglobin).

Pseudogenes

Non-repetitive DNA components.
Sequence similar to functional genes but do not produce functional products.
Types: processed (from reverse transcription) and non-processed (from gene duplication or mutation).

Repetitive DNA Details

Present in multiple copies.
Located in specific regions (centromeres, telomeres).
Often lack function.

Types of Repetitive DNA

Highly repetitive: Satellite DNA, tandem repeats.
Middle repetitive: Interspersed retrotransposons (SINEs, LINEs), multiple copy genes.

Repetition Frequency (f)

Number of times a DNA sequence appears in a haploid genome.
Different types of repetitive DNA have different f values.
Highly repetitive: >10^6 (simple sequence).
Middle repetitive: 10^3-10^5.

Repetitive DNA: Satellite DNA

Satellite: e.g., human alphoid DNA repeat.
Minisatellite: e.g., human telomeres (TTAGGG repeats).
Microsatellite: Found in genes; e.g., CAG repeats in Huntington's disease.

Repetitive DNA: Interspersed Genome-Wide Repeats

LINEs (Long Interspersed Nuclear Elements): e.g., human LINE-1.
SINEs (Short Interspersed Nuclear Elements): e.g., human Alu sequence (10% of the human genome).

Summary of Genome Organization

Cells contain nuclear and mitochondrial genomes.
Genome size does not correlate well with genome complexity.
Genomes are classified into repetitive and non-repetitive sequences.
Non-repetitive sequences are the majority in higher mammals.
Non-repetitive sequences have structural roles in genome organization.