CH8.1 - Genomes and Genomics

Genomes and Genomics

  • Genome Definition: All hereditary information specifying an organism.

    • Haploids: Complete hereditary information.

    • Diploids: Complete haploid complement.

  • Examples of Genomes:

    • Viral Genome (SFP10 Bacteriophage):

      • Double-stranded DNA.

      • Size: 157 kilobases (157,950 base pairs).

      • Haploid (one copy of each gene).

    • E. coli Genome:

      • Size: 4.6 megabases (4,639,221 base pairs).

      • Haploid (one copy of each gene).

    • Human Genome:

      • Diploid (2n).

      • Karyotype: Shows metaphase-arrested chromosomes.

      • Two copies of each autosome (chromosomes 1-22).

      • Sex chromosomes (X and Y in males).

      • Genome: 22 autosomes + 2 sex chromosomes (one of each pair).

  • Increase in Genome Sequencing:

    • Exponential increase over the past 25 years.

    • Data from NCBI (National Center for Biotechnology Information).

    • 1980s: Few genomes sequenced.

    • Exponential increase in viruses, prokaryotes, and eukaryotes.

  • Landmark Genome Sequencing Events:

    • First genome sequenced: Bacteriophage (small genome).

    • First cellular organism: Haemophilus influenza (bacteria, 1995).

    • First eukaryotic organism: Saccharomyces cerevisiae (yeast, 1996).

    • Human genome: Completed in 2003.

      • Took 13 years and $1 billion.

      • Now: Less than 24 hours and around $1,000.

  • Genome Annotation:

    • Locating genes and critical sequences (regulatory sequences).

    • Assigning putative functions.

  • Computational Approaches:

    • Identifying genes and regulatory sequences by comparison with previously studied genomes.

    • Algorithms to identify gene-like features:

      • Start codon.

      • Shine-Dalgarno sequence.

      • Codons and stop codons.

      • Consensus sequences (promoter sequences).

        • Minus 10 consensus (TATAAT).

        • Minus 35 consensus.

      • Intrinsic terminator (base complementarity followed by T's).

  • Comparative Genomics:

    • Using previously studied genomes to assign gene functions.

    • Looking for genes with homology (sequence similarity).

    • Homologs: Genes with sequence similarity; may suggest evolutionary relationship or similar function.

    • Orthologs: Homologs with clear functional relationship (same function in different species).

  • Synteny:

    • Conservation of gene order on the chromosome in different species.

    • Example: Homologs between humans and mice with similar order on chromosomes.

  • Unassigned Functions:

    • Some genes still have no assigned function, leaving room for discovery.

  • Types of Sequences in Genomes:

    • Coding sequences: Direct synthesis of proteins (only 1.5% in humans).

    • Non-coding sequences: Much more abundant than coding sequences.

      • Transposons and retrotransposons: ~45% of the human genome.

        • SINEs (Short Interspersed Nuclear Elements).

        • LINEs (Long Interspersed Nuclear Elements).

      • Additional repetitive sequences.

      • Miscellaneous unique sequences.

      • Introns: Non-coding sequences that interrupt exons in pre-mRNA (removed during mRNA processing).

  • Variation in Human Genomes:

    • Genotypic differences (nucleotide level) may or may not produce phenotypic differences.

    • Single Nucleotide Polymorphism (SNP):

      • Single nucleotide change.

      • Approximately one SNP per 1,000 base pairs in the human genome.

      • Example: Four individuals with a SNP; one has T, others have C.

    • Haplotype:

      • Groups of SNPs or other genetic variations close together on a chromosome.

      • Tend to segregate together (not separated by recombination).

    • Tag SNP:

      • Particular SNP bordering a haplotype.

      • Indicates what the rest of the haplotype sequence looks like.

    • Haplotypes as markers for certain populations.

  • Human vs. Chimpanzee Genomes:

    • Total genomic difference: 4%.

    • Differences include SNPs, chromosomal insertions, duplications, and other arrangements.

    • Duplications can come from transposons.

    • Determining if differences occurred in humans or chimpanzees:

      • Compare sequences to a distantly related outgroup (e.g., orangutan).

      • Example: Gene X sequence differs between humans and chimps, orangutan matches chimps, so the change occurred in the human lineage.

  • Molecular Basis of Human Genetic Disease:

    • Linkage Analysis:

      • Mapping disease condition relative to genetic polymorphisms (SNPs, deletions).

      • Example: Early-onset Alzheimer's linked to PS1 gene on chromosome 14.

      • Process:

        • Look at pedigrees of affected families.

        • Collect DNA from affected and unaffected individuals.

        • Follow which SNPs or haplotypes the disease segregates with.

        • SNPs and genetic markers from chromosome 14 (D14S43 SNP) segregated with Alzheimer's.

      • Researchers focused on the D14S43 region of chromosome 14, finding 19 expressed genes.

      • Mutations in S182 (later named PS1) were present in affected individuals, not in unaffected.

      • Looking at affected versus unaffected genomes and disease segregation with SNPs and other genetic markers.