Study Notes: SNPs - Key Concepts and Classifications

What is a SNP?

  • SNP stands for Single Nucleotide Polymorphism. In the slides, SNPs are also described as pronounced either as "snips" or, per another slide, "snip".

  • Definition: A SNP is a single-nucleotide substitution of one base for another that occurs in more than one percent of the general population.

  • Not every single-nucleotide change is a SNP.

  • Distribution in the genome: SNPs occur throughout the human genome, about one in every 300 nucleotide base pairs.

  • This translates to roughly 10710^7 SNPs within the 3×1093\times 10^9-base-pair human genome.

  • At a SNP location, there can be up to four versions corresponding to the four nucleotides: A, C, G, and T.

Why SNPs matter

  • SNPs can serve as predictive markers that inform medical decisions across various areas:

    • Diseases

    • Effectiveness of different drugs

    • Adverse reactions to specific drugs

  • Pharmacogenetic approach: uses SNP information to personalize medical care, potentially saving time, money, and discomfort by enabling accurate diagnoses and matching patients with appropriate medicines.

How SNPs are identified

  • Genomic approaches (the big picture):

    • Large-scale projects involve hundreds of scientists from many institutions.

    • Goal: identify and catalog all SNPs in the 3-billion-base-pair human genome.

    • Methods rely on comparing the genomes of many individuals and require substantial computer-powered data analysis.

    • Results are sorted and cataloged in databases that are available to anyone over the Internet, including the public.

  • Functional approaches (focus on specific processes):

    • Target particular diseases or drug responses.

    • Many genes control biological processes involved in diseases and drug responses.

    • Scientists select genes known to be involved in a process and examine them in people with and without the response/disease.

    • By comparing DNA sequences, researchers identify SNPs that correlate with a function or response.

  • SNP Quick Reference (slide label):

    • Appears as a reference section; content specifics not detailed in the transcript.

SNPs vs disease-causing mutations: Not the same

  • True, SNPs and disease-causing mutations are both single-nucleotide changes, but they are not the same.

  • Key distinctions:

    • To be classified as a SNP, the change must be present in at least one percent of the general population: 0.01\ge 0.01.

    • Most disease-causing mutations occur within a gene's coding or regulatory regions and affect the protein encoded by the gene.

    • SNPs are not necessarily located within genes, and they do not always affect protein function.

SNP categories: Linked vs Causative

  • Linked SNPs (also called indicative SNPs):

    • Do not reside within genes.

    • Do not affect protein function.

    • Nevertheless, they correlate with a particular drug response or with the risk of developing a certain disease.

  • Causative SNPs:

    • Affect the way a protein functions.

    • Correlate with a disease or influence a person’s response to medication.

    • Come in two forms:

    • Coding SNPs: located within the coding region of a gene; change the amino acid sequence of the gene’s protein product.

    • Non-coding SNPs: located within regulatory sequences of a gene; change the timing, location, or level of gene expression.

Coding vs non-coding SNPs

  • Coding SNPs:

    • Located in the coding region of a gene.

    • Change the amino acid sequence of the protein (could affect protein function).

  • Non-coding SNPs:

    • Located in regulatory regions (non-coding regions) of a gene.

    • Alter when, where, or how much a gene is expressed, rather than changing the amino acid sequence.

Grasping the scale and practical implications

  • Genome-wide perspective:

    • The human genome contains about 3×1093\times 10^9 base pairs.

    • SNPs occur roughly every rac1300rac{1}{300} base pairs, i.e., about one SNP per 300 base pairs, on average.

    • This results in an estimated 10710^7 SNPs across the genome.

  • Practical implications:

    • SNPs provide a framework for understanding genetic variation among individuals.

    • They enable researchers to link genetic variation to disease risk, drug response, and adverse drug reactions.

    • The predictive power of SNPs supports the broader goal of personalized medicine.

Connections to foundational principles and real-world relevance

  • Genetic variation underpins diversity in disease risk and drug response across populations.

  • Use of SNPs aligns with the principle of tailoring medical care to individual genetic profiles, enhancing effectiveness and reducing harm.

  • Large-scale genomic projects showcase the importance of data-sharing and openly accessible databases for scientific progress.

  • Functional approaches illustrate how focusing on specific diseases or drug responses helps identify clinically relevant SNPs that might be missed in a purely genome-wide search.

Ethical, philosophical, and practical implications

  • Ethical considerations include privacy and consent for genomic data sharing and potential misuse of genetic information.

  • Practical implications involve the integration of SNP-based insights into clinical workflows, including decision-making, cost, and accessibility for patients.

Notation and key numerical references (recap in LaTeX)

  • Human genome size: 3×1093 \times 10^9 base pairs

  • SNP frequency per base pair: approximately 1300\frac{1}{300}

  • Total number of SNPs: about 10710^7

  • Population prevalence criterion for SNPs: at least 0.010.01 (1%)

Summary and takeaways

  • SNPs are single-nucleotide substitutions common in the population (>= 1%), distributed throughout the genome, with an average occurrence of roughly one per 300 base pairs, equating to about 10 million SNPs in the human genome.

  • Not all SNPs affect protein function; some are simply indicative markers that correlate with disease risk or drug response, while others causally affect protein function via coding or non-coding changes.

  • Two main approaches identify SNPs: Genomic (broad cataloging across the genome) and Functional (focused on specific diseases or drug responses).

  • Distinctions between SNPs and disease-causing mutations underscore that commonality (>= 1%) excludes typical disease-causing mutations, which are often rarer and more likely to impact coding or regulatory regions.

  • Understanding SNPs supports personalized medicine by informing disease risk assessment, drug choice, and potential adverse reactions, while also raising important ethical and practical considerations for clinical implementation.