Design Project

0.0(0)
studied byStudied by 0 people
call kaiCall Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/27

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 5:18 PM on 1/10/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No analytics yet

Send a link to your students to track their progress

28 Terms

1
New cards

Query

A sequence read presented to an aligner; typically, a subsequence of an entire read (subread), which is basecalls from a single pass of the insert DNA molecule

2
New cards

Telomere

Region of repetitive DNA sequences and proteins at the end of a chromosome that acts like a protective cap, preventing the chromosome from fraying, fraying with others, or being mistakenly recognized as damaged DNA

3
New cards

Binary alignment and map = BAM file

  • Binary format, not human-readable.

  • More efficient: smaller size, faster computation, reduced storage costs.

  • Almost all tools expect BAM input.

  • Must be sorted (by read ID or genomic coordinates) and indexed (BAI file) before use.

4
New cards

Sequence alignment and map = SAM files

  • Contains alignment info of mapped (against the reference genome) and unmapped sequences.

  • Text-based, human-readable.

5
New cards

BAM Index = BAI

  • Companion file, much smaller.

  • Acts like a “table of contents” for BAM.

  • Must be regenerated after sorting.

  • Required by most analysis software.

6
New cards

Contig

= a set of overlapping DNA segments that together represent a consensus region of DNA

7
New cards

How are you going to investigate the methylation status?

IGV: Integrative Genomics Viewer

8
New cards

What is the average telomere length in humans?

  • Highly variable

  • Primarily depends on age and type of cell tissue

  • Blood cells:

    • At birth (newborns): 5-15 kb

    • Young adults (24): 12 kb

    • Older adults (72): 7.2 kb

  • Shorten with (chronological) age

    • shorter telomeres are associated with advanced age and an increased risk op age-related diseases

  • Cell-type variation

9
New cards

Rate of attrition

  • around 23 bp per year in cross-sectional studies

  • about 38 bp per year in longitudinal studies

10
New cards

G-rich overhang

= single-stranded (G-rich), 3' end of the telomere (TTAGGG repeats) left after DNA replication (typically 20-500 nucleotides) → necessary substrate for telomerase and the binding site for the protective protein POT1 (protection of telomeres 1) (shelterin) → binding helps prevent the overhang from activating the DNA damage checkpoint

11
New cards

T-loop (telomere loop)

  • = large, protective lariat structure formed when the G-rich overhang folds back and invades the double-stranded region, effectively sealing the end of the chromosome to prevent it from being mistaken for a DNA break (inaccessible to nucleases) (= end protection)

  • TRF2 (Telomere Repeat Factor 2)(Shelterin complex), promotes and stabilizes the formation of this loop.

12
New cards

D-Loop (Displacement-Loop)

A three-stranded DNA junction formed where the G-rich overhang invades the double helix, displacing the original C-rich strand and stabilizing the crucial T-loop structure.

13
New cards

Subtelomere

  • Segments of DNA in transition zone between the highly repetitive, protective telomere and the chromosome’s unique, gene-rich sequences (euchromatin)

  • Directly adjacent to telomere repeats

  • Mosaic patchwork of DNA sequences: multiple blocks of highly homologous (similar) DNA repeats and large, evolutionarily recent segmental duplications

  • Highly variable in size and sequence between different chromosome ends and even between the two copies (alleles) of the same chromosome in an individual => most structurally unstable and dynamic regions of the entire human genome

  • Genome stability (buffer zone: TPE), gene regulation, and evolutionary adaptation (hotspot for recombination)

14
New cards

Can you explain the principle of mapping? Why do you need subtelomeres?

  • Mapping = determining the relative locations of DNA sequences along a chromosome → Where are shorter telomeres, for example?

  • Subtelomeres = contain unique, chromosome-specific blocks of DNA that are distinct for each arm

  • Long-range haplotypes = high level of variation between the two alleles of subtelomeres

15
New cards

Why are you still determining the sequence of the telomere if you know it is the same repeat and you can just count the number of repeats?

  • Determining telomeric variants, which are deviations from the canonical human telomere sequence → leads to problems (instability, potential cell death)

  • Disease risk determination: maybe genetically predetermined short or long telomeres (telomeropathies, cancers…)

  • Biomarkers: biological age, predisposition to age-related diseases

16
New cards

Link telomere length to methylation adjacent subtelomere?

Determine the length of the telomere (a highly variable trait) and link it to the specific genetic variations (the haplotype) found in the adjacent subtelomere of the maternal or paternal chromosome. This is crucial for understanding the cis-acting elements (DNA sequences on the same chromosome) that regulate telomere length.

17
New cards

Terminal Restriction Fragment Analysis

  • DNA isolation

  • DNA digestion (enzymes that cut outside of telomeric repeats)

  • Southern blot

  • Hybridization of telomeric probe

  • Telomere length analysis

18
New cards

Telomere maintenance mechanisms (TMM)

Cancers: enabling replicative immortality

  • Mainly: Activating reverse transcriptase telomerase (adding repeats) (stemcells also use this)

  • Cellular recombination machinery: alternative lengthening of telomeres (ALT)

19
New cards

Why is/was it challenging to sequence/map/assemble entire human telomeres by Sanger sequencing/NGS?

  • repetitive

  • 0.015% of total human genome

  • length: short - several 10 kb

  • 92 chromosome ends in diploid human cell

20
New cards

Where do the donor-derived fibroblasts that you have used come from? What is the influence of Alzheimer’s?

  • Collected at University of California, San Diego (UCSD)

  • Part of Salk AHA-Allen aging cohort.

  • Alzheimer’s Disease Research Center participants at UCSD

  • Multiple studies and meta-analyses confirm shorter telomere length (LTL) in AD patients, particularly in leukocytes (white blood cells). → But that is not the study we are interested in at the moment, but we have to take it into account.

21
New cards

Why also interested in chromosome-specific telomere length?

  • suggestions of chromosome-arm specific factors that influence telomere length

  • some telomere arms are sign. longer: telomere 18q is the longest in 9 individuals

  • some telomere arms are sign. shorter

22
New cards

Why do we want to visualize chromosome- and allele-specifically?

  • For decades, telomere research relied on measuring Average Telomere Length (ATL) across all 92 chromosome ends in a person's cells.

    • Limitation: This is like measuring the average shoe size in a city; the average is uninformative if the city's problems are caused by one person wearing critically small shoes. Similarly, cell senescence (aging) is not triggered by the average length, but by the shortest telomere (the "first critical telomere") becoming dysfunctional.

    • Need for Specificity: To understand aging and disease, we need to know: Which chromosome end is the shortest, and why does it shorten faster than the others?

  • The Solution: Allele-specific mapping connects the physical telomere length to the unique genetic fingerprint of the immediately adjacent subtelomere. The highly variable, measured length of the terminal repeats on a single chromosome end (e.g., 5 kb on the paternal copy of chromosome 12p).

  • Subtelomere Haplotype (The Regulator): The unique combination of structural variants (segmental duplications, non-coding transcripts like TERRA) in the adjacent subtelomere of the same physical chromosome (the allele). | The determining factor that carries the regulatory signals. | | Cis-acting Element | Any DNA sequence (e.g., a specific gene variant or non-coding RNA promoter) located on the same DNA molecule (in cis) that regulates the telomere length of that molecule. | The molecular switch that is physically linked to, and governs, the length of its own telomere. | ### 3. Cis-acting Elements Regulate Telomere Length The most profound insight from this mapping is the identification of cis-acting elements in the subtelomere that dictate how long a telomere is maintained, independently of the overall cellular environment. * TERRA Regulation: The subtelomere contains promoters (start sites) for TERRA (Telomeric Repeat-containing RNA), a long non-coding RNA essential for recruiting telomerase. A specific subtelomere haplotype (e.g., one with a particular variation in the promoter region) might cause higher or lower TERRA production. * Effect: A lower-producing TERRA haplotype would result in less telomerase being recruited to that specific chromosome end, causing that telomere to shorten more quickly than its partner allele. * Chromatin State: Subtelomeric variations influence the spreading of heterochromatin (compacted, silent DNA) from the subtelomere into the telomere. This chromatin state regulates telomerase access. * Effect: A haplotype that promotes a more open chromatin state might increase telomerase access, leading to a longer telomere on that specific allele. ### 4. Importance for Diagnostics and Research This detailed, allele-specific mapping is critical because it moves telomere biology from a general measure of aging to a precise genetic tool: 1. Pinpointing Risk: It allows researchers to pinpoint the specific chromosome end (e.g., the maternal copy of 17p) that is genetically predisposed to being the shortest—the one most likely to trigger disease (like bone marrow failure). 2. Therapeutic Targets: By identifying the exact cis-acting sequence in the subtelomere responsible for a telomere's unique length, scientists can develop targeted therapies (e.g., gene therapy to restore TERRA production) for specific chromosome ends, rather than broadly targeting the entire cell's telomerase. In short, Map Length to Haplotype allows us to read the unique genetic instructions embedded in each subtelomere that determine the specific length and stability of its adjacent telomere cap. Would you like to know more about the relationship between telomere shortening and a specific disease, such as Idiopathic Pulmonary Fibrosis?

<ul><li><p>For decades, telomere research relied on measuring <strong>Average Telomere Length (ATL)</strong> across all 92 chromosome ends in a person's cells. </p><ul><li><p><strong>Limitation:</strong> This is like measuring the average shoe size in a city; the average is uninformative if the city's problems are caused by one person wearing critically small shoes. Similarly, cell senescence (aging) is not triggered by the average length, but by the <strong>shortest telomere</strong> (the "first critical telomere") becoming dysfunctional.</p></li><li><p><strong>Need for Specificity:</strong> To understand aging and disease, we need to know: <em>Which</em> chromosome end is the shortest, and <em>why</em> does it shorten faster than the others? </p></li></ul></li><li><p>The Solution: Allele-specific mapping connects the physical telomere length to the unique genetic fingerprint of the immediately adjacent subtelomere. The highly variable, measured length of the terminal repeats on a single chromosome end (e.g., 5 kb on the paternal copy of chromosome 12p). </p></li><li><p><strong>Subtelomere Haplotype (The Regulator)</strong>: The unique combination of structural variants (segmental duplications, non-coding transcripts like TERRA) in the adjacent subtelomere of the <strong>same physical chromosome</strong> (the allele). | The determining factor that carries the regulatory signals. | | <strong>Cis-acting Element</strong> | Any DNA sequence (e.g., a specific gene variant or non-coding RNA promoter) located on the <strong>same DNA molecule</strong> (in <em>cis</em>) that regulates the telomere length of that molecule. | The molecular switch that is physically linked to, and governs, the length of its own telomere. | ### 3. Cis-acting Elements Regulate Telomere Length The most profound insight from this mapping is the identification of <strong>cis-acting elements</strong> in the subtelomere that dictate how long a telomere is maintained, independently of the overall cellular environment. * <strong>TERRA Regulation:</strong> The subtelomere contains promoters (start sites) for <strong>TERRA</strong> (Telomeric Repeat-containing RNA), a long non-coding RNA essential for recruiting telomerase. A specific subtelomere haplotype (e.g., one with a particular variation in the promoter region) might cause <strong>higher or lower TERRA production</strong>. * <strong>Effect:</strong> A lower-producing TERRA haplotype would result in less telomerase being recruited to that specific chromosome end, causing that telomere to shorten more quickly than its partner allele. * <strong>Chromatin State:</strong> Subtelomeric variations influence the spreading of <strong>heterochromatin</strong> (compacted, silent DNA) from the subtelomere into the telomere. This chromatin state regulates telomerase access. * <strong>Effect:</strong> A haplotype that promotes a more open chromatin state might increase telomerase access, leading to a longer telomere on that specific allele. ### 4. Importance for Diagnostics and Research This detailed, allele-specific mapping is critical because it moves telomere biology from a general measure of aging to a precise genetic tool: 1. <strong>Pinpointing Risk:</strong> It allows researchers to pinpoint the specific chromosome end (e.g., the maternal copy of 17p) that is genetically predisposed to being the <strong>shortest</strong>—the one most likely to trigger disease (like bone marrow failure). 2. <strong>Therapeutic Targets:</strong> By identifying the exact cis-acting sequence in the subtelomere responsible for a telomere's unique length, scientists can develop targeted therapies (e.g., gene therapy to restore TERRA production) for specific chromosome ends, rather than broadly targeting the entire cell's telomerase. In short, <strong>Map Length to Haplotype</strong> allows us to read the unique genetic instructions embedded in each subtelomere that determine the specific length and stability of its adjacent telomere cap. Would you like to know more about the relationship between telomere shortening and a specific disease, such as Idiopathic Pulmonary Fibrosis?</p></li></ul><p></p>
23
New cards

Telomere Position Effect (TPE)

phenomenon where the structure of the telomere influences the transcriptional activity (gene expression) of genes located in the adjacent subtelomeric region of the chromosome. This influence typically results in the silencing or reduced expression of those nearby genes.

24
New cards

Telomere Position Effect-Over Long Distances (TPE-OLD)

extension of TPE, describing the observation that telomeres can influence the expression of genes located much further away along the chromosome, potentially impacting genes that are megabases away from the telomere itself.

25
New cards

Basecalling

Computational process: converting raw, analog or digital signal data produced by a sequencing instrument (like changes in current, light intensity, or voltage) into the corresponding sequence of nucleotide bases (A, C, G, T) that make up a DNA or RNA strand.

26
New cards

Bonito basecalling model

  • Open-source, deep-learning framework

  • Oxford Nanopore Technologies (ONT)

  • Convert the raw electrical signals from their sequencers into A, C, G, T nucleotide sequences.

  • Convolutional Neural Network (CNN) layers to process the noisy current, followed by a Recurrent Neural Network (RNN) core to interpret the time-series context, and finally employs a Connectionist Temporal Classification (CTC) decoder to output the base sequence, acting primarily as a platform for research and development of new high-accuracy basecalling algorithms.

  • Why? analog electrical current fluctuation → noisy and continuous + electrical current influenced by 4 to 6 bases simultaneously sitting within the pore (k-mer)

27
New cards

Dorado

  • current, high-performance, and officially supported production basecaller software developed by Oxford Nanopore Technologies (ONT) for processing the raw electrical signals from their sequencers.

  • Integrated bioinformatics features such as: Modified Base Calling: Detecting epigenetic markers like 5mC and 6mA directly from the signal. - Duplex Basecalling: Combining signals from both strands of a DNA molecule for the highest possible accuracy. - Read Trimming and Alignment: Post-processing steps like removing adapters and aligning reads using tools like Minimap2.

28
New cards

What is the difference between reference mapping and de novo assembly?