Gene 4001: Transcriptomics

0.0(0)
studied byStudied by 1 person
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
Card Sorting

1/39

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced

No study sessions yet.

40 Terms

1
New cards

Transcriptome

- all the RNA molecules transcribed from a genome.

- It also varies according to the function and structure of a cell

2
New cards

Transcriptomics

it is the study of the complete transcriptome encoded by a specific cell (or population of cells) at a specific time point and/or under specific functions

- most transcriptomic studies are focused on differential gene expression and are more interested in mRNA.

- in that case you use the polyA enrichment method.

3
New cards

Why do you use a polyA enrichment method when looking at mRNA in transcriptomic?

It allows you to selectively isolate messenger RNA from the total RNA pool.

- Only mature mRNA in eukaryotes has a polyA tail

4
New cards

Types of RNA

1. Protein synthesis:

- mRNA, tRNA, rRNA

2. Co/post transcriptional modification/DNA replication

- small nuclear RNA (snRNA) and small nucleolar RNA (snoRNA)

3. Regulatory

- miRNA, piRNA, siRNA and lncRNA

4.Parasitic

- viral RNA

5
New cards

Transcription

- In eukaryotes, the coding region is often split into segments (exons) by one or more non-coding introns/

- the entire gene sequence is trasncribed into pre-mRNA.

- The introns are excised and the exons are spliced together to make the mature mRNA molecule.

- Exon splicing is achieved by snRNAs and proteins in the spliceosomes.

6
New cards

Transcriptome methods

1. Microarray

2. RNAseq (cDNAseq)

7
New cards

Microarray

a grid of DNA segments of known sequence is used to test and map DNA fragments, antibodies, or proteins.

- The probes are against known transcripts, and the level change can be detected through fluorescence detection

8
New cards

Microarray advantages

- high precision within a dunamic range.

- higher throughput

- lower cost

- No alternative splicing information

9
New cards

Microarray disadvantages

- Balanced chromosomal rearrangements are not detected

- Imbalances in regions not included on array platform are not detected

- low dynamic range (the dynamic range is the 3 orders of magnitude between a lowly expressed gene and a highly expressed gene)

10
New cards

RNA seq

The sequencing technique used to determine directly the nucleotide sequence of a collection of RNAs.

- It is a more transcriptome-wide analysis

11
New cards

RNA-seq advantages

- Don't need to know the sequence before the assay

- Learn about what isoforms are expressed in a cell

- higher reproducibility

- Higher dynamic range

- higher precision

12
New cards

RNA-seq workflow

1. In vivo

Pre-mRNA, intron splicing occurs and then you get a mature mRNA

2. In vitro

Fragmentation of RNA, reverse transcription occurs, ds-cDNA fragments, high-throughput sequencing occurs.

3.In silico

Library prep, Mapping and alignment, Gene expression estimates

13
New cards

RNAseq

- Usually you use illumina ( a short read technology)

14
New cards

Illumina TruSeqRNA protocol and how it works?

- commonly used kit used in library preparation of RNAseq.

- it is focused on mRNA.

How it works:

- It uses a polyT bead that would complement the polyA in messenger RNA fragments.

- It will select for mRNA.

- Addition of random primer to make cDNA.

- The RNA strand will be removed and another cDNA strand will be made (this is the cDNA from the messenger RNA transcript)

-End-repair phosphorylation, poly A tailing and adapter ligation will occur

-Then PCR amplification and sequencing

15
New cards

What do the universal p5 and p7 adapters do?

- they bind to the flow cell on your Illumina and allow the binding of your library to the Illumina flow cell

- without it, your fragments would wash away during sequencing

16
New cards

What is the goal of an index in an adapter?

- allows for multiplexing of samples

17
New cards

Why do paired-end sequencing?

- if you want more information in highly repetitive regions or exon junctions, because you will be able to get more reads in the region

-Improves accuracy for the detection of differential expression for low-expressed genes.

18
New cards

Why do single-end sequencing

it is cheaper than PE

- faster turn-around

- smaller data footprint

19
New cards

RNA quality check

- prior to creating a library, you must look at the quality of your RNA.

- you can do so by using a gel ( able to look at the contamination and noise in the gel)

- you can also run a fluorometric assay called Qubit, to look at your RNA quality

- also use a bioanalyser, and this gives you an RNA integrity number, and you want a number above 8

20
New cards

Whats an important thing to look for when creating an RNA library?

- Presence of your ribosomal rna.

- Humans have 18s/28s , a good library will show two bands corresponding with the two RNAs.

21
New cards

Qubit

- will give you an RNA integrity and quality score.

- it does so by utilisng two unique dyes, one that binds to large , intact highly structured RNA (mRNA, tRNA, rRNA) and another selectively binds to small/or degraded RNA.

- Together they enable you to quickly assess the quality and integrity of the RNA sample.

- a score is given from 1-10, where a small number indicates the sample is compromised and the a large number indicates that the sample consists of mainly large RNA

22
New cards

Bioanalyser (electropherogram)

- A bioanalyser gives you an electropherogram, a graph showing the fluorescence intensity versus fragment size.

- the peaks will show the 18s and 28s respectively and other peaks for degraded RNA etc.

23
New cards

Bioanalyser (electropherogram) for plants

- They have more ribosome RNA, which affects the score, thus you cannot have much confidence when you do this for plants

- no single metric is sufficient; rely on judgment and experience

24
New cards

Library QC

- you need to have a uniform fragment size of around 260 bp.

- if you see multiple peaks, you do not have a uniform fragment size=not ideal for sequencing

25
New cards

Why do you have to mRNA enrichment to look at gene expression?

- mRNA enriched library gives roughly 40% more sequence against exons (i.e. message).

26
New cards

Sequence QC

1. Filter for quality.

2. Trim adapters ( in longer sequences , there is a lot of adapter contamination)

27
New cards

Phred score/FastQ

- a measure of the quality of base call in a DNA sequencing, the probability of whether a sequencing is incorrect.

- most seq. experiment you want 30 at least

28
New cards

How much percentage of your reads won't be mapped to the reference genome?

- roughly 20%, but this is standard, could be contamination, sequencing error etc.

29
New cards

How much percentage of PCR duplicates do you detect on average in reads and what do we use to detect it?

- roughly 15%, and we correct this error by using an unique molecular identifier (UMI).

-some adapters have a UMI, this umi will allow you to remove any PCR duplicates present

30
New cards

How do I determine quantity and compare gene expression (after QC and mapping)?

1. Normalise for sequencing depth i.e. number of reads per sample (between samples, libraries) and gene length.

2. This is critical for both within and between sample comparisons. Even though Gene X is the same length between samples, the libraries of the samples will vary, so expression must be normalised.

31
New cards

What is RPKM?

- Reads per kilobase of exon model per million reads

- it is a within sample normalisation method that removes the transcript length and library size effects.

- it basically tells you how active a gene is while adjusting for gene size and how deeply you sequenced the sample.

How do you calculate:

1. count total reads and divide by 1 million

2. normalise across samples

3. divide rpm by gene length

32
New cards

What is FPKM?

fragments per kilobase of transcript per million mapped reads

- similar to RPKM but used for paired end reads

33
New cards

What is TPM?

- Transcripts per million.

- Here you are doing the gene length correction first for your sample.

- This means the denominator is the same between samples.

1. divide read counts by length of each gene in kb

2. Count all RPK values in a sample and divide by 1 million

3. Normalise across samples.

34
New cards

What FDR rate do we go for?

- around 0.05 or less

35
New cards

What type of analysis can you do with your data?

- heatmaps

- Principal component analyssi (PCA_

- Functional annotation of DEG

36
New cards

CLIP-seq

Stands for Crosslinking and Immunoprecipitation sequencing

Used to identify binding sites of RNA-binding proteins (RBPs) on RNAs

Combines UV crosslinking, immunoprecipitation (IP), and high-throughput sequencing

Reveals where on the RNA a protein binds — useful for studying gene regulation, splicing, stability, etc.

37
New cards

CLIP- seq workflow

1. UV Crosslinking

Irradiate living cells with UV light to covalently link RBPs to bound RNA.

2. Cell Lysis

Break open cells to extract RNA-protein complexes.

3. Immunoprecipitation (IP)

Use an antibody to pull down the specific RBP (and its bound RNAs).

4. RNase Digestion

Partially digest RNA to trim unbound regions, leaving the protected fragment.

5. Gel Purification

Run complexes on a gel, cut out the desired band (RBP-RNA complex).

6. Proteinase Treatment

Remove protein, leaving a short RNA fragment that was bound.

7. cDNA Library Preparation

Convert RNA fragment to cDNA, add sequencing adapters.

8. High-throughput Sequencing

Sequence the cDNA to identify RNA fragments.

9. Data Analysis

Map reads to the genome to find binding sites and target RNAs.

38
New cards

Single cell sequencing

- interested in a rare cell type

- You have to have a single cell suspension.

- then form your libraries

39
New cards

What is the difference between bulk RNA seq and single cell RNA seq?

🔬 1. Sample Resolution

Bulk RNA-seq: Measures average gene expression across a large population of cells. It blends all the RNA from the sample together.

Single-cell RNA-seq: Measures gene expression at the level of individual cells, capturing cell-to-cell variability.

📊 2. Biological Insight

Bulk RNA-seq: Useful for understanding overall gene expression patterns in tissues or large cell populations.

Single-cell RNA-seq: Allows identification of rare cell types, cell states, and heterogeneous responses that bulk methods would average out.

40
New cards

Single cell sequencing workflow

- To make a library, we use 10x chromium.

- Then you would do your adapter ligation and would put it on an Illumina sequencing

- Three lanes:

1. Cells and RT master mix

2. Gel bead ( could be a specific colour bead for each sample

3. Oil + recovery well

- The beads enter, and the labelled cell enzyme enters from different entry points,s which then mix with the water and oil components.

- The water and oil emulsion encapsulates 1 bead in it.

- about 65% of these beads will have a single cell attached to it.

- After that you break your emulsion, amplify cDNA, construct library.

- 25k cells per lane which is approx 200k cells in a day