Sequencing II: Bioinformatics

0.0(0)
studied byStudied by 0 people
0.0(0)
full-widthCall with Kai
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/44

encourage image

There's no tags or description

Looks like no tags are added yet.

Study Analytics
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai

No study sessions yet.

45 Terms

1
New cards

NM_004006.2

Reference sequence

-Where the variant is in relation to the reference sequence used

2
New cards

c.

Letter prefix

3
New cards

124A>G

Position and type of change

4
New cards

What do reference sequences start with?

NG

5
New cards

Genome builds

36.1/hg18, GRCh37/hg19, GRCh38/hg38

6
New cards

What do reference mRNA sequences start with?

NM

7
New cards

What do reference protein sequences start with?

NP

8
New cards

Why is the correct reference sequence important?

-Genes can have multiple isoforms

--Different isoforms of a gene will have different reference sequences

-Alternative splicing; gives us a different transcript that results in a different protein

9
New cards

Genomic DNA (g.)

First nucleotide of the genomic reference sequence

10
New cards

Coding DNA (c.)

First nucleotide of the translation start coding on the coding DNA reference sequence

11
New cards

Noncoding DNA (n.)

First nucleotide of the noncoding DNA reference sequence

12
New cards

Mitochondrial DNA (m.)

First nucleotide of the mito DNA reference sequence

13
New cards

RNA (r.)

First nucleotide of the translation start codon of the RNA reference sequence, or first nucleotide of the noncoding RNA ref seq

14
New cards

Protein (p.)

First amnio acid of the protein sequence

15
New cards

CNV- Deletion (del) format

"prefix""position(s)_deleted""del"

16
New cards

CNV- Duplication (dup) format

Only used for tandem duplication

"prefix""position(s)_duplicated""dup"

17
New cards

CNV- Insertion (ins) format

"prefix""positions_flanking""ins""inserted_sequence"

18
New cards

CNV- Inversion (inv) format

"prefix""positions_inverted""inv"

19
New cards

Contig

Historical term when genome was first sequenced

"Chunks" of sequence that is your 'reference' point relative to the g.___ position

20
New cards

Duplicate reads

Generated during PCR amp in library prep

-Too many can skew variant calling algorithms and introduce PCR-based sequencing errors into the variant calling algorithms

21
New cards

Depth

The number of sequencing reads at a given locus

-How many unique seq reads at a given locus?

22
New cards

Breadth

The amount of target loci with sequences aligned

-How much of the genome are we covering?

23
New cards

Possible issues w/ calling SNVs

-Zygosity

-Mosaicism

-Seq errors that are called as variants

24
New cards

Variant calling algorithms

Predict the likelihood of a variant vs. the likelihood of an artifact

-Quality score of individual reads

-Allele counts at the locus

-Strand bias

-Repeated regions

-Low complexity regions

More high quality reads w/ the same allele, the greater the likelihood that the variant is called

25
New cards

Coverage depth

Important to ensure accuracy of variant calling, especially in sample w/ low allelic fractions

-The more reads you have, the more likely it's a true variant!

26
New cards

What are some circumstances where you may detect a

low fraction of reads with an alternative allele?

1. Low level mosaicism

2. Tumor heterogeneity

27
New cards

Allelic fraction cut-off for mosaicism?

20% or lower

28
New cards

Allelic fraction of G:100%

Genotype is G/G

Homozygous for alternative allele

29
New cards

Allelic fraction of G:60% and A:40%

A/G

Heterozygous for alternative allele

30
New cards

Allelic fraction of G:20% and A:80%

A/G

Heterozygous BUT w/ mosaicism

31
New cards

Deeper coverage = ?

Higher quality variant calling

32
New cards

Calling structural variants

Measure relative read depth to infer changes in copy number

33
New cards

Less than average depth = ?

Deletion

34
New cards

More than average depth = ?

Duplication

35
New cards

Split reads

A single read where the 5' and the 3' end alignment do not align contiguously in the genome

36
New cards

Mapping paired reads- Above average distance between paired end reads = ?

Deletion

37
New cards

Mapping paired reads- Below average distance between paired end reads = ?

Insertion

38
New cards

FASTQ

Raw sequence reads and quality scores directly from the sequencer

39
New cards

BAM

Position information of sequence reads aligned to the reference genome, along w/ quality info

40
New cards

VCF

Variant call format

-Every location the sample differs from the reference

41
New cards

Filters to identify disease-causing variants exclude

1. Intronic or intergenic variants (not within exon)

2. Common variants

3. Synonymous variants

4. Variants not predicted to alter function

5. Variants that don't fir inheritance pattern

42
New cards

Variant filtering

-Sorting through thousands of variants to find those that contribute to phenotype

-Filtering scheme depends on the sequencing method and phenotype/inheritance pattern

43
New cards

How would you filter to identify de novo variants?

Check parents

44
New cards

How would you filter to identify variants causing an AR condition?

-Are parents affected?

-Do they have 1 or 2 copies?

-Does child have 2 bad copies?

45
New cards

How would you filter to identify variants causing an X-linked recessive condition in a male?

See if mom is a carrier of the condition