Week 15 - Genomics and Bioinformatics

0.0(0)
Studied by 0 people
call kaiCall Kai
Locked
learnLearn
examPractice Test
spaced repetitionSpaced Repetition
heart puzzleMatch
flashcardsFlashcards
GameKnowt Play
Card Sorting

1/106

encourage image

There's no tags or description

Looks like no tags are added yet.

Last updated 4:31 PM on 4/28/26
Name
Mastery
Learn
Test
Matching
Spaced
Call with Kai
Chat

No analytics yet

Send a link to your students to track their progress

107 Terms

1
New cards

What is the genome?

All the genetic material of an organism, including all the genetic material as well as the genetic material of plastids (mitochondria, chloroplasts etc.,)

2
New cards

How large is the human genome?

Human Genome: ~3,200,000 base pairs (3.2Gb) and roughly 23,000 genes. 

3
New cards

What is the “C value” paradox?

That genome size does NOT increase with perceived complexity of organisms.

4
New cards

What does the “C value” refer to in the “C value” paradox?

The "C-value" is the amount of DNA in a haploid nucleus.

5
New cards

Why can eukaryotic genes produce more than one protein per gene?

Because of alternative splicing of RNA transcripts (different combinations of exons are joined in RNA processing, producing different mRNA transcripts and therefore different proteins).

6
New cards

Why do parasites usually have small genomes?

Because they rely on their host for many functions, they lose unnecessary genes over time, resulting in a reduced genome.

7
New cards

Which animals have the smallest known gene complement?

Mesozoans, with about 9,000 genes.

8
New cards

Which animal kingdom has the lowest gene density?

Mammals have the lowest gene density.

9
New cards

What is gene density?

The number of genes in a given length of DNA.

10
New cards

What proportion of the human genome does not code for genes (genes meaning proteins, rRNAs, or tRNAs)?

98.5%

11
New cards

What proportion of the human genome codes for introns and gene-related regulatory sequences (e.g., enhancers, promoters)?

Approx 25%

12
New cards

Are protein-coding genes made up of only exons and introns?

No, they also include regulatory regions and untranslated regions that control gene expression.

13
New cards

What is the role of the promoter in a gene?

It is the region where transcription begins and where RNA polymerase binds.

14
New cards

What is the 5′ UTR?

An untranslated region before the coding sequence that helps regulate translation.

15
New cards

What is the 3′ UTR?

An untranslated region after the coding sequence that influences mRNA stability and regulation.

16
New cards

What does a protein-coding gene consist of?

Exons, introns, promoters, enhancers, 5′ UTR, 3′ UTR, and a poly-A tail.

17
New cards

What proportion of the human genome is non-coding DNA?

15%

18
New cards

What is noncoding DNA?

DNA that does not code for proteins but often produces functional RNAs (ncRNAs) with important roles in the cell.

19
New cards

What are the two main types of noncoding RNAs?

Small ncRNAs and long ncRNAs (lncRNAs).

20
New cards

What is rRNA and its function (small noncoding RNA)?

Ribosomal RNA; forms part of the ribosome for protein synthesis.

21
New cards

What is tRNA and its function (small noncoding RNA)?

Transfer RNA; carries amino acids to the ribosome during translation.

22
New cards

What is snRNA and its function (small noncoding RNA)?

Small nuclear RNA; involved in splicing pre-mRNA in the spliceosome.

23
New cards

What is snoRNA and its function (small noncoding RNA)?

Small nucleolar RNA; helps process and assemble ribosomes.

24
New cards

What is gRNA and its function (small noncoding RNA)?

Guide RNA; directs RNA editing.

25
New cards

What is miRNA and its function (small noncoding RNA)?

MicroRNA (~24 nucleotides); regulates gene expression by silencing mRNA.

26
New cards

What are long ncRNAs (lncRNAs)?

RNAs longer than 200 nucleotides that regulate gene expression.

27
New cards

What proportion of the human genome is repetitive sequences?

14%

28
New cards

What are repeated DNA sequences?

DNA sequences that occur multiple times in the genome.

29
New cards

What are the two main types of repeated sequences?

Repeats unrelated to transposable elements and repeats related to transposable elements.

30
New cards

What are repeats unrelated to transposable elements?

Tandem repeats that arise from DNA replication errors, not from mobile genetic elements.

31
New cards

What are tandem repeats?

DNA sequences repeated directly next to each other.

32
New cards

How do tandem repeats form?

Through strand slippage during DNA replication.

33
New cards

What is strand slippage?

An error during DNA replication where the DNA polymerase slips and repeats a section of DNA.

34
New cards

What are minisatellites?

Tandem repeats with repeat units of 10–60 base pairs.

35
New cards

What are microsatellites?

Tandem repeats with repeat units shorter than 10 base pairs.

36
New cards

To other scientists, what are microsatellites also referred to as?

Microsatellites are often referred to as short tandem repeats (STRs) by forensic geneticists, or as simple sequence repeats (SSRs) by plant geneticists.

37
New cards

What are microsatellites used in?

DNA profiling/fingerprinting.

38
New cards

What are repeats related to transposable elements?

Interspersed repeats spread throughout the genome, originating from mobile genetic elements.

39
New cards

What are interspersed repeats?

Repeated DNA sequences scattered across the genome rather than located next to each other.

40
New cards

What are transposable elements (TEs)?

DNA sequences that can move or copy themselves to new locations in the genome.

41
New cards

What are the two main types of transposons?

DNA transposons and RNA transposons.

42
New cards

How do DNA transposons move?

By a “cut and paste” mechanism—moving from one location to another.

43
New cards

How do RNA transposons move?

By a “copy and paste” mechanism—creating a new copy that inserts elsewhere.

44
New cards

What is the main difference between DNA and RNA transposons?

DNA transposons move without increasing copy number, while RNA transposons increase copy number.

45
New cards

Who discovered Transposons and how?

Barbara McClintock’s discovered them doing breeding experiments with Indian corn.

McClintock identified changes in the color of corn kernels that made sense only if some genetic elements move from other genome locations into the genes for kernel colour

46
New cards

What is transposase?

An enzyme that cuts and inserts DNA transposons into new genomic locations.

47
New cards

What are TIRs (Terminal Inverted Repeats)?

Short sequences at the ends of a transposon that are recognized by transposase.

48
New cards

What are TSDs (Target Site Duplications)?

Short repeated sequences created at the site where a transposon inserts.

49
New cards

What is the first step in DNA transposition?

Transposase binds to the TIRs.

50
New cards

What happens after transposase binds to the transposon?

The DNA is cut and the transposon is excised.

51
New cards

What happens during insertion of a DNA transposon?

The transposon is inserted into a new site, creating target site duplications (TSDs).

52
New cards

What is the role of TIRs, transposase, and TSDs in DNA transposition?

TIRs allow recognition, transposase moves the element, and TSDs are created upon insertion.

53
New cards

What are Alu elements and why are they important?

Alu elements are a family of retrotransposons found in primates; they are the most abundant repetitive elements in the human genome (~1 million copies, ~10% of the genome). Many are transcribed, and some may regulate gene expression, especially through CpG-rich regions involved in epigenetics.

54
New cards

How can transposable elements influence recombination?

Multiple similar TE copies can promote recombination between different regions, causing genomic rearrangements.

55
New cards

What happens when a transposable element inserts into a protein-coding sequence?

It can disrupt the gene and block protein production.

56
New cards

What happens when a transposable element inserts into a regulatory region?

It can increase or decrease gene expression.

57
New cards

How can transposable elements move genes within the genome?

They can carry genes or gene fragments to new genomic locations.

58
New cards

How do transposable elements contribute to gene silencing?

They can trigger mechanisms like DNA methylation and non-coding RNAs, which may also silence nearby genes.

59
New cards

What is the overall impact of transposable elements on genome evolution?

They usually cause harmful changes, but can sometimes create beneficial genetic variation.

60
New cards

List major effects of transposable elements on the genome.

Recombination, gene disruption, altered gene regulation, gene movement, and gene silencing; mostly harmful but occasionally beneficial.

61
New cards

What is junk/selfish DNA?

Noncoding DNA that allegedly has no function.

62
New cards

What is epigenetics?

Heritable changes of genetic information not caused by changes in the DNA sequence.

63
New cards

Describe 3 epigenetic mechanisms.

Mechanism 

How it works 

Effect 

DNA Methylation 

Adding a methyl group () to DNA. 

Usually represses gene expression (makes DNA inaccessible). 

Histone Modification 

Chemical tags attach to histone "tails." 

Changes how tightly DNA is wrapped. Tight = Off; Loose = On. 

RNA interference 

Use of non-coding RNAs (ncRNAs). 

Regulates gene expression post-transcription. 

64
New cards

Describe DNA methylation and its effect (an epigenetic mechanism).

How it works: A methyl group is added to DNA

Effect: Makes it inacessible, usually repressing gene expression

65
New cards

Describe histone modification and its effect (an epigenetic mechanism).

How it works: Chemical tags attach to histone “tails”, modifying them.

Effect: Changes the folding of DNA. Tightly folded DNA is “off” and loosely folded DNA is “on”.

66
New cards

What role do histone modifications play in gene regulation?

They control whether DNA is folded (compact) or unfolded (accessible), affecting gene expression.

67
New cards

What happens when DNA is unfolded?

It becomes accessible to regulatory elements and proteins.

68
New cards

What happens when DNA is folded?

Regulatory elements are not accessible, and gene expression is reduced or silenced.

69
New cards

How does DNA methylation affect gene expression?

Methylation reduces access to regulatory elements, decreasing gene expression.

70
New cards

What happens when DNA is unmethylated and unfolded?

Regulatory elements are accessible, allowing transcription factors to activate gene expression.

71
New cards

What is genomic imprinting?

Gene expression that depends on whether the allele is inherited from the mother or father.

72
New cards

How does genomic imprinting affect gene expression?

One parental allele is silenced, so only one copy of the gene is expressed.

73
New cards

What mechanism usually causes imprinting?

DNA methylation that silences one allele.

74
New cards

Give an example of an imprinted gene.

IGF2, where only the paternal allele is expressed.

75
New cards

First generation genome sequencing.

  1. What is first-generation sequencing and how does it work?

  2. What are the key features and limitations of Sanger sequencing?

  3. What is a major historical example of Sanger sequencing use?

  1. Sanger sequencing uses a modified PCR reaction with normal dNTPs and fluorescent chain-terminating ddNTPs, which stop DNA strand synthesis and allow sequence detection.

  2. It requires prior genetic information (specific primers), has low throughput, and is not suitable for large-scale sequencing.

  3. The Human Genome Project, which cost about $3 billion, took 15 years, and involved 18 countries.

76
New cards

Second generation genome sequencing.

  1. What is second-generation sequencing?

  2. What are the key features of second-generation sequencing?

  3. What are its practical advantages?

  1. Next-generation sequencing (e.g. Illumina) that does not require prior genetic information and sequences DNA in parallel from fragmented samples.

  2. High throughput, high sensitivity, and quantitative data output.

  3. A human genome can be sequenced for around $1,000 in under a week in a single lab.

77
New cards

Third generation genome sequencing.

  1. What is third-generation sequencing?

  2. What are the key features of third-generation sequencing?

  1. Single-molecule sequencing (e.g. Nanopore) that reads DNA directly in real time without prior genetic information.

  2. Single-molecule sequencing, long reads, real-time analysis, and no need for shotgun sequencing.

78
New cards

What are orthologous genes a product of?

A product of speciation.

79
New cards

What are paralogous genes a product of?

Duplication within a genome.

80
New cards

What does BLAST stand for?

Basic Local Alignment Search Tool

81
New cards

What is BLAST used for?

Comparing DNA or protein sequences to find regions of similarity in a database.

82
New cards

How does BLAST work in simple terms? Why is BLAST sometimes called “Google for genes”?

It takes a query sequence and searches a database to find similar sequences (“hits”) above a statistical threshold.

Because it lets you input a sequence and find matching or similar sequences in a large database.

83
New cards

Who developed BLAST and when?

Altschul et al. in 1990.

84
New cards

Describe and explain the steps of BLAST.

  1. Split the query into short “words”

  • The sequence is broken into small fragments (e.g. 3-letter words for DNA)

  1. Search the database for matching words

  • BLAST scans sequence databases for identical or similar words

  • Similarity is evaluated using a scoring matrix (especially for proteins)

  1. Extend the matches

  • When a match is found, BLAST extends it in both directions

  • This builds longer alignments if the similarity is strong enough

  1. Score and evaluate significance

  • All alignments are scored

  • BLAST calculates an E-value (Expect value):

    • Represents how many matches would be expected by chance

    • Lower E-value = more significant match

    • E-value of 0 = extremely strong match, virtually no chance of random similarity

85
New cards

What is the difference between identity, similarity, and homology in sequence comparison?

Identity is exact matches of amino acids/nucleotides; similarity is chemically similar residues based on scoring matrices (e.g. PAM, BLOSUM); homology is an evolutionary relationship meaning shared common ancestry.

86
New cards

Can homology be expressed as a percentage?

No, homology is qualitative—sequences are either homologous or not.

87
New cards

What are orthologs and paralogs?

Orthologs are homologous genes in different species; paralogs are homologous genes within the same species due to gene duplication.

88
New cards

What are sequence databases?

Online repositories that store DNA, RNA, and protein sequences for search and analysis.

89
New cards

Give examples of sequence databases.

GenBank, EBI databases, ARDB/SILVA, and Pfam.

90
New cards

What are genome browsers used for?

To visually explore genomes, including gene location, structure, and function.

91
New cards

Give examples of genome browsers.

Ensembl and Genomicus.

92
New cards

What is GenBank?

GenBank is an open-access sequence database, established in 1982, that contains an annotated collection of all publicly available nucleotide sequences and their corresponding protein translations.

93
New cards

What is ARB-SILVA and what is it used for?

ARB-SILVA is a database of aligned ribosomal RNA sequences (16S/18S SSU and 23S/28S LSU) from all three domains of life (Bacteria, Archaea, and Eukarya), commonly used for environmental sequencing.

94
New cards

What is Genomicus used for?

Genomicus is a genome browser that visualises synteny (conserved gene order/physical co-localisation of genes) and uses it to study genome evolution across different species.

95
New cards

What does the protein SRGAP2 stand for and what is it involved in? LABS

SLIT-ROBO Pho GTPase-activating protein 2

is a protein involved in neuronal migration and differentiation.

It plays a critical role in synaptic development.

96
New cards

How are human genes written differently to proteins? LABS

Human gene names use capital letters and italics, while the name of the corresponding protein uses uppercase without italics.

97
New cards

What is FASTA formatting? LABS

FASTA is a sequence format in which the first line contains the greater-than symbol (>) followed by the name of the sequence, the second line contains the primary sequence of the gene or protein.

98
New cards

On BLAST, in the graphical representations, what colours indicate high similarity hits and which colous indicate low similarity hits? LABS

Black for low similarity

Red for high similarity

99
New cards

What are orthologous genes? LABS

Orthologous genes are versions of the same genes in the genomes of different species, thus the different versions arose by speciation events.

100
New cards

What are paralogous genes? LABS

Paralogous genes are copies of a gene within the genome of one or more species, and appear by gene duplication.