1/59
The interrupted gene and content of the genome (anatomy of a gene and genome)
Name | Mastery | Learn | Test | Matching | Spaced | Call with Kai |
|---|
No analytics yet
Send a link to your students to track their progress
Interrupted Gene
a gene in which the coding sequence is not continuous, often due to introns
Primary RNA Transcript
the original unmodified RNA product corresponding to a transcription unit
Mature Transcript
a modified RNA transcript. Modification may include splicing as well as alterations to the 5’ and 3’ ends.
RNA Splicing
a process that excises introns from RNA and connects exons into continuous mRNA
Introns are removed in cis
Mutations in exons can affect polypeptide sequences; mutations in introns can affect RNA processing and may influence the sequence and or production of a polypeptide
Chargoff’s First Parity Rule
proportion of As is equal to Ts, same for Gs to Cs. (Basically A% = T% and G% = C%)
Chargoff’s Second Parity Rule
An extrusion of structured stem-loop segments from duplex DNA, which would be greater in introns (Basically Α% ≈ Τ% and G% ≈ C%).
Cluster Rule
on a single strand, or on the nontemplate strand, there is more purines (As, Gs), and on the template stand you see more pyrimidines (all applying to exons). You can also get some clustering on the stem loops for introns
GC Rule
GC base pairs have 3 bonds and are stronger than AT pairs, which have 2 bonds. Organisms in high-stress environments often have more GC pairs to be more stable. GC pairs are more often found in exons.
Reverse transcriptase
can convert a mature RNA transcript into cDNA, which is a single-stranded DNA complementary to an RNA (in vitro). cDNA is different from genomic DNA since the introns are spliced out.
Look at the primary transcript versus mature RNA and compare with restriction mapping, electron microscopy, or sequencing to note which genes are the introns and which are the exons
How to map exons and introns using reverse transcript:
Intron Organization
Positions of introns are usually conserved when homologous genes are compared between different organisms. The lengths of the introns may vary greatly.
Introns usually do not encode proteins
Vary by size
the sequence are introns are much less similar. Introns evolve much more rapidly than exons because they lack selective pressure to produce a polypeptide with a useful sequence.
Exons and introns under negative selection:
Exons are conserved but introns vary
Exons and introns under positive selection (new mutations are beneficial)
Exons vary but introns are conserved. Individuals with an advantageous mutation survive relative to others without the mutation. Due to intrinsic genomic pressures that conserve the potential to extrude stem loops form duplex DNA, introns evolve more slowly than exons in positive selection pressure.
Gene Size
Genes vary in size
Gene size does NOT correlate with complexity
Unicellular organisms typically have more uninterrupted genes than multicellular eukaryotes
Exons are usually short, less than 100 amino acids
Introns are short in unicellular both can be much longer in multicellular organisms
Overall gene size is typically due to intron sizes
Pleiotropy
more than one independent trait from one gene sequence
Overlapping genes
a gene in which part of the sequence is found within a part of the sequence of another gene. If the reading frame changes, the polypeptide sequence can too.
Alternative splicing
differing exons and introns lead to different proteins being made.
Some exons correspond ot protein functional domains
Proteins consist of independent functional modules, the boundaries of which correspond to those of exons
Sometimes evolution is limited by just what is available. Natural selection isn't always the BEST way to do things, but rather the best of which the genome provides. Genes that aren’t the best but are kept can often be repurposed.
Exons of genes can appear homologous to exons of others, which can suggest a common exon ancestor
Gene Family
A set of genes within a genome that encode related or identical proteins or RNAs
Members are derived by duplication of an ancestral gene, followed by accumulation of changes in sequence between copies
Most members are related but NOT identical
All globins have a common form of organization with 3 exons and 2 introns, which suggests they all descended from a singular ancestral gene
Intron positions in the actin gene family are highly variable, which suggests they don’t have anything to do with functional genes but still look like each other, suggesting common descent.
Pseudogenes
junk DNA that looks like genes but lost its function over time. Tells you a lot about evolutionary history. Can be served as parts for other functional genes.
Sub-functionalization
half of a working gene and half of a junk DNA work together to make one whole working gene
Neo-functionalization
duplicates are free to mutate to do something new but related
Superfamily
gene families, but with more variation. One notch above gene families (Nested hierarchy).
Many forms of Information in DNA
Genetic information includes not only that related to characters corresponding to the conventional phenotype but also that related to characters (pressures) corresponding to the genome phenotype
Positional information might be important to development
Genetic sequence can be transferred horizontally from other species to the germ line, which could land in introns OR intergenic DNA through vertical transfer through generations.
Genome
the complete set of sequences in the genetic material of an organism, including the sequence of each chromosome plus any DNA in organelles.
Transcriptome
the complete set of RNAs present in a cell, tissue, or organism. Its complexity is due mostly to mRNAs, but also includes noncoding RNAs. Pretty fluid since there are so many different mRNA’s that make different proteins at different times for different functions.
Proteome
the complete set of proteins that is expressed by the entire genome. Sometimes used to describe the complement of proteins expressed by a cell at any one time.
Interactome
the complete set of protein complexes/protein-to-protein interactions present in a cell, tissue, or organism. Use this to see which proteins are interacting with other proteins.
Genome Mapping
Genomes are mapped by sequencing their DNA and identifying functional genes
Linkage Maps
based on the frequency of recombination between genetic markers. The closer two genes are to each other, the harder and rarer it is that there will be a break and the two genes will be separated. Genes that are far apart are more readily available to separate during recombination. Less commonly used now that you can sequence an entire genome now relatively easy.
Linkage groups- you hope the number of linkage groups corresponds with the number of chromosomes.
Restriction Maps
based on the physical distances between markers
Anotating a genome
figuring out what genes do (function) after you sequence the genome.
Polymorphism
variation in a sequence between individuals which can be detected at the phenotypic level when a sequence affects gene function, at the restriction fragment level when it affects a restriction enzyme target site, and at the sequence level by direct analysis of DNA. The alleles of a gene show extensive polymorphism at the sequence level, but many sequence changes do not affect function.
Single Nucleotide Polymorphism (SNP)
a polymorphism caused by a change in a single nucleotide. SNPs are responsible for most of the genetic variation between individuals.
Haplotype
a particular combination of alleles in a defined region of some chromosome; in effect, the genotype in miniature. Originally used to describe combinations of Major Histocompatibility Complex (MHC) alleles, it may now be used to describe particular combinations of RFLPs, SNPs, or other markers.
Genome Wide Association Studies
researchers can identify SNPs that are more frequently found in patients with a particular disorder. Compare person with disease to person without to look for disease specific SNPs
DNA Profiling
a technique to analyze the differences between individuals of the fragments generated by using restriction enzymes to cleave regions that contain short repeated sequences or by PCR. Lengths of the repeated regions are unique to every individual. The presence of a particular subset in any two individuals can be used to define their common inheritance. (ex: parent to child relationship, 23 and me)
Repetition in DNA sequences
The kinetics of DNA reassociation after a genome has been denatured distinguish sequences by their frequency of repetition in the genome.
Polypeptides are generally encoded by sequences in non-repetitive DNA
Larger genomes within a taxonomic group do not contain more genes but have large amounts of repetitive DNA
C-Value paradox
can’t measure the complexity of a genome based on its size.
Satellite DNA
Junk DNA that has repeating fragments (tandem repeats) that do not code for anything. Can cause problems if they get big enough in functional regions (ex: cause Huntington's disease). Can be good markers though, and seem to have some sort of structural function.
DNA that consists of many tandem repeats (identical or related) of short basic repeating units. Largest satellite variant.
Highly repetitive DNA (or satellite DNA) has a very short repeating sequence and no coding function
Satellite DNA occurs in large blocks that can have distinct physical properties
Satellite DNA is found more often in heterochromatin
Satellite DNA is often the major constituent of centromeric heterochromatin
transposons
jumping genes- cut and paste or copy and paste. DNA which codes for proteins that insert themselves into other parts of the genome. Makes up for repetitive DNA.
Pseudogenes
inactive but stable components of the genome derived by mutation of an ancestral active gene. Usually, they are inactive because of mutations that block transcription or translation of both.
Gene cluster
a group of adjacent genes that are identical or related
Minisatellite
DNAs consisting of tandemly repeated copies of a short repeating sequence, with more repeat copies than a microsatellite but fewer than a satellite; the length of the repeating unit is measured in tens of base pairs. The number of repeats varies between individual genomes. Used for paternity testing and fingerprinting.
Ribosomal RNA (rRNA)
encoded by a large number of identical genes that are tandemly repeated to form one or more clusters.
ribosomal DNA (rDNA)
cluster that is organized so that transcription units giving a joint precursor to the major rRNAs alternate with nontranscribed spacers. The genes in an rDNA cluster all have identical sequences.
nontranscribed spacers
consist of shorter repeating units whose number varies so that the length of individual spacers are different. They give things room to move, but do not code for anything; they can vary in size.
Unequal crossing-over
changes the size of a cluster of tandem repeats.
Individual repeating units can be eliminated or can spread through the cluster
Crossing over can lead to one copy being longer than the other, depending on where the cuts are made
Concerted evolution (coincidental evolution)
the ability of two or more related genes to evolve together as though constituting a single locus.
Simple Sequence DNA
short repeating units of DNA sequence
Heterochromatin
Tightly coiled chromosomes that are non-transcribed. More satellite DNA is found here.
Cryptic Satellite
a satellite DNA sequence not identified as such by a separate peak on a density gradient. It remains present in the main-band DNA
Centrifugation of the satellite band
Can separate the main band and the satellite band by centrifugation through a density gradient of CsCl.
Euchromatin
regions that comprise most of the genome in the interphase nucleus, are less tightly coiled and contain most of the active or potentially active single-copy genes.
Arthropod Satellites
have very short, identical repeats. Only a few nucleotides long, most copies of the sequence are identical
Mammalian Satellites
have hierarchical repeats, longer basic repeating units.
Minisatellites for DNA profiling
Not used as much currently, and they used to be
Used for paternity tests, by showing 50% of the bands in an individual are inherited from a particular parent.
Variable number tandem repeat (VNTR)- very short repeated sequences, including microsatellites and minisatellites
Convenient because if the repeat changes sizes, you can get different-sized bands.
SNPs took a long time to be used for testing because even though there is more, they are harder to test.
DNA Profiling- analysis of the differences between individuals of restriction fragments that contain short repeated sequences or by PCR. Lengths of repeated regions are unique to individuals, so the presence of a particular subset in any two individuals show their common inheritance.
If allignments gets off cause of repeats, slippage can occur, which will cause an addition or loop, which then increases the number of repeated segments/microsatellite copied
Variable number tandem repeat (VNTR)
very short repeated sequences, including microsatellites and minisatellites. Convenient because if the repeat changes sizes, you can get different-sized bands.
DNA Profiling
analysis of the differences between individuals of restriction fragments that contain short repeated sequences or by PCR. Lengths of repeated regions are unique to individuals, so the presence of a particular subset in any two individuals show their common inheritance.
Eukaryotic Genome
Mitochondria and chloroplasts have genomes that show non-Mendelian inheritance that are MATERNALLY inherited.
Organelle genomes can undergo somatic segregation in plants.
Extranuclear genes- genes that reside outside of the nucleus, such as the mitochondria and chloroplasts.
Endosymbiotic theory- the theory that mitochondria are a prokaryotic species that lives inside of cells
Human mitochondrial DNA suggests it all descended from a single population that existed around 200,000 years ago in Africa.