Lecture 1: proteins, pka, amino acids
Structure of a typical amino acid
There are 20 common amino acids found in proteins; they differ according to their side chains (R group).
The amino group and the carboxyl group are ionizable (i.e., can gain or lose H+).
Amino acid chirality
Amino acids are non-superimposable due to 1 chiral centre.
A chiral carbon is bonded to 4 DIFFERENT atoms or groups.
Properties of amino acids
Properties are conferred by the R group.
They are chiral (all except glycine, whose R group is H, so there are only 3 different groups bonded to the central C atom).
L-amino acids and D-amino acids are mirror images (enantiomers) and cannot be superimposed.
Only L-amino acids are found in proteins in cells; D-amino acids are found in the peptidoglycan cell walls of some bacteria, but not in bacterial proteins.
D/L relate to configuration relative to (+) glyceraldehyde; all L-amino acids have the same steric configuration as L-glyceraldehyde.
Enantiomers
Enantiomers are 2 stereoisomers that are non-superimposable mirror images of each other.
They have the same physical and chemical properties but rotate plane-polarized light in opposite directions.
A mixture of equal concentrations of 2 enantiomers is called a racemic mixture.
The specific rotation is zero for a racemate because the rotations cancel out.
Relevance to drugs: only one enantiomer may have activity or one may be toxic.
Acidic R Groups
Examples include L-glutamic acid (Glu or E).
Ionizable features shown in structural drawings include the ability to lose protons from functional groups.
Neutral-Nonpolar R Groups
R groups are hydrocarbon; overall neutral and nonpolar.
Includes glycine (Gly, G), alanine (Ala, A), valine (Val, V), leucine (Leu, L), isoleucine (Ile, I), phenylalanine (Phe, F), methionine (Met, M), proline (Pro, P), cysteine (Cys, C) with a nonpolar thiol group.
Cysteine’s -SH group is nonpolar but can form weak H bonds with O and N; can form disulfide bonds to form cystine. There are 2 sulfur-containing amino acids, cysteine and methionine, but only cysteine can form disulfide bonds.
Neutral-Polar R Groups
R groups can hydrogen bond and are hydrophilic (water-loving).
Include serine (Ser, S), threonine (Thr, T), glutamine (Gln, Q), asparagine (Asn, N).
Serine and threonine have OH groups in their R groups; glutamine and asparagine are amides of glutamic and aspartic acid.
Acidic amino acids
There are 2 acidic amino acids: glutamic acid (Glu, E) and aspartic acid (Asp, D).
When ionized (COO−), they are called glutamate and aspartate.
Glutamic acid: both COOH groups can ionize, giving glutamate and aspartate, respectively.
Note: the drawings often show non-ionized forms; at physiological pH, both COOH and NH2 groups are ionized.
Basic amino acids
There are 3 basic amino acids: arginine (Arg, R), lysine (Lys, K), and histidine (His, H).
Arginine can be protonated at the guanidino group; lysine’s R group NH2 can be protonated to NH3+; histidine’s imidazole ring can be protonated gaining H+.
Proline
The R group of proline is bonded to the amino nitrogen, so proline is technically an imino acid, not a standard amino acid.
Nevertheless, it is commonly found in proteins.
Hydrophobic vs hydrophilic classification
Two broad classes of amino acids based on R-group:
Hydrophobic: repelled by water; mostly interior of proteins; do not ionize or form hydrogen bonds.
Hydrophilic: interact with aqueous environment; often have -OH groups; frequently form hydrogen bonds; mostly on protein surfaces or in active sites of enzymes.
Titration and isoelectric point (pI)
The predominant ionic form of an amino acid in solution depends on pH.
At low pH, alanine carries a net positive charge; as pH increases, the -COOH group ionizes, leading to a zwitterion with no net charge; at high pH, the -NH3+ group loses a proton to become -NH2, giving a net negative charge.
The pI is the pH at which the net charge is zero.
R group for alanine: -CH3
Henderson-Hasselbalch equation
For a weak acid, HA ⇌ H+ + A−
Ka = [H+][A−]/[HA]
pKa = -log Ka
The relationship between pH and pKa is described by the Henderson–Hasselbalch equation:
This enables prediction of the ratio of charged to uncharged forms at a given pH, given pKa.
Why Henderson–Hasselbalch is important
Buffer calculations (e.g., blood acid-base status, buffer preparation in labs).
Calculation of ionized/unionized concentrations of chemicals, including drugs.
See examples and applications in pharmacology resources.
Calculating isoelectric point (pI)
For alanine: pKa(COOH) ≈ 2.3; pKa(NH3+) ≈ 9.7.
pI(alanine) = (pKa1 + pKa2) / 2 = (2.3 + 9.7) / 2 = 6.0.
For glutamate or aspartate, the pI is halfway between the two carboxyl group pKa values.
For lysine and arginine (basic amino acids), the pI is between the pKa values of the two amino groups.
pKa values for ionizable groups
Neutral amino acids: 2 pKa values (for the α-carboxyl and α-amino groups).
Acidic or basic amino acids: 3 pKa values because the R group ionizes.
Approximate values:
pKa for the COOH of R groups in aspartate/glutamate ≈ 4
pKa for histidine R group ≈ 6–7
pKa for lysine R group ≈ 10
pKa for arginine R group ≈ 12
Useful references
See LibreTexts for amino acids/proteins: https://bio.libretexts.org/Bookshelves/Biochemistry/Book%3ABiochemistryFreeForAll(AhernRajagopalandTan)/02%3AStructureandFunction/202%3AStructureFunction-AminoAcids
Henderson–Hasselbalch video and related buffers resources: Khan Academy
Peptide bonds
Peptide bonds are rigid due to resonance stabilization, and are planar (flat) and resistant to hydrolysis.
Formation is a condensation reaction; involves N-terminal and C-terminal ends of the polypeptide.
Levels of protein structure
Primary structure: sequence of amino acids in a protein or peptide (the linked amino acids are residues).
The primary structure alone tells us nothing about 3D arrangement.
Convention: the first amino acid is the N-terminus.
Secondary structure: includes α-helix and β-sheet (two main types).
The α-helix
Held together by hydrogen bonds between the C=O group of one amino acid and the N–H group of another, four residues away (i to i+4).
R groups protrude; not involved in maintaining the helical hydrogen-bond network.
Found in almost all proteins; rigid, right-handed structure.
Approximately 3.6 amino acids per turn.
Proline tends to disrupt helices (ring restricts rotation); glycine is too small/flexible to stabilize an α-helix.
β-sheet
Form when two or more segments line up side-by-side.
Stabilized by hydrogen bonds between N–H and C=O groups of adjacent chains.
Can be parallel or antiparallel (running in same or opposite directions).
Many globular proteins contain mixtures of α-helix and β-sheet; some proteins (e.g., silk fibroin) are β-sheet–rich; others (e.g., myosin) are mainly α-helix.
Collagen triple helix
Collagen is the most abundant protein in the body; major component of connective tissue.
Composed of a triple helix, tightly wound.
Every third amino acid is proline; contains hydroxyproline in addition to proline.
Tertiary structure
The 3D conformation adopted by proteins when they fold.
Folding brings distant residues in sequence into close spatial proximity, creating a compact structure.
Large globular proteins are often composed of domains (structurally independent parts with separate functions).
Example: hexokinase 3D space-filled model (enzyme that catalyses the first step in glycolysis).
Quaternary structure
Some proteins consist of several polypeptide chains; subunits may be identical or different.
If identical, proteins are oligomers.
Quaternary structure can be important for function (e.g., cooperative ligand binding).
Subunits held together by hydrophobic interactions, salt bridges, hydrogen bonds, and disulfide bonds.
Classic example: haemoglobin (tetramer).
Prosthetic groups and non-protein components in quaternary structure
Some quaternary structures involve binding to non-protein groups (prosthetic groups).
Examples:
Haem in haemoglobin (binds Fe2+, which binds O2).
Sugars in glycoproteins (N- and O-linked glycosylation) added in ER and Golgi.
Coenzymes permanently linked to enzymes (e.g., pyridoxal phosphate in transaminase enzymes).
Protein stability and interactions (summary)
Stabilizing forces in protein structure include:
Hydrophobic interactions (as proteins fold, hydrophobic R groups cluster away from water).
Hydrogen bonds (between R groups, and between R groups and the backbone).
Electrostatic interactions (salt bridges).
Weak van der Waals interactions.
Covalent bonds (disulfide bonds between cysteines; present in many extracellular proteins).
Posttranslational modifications (e.g., phosphorylation, glycosylation) can form covalent attachments.
Protein denaturation
Denaturation disrupts protein structure and often causes loss of activity.
Causes include:
Extremes of pH.
Organic solvents (some solvents disrupt hydrogen bonding, others disrupt hydrophobic interactions).
Detergents (e.g., SDS) disrupt hydrophobic interactions and unfold the molecule.
Reducing agents (e.g., β-mercaptoethanol) reduce disulfide bonds.
High temperature – hydrogen bonds are disrupted.
Heavy metals and high salt concentrations.
Note on terminology and resources
Proline is an imino acid due to its ring-linked amino nitrogen.
Cysteine can form disulfide bonds; two cysteines linked form cystine.
Aromatic amino acids (phenylalanine, tyrosine, tryptophan) absorb ultraviolet light at 280 nm and are used in protein concentration measurements by spectrophotometry.
Useful reference for amino acids/proteins: LibreTexts (see page link above).
General concept: D- and L- nomenclature relates to glyceraldehyde configuration; only L-amino acids are incorporated into proteins in most organisms.
Quick reference values (useful anchors)
Neutral amino acids: typically have 2 pKa values (α-COOH and α-NH3+).
Acidic or basic amino acids: typically have 3 pKa values because the R group ionizes.
Common approximate pKa values:
COOH of Asp/Glu side chains: ~4
Imidazole (His) R-group: ~6–7
Aliphatic amine (Lys) R-group: ~10
Guanidinium (Arg) R-group: ~12
For alanine, pI ≈ 6.0, calculated as
Extra notes and context
For isoelectric point calculations of other amino acids, use the rule: acidic amino acids have pI midway between their two COOH pKa values; basic amino acids have pI midway between the two amino group pKa values.
Primary, secondary, tertiary, and quaternary structure concepts form the basis of understanding protein structure and function.
This compilation mirrors the content from Dr. Jane Irwin’s course materials and linked references (e.g., LibreTexts, Khan Academy).