Lecture 1: proteins, pka, amino acids

Structure of a typical amino acid

  • There are 20 common amino acids found in proteins; they differ according to their side chains (R group).

  • The amino group and the carboxyl group are ionizable (i.e., can gain or lose H+).

Amino acid chirality

  • Amino acids are non-superimposable due to 1 chiral centre.

  • A chiral carbon is bonded to 4 DIFFERENT atoms or groups.

Properties of amino acids

  • Properties are conferred by the R group.

  • They are chiral (all except glycine, whose R group is H, so there are only 3 different groups bonded to the central C atom).

  • L-amino acids and D-amino acids are mirror images (enantiomers) and cannot be superimposed.

  • Only L-amino acids are found in proteins in cells; D-amino acids are found in the peptidoglycan cell walls of some bacteria, but not in bacterial proteins.

  • D/L relate to configuration relative to (+) glyceraldehyde; all L-amino acids have the same steric configuration as L-glyceraldehyde.

Enantiomers

  • Enantiomers are 2 stereoisomers that are non-superimposable mirror images of each other.

  • They have the same physical and chemical properties but rotate plane-polarized light in opposite directions.

  • A mixture of equal concentrations of 2 enantiomers is called a racemic mixture.

  • The specific rotation is zero for a racemate because the rotations cancel out.

  • Relevance to drugs: only one enantiomer may have activity or one may be toxic.

Acidic R Groups

  • Examples include L-glutamic acid (Glu or E).

  • Ionizable features shown in structural drawings include the ability to lose protons from functional groups.

Neutral-Nonpolar R Groups

  • R groups are hydrocarbon; overall neutral and nonpolar.

  • Includes glycine (Gly, G), alanine (Ala, A), valine (Val, V), leucine (Leu, L), isoleucine (Ile, I), phenylalanine (Phe, F), methionine (Met, M), proline (Pro, P), cysteine (Cys, C) with a nonpolar thiol group.

  • Cysteine’s -SH group is nonpolar but can form weak H bonds with O and N; can form disulfide bonds to form cystine. There are 2 sulfur-containing amino acids, cysteine and methionine, but only cysteine can form disulfide bonds.

Neutral-Polar R Groups

  • R groups can hydrogen bond and are hydrophilic (water-loving).

  • Include serine (Ser, S), threonine (Thr, T), glutamine (Gln, Q), asparagine (Asn, N).

  • Serine and threonine have OH groups in their R groups; glutamine and asparagine are amides of glutamic and aspartic acid.

Acidic amino acids

  • There are 2 acidic amino acids: glutamic acid (Glu, E) and aspartic acid (Asp, D).

  • When ionized (COO−), they are called glutamate and aspartate.

  • Glutamic acid: both COOH groups can ionize, giving glutamate and aspartate, respectively.

  • Note: the drawings often show non-ionized forms; at physiological pH, both COOH and NH2 groups are ionized.

Basic amino acids

  • There are 3 basic amino acids: arginine (Arg, R), lysine (Lys, K), and histidine (His, H).

  • Arginine can be protonated at the guanidino group; lysine’s R group NH2 can be protonated to NH3+; histidine’s imidazole ring can be protonated gaining H+.

Proline

  • The R group of proline is bonded to the amino nitrogen, so proline is technically an imino acid, not a standard amino acid.

  • Nevertheless, it is commonly found in proteins.

Hydrophobic vs hydrophilic classification

  • Two broad classes of amino acids based on R-group:

    • Hydrophobic: repelled by water; mostly interior of proteins; do not ionize or form hydrogen bonds.

    • Hydrophilic: interact with aqueous environment; often have -OH groups; frequently form hydrogen bonds; mostly on protein surfaces or in active sites of enzymes.

Titration and isoelectric point (pI)

  • The predominant ionic form of an amino acid in solution depends on pH.

  • At low pH, alanine carries a net positive charge; as pH increases, the -COOH group ionizes, leading to a zwitterion with no net charge; at high pH, the -NH3+ group loses a proton to become -NH2, giving a net negative charge.

  • The pI is the pH at which the net charge is zero.

  • R group for alanine: -CH3

Henderson-Hasselbalch equation

  • For a weak acid, HA ⇌ H+ + A−

  • Ka = [H+][A−]/[HA]

  • pKa = -log Ka

  • The relationship between pH and pKa is described by the Henderson–Hasselbalch equation:
    pH=pKa+log[A][HA]\text{pH} = \text{p}K_a + \log\frac{[\text{A}^-]}{[\text{HA}]}

  • This enables prediction of the ratio of charged to uncharged forms at a given pH, given pKa.

Why Henderson–Hasselbalch is important

  • Buffer calculations (e.g., blood acid-base status, buffer preparation in labs).

  • Calculation of ionized/unionized concentrations of chemicals, including drugs.

  • See examples and applications in pharmacology resources.

Calculating isoelectric point (pI)

  • For alanine: pKa(COOH) ≈ 2.3; pKa(NH3+) ≈ 9.7.

  • pI(alanine) = (pKa1 + pKa2) / 2 = (2.3 + 9.7) / 2 = 6.0.

  • For glutamate or aspartate, the pI is halfway between the two carboxyl group pKa values.

  • For lysine and arginine (basic amino acids), the pI is between the pKa values of the two amino groups.

pKa values for ionizable groups

  • Neutral amino acids: 2 pKa values (for the α-carboxyl and α-amino groups).

  • Acidic or basic amino acids: 3 pKa values because the R group ionizes.

  • Approximate values:

    • pKa for the COOH of R groups in aspartate/glutamate ≈ 4

    • pKa for histidine R group ≈ 6–7

    • pKa for lysine R group ≈ 10

    • pKa for arginine R group ≈ 12

Useful references

  • See LibreTexts for amino acids/proteins: https://bio.libretexts.org/Bookshelves/Biochemistry/Book%3ABiochemistryFreeForAll(AhernRajagopalandTan)/02%3AStructureandFunction/202%3AStructureFunction-AminoAcids

  • Henderson–Hasselbalch video and related buffers resources: Khan Academy

Peptide bonds

  • Peptide bonds are rigid due to resonance stabilization, and are planar (flat) and resistant to hydrolysis.

  • Formation is a condensation reaction; involves N-terminal and C-terminal ends of the polypeptide.

Levels of protein structure

  • Primary structure: sequence of amino acids in a protein or peptide (the linked amino acids are residues).

    • The primary structure alone tells us nothing about 3D arrangement.

    • Convention: the first amino acid is the N-terminus.

  • Secondary structure: includes α-helix and β-sheet (two main types).

The α-helix

  • Held together by hydrogen bonds between the C=O group of one amino acid and the N–H group of another, four residues away (i to i+4).

  • R groups protrude; not involved in maintaining the helical hydrogen-bond network.

  • Found in almost all proteins; rigid, right-handed structure.

  • Approximately 3.6 amino acids per turn.

  • Proline tends to disrupt helices (ring restricts rotation); glycine is too small/flexible to stabilize an α-helix.

β-sheet

  • Form when two or more segments line up side-by-side.

  • Stabilized by hydrogen bonds between N–H and C=O groups of adjacent chains.

  • Can be parallel or antiparallel (running in same or opposite directions).

  • Many globular proteins contain mixtures of α-helix and β-sheet; some proteins (e.g., silk fibroin) are β-sheet–rich; others (e.g., myosin) are mainly α-helix.

Collagen triple helix

  • Collagen is the most abundant protein in the body; major component of connective tissue.

  • Composed of a triple helix, tightly wound.

  • Every third amino acid is proline; contains hydroxyproline in addition to proline.

Tertiary structure

  • The 3D conformation adopted by proteins when they fold.

  • Folding brings distant residues in sequence into close spatial proximity, creating a compact structure.

  • Large globular proteins are often composed of domains (structurally independent parts with separate functions).

  • Example: hexokinase 3D space-filled model (enzyme that catalyses the first step in glycolysis).

Quaternary structure

  • Some proteins consist of several polypeptide chains; subunits may be identical or different.

  • If identical, proteins are oligomers.

  • Quaternary structure can be important for function (e.g., cooperative ligand binding).

  • Subunits held together by hydrophobic interactions, salt bridges, hydrogen bonds, and disulfide bonds.

  • Classic example: haemoglobin (tetramer).

Prosthetic groups and non-protein components in quaternary structure

  • Some quaternary structures involve binding to non-protein groups (prosthetic groups).

  • Examples:

    • Haem in haemoglobin (binds Fe2+, which binds O2).

    • Sugars in glycoproteins (N- and O-linked glycosylation) added in ER and Golgi.

    • Coenzymes permanently linked to enzymes (e.g., pyridoxal phosphate in transaminase enzymes).

Protein stability and interactions (summary)

  • Stabilizing forces in protein structure include:

    • Hydrophobic interactions (as proteins fold, hydrophobic R groups cluster away from water).

    • Hydrogen bonds (between R groups, and between R groups and the backbone).

    • Electrostatic interactions (salt bridges).

    • Weak van der Waals interactions.

    • Covalent bonds (disulfide bonds between cysteines; present in many extracellular proteins).

    • Posttranslational modifications (e.g., phosphorylation, glycosylation) can form covalent attachments.

Protein denaturation

  • Denaturation disrupts protein structure and often causes loss of activity.

  • Causes include:

    • Extremes of pH.

    • Organic solvents (some solvents disrupt hydrogen bonding, others disrupt hydrophobic interactions).

    • Detergents (e.g., SDS) disrupt hydrophobic interactions and unfold the molecule.

    • Reducing agents (e.g., β-mercaptoethanol) reduce disulfide bonds.

    • High temperature – hydrogen bonds are disrupted.

    • Heavy metals and high salt concentrations.

Note on terminology and resources

  • Proline is an imino acid due to its ring-linked amino nitrogen.

  • Cysteine can form disulfide bonds; two cysteines linked form cystine.

  • Aromatic amino acids (phenylalanine, tyrosine, tryptophan) absorb ultraviolet light at 280 nm and are used in protein concentration measurements by spectrophotometry.

  • Useful reference for amino acids/proteins: LibreTexts (see page link above).

  • General concept: D- and L- nomenclature relates to glyceraldehyde configuration; only L-amino acids are incorporated into proteins in most organisms.

Quick reference values (useful anchors)

  • Neutral amino acids: typically have 2 pKa values (α-COOH and α-NH3+).

  • Acidic or basic amino acids: typically have 3 pKa values because the R group ionizes.

  • Common approximate pKa values:

    • COOH of Asp/Glu side chains: ~4

    • Imidazole (His) R-group: ~6–7

    • Aliphatic amine (Lys) R-group: ~10

    • Guanidinium (Arg) R-group: ~12

  • For alanine, pI ≈ 6.0, calculated as pI=pK<em>a1+pK</em>a22=2.3+9.72=6.0\text{pI} = \frac{pK<em>{a1} + pK</em>{a2}}{2} = \frac{2.3 + 9.7}{2} = 6.0

Extra notes and context

  • For isoelectric point calculations of other amino acids, use the rule: acidic amino acids have pI midway between their two COOH pKa values; basic amino acids have pI midway between the two amino group pKa values.

  • Primary, secondary, tertiary, and quaternary structure concepts form the basis of understanding protein structure and function.

  • This compilation mirrors the content from Dr. Jane Irwin’s course materials and linked references (e.g., LibreTexts, Khan Academy).