Free to follow every thread. No paywall, no dead ends.
Protein: the story on HearLore | HearLore
Protein
In 1958, John Kendrew and Max Perutz achieved a feat that seemed impossible just decades earlier: they determined the three-dimensional structure of myoglobin and hemoglobin using X-ray crystallography. This breakthrough revealed that proteins were not amorphous blobs but precise, folded machines with specific shapes that dictated their function. Before this, the scientific community was largely guessing at how these molecules worked, relying on the linear sequences discovered by Frederick Sanger in 1949. Sanger had proven that proteins were linear polymers of amino acids, but the leap from a string of beads to a functional 3D object remained a mystery until the atomic-level maps provided by Kendrew and Perutz. The myoglobin structure, with its turquoise alpha-helices and a central heme group holding oxygen, became the blueprint for understanding how life operates at the molecular level. This discovery did not just solve a puzzle; it launched the field of structural biology, allowing scientists to see the very mechanisms of life in high definition for the first time.
From Albumins to Amino Acids
The story of protein science begins not with complex structures but with a simple observation of coagulation in the 1700s. Antoine Fourcroy and his contemporaries identified substances they called albumins, recognizing that these materials behaved differently from fats or sugars. By 1789, Fourcroy had distinguished three types of animal proteins: albumin, fibrin, and gelatin, while plant proteins like gluten were being isolated from wheat around 1747. The true nature of these substances remained elusive until 1838, when Dutch chemist Gerardus Johannes Mulder and Swedish chemist Jöns Jacob Berzelius collaborated to name them. Mulder performed elemental analysis and found that nearly all proteins shared the same empirical formula, leading him to erroneously believe they were all giant molecules of a single type. Berzelius coined the term protein from the Greek word meaning primary or standing in front, reflecting the belief that these were the fundamental building blocks of life. Early nutritional scientists like Carl von Voit championed the idea that flesh makes flesh, emphasizing the importance of these substances for body structure. It was not until the early 20th century that Franz Hofmeister and Hermann Emil Fischer established the polypeptide theory, proving that proteins were chains of amino acids linked by peptide bonds, a concept that would eventually replace the idea of proteins as simple colloids.
The Genetic Blueprint
The instructions for building proteins are hidden within the genetic code, a system of three-nucleotide sets called codons that dictate the sequence of amino acids. This process begins when DNA is transcribed into messenger RNA, which then travels to the ribosome to be translated into a polypeptide chain. The genetic code is redundant, with 64 possible codons specifying only 20 standard amino acids, and in some organisms, even selenocysteine and pyrrolysine. The ribosome reads the mRNA three nucleotides at a time, matching each codon to a transfer RNA molecule carrying the correct amino acid. This synthesis occurs from the N-terminus to the C-terminus, creating a linear chain that must fold into a specific shape to function. The rate of this synthesis varies, reaching up to 20 amino acids per second in prokaryotes. The size of proteins has evolved to become larger in more complex organisms, with eukaryotic proteins averaging 438 residues compared to 283 in archaea. The largest known protein, titin, spans almost 27,000 amino acids, forming a massive component of muscle sarcomeres. This genetic precision ensures that every protein, from the smallest peptide to the largest enzyme, is constructed with the exact sequence required to perform its specific biological role.
Who determined the three-dimensional structure of myoglobin and hemoglobin in 1958?
John Kendrew and Max Perutz determined the three-dimensional structure of myoglobin and hemoglobin in 1958 using X-ray crystallography. This breakthrough revealed that proteins were precise folded machines with specific shapes that dictated their function.
When did scientists first name proteins and what term did they use?
Dutch chemist Gerardus Johannes Mulder and Swedish chemist Jöns Jacob Berzelius collaborated to name proteins in 1838. Berzelius coined the term protein from the Greek word meaning primary or standing in front to reflect the belief that these were the fundamental building blocks of life.
How many amino acids does the largest known protein titin contain?
The largest known protein titin spans almost 27,000 amino acids and forms a massive component of muscle sarcomeres. This genetic precision ensures that every protein is constructed with the exact sequence required to perform its specific biological role.
Who proved that enzymes were proteins and when did this discovery occur?
James B. Sumner proved that the enzyme urease was a protein in 1926. This discovery shattered the prevailing belief that enzymes were non-protein substances and established the highly specific nature of these catalysts.
What is the average lifespan of a protein in mammalian cells?
The lifespan of a protein varies wildly from minutes to years with an average of 1 to 2 days in mammalian cells. This dynamic turnover ensures that cells can adapt to changing conditions by removing old or damaged proteins and recycling their components.
How many X-ray structures are currently stored in the Protein Data Bank?
The Protein Data Bank now contains over 181,000 X-ray structures to store structural data for researchers. This repository provides a treasure trove of information for computational methods including homology modeling and molecular dynamics simulations.
Once a protein chain is synthesized, it faces a critical challenge: folding into its native conformation. While some proteins can fold unassisted through the chemical properties of their amino acids, many require the aid of molecular chaperones to reach their functional state. The folding process is governed by the thermodynamic hypothesis, which suggests that the folded form represents the free energy minimum of the molecule. This folding is stabilized by hydrophobic interactions, salt bridges, hydrogen bonds, and disulfide bonds, creating a unique three-dimensional structure. The concept of protein folding was solidified by Christian Anfinsen's studies on ribonuclease A, which won him the Nobel Prize in 1972. However, the process is not always perfect; misfolded proteins are degraded rapidly by the cell's machinery to prevent damage. The lifespan of a protein varies wildly, from minutes to years, with an average of 1 to 2 days in mammalian cells. This dynamic turnover ensures that cells can adapt to changing conditions, removing old or damaged proteins and recycling their components. The ability of proteins to shift between conformations, known as conformational changes, allows them to perform complex tasks like catalyzing reactions or transmitting signals, making them the dynamic engines of cellular life.
The Enzyme Revolution
The most celebrated role of proteins is as enzymes, the catalysts that drive the chemical reactions of life. In 1926, James B. Sumner proved that the enzyme urease was a protein, shattering the prevailing belief that enzymes were non-protein substances. Enzymes are highly specific, accelerating reactions by factors as high as 10^17-fold, as seen in the case of orotate decarboxylase, which speeds up a reaction from 78 million years to 18 milliseconds. These molecules possess an active site, a small pocket where substrates bind and undergo transformation. While enzymes can consist of hundreds of amino acids, only a few residues are directly involved in catalysis. The diversity of enzymatic functions is vast, covering metabolism, DNA replication, and repair. Beyond catalysis, enzymes can modify other proteins through post-translational modifications, adding or removing chemical groups to regulate activity. The study of enzyme kinetics has provided deep insights into the chemical mechanisms of life, revealing how proteins can achieve such incredible efficiency and specificity. This understanding has paved the way for drug design and the development of new materials, as scientists learn to mimic or inhibit these powerful biological catalysts.
The Structural Tapestry
Proteins are not merely catalysts; they form the structural fabric of life. Fibrous proteins like collagen and keratin provide stiffness and rigidity to connective tissues, bones, and hair, while elastin offers elasticity to blood vessels and lungs. The mechanical properties of these proteins are measured by Young's modulus, which quantifies their stiffness. Collagen, for instance, has a modulus of 5 to 7.5 GPa, making it one of the strongest biological materials, while elastin is much softer at 1 MPa. Globular proteins, such as bovine serum albumin, float freely in the cytosol and function as enzymes, possessing much lower stiffness to allow for conformational changes. Membrane proteins, including ion channels and receptors, integrate into the cell membrane to control the passage of molecules. The cytoskeleton, composed of actin and tubulin, maintains cell shape and enables intracellular transport. Motor proteins like myosin and kinesin generate mechanical forces for muscle contraction and cellular movement. These structural proteins are essential for the integrity and function of organisms, from the microscopic scale of a single cell to the macroscopic scale of a human body. Their diverse mechanical properties allow them to perform tasks ranging from supporting the body to facilitating the movement of sperm and the contraction of muscles.
The Proteomic Frontier
The study of proteins has evolved into the field of proteomics, which examines the total complement of proteins in a cell or organism. This field relies on advanced techniques such as 2D electrophoresis, mass spectrometry, and protein microarrays to identify and quantify proteins. The Protein Data Bank, established to store structural data, now contains over 181,000 X-ray structures, providing a treasure trove of information for researchers. Computational methods, including homology modeling and molecular dynamics simulations, allow scientists to predict protein structures and interactions that have not yet been experimentally determined. The Folding@home project harnesses distributed computing to simulate protein folding, tackling complex problems that require immense computational power. Proteomics has revealed that the number of proteins expressed in a cell varies depending on cell type and external stimuli, with human cells containing up to 3 billion protein molecules. The study of protein-protein interactions, known as the interactome, has uncovered the complex networks that regulate cellular function. This field continues to expand, offering new insights into diseases, drug development, and the fundamental mechanisms of life, pushing the boundaries of what we know about the molecular world.