DALL-E Images: These images were all created by OpenAI’s DALL-E using the following prompt: “One AI robot hand holding a large clear lensball containing a closeup image of a 3D rendering of a single human apolipoprotein A-I protein.”
AI is moving at stunning speed. Here’s an amazing example that impacts health and longevity...
In December 2018, a program called AlphaFold pulled off the challenge of the century—it accurately predicted the structure of a single protein solely from its amino acid sequence. And in 2018, that was a big deal.
In 2020, DeepMind stunned the world again when it predicted the structure of not one, but 350,000 proteins—work recognized by the journal Science as the 2021 Breakthrough of the Year.
Then in July of 2022, DeepMind and its partners went much, much further. The company unveiled the likely structures of nearly all known proteins: more than 200 million from bacteria to humans, delivering a potential treasure trove for drug development and evolutionary studies.
What’s more, all this data has been added to a public database that anyone can scour for free.
As the journal Nature noted: “From today, determining the 3D shape of almost any protein known to science will be as simple as typing in a Google search.”
The implications are enormous.
A protein’s shape determines its function, but until now scientists were in the dark about how most of them are structured. This new resource could turbocharge discoveries in the life sciences and accelerate the development of new drugs.
In today’s blog, I will explain how DeepMind reached this stunning milestone and what it could mean for both science and the future of human health.
As we discuss in our Abundance360 Community, given how fast the biotech field is accelerating, the coming decade will bring multi-$100-billion-dollar startups and untold breakthroughs.
Let’s dive in…
THE PROTEIN FOLDING PROBLEM
Proteins are made up of long chains of amino acids that twist and bend to create complex 3D structures.
These shapes largely determine the protein’s function, whether that’s serving as a structural support, an enzyme that catalyzes critical cellular processes, or a channel that helps transport other important biomolecules.
This means that determining a protein’s structure is critical to understanding what it does. It’s possible work this out experimentally using techniques like X-ray crystallography and cryo-electron microscopy, but they are time-consuming and very expensive.
That’s why scientists have long sought to design software that can predict a protein’s structure based on its amino acid sequence, which is much simpler to deduce experimentally. However, this is easier said than done.
The problem is that every link in the sequence can fold in several different ways, so the number of possible configurations for each string of amino acids is enormous. So, cracking the code of protein folding has been one of biology’s “grand challenges” for more than 50 years.
ALPHAFOLD CRASHES THE PARTY
In 2018, the company turned the protein folding world on its head after winning the prestigious Critical Assessment of Techniques for Protein Structure Prediction (CASP) competition on its first attempt.
It relied on a deep learning neural network called AlphaFold that had been trained on huge amounts of protein data to estimate the distances between specific pairs of amino acids, and the angles of the chemical bonds that join them together.
It wasn’t the first time someone had tried to apply modern AI techniques to the problem, but DeepMind’s solution was far more accurate than previous approaches. It achieved the best prediction on 25 out of the 43 most challenging protein targets, compared to just 3 for the team that came in second place.
These predictions were still a long way from being practically useful, so two years later DeepMind came back to CASP with a new version of AlphaFold that blew their early results out of the water. They did this by harnessing the power of transformers, the neural network architecture behind the rise of large language models like GPT-3.
On roughly two thirds of the proteins it was tested on, the new version of AlphaFold scored more than 90 out of 100 on a measure of structural similarity. At that accuracy, the mismatch is more likely to be explained by experimental errors in the lab rather than errors made by the software.
Researchers hailed the results as a “game-changer” that had largely solved the problem of protein structure prediction. In July 2021, the company launched the freely accessible AlphaFold Protein Structure Database in partnership with the European Bioinformatics Institute (EMBL-EBI) and was steadily adding new proteins to the registry.
A NEW ERA
Then this summer, DeepMind announced that it had used AlphaFold to work out the structures of almost every protein found in nature, adding more than 200 million new records to the database.
“Essentially you can think of it covering the entire protein universe,” DeepMind CEO Demis Hassabis told reporters at a press briefing. “We’re at the beginning of a new era of digital biology.”
That might sound hyperbolic, but the significance of this treasure trove of new protein structures is hard to understate. While these are still just predictions, around 35% are believed to be as accurate as structures determined by experiments, according to EMBL-EBI, and 45% are still accurate enough to be practically useful for a broad range of applications.
For a sense of what they could make possible, it helps to look at the breakthroughs already achieved using the much smaller collection of protein structures DeepMind had already released.
In basic science, AlphaFold has helped to unravel the structure of the nuclear pore complex: a jumble of 1,000 proteins that controls what gets in and out of the cellular nucleus. It’s also helped to characterize a key protein involved in the immune system of honeybees.
On a more practical level, researchers at the University of Portsmouth in the UK are using AlphaFold to develop new enzymes that can breakdown plastic. And it has also been used in combination with crystallography to determine the structure of a protein that could prove to be a promising target for new malaria vaccines.
ACCELERATING DRUG DISCOVERY
That last example is indicative of where the greatest excitement lies.
Understanding the structure of proteins is crucial for identifying whether or not they can be targeted by a specific drug. And if you can predict a protein’s shape from its amino acid sequence, then you can also work out the structural impact of tweaks to that sequence, which could be a powerful tool for designing new medicines.
This is clearly part of DeepMind’s plan. Hassabis says he’s starting to think about “end-to-end drug design,” and last year the company spun off a new venture called Isomorphic Labs, which will use AlphaFold and other AI tools to accelerate drug discovery.
There are still questions around how useful the technology will be for the hardest problems in this field. AlphaFold can tell you little about how proteins interact with other biomolecules or so-called “disordered” proteins that don’t have a fixed structure. And crucially, it has been found to be particularly unreliable when it comes to predicting the structure of regions that small molecule drugs bind to.
Nonetheless, the broad consensus seems to be that the tool could significantly speed up parts of the drug development pipeline. Even if this database doesn’t answer every question that they might have, scientists are finding that even fairly good predictions are providing vital clues about where to look to make breakthroughs.
And this is only the beginning.
If you’d told structural biologists before AlphaFold’s debut that we’d have accurate predictions for nearly every protein in nature within five years, they would have laughed at you.
Yet DeepMind is unlikely to rest on its laurels, and the remaining challenges are almost certainly within their sights.
DeepMind’s continuing breakthroughs demonstrate the impact AI can have on scientific discovery.
And if we couple artificial intelligence’s ability to predict the structure of nearly every protein with the anticipated breakthroughs in quantum computing—another technology poised to disrupt medicine and healthcare—we’re not far from a world where individually customized, precision medicine will move from science fiction to the standard of care.
Nowhere is the convergence of exponential tech bringing greater breakthroughs than in healthcare.
If you’re an ambitious entrepreneur interested in the use of artificial intelligence in healthcare and excited by the progress we’ve made so far, there’s no better time than the present to get involved.
ARE YOU READY TO GO FROM SUCCESS TO SIGNIFICANCE?
Then consider joining my year-round Abundance360 Mastermind and Executive program and participate in our A360 Summit March 20-22, 2023.
How will you use exponential tech like AI to solve the world’s problems and uplift humanity?