How something is arranged in biology—its structure—usually explains how well it can perform a particular job. Our health and risk of disease are often dependent on the function, and therefore structure, of the many thousands of proteins in the human body. However, predicting and resolving the structure of proteins has been a slow and difficult task.
Now, an artificial intelligence system called AlphaFold has predicted the structure of over 200 million proteins from more than 1 million species. Developed by the research organization DeepMind, the AI platform is an open-access tool that is freely available to anyone with a computer. With unprecedented access to the protein universe, scientists around the world have a powerful tool to better understand disease and accelerate their search for new medicines.
“Up until now, we could only have predicted the structure of a select few small proteins using very expensive computational methods,” says Stefano Forli, PhD, an associate professor at Scripps Research who is using the program to discover promising new drug targets. “AlphaFold truly represents a revolution in the field.”
The AlphaFold program made its predictions using machine learning, a branch of information technology that can make complex models based on statistical data. By looking at structures that were solved in past decades, the system was able to find common patterns behind how proteins assume their folded, three-dimensional form simply from their starting sequence of amino acid building blocks. The predicted structures were then amalgamated into a protein database that is readily searchable and free to use for scientists in a range of disciplines. Alongside the existing database, AlphaFold is also available as a standalone tool that can be used to predict the folding patterns of any new protein of interest.
Knowing the hidden forces that determine protein shape is enabling researchers to study how different proteins interact within the cell and carry out essential processes, as well as how these relationships go awry and contribute to the development of disease. The predictions are also giving scientists the ability to perform very focused, structure-based drug discovery, which Forli believes could have huge implications for cancer, Alzheimer’s disease and vaccine design, just to name a few.
“I would imagine that nearly every therapeutic aspect of research is going to be impacted,” says Forli. “With all this open-source information, we can start looking for our own hidden patterns.” Some of these patterns involve how certain proteins interact with other small, organic molecules that might act as drugs, or relate to the similarities between proteins within the same family to find better drugs that currently exist.
“We’re dreaming big and we’re dreaming even bigger than we did before,” he says. “Now that we have all these large data sets, we can use them to ask questions that, up until recently, we couldn’t’ even comprehend.”