AlphaFold: How AI can accelerate scientific discovery

Inside every cell in our bodies, billions of tiny molecular machines allow our eyes to detect light, our neurons to fire, and the ‘instructions’ in our DNA to be read. These intricate machines are proteins: they underpin the biological processes in every living thing. Currently, there are over 200 million known proteins, with many more found every year, each with its unique 3D shape that determines how it works and what it does. Figuring out the exact structure of a protein remains an expensive and often time-consuming process – and until now – scientists have only been able to study the exact 3D structure of a tiny fraction of the proteins known to science.

200 million protein structures freely accessible to researchers

Last year, we released and open-sourced AlphaFold, our AI system which can accurately predict the structure of proteins. In partnership with the European Molecular Laboratory of Biology’s European Bioinformatics Institute, we created the AlphaFold Protein Structure Database to freely share this scientific knowledge with the world.

Our first release, on 22 July 2021, covered over 350,000 structures, including the human proteome – all of the ~20,000 known proteins expressed in the human body – along with the proteomes of 20 additional organisms important for biological research. Our latest release announced on 28 July 2022, expands this database to over 200 million structures – including nearly all catalogued proteins known to science. This update includes predicted structures for plants, bacteria, animals, and other organisms, opening up many new opportunities for researchers to use AlphaFold to advance their work on issues such as sustainability, food insecurity, and neglected diseases.

Accelerating progress on real-world problems

Twelve months on from AlphaFold’s initial release, we also wanted to reflect on the impact AlphaFold has already had. To date, more than 500,000 researchers from 190 countries have accessed the AlphaFold database. Our partners are already using AlphaFold to accelerate progress on important real-world problems: the Drugs for Neglected Diseases initiative (DNDi) is advancing drug discovery for neglected diseases, such as Chagas disease and leishmaniasis, which impact millions within vulnerable communities. At the Centre for Enzyme Innovation (CEI), researchers are discovering and engineering enzymes for breaking down single-use plastics, while teams from universities across Norway and the USA mapped the structure of honey bee Vitellogenin (Vg), a central protein for understanding the immune systems of egg-laying animals. And at the University of Colorado Boulder, another team is studying antibiotic resistance, a problem which causes 2.8M infections in the US alone each year.

For DeepMind, AlphaFold’s success was especially rewarding, both because it was the most complex AI system we’d ever built, but also because it has had the most meaningful downstream impact. It became the first major proof point of our founding thesis: that artificial intelligence can dramatically accelerate scientific discovery and in turn benefit humanity. AlphaFold is a glimpse of what might be possible with computational and AI methods applied to biology. We’re excited to see the huge potential of AI starting to be realised as one of humanity’s most useful tools for advancing scientific discovery.

AI Wonk Dog
Sign up for OECD artificial intelligence newsletter

Human-centred values and fairnessInvesting in AI research and developmentScience & technologyClassificationInnovation

Disclaimer: The opinions expressed and arguments employed herein are solely those of the authors and do not necessarily reflect the official views of the OECD or its member countries. The Organisation cannot be held responsible for possible violations of copyright resulting from the posting of any written material on this website/blog.

Sign up for OECD artificial intelligence newsletter