Nature has now released that AlphaFold 2 paper, after eight long months of waiting. The main text reports more or less what we have known for nearly a year, with some added tidbits, although it is accompanied by a painstaking description of the architecture in the supplementary information. Perhaps more importantly, the authors have released the entirety of the code, including all details to run the pipeline, on Github. And there is no small print this time: you can run inference on any protein (I’ve checked!).
Have you not heard the news? Let me refresh your memory. In November 2020, a team of AI scientists from Google DeepMind indisputably won the 14th Critical Assessment of Structural Prediction competition, a biennial blind test where computational biologists try to predict the structure of several proteins whose structure has been determined experimentally but not publicly released. Their results were so astounding, and the problem so central to biology, that it took the entire world by surprise and left an entire discipline, computational biology, wondering what had just happened.
Now that the article is live, the excitement is palpable. We have 70+ pages of long-awaited answers, and several thousand lines of code that will, no doubt, become a fundamental part of computational biology. At the same time, however, we have many new questions. What is the secret sauce before the news splash, and why is it so effective? Is it a piece of code that the average user can actually run? What are AlphaFold 2’s shortcomings? And, most important of all, what will it mean for computational biology? And for all of us?