DeepMind and OpenAI: Yin and Yang?

In December 2020 DeepMind’s AlphaFold2 was announced as the winner of the Critical Assessment of Protein Structure Prediction (CASP) contest, ushering a cataclysmic shift in structural biology by being the most advanced algorithm ever created for accurate protein prediction purposes. 7 Months later the source code was released together with a Nature paper describing it. The impact can be assessed by a proxy: ca. 400k people accessed EMBL-EBIs AlphaFold Database in first 12 Months and the rest is history.

In parallel, OpenAI started making waves in the NLP (Natural Language Processing) department: they developed Whisper for English speech recognition, DALL·E creating images from text, and ChatGPT-3 a language model for dialogue. The ChatGPT-3 language model uses Reinforcement Learning from Human Feedback (RLHF) and can create code, debug code, follow a conversation and logically link to previous statements, detect intent and basically just be super fun to play with.

Why do the two parallel developments (AlphaFold and ChatGPT-3) complement each other?

One (AlphaFold) is creating large and complex hard data sets that are not human readable, while the other (ChatGPT-3) following natural language rules is easily understood and used by humans, hence information created by the former could be assessed and analyzed by the later with human guidance.

What does this mean for the bio(pharmaceutical) and biotech industries?

In my book the following:

  • protein structure-function relationship elucidation will become faster (+ combined with all the omics information at DNA, RNA and protein level the “digital cell” might become a reality in my life time … and I’m not old, just mature ;) )

  • low level coding for basic use cases (initially…later with GPT version X higher level coding) will become democratised and accessible through natural language to a wider audience (e.g. research scientists without basic coding knowledge)

  • software development will become faster (e.g. less syntax errors) and leaner … programmers will not be replaced any time soon, as it takes more than just coding to be a good software professional

  • scientific information (knowledge) generation and dissemination will iteratively shift towards semi-automation (think Materials and Methods in any scientific paper - highly standardisable for basic applications like PCR etc.)

BUT adoption rate will be slow as the industry (especially the GxP part of it) is very resistant to change. That being said it’s still only a question of time … a long time.

References:

  1. Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). https://doi.org/10.1038/s41586-021-03819-2

  2. Callaway, E. What's next for AlphaFold and the AI protein-folding revolution? (2022)

  3. openai.com/research/

Next
Next

Digitalization of research labs: will it ever happen?