Viewing a single comment thread. View all comments

FuturologyBot t1_iutlf87 wrote

The following submission statement was provided by /u/Soupjoe5:



Microbial molecules from soil, seawater and human bodies are among the planet’s least understood proteins.

When London-based Deep Mind unveiled predicted structures for some 220 million proteins this year, it covered nearly every protein from known organisms in DNA databases. Now, another tech giant is filling in the dark matter of our protein universe.

Researchers at Meta (formerly Facebook, headquartered in Menlo Park, California) have used artificial intelligence (AI) to predict the structures of some 600 million proteins from bacteria, viruses and other microbes that haven’t been characterized.

“These are the structures we know the least about. These are incredibly mysterious proteins. I think they offer the potential for great insight into biology,” says Alexander Rives, the research lead for Meta AI’s protein team.

The team generated the predictions — described in a 1 November preprint1 — using a ‘large language model’, a type of AI that are the basis for tools that can predict text from just a few letters or words.

Normally language models are trained on large volumes of text. To apply them to proteins, Rives and his colleagues fed them sequences to known proteins, which can be expressed by a chains of 20 different amino acids, each represented by a letter. The network then learned to ‘autocomplete’ proteins with a proportion of amino acids obscured.

Protein ‘autocomplete’

This training imbued the network with an intuitive understanding of protein sequences, which hold information about their shapes, says Rives. A second step — inspired by DeepMind’s pioneering protein structure AI AlphaFold — combines such insights with information about the relationships between known protein structures and sequences, to generate predicted structures from protein sequences.

Please reply to OP's comment here: