Workflow from Scientific Research

Open access visualization of Workflow, Flowchart, Deep Learning Model, Protein Sequences, Missense Variant Effects
CC-BY
0
Views
0
Likes
DOI

A 650-million-parameter (unsupervised) deep learning model was formed from 250 million protein sequences (developed by MetaAI scientists). This framework was used to predict the totality of ~450 million potential missense variant effects (i.e., a single-nucleotide change results in the substitution of one amino acid for another in the protein produced by a gene) screening through >40,000 protein structures in the full human genome. During model training, random locations across the genome are blinded to the model and the model is trained to recover these left-out amino acids. Such modeling tools implicitly extract and represent how one-dimensional amino acid sequences lead to two-dimensional and three-dimensional features of the protein structure and function, including ligand-receptor binding sites. Such protein language models are capable of providing high-quality predictions of any amino acid sequence as well as different kinds of coding variants. Reproduced with permission from Brandes et al.29

Related Plots

Discover More Scientific Plots

Browse thousands of high-quality scientific visualizations from open-access research