A 650-million-parameter (unsupervised) deep learning model was formed from 250 million protein sequences (developed by MetaAI scientists). This framework was used to predict the totality of ~450 million potential missense variant effects (i.e., a single-nucleotide change results in the substitution of one amino acid for another in the protein produced by a gene) screening through >40,000 protein structures in the full human genome. During model training, random locations across the genome are blinded to the model and the model is trained to recover these left-out amino acids. Such modeling tools implicitly extract and represent how one-dimensional amino acid sequences lead to two-dimensional and three-dimensional features of the protein structure and function, including ligand-receptor binding sites. Such protein language models are capable of providing high-quality predictions of any amino acid sequence as well as different kinds of coding variants. Reproduced with permission from Brandes et al.29