Generally speaking, inverse folding models are built for predicting protein sequences from protein backbones. While it is often used as part of a protein design pipeline to generate multiple sequences for designed binders, their probabilistic nature makes them attractive for a variety of tasks, such as predicting the effect of residue substitutions.
Table of contents
Methods
ProteinMPNN
ProteinMPNN: Robust deep learning–based protein sequence design using ProteinMPNN
Paper GitHub Colab HuggingFace YouTube
LigandMPNN
LigandMPNN: Atomic context-conditioned protein sequence design using LigandMPNN
ESM-IF1
ESM-IF1: A high-level programming language for generative protein design
This model predicts protein sequences from backbone atom coordinates, trained on AF2 predicted structures. The model consists of invariant geometric input processing layers followed by a sequence-to-sequence transformer and can predict sequences for partially masked structures.
PiFold
PiFold: Toward effective and efficient protein inverse folding
GraDe_IF
GraDe_IF: Graph Denoising Diffusion for Inverse Protein Folding
ProRefiner
ProRefiner: an entropy-based refining strategy for inverse protein folding with global graph attention
nanand2/proteins
Protein Structure and Sequence Generation with Equivariant Denoising Diffusion Probabilistic Models
Benchmarks
ProteinInvBench
ProteinInvBench: Benchmarking Protein Inverse Folding on Diverse Tasks, Models, and Metrics
Additional resources
- Knowledge-Design: Pushing the Limit of Protein Design via Knowledge Refinement