Table of contents
  1. Structure-based
    1. FoldSeek
    2. Progres
  2. Protein Language Model-based
    1. PROST
  3. Additional resources

Structure-based

FoldSeek

Fast and accurate protein structure search with Foldseek

Paper GitHub Webserver

Progres

Fast protein structure searching using structure graph embeddings

Paper GitHub

Generally speaking, averaging protein embedding vectors over a whole protein can lead to biases and a substential loss of information. This is, in part, because most proteins are composed of multiple domains and disordered regions that are subject to changes during evolution. Progres uses individual domains as query structures, that can be obtained using tools such as Merizo, SWORD2 and Chainsaw. The protein structure embeddings Progres uses for protein domain similarity search are based on a trained graph neural network using supervised contrastive learning to learn a low-dimensional embedding of protein structure.

Protein Language Model-based

PROST

Improved global protein homolog detection with major gains in function identification

Paper GitHub Webserver

Additional resources