Table of contents
Molecular docking and protein-ligand interactions
This project is work in progress. In general, I am interested in applying molecular docking methods, such as DiffDock and DynamicBind, to predicted plant specialized metabolism enzymes, primarily terpene syntheses (TPSs). Part of this work has been to develop an easy to use bioinformatics infrastructure for high-throughput analysis and visualization of protein-ligand interactions. You can find some relevant information in my protein-ligand page, with more additions to come in the future.
PhosBoost
PhosBoost is a machine learning approach that leverages protein language models and gradient boosting trees to predict protein phosphorylation from experimentally derived data. PhosBoost offers improved performance when recall is prioritized while consistently providing more confident probability scores.
Poretsky, E., Andorf, C. M., & Sen, T. Z. (2023). PhosBoost: Improved phosphorylation prediction recall using gradient boosting and protein language models. Plant Direct, 7(12), e554.
PanPPI
In the PanPPI framework, we generated predicted STRING-db interactomes for the 26 maize NAM inbred lines and used ClusterONE to cluster the genome-, core-, and pan-interactomes. The clusters were then annotated using GO term enrichment, gene coexpression, and gene descriptions. The annotated clusters can be used putative gene function predictions and prioritization of candidate genes. The framework can be applied to any list of pan-genomes, see the GitHub repository for instructions. We also generated a Python Dash web-application to help with finding relevant PPI clusters with user-provided genes of interest. The easiest way to access the app, with the maize data pre-loaded, is by following the instructions in the provided Docker page (instructions to install Docker). Alternatively, the code and instructions for the Dash app are available in the GitHub repository.
G3: Genes, Genomes, Genetics GitHub Docker
Poretsky, E.*, Cagirici, H. B.*, Andorf, C. M., & Sen, T. Z. (2024). Harnessing the predicted maize pan-interactome for putative gene function prediction and prioritization of candidate genes for important traits. G3: Genes, Genomes, Genetics, jkae059.