Hermes: Large DEL Datasets Train Generalizable Protein-Ligand Binding Prediction Models

Maxwell Kleinsasser ,  Brayden J. Halverson ,  Edward Kraft ,  Sean Francis-Lyon ,  Sarah E. Hugo ,  Mackenzie R. Roman ,  Ben Miller ,  Andrew D. Blevins ,  Ian K. Quigley

arXiv - QuanBio - Biomolecules 

Abstract

The quality and consistency of training data remain critical bottlenecks for protein-ligand binding prediction. Public affinity datasets, aggregated from thousands of labs and assay formats, introduce biases that limit model generalization and complicate evaluation. DNA-encoded chemical libraries (DELs) offer a potential solution: unified experimental protocols generating massive binding datasets across diverse chemical and protein target space. We present Hermes, a lightweight transformer trained exclusively on DEL data from screens against hundreds of protein targets, representing one of the largest and most protein-diverse DEL training sets applied to protein-ligand interaction (PLI) modeling to date. Despite never seeing traditional affinity measurements during training, Hermes generalizes to held-out targets, novel chemical scaffolds, and external benchmarks derived from public binding data and high-throughput screens. Our results demonstrate that DEL data alone captures transferable protein-ligand interaction representations, while Hermes' minimal architecture enables inference speeds suitable for large-scale virtual screening.

Summary

The paper introduces Hermes, a lightweight transformer-based model trained exclusively on DNA-encoded library (DEL) screening data across 239 protein targets. Despite never using traditional affinity measurements (e.g., IC50, Kd), Hermes generalizes to unseen protein targets, novel chemical scaffolds, and external benchmarks derived from public binding data. The model demonstrates that DEL data alone captures transferable protein-ligand interaction representations, with inference speeds 500–700× faster than state-of-the-art structure-based models like Boltz-2, making it highly suitable for large-scale virtual screening.

Highlights

  • Strong generalization: Achieves mean AUROC of 0.68 on the DEL Protein Split (unseen proteins) and 0.60 on Public Binders/Decoys (external benchmarks), with significantly better performance for kinase targets due to kinase-enriched training data.
  • Speed advantage: Processes 28.2 samples/second/GPU on H200 hardware, far outpacing Boltz-2 (0.04 samples/second on H100), critical for cost-effective virtual screening.
  • Limitations: Performance drops on the DEL Chemical Library Split (AUROC ~0.56), suggesting challenges in generalizing to entirely new chemical libraries. Data binarization (binary binding labels) and noise in DEL screening results constrain model expressivity.
  • Practical impact: Highlights DEL datasets as a scalable, unified alternative to fragmented public affinity data (e.g., ChEMBL), with potential to accelerate drug discovery pipelines.

Conclusion

Hermes demonstrates that DEL-derived data alone can train generalizable protein-ligand binding prediction models without reliance on traditional affinity measurements. Its success underscores the value of large-scale, consistent DEL screening data for capturing transferable biological interactions. As DEL datasets continue to grow beyond public affinity resources, DEL-trained models like Hermes are poised to drive the next generation of computational drug discovery, particularly for targets underrepresented in existing public data. Future improvements could incorporate structural augmentation and continuous binding strength modeling to address current limitations.

logo
logo