Marissa D Dolorfino, Daniel Santos Perez, Yao Fu, Shu-Hang Lin, Sean McCarty, Matthew James O'Meara, Terra Sztain
bioRxiv - Biophysics
DOI: 10.64898/2026.04.18.719394
Abstract
DNA-encoded libraries (DELs) enable ultra-large screening of billions of molecules simultaneously. However, various limitations of DELs have prompted interest in training machine learning (ML) models on these large datasets to extrapolate predictions to non-DEL compounds. A recent NeurIPS competition revealed that even top performing ML models trained on DEL data failed at generalizing to out-of-distribution (OOD) chemical space. We investigated whether integrating structural modeling could bridge this generalization gap. We systematically assessed state-of-the-art ML, docking, and co-folding methods with three biologically diverse protein targets screened against libraries containing multiple DEL synthesis formats, and show that while ML excels in-distribution, the optimal approach for OOD hit discrimination performance is both target and ligand dependent. We conclude that, regardless of performance reported in aggregated benchmarks, rigorous, system-dependent pilot testing is critical for reliable virtual screening predictions. We provide these workflows and analysis tools in an open-source package: DEL-iver.