Welcome to OpenDEL™ Community

A central hub to connect with global DEL professionals, access the latest industry insights and product updates, and collaborate to accelerate drug discovery.

DEL Hunter

  • DEL-Related Publications

    DNA-Encoded Library Screening Identifies CDK2-Targeting Lead Compounds with Favorable Drug-like Properties for Anticancer Development

    Li Zhou , Yong Ju , Zhijuan Cao , Sheng Cai , Jiayuan Su , Jianzhong Lu Journal of Pharmaceutical AnalysisDOI: 10.1016/j.jpha.2025.101498 Highlight Through screening 31 DNA-encoded chemical libraries, totaling 4.4 billion molecules, we identified a novel class of selective CDK2 inhibitors. The drug-likeness of C172 at the cellular level was evaluated, such as in vitro enzymatic and cellular assays, mechanistic studies on protein degradation, ADME characterization, single-dose pharmacokinetics in rats and metabolite identification.  

  • DEL-Related Publications

    Deciphering DEL Pocket Patterns through Contrastive Learning

    Wenyi Zhang, Yuxing Wang, Rui Zhan, Runtong Qian, Qi Hu, Jing Huang bioRxiv - Biophysics DOI: 10.1101/2025.06.12.659183 Abstract DNA-encoded libraries (DELs) facilitate high-throughput screening of trillions of molecules against protein targets through split-pool synthesis and DNA tagging. Despite their potential, only a few DEL-derived compounds have advanced to clinical trials or reached the market. A better understanding of the defining characteristics of target proteins, particularly those with binding pockets suitable for DEL screening, is critical to improving success rates. However, existing approaches remain limited in assessing pocket flexibility and functional similarity. Here, we present ErePOC, a pocket representation model based on contrastive learning with ESM-2 embeddings to address these challenges. ErePOC captures both structural and functional features of binding pockets, enabling identification of shared characteristics among DEL targets. By integrating analyses of low-dimensional physicochemical properties and high-dimensional ErePOC embeddings, we provide a comprehensive view of DEL target space. With 98% precision in downstream classification tasks, ErePOC demonstrates high performance in pocket representation, which is then applied to predict human proteins suitable for DEL screening, with enrichment uncovered across 18 functional categories. This work establishes a new framework for enhancing DEL-based drug discovery through more effective target selection and pocket similarity analysis. Summary This study introduces ErePOC, a novel pocket representation model that employs contrastive learning with ESM-2 embeddings to decode the defining characteristics of protein binding pockets amenable to DNA-encoded library (DEL) screening. Despite DEL technology's capacity to screen trillions of compounds, clinical translation remains limited due to poor understanding of target druggability. The researchers analyzed 128 successful DEL targets and compared them to 326,416 general ligand pockets (BioLiP2) and 340 FDA-approved drug pockets, revealing that DEL pockets are uniquely larger (28.1 vs 16.1 residues), more hydrophobic, and enriched in specific amino acids (Met, Tyr, Trp, Phe, Leu). ErePOC was trained to map pockets to a 256-dimensional latent space aligned with ligand chemical similarity, achieving 98% precision in functional classification. Applied to 23,391 AlphaFold2-predicted human proteins, the model identified 2,739 DEL-compatible targets with pockets showing >0.8 cosine similarity to known DEL pockets. Enrichment analysis revealed 18 functional categories, particularly oxidoreductases, transferases, and multifunctional enzymes. In silico docking of 2.8 million virtual DEL compounds against 14 selected targets confirmed that ErePOC-enriched proteins exhibit significantly better predicted binding affinities than neutral controls. This work establishes a computational framework for rational DEL target selection beyond traditional structural similarity metrics. Highlights Distinct DEL Pocket Signature: DEL-binding pockets are 1.3× larger (3,301 ų volume), more hydrophobic (50.7% hydrophobic interactions vs 32.5% in natural pockets), and enriched in flexible aromatic/hydrophobic residues (Met, Tyr, Trp, Phe) compared to regular ligand and FDA-approved drug pockets. ErePOC Model Innovation: A contrastive learning framework that aligns pocket representations with ligand Morgan fingerprints via KL divergence loss, generating function-aware embeddings that capture physicochemical and evolutionary features beyond 3D geometry, robust to pocket flexibility. Robust Zero-Shot Performance: Achieves superior classification of 7 ligand-binding pocket types (~43,000 pockets) with 98.5–98.9% accuracy; maintains strong performance even for pocket classes excluded from training, demonstrating powerful generalization. Large-Scale Human Proteome Screening: Identified 2,739 unique human proteins with DEL-compatible pockets from AlphaFold2 structures, with significant enrichment in transferases (17.9%), hydrolases (11.6%), and oxidoreductases (9.4%), plus novel classes like RNA-binding proteins and chromatin regulators. Experimental Validation via Docking: In silico screening of 2.8M DEL-like molecules against ErePOC-selected targets showed statistically significant better binding affinity (mean Z-score –2.18 vs –1.07) and higher enrichment for DEL-enriched vs DEL-neutral protein families. Case Study Insights: The regulatory protein MAT2B exhibits higher DEL compatibility (cosine similarity 0.93, docking –8.8 kcal/mol) than its catalytic paralog MAT2A (0.66, –5.3 kcal/mol), demonstrating ErePOC's ability to resolve subtle family-level differences in druggability. Conclusion ErePOC provides a transformative approach to DEL target selection by learning high-dimensional, function-aware representations of binding pockets that transcend traditional structural alignment limitations. The model successfully deciphers a unique DEL pocket pattern—characterized by larger size, enhanced hydrophobicity, and specific amino acid biases—and leverages this to predict over 2,700 human proteins likely amenable to DEL screening across 18 enriched functional categories. By capturing physicochemical relationships rather than relying solely on geometric similarity, ErePOC addresses the critical challenge of pocket flexibility and low structural overlap among functionally related sites. The significant enrichment of oxidoreductases, transferases, and multifunctional enzymes validates known DEL success stories while expanding the targetable space to include chromatin regulators and RNA-binding proteins. In silico validation confirms that ErePOC-selected targets bind DEL-like molecules more favorably, supporting its practical utility. This framework not only enhances DEL efficiency but also offers broad applicability for virtual screening, molecule generation, and protein design, particularly when integrated with advanced structure prediction tools like AlphaFold3.

  • DEL-Related Publications

    Biocatalytic- and Chemoproteomic-Guided Discovery of a PHGDH Inhibitor from Chemoenzymatic-Promoted DNA-Encoded Libraries Selection Platform

    Yiwei Zhang, Yuqiu Lan, Rufeng Fan, Lei Feng, Guoliang Wang, Xinyuan Wu, Lulu Wen, Zhiqiang Duan, Yueyue Xia, Xudong Wang, Lingrui Zhang, Lu Zhou, Minjia Tan, Cangsong Liao, Xiaojie Lu Journal of the American Chemical Society DOI: 10.1021/jacs.5c14634 Abstract DNA-encoded libraries (DELs) have emerged as an effective and efficient selection strategy for lead compound discovery in academia and industry over the past few decades. Despite recent advancements in this field, DEL remains limited by sensitive DNA-based constructs, particularly with low selection success rates resulting from the random selection of targets. Here, we describe a chemoenzymatic on-DNA reaction for DEL syntheses and develop a chemoproteomic-guided DEL selection platform. This platform, termed FF tags-biocatDEL, integrates DEL technology, chemoenzymatic synthesis, and fully functionalized (FF) chemical tags to match DELs with selection targets, even with limited information about ligandable hotspots. Using two diazirine-based FF indole probes, we comprehensively surveyed binding partners in cells and identified phosphoglycerate dehydrogenase (PHGDH) as a potential target for DEL selection. DEL01 and DEL02 were designed, synthesized, and selected against PHGDH, leading to the identification of a novel enzyme-active compound with an IC50 value of 2.5 μM. Our strategy, utilizing FF tags-biocatDEL, establishes a generalizable workflow for rapid target hunting and ligand discovery. It provides an effective method for precisely matching DELs with potential targets, demonstrating its significant potential as a complementary approach to drug discovery based on DELs. Summary This study presents a novel FF tags-biocatDEL platform that integrates chemoenzymatic synthesis, chemoproteomics, and DNA-encoded library (DEL) technology to overcome the low success rates of traditional DEL selection. The researchers developed a DNA-compatible decarboxylative aldol reaction using the PLP-dependent enzyme ApUstD to generate indole scaffolds bearing amine and carboxyl functional groups. Through chemoproteomic profiling with diazirine-based fully functionalized (FF) indole probes, they identified phosphoglycerate dehydrogenase (PHGDH) as a high-priority target from 2,208 enriched proteins. Two focused DELs were constructed: DEL01 (281,158 members via 2-cycle synthesis) and DEL02 (1.35 million members derived from a lactone fragment). Affinity selection against PHGDH yielded L5, a novel indole-based inhibitor with an IC₅₀ of 2.5 μM that acts via an allosteric mechanism. This strategy demonstrates that chemoproteomic guidance significantly enhances DEL selection efficiency and expands the chemical space for challenging targets. Highlights Innovative Chemoenzymatic Reaction: The first application of ApUstD on DNA substrates, achieving quantitative conversion (up to 100%) under mild aqueous conditions to generate complex indole scaffolds with γ-hydroxy-α-amino acid structures. Chemoproteomic-Guided Target Identification: Diazirine-based FF indole probes enabled unbiased profiling of 2,208 ligandable proteins, with PHGDH emerging as a clinically relevant target (ranked 111th) for cancer and neurodegenerative diseases. Potent Allosteric Inhibitor Discovery: L5, derived from a lactone byproduct (L3) scaffold, showed sub-micromolar potency (IC₅₀ = 2.5 μM) and allosteric inhibition independent of NAD⁺ concentration, representing a new chemotype for PHGDH. Scaffold Optimization via DEL Iteration: Initial hits (L1, L2) showed modest activity, but leveraging a side-product scaffold (L3) to build DEL02 (1.35M compounds) enabled a 20-fold activity improvement over the parent fragment. Technical Milestones: Successfully synthesized large-scale DELs using biocatalysis, validated target engagement via photo-crosslinking and thermal shift assays, and established a generalizable workflow combining fragment-based DELs with proteome-wide targeting data. Conclusion The FF tags-biocatDEL platform successfully bridges biocatalysis, chemoproteomics, and DEL technology to create a highly efficient, target-directed drug discovery workflow. By using chemoproteomic data to rationally select PHGDH and focused DELs to optimize a biocatalytically derived indole scaffold, the team discovered L5, a novel, compact PHGDH inhibitor with promising activity. This approach significantly outperforms random target selection and expands the accessible chemical space for traditionally challenging enzymes. While the platform currently leverages biocatalysis primarily for scaffold generation, future expansion to multiple DEL synthesis steps could further enhance diversity. Additionally, the affinity-based selection may identify non-functional binders that could be repurposed as PROTACs or other modalities. Overall, this strategy offers a robust complement to conventional DEL methods and holds substantial promise for accelerating lead discovery against emerging therapeutic targets.

  • DEL-Related Publications

    The Current Toolbox for Covalent Inhibitors: From Hit Identification to Drug Discovery

    Mengke You, Hong Liu, Chunpu Li JACS Au  DOI: 10.1021/jacsau.5c01134 Abstract Covalent modification of therapeutic targets has emerged as a powerful platform for creating clinical drugs and chemical probes. Covalent drugs have evolved from serendipitous discoveries to rationally designed therapeutics, driven by advances in electrophile-first screening technologies. This perspective takes stock of alternative technologies currently available in laboratories and industry that collectively enable targeted covalent inhibitor development across historically “undruggable” targets. We highlight five such technologies: activity-based protein profiling (ABPP), provides functional proteomic mapping to identify ligandable residues; covalent tethering, exploits dynamic chemistry to capture transient pockets; covalent DNA-encoded libraries, leverages trillion-member libraries for multiresidue targeting; phage/mRNA display, which facilitates evolution of covalent macrocyclic peptides; and sulfur(VI) fluoride exchange (SuFEx), engages residues beyond cysteine. Integration of these approaches with chemoproteomics and artificial intelligence accelerates the discovery of covalent inhibitors with enhanced selectivity and reduced off-target risks. This technological convergence establishes a new paradigm for precision covalent therapeutics, offering innovative solutions to overcome drug resistance and target challenging protein interfaces.

  • DEL-Related Publications

    TREM2 Activation by First-in-Class Direct Small Molecule Agonists: DEL Screening, Optimization, Biophysical Validation, and Functional Characterization

    Hossam Nada ,  Shaoren Yuan ,  Farida El Gaamouch ,  Sungwoo Cho ,  Katarzyna Kuncewicz ,  Laura Calvo-Barreiro ,  Moustafa T. Gabr European Journal of Medicinal Chemistry  DOI: 10.1016/j.ejmech.2025.118358 Abstract Triggering receptor expressed on myeloid cells 2 (TREM2) is a key regulator of microglial function, and its loss-of-function variants are linked to Alzheimer’s disease (AD) and neurodegenerative disorders. While TREM2 activation is a promising therapeutic strategy, no small molecule agonists acting via direct TREM2 binding have been reported to date. Here, we describe the discovery of first-in-class, direct small molecule TREM2 agonists identified through DNA-encoded library (DEL) screening. The DEL hit (4a) demonstrated TREM2 binding affinity, as validated by three biophysical screening platforms (TRIC, MST, and SPR), induced Syk phosphorylation, luciferase assay and enhanced microglial phagocytosis. Pre-liminary optimization yielded 4i, which maintained TREM2 engagement with improved selectivity over TREM1 and no cytotoxicity. Molecular dynamics simulations predicted that 4a stabilizes a transient binding pocket on TREM2, indicating the possibility of a novel mechanism for receptor activation. These findings provide the first proof-of-concept for direct pharmacological TREM2 agonism, offering a foundation for developing therapeutics against AD and related disorders.

  • DEL-Related Publications

    C2PO: an ML-powered optimizer of the membrane permeability of cyclic peptides through chemical modification

    Roy Aerts, Joris Tavernier, Alan Kerstjens, Mazen Ahmad, Jose Carlos Gómez-Tamayo, Gary Tresadern, Hans De Winter Journal of Cheminformatics DOI: https://doi.org/10.1186/s13321-025-01109-x Abstract Peptide drug development is currently receiving due attention as a modality between small and large molecules. Therapeutic peptides represent an opportunity to achieve high potency, selectivity, and reach intracellular targets. A new era in the development of therapeutic peptides emerged with the arrival of cyclic peptides which avoid the limitations of parenteral administration via achieving sufficient oral bioavailability. However, improving the membrane permeability of cyclic peptides remains one of the principal bottlenecks. Here, we introduce a deep learning regression model of cyclic peptide membrane permeability based on publicly available data. The model starts with a chemical structure and goes beyond the limited vocabulary language models to generalize to monomers beyond the ones in the training dataset. Moreover, we introduce an efficient estimator2generative wrapper to enable using the model in direct molecular optimization of membrane permeability via chemical modification. We name our application C2PO (Cyclic Peptide Permeability Optimizer). Lastly, we demonstrate how a molecule correction tool can be used to limit the presence of unfamiliar chemistry in the generated molecules. Summary This study presents C2PO (Cyclic Peptide Permeability Optimizer), a novel machine learning-driven application that improves the membrane permeability of cyclic peptides through chemical structure modification. The core of C2PO consists of a Graph Transformer deep learning model trained on the CycPeptMPDB dataset (7,451 permeability measurements), achieving state-of-the-art performance (R² = 0.61, Pearson r = 0.78, MAE = 0.37 on test set). Unlike conventional generative models, C2PO employs an estimator2generative approach, using gradient-based optimization based on the HotFlip algorithm to suggest structural modifications. The framework operates in two stages: first, it generates permeability-optimized peptide analogs by mutating side chains while preserving the macrocycle backbone; second, it automatically corrects chemically invalid structures using a dictionary-based correction tool referencing ChEMBL31. A case study on 700 low-permeability cyclic peptides demonstrated that 76.86% of optimization campaigns successfully produced at least one offspring with improved permeability (logPapp > -6.0), with 42.05% of all 13,043 generated molecules crossing this threshold. The system allows flexible user control over modification scope, elemental composition, and optimization parameters, making it a practical tool for medicinal chemists to generate ideas for improving peptide drug candidates. Highlights First-in-class application converting a machine learning model into a generative optimizer specifically for cyclic peptide permeability improvement Estimator2generative paradigm that decouples property estimation from structure generation, enabling broader chemical space exploration beyond training vocabulary State-of-the-art Graph Transformer model (based on GRAPHGPS framework) trained on comprehensive CycPeptMPDB dataset with robust cross-validation performance Automated chemistry correction workflow using a dictionary-based tool to validate and fix chemically unrealistic structures post-optimization, preserving 78% of successful optimizations Demonstrated effectiveness in a large-scale case study: 76.86% success rate for campaigns and 42.05% of offspring molecules achieving high permeability High flexibility allowing user-defined constraints on backbone protection, elemental modifications, molecular size changes (±5 atoms), and optimization parameters Beyond-vocabulary generalization capability to handle monomers not present in training data, overcoming limitations of language model-based approaches Conclusion Generally, cyclic peptides lack adequate membrane permeability to be developed into medicines. We propose C2PO (Cyclic Peptide Permeability Optimizer), an application that improves permeability by modifying the chemical structure of a given cyclic peptide. C2PO is ML-driven, trained on the experimental CycPeptMPDB dataset, and can be categorized in the estimator2generative optimization paradigm. However, ML-based applications that output chemical structures have the tendency of occasionally proposing strange chemistry, attributable to the loss of chemical knowledge, although it is generally considered to be implicitly learned. Therefore, we opted for checking and correcting the outcomes of C2PO using a chemistry library-based autocorrection application in a subsequent step. This contribution provides insights into what one can expect when applying these two applications. Seven hundred permeability optimization campaigns were launched where only peptide side chains were allowed to be modified. In general, we observed optimization for many campaigns, meaning that bad permeability starting points were optimized to structures with estimated permeability above the threshold of -6.0 logPapp. In the chemical correctness check step, we identified that a substantial portion (22.9%) of output structures needed correction. The autocorrection tool modified these, and we tracked how optimized permeability altered upon chemical correction. Various scenarios occurred, but the most important was that for many campaigns, the second step did not counteract the initial permeability optimization. We discussed in detail how to properly use our model and workflow, noting its flexibility for user customization. We focused on providing insights into basic capabilities rather than pursuing optimal performance, while informing about ways to improve both permeability optimization and molecular autocorrection. Finally, we hope to raise general interest in adopting estimator2generative optimizer strategies for chemical problems and deploying chemistry-library-driven applications for post-correcting ML-generated structures.

Product & Services

OpenDEL™ - Small Molecule

Starting Your Journey to Access the Vast Chemical Space

The Kit

  • 57 Libraries
  • ~3.8Bn compounds
  • 10 DEL samples

 

To Access

  • Fully Enumerated Molecules
  • Building Block Structures
  • DNA Codon Sequences
  • Scaffolds Information

 

✔ No Structure Disclosure Fee

✔ No Compound IP License Fee
Learn More more Quote more
case_01
OpenDEL™ - Small Molecule
01

OpenDEL™ Screening

OpenDEL™ screening is carried out by our team of experienced professionals, proficient in handling over 50 different target types including protein-protein interactions, kinases, enzymes, transcription factors, and RNA targets. Our team typically completes the screening experiments within 1-2 weeks. 
Learn More more Quote more
case_01
OpenDEL™ Screening
02

OpenDEL™ Sequencing

HitGen offers high-quality and gold sequencing service includes. 
  • Global Sample Shipment

  • Outstanding Sequencing Quality

  • Lightning-speed Result Delivery

  • Diverse Sequencing Options

Learn More more Quote more
case_01
OpenDEL™ Sequencing
03

OpenDEL™ Hit Proposal

Analyzing DEL selection data and choosing the right compounds for follow-up necessitates multidisciplinary expertise encompassing biology, computational science, and chemistry. This includes a deep understanding of the experimental design and mechanisms of action (MOAs) in biology, data processing and analysis in computational science, and aspects of both synthetic and DEL chemistry
Learn More more Quote more
case_01
OpenDEL™ Hit Proposal
04

OpenDEL™ Off-DNA Synthesis

HitGen Chemical Services: Innovation-Driven and Precision-Empowered.

We transform your DEL hits into tangible results by delivering the pure, complex structures critical for validating discoveries and accelerating their advancement.

Choose Your Path:

A. Traditional Chemical Synthesis @ HitGen 
B. High Throughput Chemical Synthesis @ HitGen

Learn More more Quote more
case_01
OpenDEL™ Off-DNA Synthesis
05

What are people in the community saying?

Connect with peers. Access breakthrough science. Spark your next discovery.

  • HitGen
    HitGen

    Li Zhou , Yong Ju , Zhijuan Cao , Sheng Cai , Jiayuan Su , Jianzhong Lu

    Journal of Pharmaceutical Analysis

    DOI: 10.1016/j.jpha.2025.101498

    Highlight

    • Through screening 31 DNA-encoded chemical libraries, totaling 4.4 billion molecules, we identified a novel class of selective CDK2 inhibitors.
    • The drug-likeness of C172 at the cellular level was evaluated, such as in vitro enzymatic and cellular assays, mechanistic studies on protein degradation, ADME characterization, single-dose pharmacokinetics in rats and metabolite identification.

     

  • HitGen
    HitGen

    Yiwei Zhang, Yuqiu Lan, Rufeng Fan, Lei Feng, Guoliang Wang, Xinyuan Wu, Lulu Wen, Zhiqiang Duan, Yueyue Xia, Xudong Wang, Lingrui Zhang, Lu Zhou, Minjia Tan, Cangsong Liao, Xiaojie Lu

    Journal of the American Chemical Society

    DOI: 10.1021/jacs.5c14634

    Abstract

    DNA-encoded libraries (DELs) have emerged as an effective and efficient selection strategy for lead compound discovery in academia and industry over the past few decades. Despite recent advancements in this field, DEL remains limited by sensitive DNA-based constructs, particularly with low selection success rates resulting from the random selection of targets. Here, we describe a chemoenzymatic on-DNA reaction for DEL syntheses and develop a chemoproteomic-guided DEL selection platform. This platform, termed FF tags-biocatDEL, integrates DEL technology, chemoenzymatic synthesis, and fully functionalized (FF) chemical tags to match DELs with selection targets, even with limited information about ligandable hotspots. Using two diazirine-based FF indole probes, we comprehensively surveyed binding partners in cells and identified phosphoglycerate dehydrogenase (PHGDH) as a potential target for DEL selection. DEL01 and DEL02 were designed, synthesized, and selected against PHGDH, leading to the identification of a novel enzyme-active compound with an IC50 value of 2.5 μM. Our strategy, utilizing FF tags-biocatDEL, establishes a generalizable workflow for rapid target hunting and ligand discovery. It provides an effective method for precisely matching DELs with potential targets, demonstrating its significant potential as a complementary approach to drug discovery based on DELs.

    Summary

    This study presents a novel FF tags-biocatDEL platform that integrates chemoenzymatic synthesis, chemoproteomics, and DNA-encoded library (DEL) technology to overcome the low success rates of traditional DEL selection. The researchers developed a DNA-compatible decarboxylative aldol reaction using the PLP-dependent enzyme ApUstD to generate indole scaffolds bearing amine and carboxyl functional groups. Through chemoproteomic profiling with diazirine-based fully functionalized (FF) indole probes, they identified phosphoglycerate dehydrogenase (PHGDH) as a high-priority target from 2,208 enriched proteins. Two focused DELs were constructed: DEL01 (281,158 members via 2-cycle synthesis) and DEL02 (1.35 million members derived from a lactone fragment). Affinity selection against PHGDH yielded L5, a novel indole-based inhibitor with an IC₅₀ of 2.5 μM that acts via an allosteric mechanism. This strategy demonstrates that chemoproteomic guidance significantly enhances DEL selection efficiency and expands the chemical space for challenging targets.

    Highlights

    • Innovative Chemoenzymatic Reaction: The first application of ApUstD on DNA substrates, achieving quantitative conversion (up to 100%) under mild aqueous conditions to generate complex indole scaffolds with γ-hydroxy-α-amino acid structures.
    • Chemoproteomic-Guided Target Identification: Diazirine-based FF indole probes enabled unbiased profiling of 2,208 ligandable proteins, with PHGDH emerging as a clinically relevant target (ranked 111th) for cancer and neurodegenerative diseases.
    • Potent Allosteric Inhibitor Discovery: L5, derived from a lactone byproduct (L3) scaffold, showed sub-micromolar potency (IC₅₀ = 2.5 μM) and allosteric inhibition independent of NAD⁺ concentration, representing a new chemotype for PHGDH.
    • Scaffold Optimization via DEL Iteration: Initial hits (L1, L2) showed modest activity, but leveraging a side-product scaffold (L3) to build DEL02 (1.35M compounds) enabled a 20-fold activity improvement over the parent fragment.
    • Technical Milestones: Successfully synthesized large-scale DELs using biocatalysis, validated target engagement via photo-crosslinking and thermal shift assays, and established a generalizable workflow combining fragment-based DELs with proteome-wide targeting data.

    Conclusion

    The FF tags-biocatDEL platform successfully bridges biocatalysis, chemoproteomics, and DEL technology to create a highly efficient, target-directed drug discovery workflow. By using chemoproteomic data to rationally select PHGDH and focused DELs to optimize a biocatalytically derived indole scaffold, the team discovered L5, a novel, compact PHGDH inhibitor with promising activity. This approach significantly outperforms random target selection and expands the accessible chemical space for traditionally challenging enzymes. While the platform currently leverages biocatalysis primarily for scaffold generation, future expansion to multiple DEL synthesis steps could further enhance diversity. Additionally, the affinity-based selection may identify non-functional binders that could be repurposed as PROTACs or other modalities. Overall, this strategy offers a robust complement to conventional DEL methods and holds substantial promise for accelerating lead discovery against emerging therapeutic targets.

  • HitGen
    HitGen

    Wenyi Zhang, Yuxing Wang, Rui Zhan, Runtong Qian, Qi Hu, Jing Huang

    bioRxiv - Biophysics

    DOI: 10.1101/2025.06.12.659183

    Abstract

    DNA-encoded libraries (DELs) facilitate high-throughput screening of trillions of molecules against protein targets through split-pool synthesis and DNA tagging. Despite their potential, only a few DEL-derived compounds have advanced to clinical trials or reached the market. A better understanding of the defining characteristics of target proteins, particularly those with binding pockets suitable for DEL screening, is critical to improving success rates. However, existing approaches remain limited in assessing pocket flexibility and functional similarity. Here, we present ErePOC, a pocket representation model based on contrastive learning with ESM-2 embeddings to address these challenges. ErePOC captures both structural and functional features of binding pockets, enabling identification of shared characteristics among DEL targets. By integrating analyses of low-dimensional physicochemical properties and high-dimensional ErePOC embeddings, we provide a comprehensive view of DEL target space. With 98% precision in downstream classification tasks, ErePOC demonstrates high performance in pocket representation, which is then applied to predict human proteins suitable for DEL screening, with enrichment uncovered across 18 functional categories. This work establishes a new framework for enhancing DEL-based drug discovery through more effective target selection and pocket similarity analysis.

    Summary

    This study introduces ErePOC, a novel pocket representation model that employs contrastive learning with ESM-2 embeddings to decode the defining characteristics of protein binding pockets amenable to DNA-encoded library (DEL) screening. Despite DEL technology's capacity to screen trillions of compounds, clinical translation remains limited due to poor understanding of target druggability. The researchers analyzed 128 successful DEL targets and compared them to 326,416 general ligand pockets (BioLiP2) and 340 FDA-approved drug pockets, revealing that DEL pockets are uniquely larger (28.1 vs 16.1 residues), more hydrophobic, and enriched in specific amino acids (Met, Tyr, Trp, Phe, Leu). ErePOC was trained to map pockets to a 256-dimensional latent space aligned with ligand chemical similarity, achieving 98% precision in functional classification. Applied to 23,391 AlphaFold2-predicted human proteins, the model identified 2,739 DEL-compatible targets with pockets showing >0.8 cosine similarity to known DEL pockets. Enrichment analysis revealed 18 functional categories, particularly oxidoreductases, transferases, and multifunctional enzymes. In silico docking of 2.8 million virtual DEL compounds against 14 selected targets confirmed that ErePOC-enriched proteins exhibit significantly better predicted binding affinities than neutral controls. This work establishes a computational framework for rational DEL target selection beyond traditional structural similarity metrics.

    Highlights

    • Distinct DEL Pocket Signature: DEL-binding pockets are 1.3× larger (3,301 ų volume), more hydrophobic (50.7% hydrophobic interactions vs 32.5% in natural pockets), and enriched in flexible aromatic/hydrophobic residues (Met, Tyr, Trp, Phe) compared to regular ligand and FDA-approved drug pockets.
    • ErePOC Model Innovation: A contrastive learning framework that aligns pocket representations with ligand Morgan fingerprints via KL divergence loss, generating function-aware embeddings that capture physicochemical and evolutionary features beyond 3D geometry, robust to pocket flexibility.
    • Robust Zero-Shot Performance: Achieves superior classification of 7 ligand-binding pocket types (~43,000 pockets) with 98.5–98.9% accuracy; maintains strong performance even for pocket classes excluded from training, demonstrating powerful generalization.
    • Large-Scale Human Proteome Screening: Identified 2,739 unique human proteins with DEL-compatible pockets from AlphaFold2 structures, with significant enrichment in transferases (17.9%), hydrolases (11.6%), and oxidoreductases (9.4%), plus novel classes like RNA-binding proteins and chromatin regulators.
    • Experimental Validation via Docking: In silico screening of 2.8M DEL-like molecules against ErePOC-selected targets showed statistically significant better binding affinity (mean Z-score –2.18 vs –1.07) and higher enrichment for DEL-enriched vs DEL-neutral protein families.
    • Case Study Insights: The regulatory protein MAT2B exhibits higher DEL compatibility (cosine similarity 0.93, docking –8.8 kcal/mol) than its catalytic paralog MAT2A (0.66, –5.3 kcal/mol), demonstrating ErePOC's ability to resolve subtle family-level differences in druggability.

    Conclusion

    ErePOC provides a transformative approach to DEL target selection by learning high-dimensional, function-aware representations of binding pockets that transcend traditional structural alignment limitations. The model successfully deciphers a unique DEL pocket pattern—characterized by larger size, enhanced hydrophobicity, and specific amino acid biases—and leverages this to predict over 2,700 human proteins likely amenable to DEL screening across 18 enriched functional categories. By capturing physicochemical relationships rather than relying solely on geometric similarity, ErePOC addresses the critical challenge of pocket flexibility and low structural overlap among functionally related sites. The significant enrichment of oxidoreductases, transferases, and multifunctional enzymes validates known DEL success stories while expanding the targetable space to include chromatin regulators and RNA-binding proteins. In silico validation confirms that ErePOC-selected targets bind DEL-like molecules more favorably, supporting its practical utility. This framework not only enhances DEL efficiency but also offers broad applicability for virtual screening, molecule generation, and protein design, particularly when integrated with advanced structure prediction tools like AlphaFold3.

  • HitGen
    HitGen

    Mengke You, Hong Liu, Chunpu Li

    JACS Au 

    DOI: 10.1021/jacsau.5c01134

    Abstract

    Covalent modification of therapeutic targets has emerged as a powerful platform for creating clinical drugs and chemical probes. Covalent drugs have evolved from serendipitous discoveries to rationally designed therapeutics, driven by advances in electrophile-first screening technologies. This perspective takes stock of alternative technologies currently available in laboratories and industry that collectively enable targeted covalent inhibitor development across historically “undruggable” targets. We highlight five such technologies: activity-based protein profiling (ABPP), provides functional proteomic mapping to identify ligandable residues; covalent tethering, exploits dynamic chemistry to capture transient pockets; covalent DNA-encoded libraries, leverages trillion-member libraries for multiresidue targeting; phage/mRNA display, which facilitates evolution of covalent macrocyclic peptides; and sulfur(VI) fluoride exchange (SuFEx), engages residues beyond cysteine. Integration of these approaches with chemoproteomics and artificial intelligence accelerates the discovery of covalent inhibitors with enhanced selectivity and reduced off-target risks. This technological convergence establishes a new paradigm for precision covalent therapeutics, offering innovative solutions to overcome drug resistance and target challenging protein interfaces.

  • HitGen
    HitGen

    Hossam Nada ,  Shaoren Yuan ,  Farida El Gaamouch ,  Sungwoo Cho ,  Katarzyna Kuncewicz ,  Laura Calvo-Barreiro ,  Moustafa T. Gabr

    European Journal of Medicinal Chemistry 

    DOI: 10.1016/j.ejmech.2025.118358

    Abstract

    Triggering receptor expressed on myeloid cells 2 (TREM2) is a key regulator of microglial function, and its loss-of-function variants are linked to Alzheimer’s disease (AD) and neurodegenerative disorders. While TREM2 activation is a promising therapeutic strategy, no small molecule agonists acting via direct TREM2 binding have been reported to date. Here, we describe the discovery of first-in-class, direct small molecule TREM2 agonists identified through DNA-encoded library (DEL) screening. The DEL hit (4a) demonstrated TREM2 binding affinity, as validated by three biophysical screening platforms (TRIC, MST, and SPR), induced Syk phosphorylation, luciferase assay and enhanced microglial phagocytosis. Pre-liminary optimization yielded 4i, which maintained TREM2 engagement with improved selectivity over TREM1 and no cytotoxicity. Molecular dynamics simulations predicted that 4a stabilizes a transient binding pocket on TREM2, indicating the possibility of a novel mechanism for receptor activation. These findings provide the first proof-of-concept for direct pharmacological TREM2 agonism, offering a foundation for developing therapeutics against AD and related disorders.

  • HitGen
    HitGen

    Roy Aerts, Joris Tavernier, Alan Kerstjens, Mazen Ahmad, Jose Carlos Gómez-Tamayo, Gary Tresadern, Hans De Winter

    Journal of Cheminformatics

    DOI: https://doi.org/10.1186/s13321-025-01109-x

    Abstract

    Peptide drug development is currently receiving due attention as a modality between small and large molecules. Therapeutic peptides represent an opportunity to achieve high potency, selectivity, and reach intracellular targets. A new era in the development of therapeutic peptides emerged with the arrival of cyclic peptides which avoid the limitations of parenteral administration via achieving sufficient oral bioavailability. However, improving the membrane permeability of cyclic peptides remains one of the principal bottlenecks. Here, we introduce a deep learning regression model of cyclic peptide membrane permeability based on publicly available data. The model starts with a chemical structure and goes beyond the limited vocabulary language models to generalize to monomers beyond the ones in the training dataset. Moreover, we introduce an efficient estimator2generative wrapper to enable using the model in direct molecular optimization of membrane permeability via chemical modification. We name our application C2PO (Cyclic Peptide Permeability Optimizer). Lastly, we demonstrate how a molecule correction tool can be used to limit the presence of unfamiliar chemistry in the generated molecules.

    Summary

    This study presents C2PO (Cyclic Peptide Permeability Optimizer), a novel machine learning-driven application that improves the membrane permeability of cyclic peptides through chemical structure modification. The core of C2PO consists of a Graph Transformer deep learning model trained on the CycPeptMPDB dataset (7,451 permeability measurements), achieving state-of-the-art performance (R² = 0.61, Pearson r = 0.78, MAE = 0.37 on test set). Unlike conventional generative models, C2PO employs an estimator2generative approach, using gradient-based optimization based on the HotFlip algorithm to suggest structural modifications. The framework operates in two stages: first, it generates permeability-optimized peptide analogs by mutating side chains while preserving the macrocycle backbone; second, it automatically corrects chemically invalid structures using a dictionary-based correction tool referencing ChEMBL31. A case study on 700 low-permeability cyclic peptides demonstrated that 76.86% of optimization campaigns successfully produced at least one offspring with improved permeability (logPapp > -6.0), with 42.05% of all 13,043 generated molecules crossing this threshold. The system allows flexible user control over modification scope, elemental composition, and optimization parameters, making it a practical tool for medicinal chemists to generate ideas for improving peptide drug candidates.

    Highlights

    • First-in-class application converting a machine learning model into a generative optimizer specifically for cyclic peptide permeability improvement
    • Estimator2generative paradigm that decouples property estimation from structure generation, enabling broader chemical space exploration beyond training vocabulary
    • State-of-the-art Graph Transformer model (based on GRAPHGPS framework) trained on comprehensive CycPeptMPDB dataset with robust cross-validation performance
    • Automated chemistry correction workflow using a dictionary-based tool to validate and fix chemically unrealistic structures post-optimization, preserving 78% of successful optimizations
    • Demonstrated effectiveness in a large-scale case study: 76.86% success rate for campaigns and 42.05% of offspring molecules achieving high permeability
    • High flexibility allowing user-defined constraints on backbone protection, elemental modifications, molecular size changes (±5 atoms), and optimization parameters
    • Beyond-vocabulary generalization capability to handle monomers not present in training data, overcoming limitations of language model-based approaches

    Conclusion

    Generally, cyclic peptides lack adequate membrane permeability to be developed into medicines. We propose C2PO (Cyclic Peptide Permeability Optimizer), an application that improves permeability by modifying the chemical structure of a given cyclic peptide. C2PO is ML-driven, trained on the experimental CycPeptMPDB dataset, and can be categorized in the estimator2generative optimization paradigm. However, ML-based applications that output chemical structures have the tendency of occasionally proposing strange chemistry, attributable to the loss of chemical knowledge, although it is generally considered to be implicitly learned. Therefore, we opted for checking and correcting the outcomes of C2PO using a chemistry library-based autocorrection application in a subsequent step. This contribution provides insights into what one can expect when applying these two applications. Seven hundred permeability optimization campaigns were launched where only peptide side chains were allowed to be modified. In general, we observed optimization for many campaigns, meaning that bad permeability starting points were optimized to structures with estimated permeability above the threshold of -6.0 logPapp. In the chemical correctness check step, we identified that a substantial portion (22.9%) of output structures needed correction. The autocorrection tool modified these, and we tracked how optimized permeability altered upon chemical correction. Various scenarios occurred, but the most important was that for many campaigns, the second step did not counteract the initial permeability optimization. We discussed in detail how to properly use our model and workflow, noting its flexibility for user customization. We focused on providing insights into basic capabilities rather than pursuing optimal performance, while informing about ways to improve both permeability optimization and molecular autocorrection. Finally, we hope to raise general interest in adopting estimator2generative optimizer strategies for chemical problems and deploying chemistry-library-driven applications for post-correcting ML-generated structures.

Messages and Feedback

By submitting your information, you acknowledge having received, read and understood our Privacy Notice as made available above.

logo
logo