Welcome to OpenDEL™ Community

A central hub to connect with global DEL professionals, access the latest industry insights and product updates, and collaborate to accelerate drug discovery.

DEL Hunter

  • DEL-Related Publications

    DNA-Compatible Synthesis of β-Ketoamides as Intermediates for On-DNA Chemical Diversification

    Xianfeng Li ,  Zehao Yin ,  Qiuyi Chen ,  Xinlong Hu ,  Gong Zhang ,  Xiaohong Fan ,  Yizhou Li Organic Letters  DOI: 10.1021/acs.orglett.6c00490 Abstract The β-ketoamide motif represents both a privileged scaffold and a versatile synthetic intermediate in medicinal chemistry. Herein, we developed a DNA-compatible method for the efficient conversion of various DNA-conjugated amines into β-ketoamides. The resulting β-ketoamides facilitate rapid diversification into a panel of structurally diverse molecular scaffolds. Importantly, the synthetic route and subsequent derivatization steps were validated to be fully compatible with DNA encoding, offering a reliable and versatile platform for DNA-encoded library synthesis.  

  • DEL-Related Publications

    Toward generalizable predictive models for DNA-encoded libraries

    Vasanthanathan Poongavanam ,  S. Pauliina Turunen ,  Kristian Sandberg ,  Ulrika Yngve ,  Johan Wannberg Drug Discovery Today  DOI: 10.1016/j.drudis.2026.104629 Abstract DNA-encoded libraries (DELs) combined with machine learning (ML) offer a powerful paradigm for hit identification. However, sequencing-derived enrichment data are inherently noisy and biased, often resulting in models that overfit to specific chemical libraries. In this review, we critically evaluate the capabilities and limitations of DEL-ML, illustrating key challenges using Aurora Kinase A (AURKA) DEL affinity selection data. We demonstrate that standard ML models often struggle to generalize to unseen chemical space because of the specific structural constraints of combinatorial libraries. Furthermore, we discuss the necessity of rigorous denoising strategies and evaluate approaches, such as domain adaptation, to mitigate these limitations, offering a roadmap for building robust models capable of exploring diverse chemical space. Summary This review critically examines the integration of machine learning (ML) with DNA-encoded library (DEL) technology for drug discovery. While DEL-ML offers a powerful paradigm for hit identification by generating massive binding datasets (10⁶–10¹² data points), the authors identify a critical "generalizability gap" that limits the practical utility of current models. Using Aurora Kinase A (AURKA) as a case study with OpenDEL 4.0 screening data (~1.5 million data points), the authors demonstrate that standard ML models achieve high accuracy on internal validation but frequently fail to generalize to structurally novel scaffolds due to domain shift—the substantial difference between DEL chemical space and known pharmacological compounds. The review provides methodological best practices for data preprocessing, denoising, and validation, while evaluating advanced strategies such as domain adaptation to improve model robustness. The authors argue that future DEL-ML development must move beyond simple accuracy maximization toward explicit handling of distribution shifts to transform DEL-ML from a retrospective analysis tool into a reliable engine for novel chemical discovery. Highlights 1. The Generalizability Challenge in DEL-ML Models trained on DEL data often memorize library-specific building blocks rather than learning transferable structure-activity relationships The BELKA competition revealed that models perform well on test sets within the same chemical space but fail on structurally novel scaffolds Domain shift between DEL training data and external compound collections represents a fundamental barrier to practical application 2. Data Quality and Preprocessing Considerations DEL sequencing data contains unique noise profiles including matrix binding, DNA-tag interference, unequal synthesis yields, and "jackpot" effects Multiple denoising strategies are evaluated: fold-enrichment, Z-scores for ultra-large libraries, disynthon aggregation, and uncertainty-aware probabilistic loss functions Critical importance of subtracting background noise from control experiments (matrix/bead-only) to prevent false positives 3. Class Imbalance and Data Splitting Strategies DEL selections produce highly imbalanced datasets (10¹–10⁴ binders vs. up to ~10¹² nonbinders) Random splitting leads to overoptimistic metrics due to high structural similarity within DEL congeneric series Scaffold-based or library-based splitting provides more rigorous assessment of generalizability to novel chemotypes Undersampling nonbinders (e.g., 1:1 ratio) can boost external sensitivity from ~1% to 20–30%, though this may reflect bias exploitation rather than true generalization 4. Molecular Representation and Model Architectures Traditional fingerprints and physicochemical descriptors often fail to capture subtle variations in DEL compounds Graph neural networks (GNNs) and variational autoencoders (VAEs) show promise but require careful handling of linker/DNA-tag artifacts Compositional (disynthon) approaches reduce sparsity but risk losing "whole-molecule" structural fidelity Conformal prediction frameworks provide calibrated confidence intervals essential for prioritizing predictions in noisy DEL environments 5. Domain Adaptation as a Solution Strategy Covariate shift correction reduces divergence between source (DEL) and target (known binder) domains Using high-confidence predictions from diverse compound collections (e.g., Enamine REAL Diversity Set) as an intermediate domain improves generalization Domain adaptation reduced PCA centroid distance from 0.77 to 0.32 between DEL training data and known AURKA space Retraining with both predicted binders and nonbinders improved Matthews Correlation Coefficient (MCC) from 0.2 to 0.4 on external datasets while maintaining 20–39% sensitivity 6. AURKA Case Study Findings OpenDEL 4.0-derived binders tended to be larger, more lipophilic, and less polar compared to known AURKA inhibitors Despite overall domain shift, highly enriched DEL hits from sublibrary 27 shared conserved hinge-binding motifs with established inhibitors (e.g., VX-680) Mechanistic alignment between DEL hits and known binders confirms that domain shift, rather than fundamental binding mode differences, drives prediction failures Conclusion The integration of DELs with ML presents transformative opportunities for early drug discovery, but realizing this potential requires overcoming the critical generalizability gap. The primary challenge is not data volume but data nature: intrinsic structural biases and systematic false negatives (often linker-induced) cause models to memorize library-specific artifacts rather than learn transferable pharmacophore principles. High internal validation metrics frequently mask failures to extrapolate to novel, pharmacologically relevant scaffolds. The authors advocate for a paradigm shift in DEL-ML development emphasizing: Rigorous validation standards: Moving beyond random splits to scaffold-based and out-of-distribution evaluation Domain alignment strategies: Explicit handling of distribution shifts through domain adaptation and transfer learning Data diversity expansion: Open-source DEL datasets spanning broader drug-like chemical space to reduce single-library bias Integration of physics-based priors: Incorporating docking constraints to reduce overfitting to synthetic artifacts Uncertainty quantification: Systematic use of conformal prediction and applicability domain assessment By pivoting from simple accuracy maximization to robust domain alignment, DEL-ML can evolve from a retrospective analysis tool into a reliable engine for identifying novel chemical starting points. The establishment of standardized benchmarks and community resources will be essential to accelerate the development of generalizable predictive models capable of exploring the vast chemical space beyond individual DEL compositions.

  • DEL-Related Publications

    Hermes: Large DEL Datasets Train Generalizable Protein-Ligand Binding Prediction Models

    Maxwell Kleinsasser ,  Brayden J. Halverson ,  Edward Kraft ,  Sean Francis-Lyon ,  Sarah E. Hugo ,  Mackenzie R. Roman ,  Ben Miller ,  Andrew D. Blevins ,  Ian K. Quigley arXiv - QuanBio - Biomolecules  Abstract The quality and consistency of training data remain critical bottlenecks for protein-ligand binding prediction. Public affinity datasets, aggregated from thousands of labs and assay formats, introduce biases that limit model generalization and complicate evaluation. DNA-encoded chemical libraries (DELs) offer a potential solution: unified experimental protocols generating massive binding datasets across diverse chemical and protein target space. We present Hermes, a lightweight transformer trained exclusively on DEL data from screens against hundreds of protein targets, representing one of the largest and most protein-diverse DEL training sets applied to protein-ligand interaction (PLI) modeling to date. Despite never seeing traditional affinity measurements during training, Hermes generalizes to held-out targets, novel chemical scaffolds, and external benchmarks derived from public binding data and high-throughput screens. Our results demonstrate that DEL data alone captures transferable protein-ligand interaction representations, while Hermes' minimal architecture enables inference speeds suitable for large-scale virtual screening. Summary The paper introduces Hermes, a lightweight transformer-based model trained exclusively on DNA-encoded library (DEL) screening data across 239 protein targets. Despite never using traditional affinity measurements (e.g., IC50, Kd), Hermes generalizes to unseen protein targets, novel chemical scaffolds, and external benchmarks derived from public binding data. The model demonstrates that DEL data alone captures transferable protein-ligand interaction representations, with inference speeds 500–700× faster than state-of-the-art structure-based models like Boltz-2, making it highly suitable for large-scale virtual screening. Highlights Strong generalization: Achieves mean AUROC of 0.68 on the DEL Protein Split (unseen proteins) and 0.60 on Public Binders/Decoys (external benchmarks), with significantly better performance for kinase targets due to kinase-enriched training data. Speed advantage: Processes 28.2 samples/second/GPU on H200 hardware, far outpacing Boltz-2 (0.04 samples/second on H100), critical for cost-effective virtual screening. Limitations: Performance drops on the DEL Chemical Library Split (AUROC ~0.56), suggesting challenges in generalizing to entirely new chemical libraries. Data binarization (binary binding labels) and noise in DEL screening results constrain model expressivity. Practical impact: Highlights DEL datasets as a scalable, unified alternative to fragmented public affinity data (e.g., ChEMBL), with potential to accelerate drug discovery pipelines. Conclusion Hermes demonstrates that DEL-derived data alone can train generalizable protein-ligand binding prediction models without reliance on traditional affinity measurements. Its success underscores the value of large-scale, consistent DEL screening data for capturing transferable biological interactions. As DEL datasets continue to grow beyond public affinity resources, DEL-trained models like Hermes are poised to drive the next generation of computational drug discovery, particularly for targets underrepresented in existing public data. Future improvements could incorporate structural augmentation and continuous binding strength modeling to address current limitations.

  • DEL-Related Publications

    Recent Advances in the Use and Impact of DNA-Encoded Libraries in Drug Discovery

    Amanda W. Dombrowski,Florent Samain ACS Medicinal Chemistry Letters DOI: 10.1021/acsmedchemlett.6c00047 Abstract Over the past 30 years, the field of DNA-encoded libraries (DELs) has become a mature and robust technology platform for the identification of ligands against relevant biological protein targets. Most of the major innovative pharmaceutical companies have integrated DEL platforms into their drug discovery workflows. Indeed, DELs have significantly impacted drug discovery efforts in the last 10 years with the identification of ligands that have progressed into clinical trials for various disease indications. One could assume that there are likely even more DEL-derived ligands that have reached the clinic, but candidate stories do not necessarily mention the hit generation methods utilized. (1−3) The fundamental concept of DELs emerged in a theoretical paper from Lerner and Brenner in 1992. (4) DELs are defined as collections of small molecules that are covalently attached to unique DNA tags, serving as amplifiable identification barcodes. Encoding procedures allow the generation and screening of combinatorial libraries of high diversity and unprecedented size. Preferential binding molecules identified by high throughput DNA sequencing of libraries after affinity capture procedures are typically resynthesized and tested to characterize their binding properties. (5) Success of DEL-based drug discovery screening campaigns is affected by various factors, including quality and diversity of the libraries, screening protocols, protein selection conditions, downstream validation and data analysis methods. Advances in the field have expanded DEL applications beyond traditional synthesis and screening procedures. Recently, the incorporation of artificial intelligence (AI) and machine learning (ML) techniques in DEL workflows has been accelerating, driving significant advancements and extending the potential of technology. (6,7) This ACS Medicinal Chemistry Letters Collection features recent success stories that highlight the value and impact of DEL technology in drug discovery. The following report, A Novel Small Molecule Allosteric Inhibitor of IL-17A from a DNA-Encoded Library, demonstrates the ability of DEL to identify small molecules that directly modulate the IL-17 pathway by inhibiting IL-17A with their cognate receptors, proving that small molecules can mimic the action of macromolecular biologics to disrupt high affinity protein–protein interactions. DEL macrocycle-like libraries have also proven effective for the recognition of larger protein surfaces (DNA-Encoded Macrocyclic Peptide Libraries Enable the Discovery of a Neutral MDM2–p53 Inhibitor). Beyond traditional inhibitor identification, DEL technology has become a pivotal approach for the development of targeted protein degradation (TPD) therapeutics. The report, Discovery of Small-Molecule Ligands for the E3 Ligase STUB1/CHIP from a DNA-Encoded Library Screen, from AstraZeneca scientists, shows that DEL can be effectively used to identify small molecule ligands for STIP1 homology and U-box containing protein 1 (STUB1), an E3 ligase that contains protein–protein interaction (PPI) sites, where previously only peptide binding ligands have been discovered. DEL hits can also serve as tools to provide structural basis for further hit-to-lead progression, thus accelerating medicinal chemistry lead optimization activities (Structural and Molecular Insight into the PWWP1 Domain of NSD2 from the Discovery of Novel Binders Via DNA-Encoded Library Screening; Optimization of a Novel DEL Hit That Binds in the Cbl-b SH2 Domain and Blocks Substrate Binding; Chemical Space Profiling of SARS-CoV-2 PLpro Using DNA-Encoded Focused Libraries). By leveraging the broad chemical diversity offered by DEL techniques, researchers have been able to identify robust covalent inhibitors that specifically interact with cysteine residues. This targeted approach helps improve both the selectivity and efficacy of potential therapeutic agents (Identification and Evaluation of Reversible Covalent Binders to Cys55 of Bfl-1 from a DNA-Encoded Chemical Library Screen). Recent literature indicates that DEL designs have become more strategic, influenced by progress in synthetic methodologies, expanded chemical diversity, and enhanced access to structural biology data. The perspective, Design of DNA Encoded Libraries for Medicinal Chemistry, provides a comprehensive analysis of DEL derived hits that enabled medicinal chemistry programs from publications between 2020 and 2025. The development of computational tools has offered new opportunities to rationalize DEL designs and DEL data set analysis. The integration of DEL technology with computational approaches, such as machine learning, continues to unleash the potential of the technology (Highly Selective Novel Heme Oxygenase-1 Hits Found by DNA-Encoded Library Machine Learning beyond the DEL Chemical Space; Evaluating the Diversity and Target Addressability of DELs using Scaffold Analysis and Machine Learning). This Collection features 10 publications, which highlight the rapid growth in many areas of the DEL field. We thank the authors for their work, which we hope will inform and inspire our readers to expand upon that work. This article references 7 other publications. This article has not yet been cited by other publications.   Summary This editorial introduces a special collection of 10 publications highlighting the rapid evolution and maturation of DNA-encoded library (DEL) technology over the past 30 years. DELs, which consist of small molecules covalently attached to unique DNA barcodes for identification, have transitioned from a theoretical concept (first proposed by Lerner and Brenner in 1992) to a robust, industry-standard platform for drug discovery. Major pharmaceutical companies have now integrated DEL platforms into their workflows, with several DEL-derived ligands progressing into clinical trials over the last decade. The editorial emphasizes that recent advances—including AI/ML integration, novel library designs, and expanded applications beyond traditional inhibitor discovery—are significantly extending the potential of this technology. Highlights 1. Diverse Therapeutic Applications IL-17A Inhibition: DEL technology successfully identified small molecule allosteric inhibitors that disrupt high-affinity protein-protein interactions (PPIs), demonstrating that small molecules can mimic macromolecular biologics. Macrocyclic Libraries: DEL macrocycle-like libraries enabled the discovery of neutral MDM2-p53 inhibitors, effective for recognizing larger protein surfaces. Targeted Protein Degradation (TPD): DEL screening identified small molecule ligands for STUB1/CHIP (an E3 ligase containing PPI sites), where previously only peptide binders were known. 2. Structural and Mechanistic Insights DEL hits provide structural foundations for hit-to-lead optimization, accelerating medicinal chemistry efforts. Examples include: Novel binders to the PWWP1 domain of NSD2 Inhibitors targeting the Cbl-b SH2 domain Chemical space profiling of SARS-CoV-2 PLpro 3. Covalent Inhibitor Discovery Strategic DEL designs enable the identification of reversible covalent binders targeting specific cysteine residues (e.g., Cys55 of Bfl-1), improving selectivity and efficacy. 4. Integration of Artificial Intelligence and Machine Learning AI/ML techniques are increasingly incorporated into DEL workflows for: Rationalizing library designs Analyzing DEL datasets Discovering hits beyond traditional DEL chemical space (e.g., highly selective heme oxygenase-1 inhibitors) Scaffold analysis and target addressability prediction 5. Strategic Library Design Evolution Recent DEL designs have become more sophisticated, influenced by: Progress in synthetic methodologies Expanded chemical diversity through validated on-DNA chemical reactions Enhanced access to structural biology data Conclusion The editorial concludes that DEL technology has become a mature, robust, and indispensable platform in modern drug discovery. The featured collection demonstrates the technology's versatility across multiple applications—from traditional inhibitor identification to targeted protein degradation and covalent drug discovery. The integration of computational approaches, particularly machine learning, continues to unlock new potential and extend DEL capabilities beyond its original scope. The editors (Amanda W. Dombrowski and Florent Samain) express gratitude to the contributing authors and anticipate that these works will inform and inspire readers to further expand the boundaries of DEL technology in pharmaceutical research.

  • DEL-Related Publications

    Microfluidic Agarose Microdroplets for DNA-Encoded Chemical Library Screening

    Yoojin Kim ,  Hayeon Kim ,  Jinhui Hong ,  Minseo Kang ,  Jaeyoung Bae ,  Sangyoon Ko ,  Minjae Kim ,  Byumseok Koh ,  Hakjoong Kim ,  Sang-Hee Shim ,  Kyubong Jo bioRxiv - Bioengineering  DOI: 10.64898/2026.02.15.706034 Abstract DNA-encoded library (DEL) technology enables high-throughput small-molecule discovery but is typically performed using purified proteins under in vitro conditions that do not reflect native intracellular environments. Here, we present a microfluidic agarose microdroplet platform for cellular-context DEL screening. The porous hydrogel droplets provide mechanically stable yet permeable microenvironments that protect weak protein-ligand interactions while enabling efficient buffer exchange and ligand diffusion. Importantly, mild cell permeabilization within droplets selectively retained chromatin-associated proteins, allowing screening directly in a cellular context. Using BRD4 as a model target, we validated intracellular ligand engagement by fluorescence imaging and super-resolution microscopy. Small-scale DEL screening selectively enriched JQ1 in both bead-based and cell-based formats, and large-scale DEL screening across millions of encoded compounds successfully identified hit molecules by sequencing. This agarose microdroplet based strategy expands DEL technology toward biologically relevant and chromatin-associated targets under near-native conditions. Summary This work introduces a microfluidic agarose µ-droplet platform for DNA-encoded chemical library (DECL/DEL) screening against intracellular targets, with validation on BRD4—a nuclear epigenetic reader protein. The system encapsulates single cells or target-coated magnetic beads in monodisperse ~100 µm agarose droplets via flow-focusing microfluidics; the low-gelling-temperature (LGT) agarose forms a porous hydrogel upon cooling, permitting rapid diffusion of DNA-encoded small molecules while preserving intracellular architecture and protein–bead complexes. Permeabilization enables controlled probe access, and super-resolution Exchange-PAINT imaging confirms nanoscale colocalization of JQ1-BP with GFP-BRD4 in nuclear nanoclusters. Highlights Agarose µ-droplets enable gravity-assisted washing, shear protection, and uniform molecular diffusion. Two-color Exchange-PAINT with orthogonal R2/R6 docking strands validates specific intracellular target engagement. DEL screening yields target-specific enrichment: JQ1 barcode is selectively enriched in BRD4-overexpressing HeLa droplets. Scalable to large DELs (96×96×96 combinatorial space across three scaffolds) with nanopore sequencing–based enrichment quantification. Conclusion The platform bridges functional intracellular DEL screening with high spatial fidelity and quantitative readouts—enabling both PCR- and sequencing-based hit identification while preserving native biomolecular context.

  • DEL-Related Publications

    Universal Baseline for in vitro Selection of Genetically Encoded Libraries

    Kejia Yan ,  Guilherme M. Lima ,  Tara Bahadur ,  Vincent Albert ,  Zoe O’Gara ,  Gary Bao ,  Christin Kossmann ,  William Kirby ,  Fernando B. Mejia ,  Matthew L. Michnik ,  Kristen Maiorana ,  Ratmir Derda bioRxiv - Biochemistry DOI: 10.64898/2026.02.14.705946 Abstract Genetically encoded (GE) libraries enable identification of high-affinity ligands for diverse molecular targets through iterative in vitro selection and DNA sequencing or next-generation sequencing (NGS). Despite their impact in therapeutic development, a systematic framework for evaluating reproducibility in GE-molecular discoveries remains limited. To aid such analysis, we introduce the concept of baseline response, which reproducibly partitions active and inactive members of in vitro selection. The baseline response is provided by spiking a random DNA-barcoded population. We calibrated the baseline concept using Bioconductor EdgeR differential enrichment (DE) analysis of NGS of phage-displayed selection on oligosaccharide chitin and hepatitis virus NS3a* protease as model targets. We further show that mixing discovery campaigns also offers an effective baseline: chitin-enriched peptides serve as a baseline for DE-analysis of NS3a* selection and NS3a*-enriched peptides serve as a baseline for chitin binders. We applied baseline-stratified DE-analysis to 66 parallel selections performed in 3–5 replicates across 22 extracellular targets, including HER1-3, EpCAM, CAIX, PD-L1, and eight integrin receptors. Automated DE-analysis across hundreds of NGS files produced hits validated in a secondary screen and yielded synthetic macrocyclic ligands with mid-nanomolar affinity confirmed in 2–3 biophysical assays. For PD-L1, we further demonstrated how baseline-calibrated NGS data provide decision-enabling information for optimization of peptide macrocycles to yield potent single-digit nanomolar ligands for the cell-surface receptor. We anticipate that baseline-based analyses of NGS data from in vitro selection procedures will offer a scalable framework for reproducible hit discovery and standardized analysis across diverse in vitro selection campaigns. Summary This work introduces a universal baseline framework for in vitro selection of genetically encoded (GE) libraries—e.g., phage-displayed peptide libraries—to improve reproducibility, statistical rigor, and cross-target comparability. The core innovation is spiking a DNA-barcoded random peptide library (serving as an in situ or “cross-target” empirical baseline) into every selection round. This baseline mimics naïve library binding behavior and enables robust normalization and differential enrichment (DE) analysis using Bioconductor EdgeR on NGS data. Validation spanned 22–24 extracellular protein targets (including HER1–3, PD-L1, integrins, NS3a*, chitin) across 66 parallel selections. Baseline-stratified DE identified high-confidence hits, including synthetic macrocyclic ligands with mid- to single-digit nM affinity confirmed by biophysical assays. The method also supports functional benchmarking—e.g., revealing reduced infectivity in MBX-modified phage libraries—and replaces synthetic or computational baselines with empirically derived, target-agnostic mixtures. Highlights Spiked DNA-barcoded random peptides serve as composition-agnostic, in situ baselines for normalization. Cross-target library mixing (e.g., chitin + NS3a*-selected peptides) yields effective empirical controls. EdgeR-based DE with TMM normalization and BH-FDR correction (α = 0.05) enables quantitative FC estimation binned by input abundance. Baseline depletion <1% after NS3a* selection confirms high selectivity. Conclusion The universal baseline standardizes hit discovery, improves enrichment fidelity assessment, and enables ML-ready, statistically benchmarked data generation without structural priors.

Product & Services

OpenDEL™ - Small Molecule

Starting Your Journey to Access the Vast Chemical Space

The Kit

  • 57 Libraries
  • ~3.8Bn compounds
  • 10 DEL samples

 

To Access

  • Fully Enumerated Molecules
  • Building Block Structures
  • DNA Codon Sequences
  • Scaffolds Information

 

✔ No Structure Disclosure Fee

✔ No Compound IP License Fee
Learn More more Quote more
case_01
OpenDEL™ - Small Molecule
01

OpenDEL™ Screening

OpenDEL™ screening is carried out by our team of experienced professionals, proficient in handling over 50 different target types including protein-protein interactions, kinases, enzymes, transcription factors, and RNA targets. Our team typically completes the screening experiments within 1-2 weeks. 
Learn More more Quote more
case_01
OpenDEL™ Screening
02

OpenDEL™ Sequencing

HitGen offers high-quality and gold sequencing service includes. 
  • Global Sample Shipment

  • Outstanding Sequencing Quality

  • Lightning-speed Result Delivery

  • Diverse Sequencing Options

Learn More more Quote more
case_01
OpenDEL™ Sequencing
03

OpenDEL™ Hit Proposal

Analyzing DEL selection data and choosing the right compounds for follow-up necessitates multidisciplinary expertise encompassing biology, computational science, and chemistry. This includes a deep understanding of the experimental design and mechanisms of action (MOAs) in biology, data processing and analysis in computational science, and aspects of both synthetic and DEL chemistry
Learn More more Quote more
case_01
OpenDEL™ Hit Proposal
04

OpenDEL™ Off-DNA Synthesis

HitGen Chemical Services: Innovation-Driven and Precision-Empowered.

We transform your DEL hits into tangible results by delivering the pure, complex structures critical for validating discoveries and accelerating their advancement.

Choose Your Path:

A. Traditional Chemical Synthesis @ HitGen 
B. High Throughput Chemical Synthesis @ HitGen

Learn More more Quote more
case_01
OpenDEL™ Off-DNA Synthesis
05

What are people in the community saying?

Connect with peers. Access breakthrough science. Spark your next discovery.

  • HitGen
    HitGen

    Congbao Kang ,  Hung T. Nguyen ,  David E. Heppner ,  Bin Yu ,  Weijun Xu

    Journal of Medicinal Chemistry

    DOI: 10.1021/acs.jmedchem.6c00070

     

    Over the past decade, RNA has emerged as an attractive and evolving landscape for small-molecule drug discovery. RNA functions as an intermediate macromolecule during gene expression and plays a diverse role in regulating cellular processes. (1) The promise in targeting RNA as therapeutic interventions arises from recent breakthroughs in structural biology, chemical biology, and computational modeling that collectively facilitate understanding of RNA biology. Multiple classes of RNA have been identified contributing to their functional diversity, including messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), long noncoding RNA (lncRNA), microRNA (miRNA), and other noncoding RNAs (ncRNAs). While only a tiny fraction of 1.5% of human genome is translated into proteins, approximately 75% is transcribed into RNA, (2−4) making RNA a vastly larger reservoir of potential drug targets than the traditional protein targets. Moreover, many disease-associated pathways including viral replication, cancer progression, neurodegeneration, and immune regulation are regulated by RNA or through its interactions with different proteins, making RNA an increasingly relevant and strategic focus for next-generation drug discovery. (5) Several classes of small molecules define the landscape of RNA-targeting drug discovery (Figure 1), and many other modulators of RNA have been discovered for further development. (6) Small molecules targeting RNA can act through diverse mechanisms, including direct RNA binding to modulate RNA function, disruption of RNA interactions with RNA binding proteins, and stimulation RNA binding to RNA degrading enzyme for degradation. (7) The success in developing RNA drugs and RNA ligands prove the feasibility of RNA-focused drug discovery. (2) Patients with spinal muscular atrophy (SMA) lack enough survival motor neuron (SMN) protein to maintain adequate muscle function. The lack of SMN protein is believed to drive the pathophysiology of SMA. In August 2020, the FDA approved risdiplam (Evrysdi, RG7916, RO7034067) developed by Roche for the treatment of SMA in patients of all ages. (8) As a small-molecule modulator of SMN2 splicing, risdiplam represents the first FDA-approved RNA-targeting small-molecule drug. Mechanically, risdiplam acts as a “glue”-like compound that enhances RNA binding protein (RBP) to RNA to facilitate splicing and increases the production of full-length functional SMN protein. In addition to risdiplam, there are other drugs or clinical candidates that directly bind to RNA, including small molecules targeting RNA G-quadruplexes such as Quarfloxin and CX-5461, ribosome-targeting antibiotics like macrolides, and viral RNA polymerase nucleosides like sofosbuvir. (9) Together, these precedences highlight the growing potential of RNA as a therapeutic target and lay the foundation for the rational discovery of small-molecule RNA modulators. Figure 1. Small-molecule drugs that target RNAs. While strong incentives have been directed to drug RNA, several fundamental challenges make RNA-targeted drug discovery inherently more intractable than conventional protein-directed approaches. (10) Although proteins can undergo complicated conformational changes and post-translational modifications, RNA is even more dynamic and flexible, adopting distinct conformations that depend on its sequence, cellular environment, and interactions with other biomolecules. (11) Said another way, the highly dynamic nature of RNA implies that robust pockets may be impossible to bind small molecules like unstructured regions of proteins. Unlike proteins formed by 20 amino acids, RNA is composed of only four nucleotides, presenting challenges for specific and tight interactions with small molecules. Additional hurdles include the limited identification of clearly druggable RNA motifs, the intrinsic structural dynamics of RNA, and the difficulty in achieving selectivity across closely related RNA sequences. Additionally, traditional protein-directed drug discovery often relies on both polar and nonpolar interactions within well-defined hydrophobic and hydrophilic pockets. By contrast, RNA features overwhelmingly negatively charged backbones and typically lacks hydrophobic pockets. As a result, the number and structural diversity of druggable RNA binding sites are expected to be limited, which further complicates efforts to achieve selective modulation. As such, the structural and functional nature of RNA intrinsically favors chemical space dominated by highly polar molecules manifesting to at least two major issues: limited selectivity due to the abundance of similarly polar features across cellular RNAs and significant challenges in optimizing ADME properties, particularly lipophilicity. As with drug discovery of protein targets, hit identification via high-throughput screening (HTS), fragment-based screening, and virtual screening are commonly adopted. A major challenge for HTS against RNA targets is that most existing HTS libraries were designed for proteins and are poorly suited for recognizing the unique physicochemical and structural characteristics of RNA. Therefore, the development of RNA-focused libraries has become a key priority, which requires pharmacophore-based strategies through incorporating scaffolds known to interact with RNA motifs. This can be further complemented by the rise of DNA-encoded libraries (DELs) that have emerged as powerful source for hit identification for RNA targets. (4) For RNA to become a more tractable and widely druggable class of biomolecules, advances are needed in generating well-behaved biochemical tools and obtaining detailed structural information on unique or transient RNA conformations. Such innovations will be essential for enabling robust discovery platforms analogous to those that have long supported successful drug development for proteins and enzymes. Despite the above-mentioned challenges, RNA-targeted therapeutics continue to gain momentum across oncology, virology, and neurology. Recent advances in structural biology and computation, artificial intelligence, and multiscale modeling have begun to overcome these barriers and pave a rational foundation for RNA-targeted drug discovery. Structural studies of RNA using chemical biology and biophysical methods have made progress in recent years, providing detailed insights into secondary and tertiary motifs such as hairpins, internal loops, bulges, and pseudoknots, which are considered important structural architectures for developing small molecules. NMR spectroscopy remains a powerful technique for probing RNA structure and dynamics in solution. (12) Its ability to resolve conformational equilibria, detect transient states, and characterize ligand binding has made it indispensable for understanding RNA functional motions and identifying druggable conformations. X-ray crystallography, usually hindered by the intrinsic flexibility of RNA, has become increasingly feasible due to improved construct design, presence of stabilizing ligands, and the use of RBPs to facilitate crystallization. These innovations have enabled high-resolution structures of diverse RNA motifs and RNA–ligand complexes to be solved. (13) Cryo-EM has transformed the field through efficient determination of large RNA molecules and RNA–protein complexes that were previously inaccessible. (14) Complementing experimental approaches, computational methods have evolved rapidly and played a critical role in RNA structural biology and drug discovery. (15) Several algorithms and servers have been developed to predict RNA secondary and tertiary structures, model conformational ensembles, binding pocket identification and screen compound libraries against RNA targets. (16) Machine learning based methods that are supported by continuously expanding high-quality structural data, have improved accuracy in predicting RNA structures, RNA–RBP interactions and RNA–ligand interactions. Collectively, these methods provide a robust foundation for rational discovery of RNA modulators. Advances in physics-based molecular dynamics (MD) simulations are now mapping RNA conformational landscapes to reveal hidden and metastable pockets. Enhanced sampling MD, (17) coarse-grained (CG) modeling, (18) and Markov state models (MSMs) (19) allow the identification of transient states that are invisible to standard spectroscopy and experimental determination. (20) Coupled with ensemble-based docking against multiple conformers, hit rates may improve compared to static docking against RNA targets. Furthermore, the integration of hybrid QM/MM and machine-learning (ML) corrected scoring functions is providing a more rigorous treatment of stacking interactions, hydration shells, and ion-mediated electrostatics, which are critical for accurate RNA-ligand affinity prediction. These methods are increasingly used to derive druggability maps for riboswitches, repeat RNAs, viral elements, and structured motifs within long non-coding RNAs. (21) On the other hand, AI-driven prediction and generative design for RNA-targeted chemistry is on the rise. However, a central barrier is the limited availability of high-resolution RNA–ligand structures, which constrains traditional structure-based design. Emerging AI approaches are addressing this gap through multimodal learning frameworks that integrate RNA sequence, chemical features, SHAPE/DMS reactivity, and evolutionary covariation. Deep learning models improve secondary and tertiary structure inference, enabling more accurate identification of ligandable motifs and conformational states relevant for binding. (22) Recently, prediction of small-molecule–RNA interactions was achieved without the need for RNA tertiary structures as input. (23) However, their broader utility in RNA-focused medicinal chemistry remains to be fully established and will likely require larger data sets and more rigorous experimental validation. Moreover, many disease-relevant RNAs function through multivalent interactions with RBPs. Small molecules that restore normal RNA–RBP equilibrium have demonstrated proof-of-principle activity in correcting splicing, transcriptional regulation, and translation. (24) It remains to be seen whether disrupting RNA–protein binding interaction would bring clinical benefits. As data sets from CLIP–seq, RNP–MaP, and ligand–RNA cross-linking expand, predictive modeling of RBP selectivity will continue to improve. Last but not least, a rapidly growing therapeutic direction targets ribonucleoprotein condensates, that feature dynamic assemblies whose physicochemical properties (viscoelasticity, aging kinetics, etc.) encode key regulatory functions. (25) Aberrant condensate behavior is implicated in many diseases including neurodegenerative diseases and cancer. (25) Small molecules that alter condensate properties, such as softening, hardening, dissolving, or preventing gelation, represent a new modality of RNA-focused therapeutics. (26) Computational modeling is expected to play a central role capturing the structures of mesoscale organization and emergent behaviors of ribonucleoprotein condensates. These approaches collectively support rational design of condensate-modulating drugs, a modality fundamentally different from classical lock-and-key pharmacology. Overall, deep understanding of RNA targets, innovative screening and validation as a collaborative effort from chemical biology and medicinal chemistry is key to the success of RNA targeted drug discovery campaigns. With rapidly advancing techniques in structural biology and artificial intelligence are being developed, we may witness whether small molecule drug discovery targeting RNAs will meet its inflection point or remain a niche curiosity in the years to come. This article references 26 other publications. (acccessed 2024/10/22) PubMed (acccessed 2026/01/04) (acccessed 2026/01/04) This article has not yet been cited by other publications.

  • HitGen
    HitGen

    Xianfeng Li ,  Zehao Yin ,  Qiuyi Chen ,  Xinlong Hu ,  Gong Zhang ,  Xiaohong Fan ,  Yizhou Li

    Organic Letters 

    DOI: 10.1021/acs.orglett.6c00490

    Abstract

    The β-ketoamide motif represents both a privileged scaffold and a versatile synthetic intermediate in medicinal chemistry. Herein, we developed a DNA-compatible method for the efficient conversion of various DNA-conjugated amines into β-ketoamides. The resulting β-ketoamides facilitate rapid diversification into a panel of structurally diverse molecular scaffolds. Importantly, the synthetic route and subsequent derivatization steps were validated to be fully compatible with DNA encoding, offering a reliable and versatile platform for DNA-encoded library synthesis.

     

  • HitGen
    HitGen

    Yulong An ,  Ruolan Zhou ,  Xiang Li

    ACS Medicinal Chemistry Letters 

    DOI: 10.1021/acsmedchemlett.5c00738

    Abstract

    DNA-encoded library (DEL) technology has emerged as a transformative platform for discovering chemical inducers of proximity (CIPs), addressing challenges in both degrader and non-degrader CIP development. This Microperspective analyzes the results of recent DEL technology screens (2021–2025) to enable medicinal chemistry programs, focusing on CIP development including CIP-focused DELs, DEL-derived ligands for proteins of interest (POIs) and E3 ligase in rational CIP design, and directly functional CIP identification. Finally, we address current limitations of DEL technology in CIP research and outline future directions. This Microperspective underscores DEL’s pivotal role in advancing CIP discovery, providing actionable insights for addressing “undruggable” targets and accelerating translational research in chemical biology and medicinal chemistry.

    Summary

    This MicroPerspective reviews recent advances (2021–2025) in DNA-encoded library (DEL) technology for discovering chemical inducers of proximity (CIPs), spanning both degraders (e.g., PROTACs, molecular glue degraders) and non-degrader modalities (e.g., protein stabilization, subcellular relocalization, transcriptional activation). It synthesizes three key strategies: (1) CIP-focused DELs (CIP-DELs), enabling simultaneous dual-target (POI + E3 ligase) selection to directly identify cooperatively binding bifunctional compounds; (2) Conversion of DEL-derived POI/E3 ligands—leveraging well-defined DNA attachment sites as “exit vectors”—into functional CIPs; and (3) Discovery of non-degradative CIPs, including FKBP12-recruiting molecular glues and function-driven DEL screening (e.g., direct ubiquitination readout). DEL overcomes longstanding limitations of traditional HTS—including library size, cost, and scarcity of E3 ligands—thereby accelerating CIP development against “undruggable” targets.

    Highlights

    • Dual-Target CIP-DEL Screening: CRBN- or VHL-targeted DELs enable concurrent selection against POIs and E3 ligases, directly identifying ternary complex stabilizers with high cooperativity (e.g., BRD4/BRD2-selective PROTACs, BRD9 molecular glue).
    • Ligand-to-CIP Conversion Paradigm: DEL-derived ligands for ERα, MAGE-A3, PIN1, DNPH1, and TRIM21 were optimized and converted into functional PROTACs or TrimTACs; the DNA attachment site serves as a built-in, precise “exit vector” for linker conjugation.
    • Expansion to Non-Degradative Functions: An FKBP12-biased CIP-DEL identified a molecular glue that stabilizes the Crohn’s disease-associated ATG16L1 T300A variant; function-driven DEL screening (in presence of E1/E2/ATP) directly enriches ubiquitination-competent PROTACs, eliminating affinity-only false positives.
    • New Frontier: RNA Targets: DEL screening against RNase L led to the design of RiboTACs targeting pre-miR-21—extending CIP therapeutics to RNA biology.

    Conclusion

    DEL technology has evolved from a single-target ligand discovery platform into a central engine driving the discovery of the full spectrum of CIPs—from degraders to non-degraders, and from proteins to RNA. Its core advantages lie in vast chemical space coverage, barcode-enabled precise hit identification, and intrinsic structural information (e.g., defined exit vectors). Future directions include integrating AI for POI–E3 interface–guided library design, developing robust in-cell DEL screening, expanding the repertoire of E3 ligase ligands, and strengthening functional phenotypic and preclinical translational studies—to fully unlock the therapeutic potential of CIPs against “undruggable” targets.

  • HitGen
    HitGen

    Qingao Xue ,  Ze Liang ,  Yi Zhang ,  Fei Wang ,  Fulian Wang ,  Lili Liu ,  Guang Yang ,  Lei Yan

    Bioorganic Chemistry

    DOI:10.1016/j.bioorg.2026.109627

    Abstract

    The membrane fusion process mediated by the SARS-CoV-2 spike protein is a key therapeutic target. Its heptad repeat 1 (HR1) domain forms a conserved trimeric groove critical for forming the fusion-competent six-helix bundle with HR2. We used DNA-encoded library screening to identify small molecules binding HR1. Hits including Rabeprazole-related compound E (Rab RCE), Omeprazole, Alvimopan, and Olmesartan were characterized. Biophysical assays confirmed binding, while computational simulations revealed distinct interaction modes, with Alvimopan showing high predicted affinity. Cell-cell fusion assays demonstrated potent inhibitory activity for Olmesartan and Rab RCE. Notably, Rabeprazole and Rab RCE showed partial antiviral activity against SARS-CoV-2 variants and HCoV-OC43, rescuing virus-induced apoptosis. Mechanistically, Rabeprazole competitively occupies the HR2-binding groove on HR1, blocking fusion. Our findings identify HR1-targeting molecules like Rabeprazole as promising leads for broad-spectrum coronaviral fusion inhibitors.

    Highlights

    • A DNA-encoded library (DEL) screening strategy was established to rapidly identify small-molecule binders targeting the conserved heptad repeat 1 (HR1) domain of the SARS-CoV-2 spike protein, enabling efficient mining of repurposed drug candidates from a ∼ 4 billion-compound chemical space.
    • Four clinically approved drugs (alvimopan, olmesartan, rabeprazole sulfide, and omeprazole) were validated as HR1-targeting agents, sharing biaryl/heteroaryl cores and hydrogen-bond acceptor groups that mediate specific interactions with HR1 (binding affinities ranging from micromolar to millimolar).
    • Two distinct inhibitory mechanisms were delineated: classical competitive occupancy of the HR1 hydrophobic groove (olmesartan, rabeprazole sulfide) and a novel ‘molecular wedge’ mode disrupting the trimeric HR1 interface (alvimopan), providing complementary strategies for targeting viral fusion.
    • Olmesartan and rabeprazole sulfide exhibited potent inhibition of SARS-CoV-2-mediated cell-cell fusion, with efficacy comparable to the positive control Salsingle bondC, validating their potential as lead compounds for anti-COVID-19 therapeutics.
    • This study establishes a robust pipeline integrating DEL screening, biophysical validation, molecular docking, and functional assays, offering valuable chemical scaffolds and mechanistic insights for developing broad-spectrum coronaviral fusion inhibitors.

     

  • HitGen
    HitGen

    Kejia Yan ,  Guilherme M. Lima ,  Tara Bahadur ,  Vincent Albert ,  Zoe O’Gara ,  Gary Bao ,  Christin Kossmann ,  William Kirby ,  Fernando B. Mejia ,  Matthew L. Michnik ,  Kristen Maiorana ,  Ratmir Derda

    bioRxiv - Biochemistry

    DOI: 10.64898/2026.02.14.705946

    Abstract

    Genetically encoded (GE) libraries enable identification of high-affinity ligands for diverse molecular targets through iterative in vitro selection and DNA sequencing or next-generation sequencing (NGS). Despite their impact in therapeutic development, a systematic framework for evaluating reproducibility in GE-molecular discoveries remains limited. To aid such analysis, we introduce the concept of baseline response, which reproducibly partitions active and inactive members of in vitro selection. The baseline response is provided by spiking a random DNA-barcoded population. We calibrated the baseline concept using Bioconductor EdgeR differential enrichment (DE) analysis of NGS of phage-displayed selection on oligosaccharide chitin and hepatitis virus NS3a* protease as model targets. We further show that mixing discovery campaigns also offers an effective baseline: chitin-enriched peptides serve as a baseline for DE-analysis of NS3a* selection and NS3a*-enriched peptides serve as a baseline for chitin binders. We applied baseline-stratified DE-analysis to 66 parallel selections performed in 3–5 replicates across 22 extracellular targets, including HER1-3, EpCAM, CAIX, PD-L1, and eight integrin receptors. Automated DE-analysis across hundreds of NGS files produced hits validated in a secondary screen and yielded synthetic macrocyclic ligands with mid-nanomolar affinity confirmed in 2–3 biophysical assays. For PD-L1, we further demonstrated how baseline-calibrated NGS data provide decision-enabling information for optimization of peptide macrocycles to yield potent single-digit nanomolar ligands for the cell-surface receptor. We anticipate that baseline-based analyses of NGS data from in vitro selection procedures will offer a scalable framework for reproducible hit discovery and standardized analysis across diverse in vitro selection campaigns.

    Summary

    This work introduces a universal baseline framework for in vitro selection of genetically encoded (GE) libraries—e.g., phage-displayed peptide libraries—to improve reproducibility, statistical rigor, and cross-target comparability. The core innovation is spiking a DNA-barcoded random peptide library (serving as an in situ or “cross-target” empirical baseline) into every selection round. This baseline mimics naïve library binding behavior and enables robust normalization and differential enrichment (DE) analysis using Bioconductor EdgeR on NGS data. Validation spanned 22–24 extracellular protein targets (including HER1–3, PD-L1, integrins, NS3a*, chitin) across 66 parallel selections. Baseline-stratified DE identified high-confidence hits, including synthetic macrocyclic ligands with mid- to single-digit nM affinity confirmed by biophysical assays. The method also supports functional benchmarking—e.g., revealing reduced infectivity in MBX-modified phage libraries—and replaces synthetic or computational baselines with empirically derived, target-agnostic mixtures.

    Highlights

    • Spiked DNA-barcoded random peptides serve as composition-agnostic, in situ baselines for normalization.
    • Cross-target library mixing (e.g., chitin + NS3a*-selected peptides) yields effective empirical controls.
    • EdgeR-based DE with TMM normalization and BH-FDR correction (α = 0.05) enables quantitative FC estimation binned by input abundance.
    • Baseline depletion <1% after NS3a* selection confirms high selectivity.

    Conclusion

    The universal baseline standardizes hit discovery, improves enrichment fidelity assessment, and enables ML-ready, statistically benchmarked data generation without structural priors.

  • HitGen
    HitGen

    Yoojin Kim ,  Hayeon Kim ,  Jinhui Hong ,  Minseo Kang ,  Jaeyoung Bae ,  Sangyoon Ko ,  Minjae Kim ,  Byumseok Koh ,  Hakjoong Kim ,  Sang-Hee Shim ,  Kyubong Jo

    bioRxiv - Bioengineering 

    DOI: 10.64898/2026.02.15.706034

    Abstract

    DNA-encoded library (DEL) technology enables high-throughput small-molecule discovery but is typically performed using purified proteins under in vitro conditions that do not reflect native intracellular environments. Here, we present a microfluidic agarose microdroplet platform for cellular-context DEL screening. The porous hydrogel droplets provide mechanically stable yet permeable microenvironments that protect weak protein-ligand interactions while enabling efficient buffer exchange and ligand diffusion. Importantly, mild cell permeabilization within droplets selectively retained chromatin-associated proteins, allowing screening directly in a cellular context. Using BRD4 as a model target, we validated intracellular ligand engagement by fluorescence imaging and super-resolution microscopy. Small-scale DEL screening selectively enriched JQ1 in both bead-based and cell-based formats, and large-scale DEL screening across millions of encoded compounds successfully identified hit molecules by sequencing. This agarose microdroplet based strategy expands DEL technology toward biologically relevant and chromatin-associated targets under near-native conditions.

    Summary

    This work introduces a microfluidic agarose µ-droplet platform for DNA-encoded chemical library (DECL/DEL) screening against intracellular targets, with validation on BRD4—a nuclear epigenetic reader protein. The system encapsulates single cells or target-coated magnetic beads in monodisperse ~100 µm agarose droplets via flow-focusing microfluidics; the low-gelling-temperature (LGT) agarose forms a porous hydrogel upon cooling, permitting rapid diffusion of DNA-encoded small molecules while preserving intracellular architecture and protein–bead complexes. Permeabilization enables controlled probe access, and super-resolution Exchange-PAINT imaging confirms nanoscale colocalization of JQ1-BP with GFP-BRD4 in nuclear nanoclusters.

    Highlights

    • Agarose µ-droplets enable gravity-assisted washing, shear protection, and uniform molecular diffusion.
    • Two-color Exchange-PAINT with orthogonal R2/R6 docking strands validates specific intracellular target engagement.
    • DEL screening yields target-specific enrichment: JQ1 barcode is selectively enriched in BRD4-overexpressing HeLa droplets.
    • Scalable to large DELs (96×96×96 combinatorial space across three scaffolds) with nanopore sequencing–based enrichment quantification.

    Conclusion

    The platform bridges functional intracellular DEL screening with high spatial fidelity and quantitative readouts—enabling both PCR- and sequencing-based hit identification while preserving native biomolecular context.

Messages and Feedback

By submitting your information, you acknowledge having received, read and understood our Privacy Notice as made available above.

logo
logo