The drug development pipeline is a costly and lengthy process. Identifying high-quality “hit” compounds—those with high potency, selectivity, and favorable metabolic properties—at the earliest stages is important for reducing cost and accelerating the path to clinical trials. For the last decade, scientists have looked to machine learning to make this initial screening process more efficient.
Computer-aided drug design is used to computationally screen for compounds that interact with a target protein. However, the ability to accurately and rapidly estimate the strength of these interactions remains a challenge.
“Machine learning promised to bridge the gap between the accuracy of gold-standard, physics-based computational methods and the speed of simpler empirical scoring functions,” said Dr. Benjamin P. Brown, an assistant professor of pharmacology at the Vanderbilt University School of Medicine Basic Sciences.
“Unfortunately, its potential has so far been unrealized because current ML methods can unpredictably fail when they encounter chemical structures that they were not exposed to during their training, which limits their usefulness for real-world drug discovery.”
Brown is the single author on a Proceedings of the National Academy of Sciences paper titled “A generalizable deep learning framework for structure-based protein-ligand affinity ranking” that addresses this “generalizability gap.”
In the paper, he proposes a targeted approach: instead of learning from the entire 3D structure of a protein and a drug molecule, Brown proposes a task-specific model architecture that is intentionally restricted to learn only from a representation of their interaction space, which captures the distance-dependent physicochemical interactions between atom pairs.
“By constraining the model to this view, it is forced to learn the transferable principles of molecular binding rather than structural shortcuts present in the training data that fail to generalize to new molecules,” Brown said.
A key aspect of Brown’s work was the rigorous evaluation protocol he developed. “We set up our training and testing runs to simulate a real-world scenario: If a novel protein family were discovered tomorrow, would our model be able to make effective predictions for it?” he said.
To do this, he left out entire protein superfamilies and all their associated chemical data from the training set, creating a challenging and realistic test of the model’s ability to generalize.
Brown’s work provides several key insights for the field:
- Task-specific specialized architectures provide a clear avenue for building generalizable models using today’s publicly available datasets. By designing a model with a specific “inductive bias” that forces it to learn from a representation of molecular interactions rather than from raw chemical structures, it generalizes more effectively.
- Rigorous, realistic benchmarks are critical. The paper’s validation protocol revealed that contemporary ML models performing well on standard benchmarks can show a significant drop in performance when faced with novel protein families. This highlights the need for more stringent evaluation practices in the field to accurately gauge real-world utility.
- Current performance gains over conventional scoring functions are modest, but the work establishes a clear, reliable baseline for a modeling strategy that doesn’t fail unpredictably, which is a critical step toward building trustworthy AI for drug discovery.
Brown, a core faculty member of the Center for AI in Protein Dynamics, knows that there is more work to be done. His current project focused exclusively on scoring—ranking compounds based on the strength of their interaction with the target protein—which is only part of the structure-based drug discovery equation.
“My lab is fundamentally interested in modeling challenges related to scalability and generalizability in molecular simulation and computer-aided drug design. Hopefully, soon we can share some additional work that aims to advance these principles,” Brown said.
For now, significant challenges remain, but Brown’s work on building a more dependable approach for machine learning in structure-based computer-aided drug design has clarified the path forward.
More information: Benjamin P. Brown, A generalizable deep learning framework for structure-based protein–ligand affinity ranking, Proceedings of the National Academy of Sciences (2025). doi.org/10.1073/pnas.2508998122
Journal information: Proceedings of the National Academy of Sciences
Provided by Vanderbilt University
News
A Forgotten Molecule Could Revive Failing Antifungal Drugs and Save Millions of Lives
Scientists have uncovered a way to make existing antifungal drugs work again against deadly, drug-resistant fungi. Fungal infections claim millions of lives worldwide each year, and current medical treatments are failing to keep pace. [...]
Scientists Trap Thyme’s Healing Power in Tiny Capsules
A new micro-encapsulation breakthrough could turn thyme’s powerful health benefits into safer, smarter nanodoses. Thyme extract is often praised for its wide range of health benefits, giving it a reputation as a natural medicinal [...]
Scientists Develop Spray-On Powder That Instantly Seals Life-Threatening Wounds
KAIST scientists have created a fast-acting, stable powder hemostat that stops bleeding in one second and could significantly improve survival in combat and emergency medicine. Severe blood loss remains the primary cause of death from [...]
Oceans Are Struggling To Absorb Carbon As Microplastics Flood Their Waters
New research points to an unexpected way plastic pollution may be influencing Earth’s climate system. A recent study suggests that microscopic plastic pollution is reducing the ocean’s capacity to take in carbon dioxide, a [...]
Molecular Manufacturing: The Future of Nanomedicine – New book from Frank Boehm
This book explores the revolutionary potential of atomically precise manufacturing technologies to transform global healthcare, as well as practically every other sector across society. This forward-thinking volume examines how envisaged Factory@Home systems might enable the cost-effective [...]
New Book! NanoMedical Brain/Cloud Interface – Explorations and Implications
New book from Frank Boehm, NanoappsMedical Inc Founder: This book explores the future hypothetical possibility that the cerebral cortex of the human brain might be seamlessly, safely, and securely connected with the Cloud via [...]
Global Health Care Equivalency in the Age of Nanotechnology, Nanomedicine and Artificial Intelligence
A new book by Frank Boehm, NanoappsMedical Inc. Founder. This groundbreaking volume explores the vision of a Global Health Care Equivalency (GHCE) system powered by artificial intelligence and quantum computing technologies, operating on secure [...]
Miller School Researchers Pioneer Nanovanilloid-Based Brain Cooling for Traumatic Injury
A multidisciplinary team at the University of Miami Miller School of Medicine has developed a breakthrough nanodrug platform that may prove beneficial for rapid, targeted therapeutic hypothermia after traumatic brain injury (TBI). Their work, published in ACS [...]
COVID-19 still claims more than 100,000 US lives each year
Centers for Disease Control and Prevention researchers report national estimates of 43.6 million COVID-19-associated illnesses and 101,300 deaths in the US during October 2022 to September 2023, plus 33.0 million illnesses and 100,800 deaths [...]
Nanomedicine in 2026: Experts Predict the Year Ahead
Progress in nanomedicine is almost as fast as the science is small. Over the last year, we've seen an abundance of headlines covering medical R&D at the nanoscale: polymer-coated nanoparticles targeting ovarian cancer, Albumin recruiting nanoparticles for [...]
Lipid nanoparticles could unlock access for millions of autoimmune patients
Capstan Therapeutics scientists demonstrate that lipid nanoparticles can engineer CAR T cells within the body without laboratory cell manufacturing and ex vivo expansion. The method using targeted lipid nanoparticles (tLNPs) is designed to deliver [...]
The Brain’s Strange Way of Computing Could Explain Consciousness
Consciousness may emerge not from code, but from the way living brains physically compute. Discussions about consciousness often stall between two deeply rooted viewpoints. One is computational functionalism, which holds that cognition can be [...]
First breathing ‘lung-on-chip’ developed using genetically identical cells
Researchers at the Francis Crick Institute and AlveoliX have developed the first human lung-on-chip model using stem cells taken from only one person. These chips simulate breathing motions and lung disease in an individual, [...]
Cell Membranes May Act Like Tiny Power Generators
Living cells may generate electricity through the natural motion of their membranes. These fast electrical signals could play a role in how cells communicate and sense their surroundings. Scientists have proposed a new theoretical [...]
This Viral RNA Structure Could Lead to a Universal Antiviral Drug
Researchers identify a shared RNA-protein interaction that could lead to broad-spectrum antiviral treatments for enteroviruses. A new study from the University of Maryland, Baltimore County (UMBC), published in Nature Communications, explains how enteroviruses begin reproducing [...]
New study suggests a way to rejuvenate the immune system
Stimulating the liver to produce some of the signals of the thymus can reverse age-related declines in T-cell populations and enhance response to vaccination. As people age, their immune system function declines. T cell [...]















