The drug development pipeline is a costly and lengthy process. Identifying high-quality “hit” compounds—those with high potency, selectivity, and favorable metabolic properties—at the earliest stages is important for reducing cost and accelerating the path to clinical trials. For the last decade, scientists have looked to machine learning to make this initial screening process more efficient.
Computer-aided drug design is used to computationally screen for compounds that interact with a target protein. However, the ability to accurately and rapidly estimate the strength of these interactions remains a challenge.
“Machine learning promised to bridge the gap between the accuracy of gold-standard, physics-based computational methods and the speed of simpler empirical scoring functions,” said Dr. Benjamin P. Brown, an assistant professor of pharmacology at the Vanderbilt University School of Medicine Basic Sciences.
“Unfortunately, its potential has so far been unrealized because current ML methods can unpredictably fail when they encounter chemical structures that they were not exposed to during their training, which limits their usefulness for real-world drug discovery.”
Brown is the single author on a Proceedings of the National Academy of Sciences paper titled “A generalizable deep learning framework for structure-based protein-ligand affinity ranking” that addresses this “generalizability gap.”
In the paper, he proposes a targeted approach: instead of learning from the entire 3D structure of a protein and a drug molecule, Brown proposes a task-specific model architecture that is intentionally restricted to learn only from a representation of their interaction space, which captures the distance-dependent physicochemical interactions between atom pairs.
“By constraining the model to this view, it is forced to learn the transferable principles of molecular binding rather than structural shortcuts present in the training data that fail to generalize to new molecules,” Brown said.
A key aspect of Brown’s work was the rigorous evaluation protocol he developed. “We set up our training and testing runs to simulate a real-world scenario: If a novel protein family were discovered tomorrow, would our model be able to make effective predictions for it?” he said.
To do this, he left out entire protein superfamilies and all their associated chemical data from the training set, creating a challenging and realistic test of the model’s ability to generalize.
Brown’s work provides several key insights for the field:
- Task-specific specialized architectures provide a clear avenue for building generalizable models using today’s publicly available datasets. By designing a model with a specific “inductive bias” that forces it to learn from a representation of molecular interactions rather than from raw chemical structures, it generalizes more effectively.
- Rigorous, realistic benchmarks are critical. The paper’s validation protocol revealed that contemporary ML models performing well on standard benchmarks can show a significant drop in performance when faced with novel protein families. This highlights the need for more stringent evaluation practices in the field to accurately gauge real-world utility.
- Current performance gains over conventional scoring functions are modest, but the work establishes a clear, reliable baseline for a modeling strategy that doesn’t fail unpredictably, which is a critical step toward building trustworthy AI for drug discovery.
Brown, a core faculty member of the Center for AI in Protein Dynamics, knows that there is more work to be done. His current project focused exclusively on scoring—ranking compounds based on the strength of their interaction with the target protein—which is only part of the structure-based drug discovery equation.
“My lab is fundamentally interested in modeling challenges related to scalability and generalizability in molecular simulation and computer-aided drug design. Hopefully, soon we can share some additional work that aims to advance these principles,” Brown said.
For now, significant challenges remain, but Brown’s work on building a more dependable approach for machine learning in structure-based computer-aided drug design has clarified the path forward.
More information: Benjamin P. Brown, A generalizable deep learning framework for structure-based protein–ligand affinity ranking, Proceedings of the National Academy of Sciences (2025). doi.org/10.1073/pnas.2508998122
Journal information: Proceedings of the National Academy of Sciences
Provided by Vanderbilt University
News
Very low LDL-cholesterol correlates to fewer heart problems after stroke
Brigham and Women's Hospital's TIMI Study Group reports that in patients with prior ischemic stroke, very low achieved LDL-cholesterol correlated with fewer major adverse cardiovascular events and fewer recurrent strokes, without an apparent increase [...]
“Great Unified Microscope” Reveals Hidden Micro and Nano Worlds Inside Living Cells
University of Tokyo researchers have created a powerful new microscope that captures both forward- and back-scattered light at once, letting scientists see everything from large cell structures to tiny nanoscale particles in a single shot. Researchers [...]
Breakthrough Alzheimer’s Drug Has a Hidden Problem
Researchers in Japan found that although the Alzheimer’s drug lecanemab successfully removes amyloid plaques from the brain, it does not restore the brain’s waste-clearing system within the first few months of treatment. The study suggests that [...]
Concerning New Research Reveals Colon Cancer Is Skyrocketing in Adults Under 50
Colorectal cancer is striking younger adults at alarming rates, driven by lifestyle and genetic factors. Colorectal cancer (CRC) develops when abnormal cells grow uncontrollably in the colon or rectum, forming tumors that can eventually [...]
Scientists Discover a Natural, Non-Addictive Way To Block Pain That Could Replace Opioids
Scientists have discovered that the body can naturally dull pain through its own localized “benzodiazepine-like” peptides. A groundbreaking study led by a University of Leeds scientist has unveiled new insights into how the body manages pain, [...]
GLP-1 Drugs Like Ozempic Work, but New Research Reveals a Major Catch
Three new Cochrane reviews find evidence that GLP-1 drugs lead to clinically meaningful weight loss, though industry-funded studies raise concerns. Three new reviews from Cochrane have found that GLP-1 medications can lead to significant [...]
How a Palm-Sized Laser Could Change Medicine and Manufacturing
Researchers have developed an innovative and versatile system designed for a new generation of short-pulse lasers. Lasers that produce extremely short bursts of light are known for their remarkable precision, making them indispensable tools [...]
New nanoparticles stimulate the immune system to attack ovarian tumors
Cancer immunotherapy, which uses drugs that stimulate the body’s immune cells to attack tumors, is a promising approach to treating many types of cancer. However, it doesn’t work well for some tumors, including ovarian [...]
New Drug Kills Cancer 20,000x More Effectively With No Detectable Side Effects
By restructuring a common chemotherapy drug, scientists increased its potency by 20,000 times. In a significant step forward for cancer therapy, researchers at Northwestern University have redesigned the molecular structure of a well-known chemotherapy drug, greatly [...]
Lipid nanoparticles discovered that can deliver mRNA directly into heart muscle cells
Cardiovascular disease continues to be the leading cause of death worldwide. But advances in heart-failure therapeutics have stalled, largely due to the difficulty of delivering treatments at the cellular level. Now, a UC Berkeley-led [...]
The basic mechanisms of visual attention emerged over 500 million years ago, study suggests
The brain does not need its sophisticated cortex to interpret the visual world. A new study published in PLOS Biology demonstrates that a much older structure, the superior colliculus, contains the necessary circuitry to perform the [...]
AI Is Overheating. This New Technology Could Be the Fix
Engineers have developed a passive evaporative cooling membrane that dramatically improves heat removal for electronics and data centers Engineers at the University of California San Diego have created an innovative cooling system designed to greatly enhance [...]
New nanomedicine wipes out leukemia in animal study
In a promising advance for cancer treatment, Northwestern University scientists have re-engineered the molecular structure of a common chemotherapy drug, making it dramatically more soluble and effective and less toxic. In the new study, [...]
Mystery Solved: Scientists Find Cause for Unexplained, Deadly Diseases
A study reveals that a protein called RPA is essential for maintaining chromosome stability by stimulating telomerase. New findings from the University of Wisconsin-Madison suggest that problems with a key protein that helps preserve chromosome stability [...]
Nanotech Blocks Infection and Speed Up Chronic Wound Recovery
A new nanotech-based formulation using quercetin and omega-3 fatty acids shows promise in halting bacterial biofilms and boosting skin cell repair. Scientists have developed a nanotechnology-based treatment to fight bacterial biofilms in wound infections. The [...]
Researchers propose five key questions for effective adoption of AI in clinical practice
While Artificial Intelligence (AI) can be a powerful tool that physicians can use to help diagnose their patients and has great potential to improve accuracy, efficiency and patient safety, it has its drawbacks. It [...]















