A researcher has just finished writing a scientific paper. She knows her work could benefit from another perspective. Did she overlook something? Or perhaps there’s an application of her research she hadn’t thought of. A second set of eyes would be great, but even the friendliest of collaborators might not be able to spare the time to read all the required background publications to catch up.
Rapid advances in AI and ML have given way to programs that can generate creative text and useful software code. These general-purpose chatbots have recently captured the public imagination. Existing chatbots—based on large, diverse language models—lack detailed knowledge of scientific sub-domains.
By leveraging a document-retrieval method, Yager’s bot is knowledgeable in areas of nanomaterial science that other bots are not. The details of this project and how other scientists can leverage this AI colleague for their own work have recently been published in Digital Discovery.
Rise of the robots
“CFN has been looking into new ways to leverage AI/ML to accelerate nanomaterial discovery for a long time. Currently, it’s helping us quickly identify, catalog, and choose samples, automate experiments, control equipment, and discover new materials. Esther Tsai, a scientist in the electronic nanomaterials group at CFN, is developing an AI companion to help speed up materials research experiments at the National Synchrotron Light Source II (NSLS-II).” NSLS-II is another DOE Office of Science User Facility at Brookhaven Lab.
At CFN, there has been a lot of work on AI/ML that can help drive experiments through the use of automation, controls, robotics, and analysis, but having a program that was adept with scientific text was something that researchers hadn’t explored as deeply. Being able to quickly document, understand, and convey information about an experiment can help in a number of ways—from breaking down language barriers to saving time by summarizing larger pieces of work.

Watching your language
To build a specialized chatbot, the program required domain-specific text—language taken from areas the bot is intended to focus on. In this case, the text is scientific publications. Domain-specific text helps the AI model understand new terminology and definitions and introduces it to frontier scientific concepts. Most importantly, this curated set of documents enables the AI model to ground its reasoning using trusted facts.
To emulate natural human language, AI models are trained on existing text, enabling them to learn the structure of language, memorize various facts, and develop a primitive sort of reasoning. Rather than laboriously retrain the AI model on nanoscience text, Yager gave it the ability to look up relevant information in a curated set of publications. Providing it with a library of relevant data was only half of the battle. To use this text accurately and effectively, the bot would need a way to decipher the correct context.
“A challenge that’s common with language models is that sometimes they ‘hallucinate’ plausible sounding but untrue things,” explained Yager. “This has been a core issue to resolve for a chatbot used in research as opposed to one doing something like writing poetry. We don’t want it to fabricate facts or citations. This needed to be addressed. The solution for this was something we call ’embedding,’ a way of categorizing and linking information quickly behind the scenes.”
Embedding is a process that transforms words and phrases into numerical values. The resulting “embedding vector” quantifies the meaning of the text. When a user asks the chatbot a question, it’s also sent to the ML embedding model to calculate its vector value. This vector is used to search through a pre-computed database of text chunks from scientific papers that were similarly embedded. The bot then uses text snippets it finds that are semantically related to the question to get a more complete understanding of the context.
The user’s query and the text snippets are combined into a “prompt” that is sent to a large language model, an expansive program that creates text modeled on natural human language, that generates the final response. The embedding ensures that the text being pulled is relevant in the context of the user’s question. By providing text chunks from the body of trusted documents, the chatbot generates answers that are factual and sourced.
“The program needs to be like a reference librarian,” said Yager. “It needs to heavily rely on the documents to provide sourced answers. It needs to be able to accurately interpret what people are asking and be able to effectively piece together the context of those questions to retrieve the most relevant information. While the responses may not be perfect yet, it’s already able to answer challenging questions and trigger some interesting thoughts while planning new projects and research.”

Bots empowering humans
CFN is developing AI/ML systems as tools that can liberate human researchers to work on more challenging and interesting problems and to get more out of their limited time while computers automate repetitive tasks in the background. There are still many unknowns about this new way of working, but these questions are the start of important discussions scientists are having right now to ensure AI/ML use is safe and ethical.
“There are a number of tasks that a domain-specific chatbot like this could clear from a scientist’s workload. Classifying and organizing documents, summarizing publications, pointing out relevant info, and getting up to speed in a new topical area are just a few potential applications,” remarked Yager. “I’m excited to see where all of this will go, though. We never could have imagined where we are now three years ago, and I’m looking forward to where we’ll be three years from now.”
For researchers interested in trying this software out for themselves, the source code for CFN’s chatbot and associated tools can be found in this GitHub repository.
More information: Kevin G. Yager, Domain-specific chatbots for science using embeddings, Digital Discovery (2023). DOI: 10.1039/D3DD00112A
News
Recent Digital Health Trends, Insights and News – May 2026
Last month marked continued progress as digital health moves into its next phase — from AI expanding into drug discovery and core infrastructure to new federal pathways accelerating device access and home-based care. Together, [...]
Cancer Mystery Solved: Scientists Discover How Melanoma Becomes “Immortal”
Scientists have uncovered a previously overlooked mechanism that may help melanoma cells become effectively “immortal.” Cancer cells face a major problem before they can become deadly: They have to figure out how to stop [...]
How Visual Neurons Organize Thousands of Synaptic Inputs
Summary: A new study uncovered the organizational rules that determine how neurons in the primary visual cortex process information. By imaging both the cell bodies (soma) and the individual synapses (on dendritic spines) of [...]
Scientists Just Found a Surprising Way To Destroy “Forever Chemicals”
Scientists have uncovered a new mechanism that may help break down highly persistent PFAS pollutants. PFAS have earned the nickname “forever chemicals” for a reason. These industrial compounds are so chemically durable that they [...]
Scientists Discover Cheap Material That Kills Deadly Superbugs
A new sulfur-rich antimicrobial polymer shows strong effectiveness against fungal and bacterial pathogens and may offer an affordable solution to antimicrobial resistance. Antimicrobial resistance is creating growing challenges for both healthcare and food production, [...]
What to Know About Cicada, or BA.3.2, the Latest SARS-CoV-2 Variant Under Monitoring
Like periodical cicadas, the insects for which it is nicknamed, SARS-CoV-2 Omicron subvariant BA.3.2 is only just beginning to emerge after lying low for an extended period since it first appeared. Although it was [...]
Scientists Say This Simple Supplement May Actually Reverse Heart Disease
Scientists in Japan say a common supplement may actually help “unclog” certain diseased heart arteries from the inside out. A simple food supplement sold in Japan may have helped reverse a dangerous form of [...]
New breakthrough against radiation: Korean Scientists create revolutionary shield with nanotechnology
Korean Scientists develop new nanotechnology material capable of reducing radiation impacts in space missions, hospitals, and power plants. The search for more efficient protection technologies in extreme environments has just gained an important advance. Korean [...]
Scientists Just Discovered the Hidden Trick That Keeps Your Cells Alive
A strange bead-like motion inside cells may be the secret to keeping their DNA—and health—in balance. Mitochondria are often described as the power plants of the cell because they produce the energy cells need [...]
Scientists Discover Stem Cells That Could Regrow Teeth and Bone
Scientists just uncovered the cellular “blueprint” that could one day let us regrow real teeth. Researchers at Science Tokyo have uncovered two distinct stem cell lineages that play a central role in forming tooth [...]
Scientists Uncover Fatal Weakness in “Zombie Cells” Linked to Cancer
A newly identified weakness in “zombie” cells may open the door to more precise cancer treatments by turning their own survival strategy against them. A new class of drugs takes advantage of a recently [...]
Bowel and Ovarian Cancers Are Dramatically Rising in Young Adults, Scientists Aren’t Sure Why
Cancer incidence is increasing, especially among younger adults, and current risk factors don’t fully account for the trend. Scientists suggest other underlying causes may be contributing. Cancer patterns in England are shifting in a [...]
New Immune Pathway Could Supercharge mRNA Cancer Vaccines
A surprising backup system in the immune response to mRNA vaccines may hold the key to more effective cancer treatments. The arrival of mRNA vaccines against SARS-CoV-2 in 2020 marked a turning point in the COVID-19 pandemic. Today, [...]
Scientists Discover “Molecular Switch” That Fuels Alzheimer’s Brain Inflammation
A newly identified trigger of brain inflammation could offer a fresh target for slowing Alzheimer’s progression. The brain has its own built-in immune system that identifies threats and responds to them. In Alzheimer’s disease, growing evidence [...]
Molecular Manufacturing: The Future of Nanomedicine – New book from NanoappsMedical Inc.
This book explores the revolutionary potential of atomically precise manufacturing technologies to transform global healthcare, as well as practically every other sector across society. This forward-thinking volume examines how envisaged Factory@Home systems might enable the cost-effective [...]
Forgotten Medicinal Plant Shows Promise in Fighting Dangerous Superbugs
A traditional medicinal plant, tormentil, shows promise against antibiotic-resistant bacteria in laboratory tests. Its compounds work by limiting bacterial growth and boosting antibiotic performance. Before the development of modern antibiotics, plant-based remedies were commonly [...]















