Researchers from Mass General Brigham determined that ChatGPT achieved an accuracy rate of almost 72% across all medical specialties and phases of clinical care, and 77 percent accuracy in making final diagnoses.
Researchers from Mass General Brigham have conducted a study which reveals that ChatGPT demonstrated an accuracy rate of approximately 72% in overall clinical decision-making processes, ranging from suggesting potential diagnoses to finalizing diagnoses and determining care management strategies. This expansive language model-based AI chatbot exhibited consistent performance in both primary care and emergency medical environments across diverse medical fields. The findings were recently published in the Journal of Medical Internet Research.
"Our paper comprehensively assesses decision support via ChatGPT from the very beginning of working with a patient through the entire care scenario, from differential diagnosis all the way through testing, diagnosis, and management," said corresponding author Marc Succi, MD, associate chair of innovation and commercialization and strategic innovation leader at Mass General Brigham and executive director of the MESH Incubator.
"No real benchmarks exist, but we estimate this performance to be at the level of someone who has just graduated from medical school, such as an intern or resident. This tells us that LLMs, in general, have the potential to be an augmenting tool for the practice of medicine and support clinical decision-making with impressive accuracy."
The study was done by pasting successive portions of 36 standardized, published clinical vignettes into ChatGPT. The tool first was asked to come up with a set of possible, or differential, diagnoses based on the patient's initial information, which included age, gender, symptoms, and whether the case was an emergency. ChatGPT was then given additional pieces of information and asked to make management decisions as well as give a final diagnosis—simulating the entire process of seeing a real patient. The team compared ChatGPT's accuracy on differential diagnosis, diagnostic testing, final diagnosis, and management in a structured blinded process, awarding points for correct answers and using linear regressions to assess the relationship between ChatGPT's performance and the vignette's demographic information.
The researchers found that overall, ChatGPT was about 72 percent accurate and that it was best in making a final diagnosis, where it was 77 percent accurate. It was lowest-performing in making differential diagnoses, where it was only 60 percent accurate. And it was only 68 percent accurate in clinical management decisions, such as figuring out what medications to treat the patient with after arriving at the correct diagnosis. Other notable findings from the study included that ChatGPT's answers did not show gender bias and that its overall performance was steady across both primary and emergency care.
"ChatGPT struggled with differential diagnosis, which is the meat and potatoes of medicine when a physician has to figure out what to do," said Succi. "That is important because it tells us where physicians are truly experts and adding the most value—in the early stages of patient care with little presenting information, when a list of possible diagnoses is needed."
The authors note that before tools like ChatGPT can be considered for integration into clinical care, more benchmark research and regulatory guidance is needed. Next, Succi's team is looking at whether AI tools can improve patient care and outcomes in hospitals' resource-constrained areas.
The emergence of artificial intelligence tools in health has been groundbreaking and has the potential to positively reshape the continuum of care. Mass General Brigham, as one of the nation's top integrated academic health systems and largest innovation enterprises, is leading the way in conducting rigorous research on new and emerging technologies to inform the responsible incorporation of AI into care delivery, workforce support, and administrative processes.
"Mass General Brigham sees great promise for LLMs to help improve care delivery and clinician experience," said co-author Adam Landman, MD, MS, MIS, MHS, chief information officer and senior vice president of digital at Mass General Brigham. "We are currently evaluating LLM solutions that assist with clinical documentation and draft responses to patient messages with a focus on understanding their accuracy, reliability, safety, and equity. Rigorous studies like this one are needed before we integrate LLM tools into clinical care."
Reference: "Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study" by Arya Rao, Michael Pang, John Kim, Meghana Kamineni, Winston Lie, Anoop K Prasad, Adam Landman, Keith Dreyer and Marc D Succi, 22 August 2023, Journal of Medical Internet Research.
DOI: 10.2196/48659
The study was funded by the National Institute of General Medical Sciences.
News
Ryugu asteroid samples contain all DNA and RNA building blocks, bolstering origin-of-life theories
All the essential ingredients to make the DNA and RNA underpinning life on Earth have been discovered in samples collected from the asteroid Ryugu, scientists said Monday. The discovery comes after these building blocks [...]
Is Berberine Really a “Natural Ozempic”?
Often labeled a “natural Ozempic,” berberine is widely discussed as a metabolic aid. Yet research suggests its influence may lie deeper. In recent years, berberine has gained significant attention as a supposed “natural way” [...]
Viagra Ingredient Shows Promise for Rare Childhood Brain Disease in Surprising Study
A rare childhood disease with no approved treatment may have an unexpected new therapeutic candidate. Sildenafil, the active ingredient also sold under the brand name Viagra, may help reduce symptoms in people with Leigh [...]
In a first for China, Neuracle’s implantable brain-computer interface wins approval
In a landmark development, Neuracle Medical Technology has secured the country’s first-ever approval for an implantable brain-computer interface (BCI) system designed to restore hand motor function in patients with spinal cord injuries, in a [...]
A Cambridge Lab Mistake Reveals a Powerful New Way to Modify Drug Molecules
A surprising lab discovery reveals a light-powered way to tweak complex drugs faster, cleaner, and later in development. Researchers at the University of Cambridge have created a new technique for altering complex drug molecules [...]
New book from NanoappsMedical Inc – Molecular Manufacturing: The Future of Nanomedicine
This book explores the revolutionary potential of atomically precise manufacturing technologies to transform global healthcare, as well as practically every other sector across society. This forward-thinking volume examines how envisaged Factory@Home systems might enable the cost-effective [...]
Scientists Discover Simple Saliva Test That Reveals Hidden Diabetes Risk
Researchers have identified a potential new way to assess metabolic health using saliva instead of blood. High insulin levels in the blood, known as hyperinsulinemia, can reveal metabolic problems long before obvious symptoms appear. It is [...]
One Nasal Spray Could Protect Against COVID, Flu, Pneumonia, and More
A single nasal spray vaccine may one day protect against viruses, pneumonia, and even allergies. For decades, scientists have dreamed of creating a universal vaccine capable of protecting against many different pathogens. The idea [...]
New AI Model Predicts Cancer Spread With Incredible Accuracy
Scientists have developed an AI system that analyzes complex gene-expression signatures to estimate the likelihood that a tumor will spread. Why do some tumors spread throughout the body while others remain confined to their [...]
Scientists Discover DNA “Flips” That Supercharge Evolution
In Lake Malawi, hundreds of species of cichlid fish have evolved with astonishing speed, offering scientists a rare opportunity to study how biodiversity arises. Researchers have identified segments of “flipped” DNA that may allow fish to adapt rapidly [...]
Our books now available worldwide!
Online Sellers other than Amazon, Routledge, and IOPP Indigo Global Health Care Equivalency in the Age of Nanotechnology, Nanomedicine and Artifcial Intelligence Global Health Care Equivalency In The Age Of Nanotechnology, Nanomedicine And Artificial [...]
Scientists Discover Why Some COVID Survivors Still Can’t Taste Food Years Later
A new study provides the first direct biological evidence explaining why some people continue to experience taste loss long after recovering from COVID-19. Researchers have uncovered specific biological changes in taste buds that could help [...]
Catching COVID significantly raises the risk of developing kidney disease, researchers find
Catching Covid significantly raises the risk of developing deadly kidney disease, research has shown. The virus was found to increase the chances that patients will develop the incurable condition by around 50 per cent. [...]
New Toothpaste Stops Gum Disease Without Harming Healthy Bacteria
Researchers have developed a targeted approach to combat periodontitis without disrupting the natural balance of the oral microbiome. The innovation could reshape how gum disease is treated while preserving beneficial bacteria. The human mouth [...]
Plastic Without End: Are We Polluting the Planet for Eternity?
The Kunming Montreal Global Biodiversity Framework calls for the elimination of plastic pollution by 2030. If that goal has been clearly set, why have meaningful measures that create real change still not been implemented? [...]
Scientists Rewire Natural Killer Cells To Attack Cancer Faster and Harder
Researchers tested new CAR designs in NK-92 cells and found the modified cells killed tumor cells more effectively, showing stronger anti-cancer activity. Researchers at the Ribeirão Preto Blood Center and the Center for Cell-Based [...]















