Researchers from Mass General Brigham determined that ChatGPT achieved an accuracy rate of almost 72% across all medical specialties and phases of clinical care, and 77 percent accuracy in making final diagnoses.
Researchers from Mass General Brigham have conducted a study which reveals that ChatGPT demonstrated an accuracy rate of approximately 72% in overall clinical decision-making processes, ranging from suggesting potential diagnoses to finalizing diagnoses and determining care management strategies. This expansive language model-based AI chatbot exhibited consistent performance in both primary care and emergency medical environments across diverse medical fields. The findings were recently published in the Journal of Medical Internet Research.
"Our paper comprehensively assesses decision support via ChatGPT from the very beginning of working with a patient through the entire care scenario, from differential diagnosis all the way through testing, diagnosis, and management," said corresponding author Marc Succi, MD, associate chair of innovation and commercialization and strategic innovation leader at Mass General Brigham and executive director of the MESH Incubator.
"No real benchmarks exist, but we estimate this performance to be at the level of someone who has just graduated from medical school, such as an intern or resident. This tells us that LLMs, in general, have the potential to be an augmenting tool for the practice of medicine and support clinical decision-making with impressive accuracy."
The study was done by pasting successive portions of 36 standardized, published clinical vignettes into ChatGPT. The tool first was asked to come up with a set of possible, or differential, diagnoses based on the patient's initial information, which included age, gender, symptoms, and whether the case was an emergency. ChatGPT was then given additional pieces of information and asked to make management decisions as well as give a final diagnosis—simulating the entire process of seeing a real patient. The team compared ChatGPT's accuracy on differential diagnosis, diagnostic testing, final diagnosis, and management in a structured blinded process, awarding points for correct answers and using linear regressions to assess the relationship between ChatGPT's performance and the vignette's demographic information.
The researchers found that overall, ChatGPT was about 72 percent accurate and that it was best in making a final diagnosis, where it was 77 percent accurate. It was lowest-performing in making differential diagnoses, where it was only 60 percent accurate. And it was only 68 percent accurate in clinical management decisions, such as figuring out what medications to treat the patient with after arriving at the correct diagnosis. Other notable findings from the study included that ChatGPT's answers did not show gender bias and that its overall performance was steady across both primary and emergency care.
"ChatGPT struggled with differential diagnosis, which is the meat and potatoes of medicine when a physician has to figure out what to do," said Succi. "That is important because it tells us where physicians are truly experts and adding the most value—in the early stages of patient care with little presenting information, when a list of possible diagnoses is needed."
The authors note that before tools like ChatGPT can be considered for integration into clinical care, more benchmark research and regulatory guidance is needed. Next, Succi's team is looking at whether AI tools can improve patient care and outcomes in hospitals' resource-constrained areas.
The emergence of artificial intelligence tools in health has been groundbreaking and has the potential to positively reshape the continuum of care. Mass General Brigham, as one of the nation's top integrated academic health systems and largest innovation enterprises, is leading the way in conducting rigorous research on new and emerging technologies to inform the responsible incorporation of AI into care delivery, workforce support, and administrative processes.
"Mass General Brigham sees great promise for LLMs to help improve care delivery and clinician experience," said co-author Adam Landman, MD, MS, MIS, MHS, chief information officer and senior vice president of digital at Mass General Brigham. "We are currently evaluating LLM solutions that assist with clinical documentation and draft responses to patient messages with a focus on understanding their accuracy, reliability, safety, and equity. Rigorous studies like this one are needed before we integrate LLM tools into clinical care."
Reference: "Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study" by Arya Rao, Michael Pang, John Kim, Meghana Kamineni, Winston Lie, Anoop K Prasad, Adam Landman, Keith Dreyer and Marc D Succi, 22 August 2023, Journal of Medical Internet Research.
DOI: 10.2196/48659
The study was funded by the National Institute of General Medical Sciences.
News
Scientists Just Discovered a Cellular Survival System That Was Never Supposed To Exist
A surprising backup pathway allows cells to make a crucial amino acid when their primary machinery fails. For decades, biologists believed cells had only one way to access a molecule they cannot live without. New [...]
Artificial cells gain porous membranes, enabling lab reactions and drug release
Artificial cells created in the laboratory offer a wide range of potential applications. Until now, however, their membranes—unlike those of real cells—have been virtually impermeable. Researchers at the Max Planck Institute for Polymer Research, [...]
Popular Weight-Loss Drugs Like Ozempic Linked to Lower Breast Cancer Risk
Ozempic and similar weight-loss drugs were linked to a striking 30% reduction in breast cancer risk in a study of more than 110,000 women. Popular weight-loss and diabetes medications such as Ozempic, Wegovy, Mounjaro, [...]
Stanford Scientists Discover Explosive New Type of Immune Cell
Scientists studying the remarkable regenerative abilities of planarian flatworms have uncovered a previously unknown type of immune cell with an unusually destructive defense strategy. What if an immune cell could wipe out nearby threats [...]
Big Pharma-backed SonoThera sounds off with $125M series B for bubble-based genetic delivery
Bay Area biotech SonoThera is bubbling to a clinical boil after raising a $125 million series B with the backing of some of the biggest names in pharma. Vida Ventures led the raise, with the venture [...]
Joint initiative of 5 EU countries calls for ‘unified approach’ to pharma framework amid US drug pricing pressure
With drug pricing pressure building from the U.S., a healthcare-focused consortium of five European countries is calling for a “unified approach” to strengthen Europe’s pharmaceutical framework and access to innovative medicines. Belgium, the Netherlands, [...]
Our books now available worldwide!
Online Sellers other than Amazon, Routledge, and IOPP Indigo Global Health Care Equivalency in the Age of Nanotechnology, Nanomedicine and Artifcial Intelligence Global Health Care Equivalency In The Age Of Nanotechnology, Nanomedicine And Artificial [...]
Molecular Manufacturing: The Future of Nanomedicine – New book from NanoappsMedical Inc.
This book explores the revolutionary potential of atomically precise manufacturing technologies to transform global healthcare, as well as practically every other sector across society. This forward-thinking volume examines how envisaged Factory@Home systems might enable the cost-effective [...]
NanoMedical Brain/Cloud Interface – Explorations and Implications. A new book from Frank Boehm
New book from Frank Boehm, NanoappsMedical Inc Founder: This book explores the future hypothetical possibility that the cerebral cortex of the human brain might be seamlessly, safely, and securely connected with the Cloud via [...]
New book from Nanoappsmedical Inc. – Global Health Care Equivalency
A new book by Frank Boehm, NanoappsMedical Inc. Founder. This groundbreaking volume explores the vision of a Global Health Care Equivalency (GHCE) system powered by artificial intelligence and quantum computing technologies, operating on secure [...]
UCLA Scientists Uncover a “Hidden Weakness” in Some of the World’s Deadliest Cancers
A new study has uncovered an unexpected vulnerability in some of the deadliest cancers. Researchers at UCLA have identified a previously hidden weakness in some of the most aggressive cancers, pointing to a possible new way [...]
AI-designed universal coronavirus vaccine clears first human trial
Key Takeaways Super-Antigen Technology: Uses AI and machine learning to analyze viral genomes, creating a single vaccine that targets essential features across entire virus families, including coronaviruses and Ebola. Human Trials & Safety: Phase [...]
Researchers Discover a Hidden Vitamin D Problem That Persists Year-Round
A new study suggests that some groups may not experience the expected seasonal boost in vitamin D levels, even during the sunniest months of the year. Many people assume that spending more time outdoors [...]
Researchers Solve the Mystery Behind a Billion-Dollar Dental Implant Disease
Researchers have uncovered why a common and costly dental implant infection often resists antibiotics. Dental implants have helped tens of millions of people regain a full set of stable, functional teeth, something traditional dentures [...]
Nanoparticles inspired by lung fluid improve therapies targeting respiratory system
The CIC biomaGUNE Center for Cooperative Research in Biomaterials has developed pulmonary surfactant nanoparticles (the blend of lipids and proteins that line the alveoli and enables breathing), which are encapsulated [...]
Scientists Finally Uncover How a “Forever Chemical” Causes Birth Defects
PFDA, a PFAS “forever chemical,” can cause craniofacial birth defects by disrupting retinoic acid regulation during fetal development, revealing the first clear molecular mechanism behind the link. Researchers have long linked perfluoroalkyl and polyfluoroalkyl substances (PFAS), [...]















