Researchers from Mass General Brigham determined that ChatGPT achieved an accuracy rate of almost 72% across all medical specialties and phases of clinical care, and 77 percent accuracy in making final diagnoses.
Researchers from Mass General Brigham have conducted a study which reveals that ChatGPT demonstrated an accuracy rate of approximately 72% in overall clinical decision-making processes, ranging from suggesting potential diagnoses to finalizing diagnoses and determining care management strategies. This expansive language model-based AI chatbot exhibited consistent performance in both primary care and emergency medical environments across diverse medical fields. The findings were recently published in the Journal of Medical Internet Research.
"Our paper comprehensively assesses decision support via ChatGPT from the very beginning of working with a patient through the entire care scenario, from differential diagnosis all the way through testing, diagnosis, and management," said corresponding author Marc Succi, MD, associate chair of innovation and commercialization and strategic innovation leader at Mass General Brigham and executive director of the MESH Incubator.
"No real benchmarks exist, but we estimate this performance to be at the level of someone who has just graduated from medical school, such as an intern or resident. This tells us that LLMs, in general, have the potential to be an augmenting tool for the practice of medicine and support clinical decision-making with impressive accuracy."
The study was done by pasting successive portions of 36 standardized, published clinical vignettes into ChatGPT. The tool first was asked to come up with a set of possible, or differential, diagnoses based on the patient's initial information, which included age, gender, symptoms, and whether the case was an emergency. ChatGPT was then given additional pieces of information and asked to make management decisions as well as give a final diagnosis—simulating the entire process of seeing a real patient. The team compared ChatGPT's accuracy on differential diagnosis, diagnostic testing, final diagnosis, and management in a structured blinded process, awarding points for correct answers and using linear regressions to assess the relationship between ChatGPT's performance and the vignette's demographic information.
The researchers found that overall, ChatGPT was about 72 percent accurate and that it was best in making a final diagnosis, where it was 77 percent accurate. It was lowest-performing in making differential diagnoses, where it was only 60 percent accurate. And it was only 68 percent accurate in clinical management decisions, such as figuring out what medications to treat the patient with after arriving at the correct diagnosis. Other notable findings from the study included that ChatGPT's answers did not show gender bias and that its overall performance was steady across both primary and emergency care.
"ChatGPT struggled with differential diagnosis, which is the meat and potatoes of medicine when a physician has to figure out what to do," said Succi. "That is important because it tells us where physicians are truly experts and adding the most value—in the early stages of patient care with little presenting information, when a list of possible diagnoses is needed."
The authors note that before tools like ChatGPT can be considered for integration into clinical care, more benchmark research and regulatory guidance is needed. Next, Succi's team is looking at whether AI tools can improve patient care and outcomes in hospitals' resource-constrained areas.
The emergence of artificial intelligence tools in health has been groundbreaking and has the potential to positively reshape the continuum of care. Mass General Brigham, as one of the nation's top integrated academic health systems and largest innovation enterprises, is leading the way in conducting rigorous research on new and emerging technologies to inform the responsible incorporation of AI into care delivery, workforce support, and administrative processes.
"Mass General Brigham sees great promise for LLMs to help improve care delivery and clinician experience," said co-author Adam Landman, MD, MS, MIS, MHS, chief information officer and senior vice president of digital at Mass General Brigham. "We are currently evaluating LLM solutions that assist with clinical documentation and draft responses to patient messages with a focus on understanding their accuracy, reliability, safety, and equity. Rigorous studies like this one are needed before we integrate LLM tools into clinical care."
Reference: "Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study" by Arya Rao, Michael Pang, John Kim, Meghana Kamineni, Winston Lie, Anoop K Prasad, Adam Landman, Keith Dreyer and Marc D Succi, 22 August 2023, Journal of Medical Internet Research.
DOI: 10.2196/48659
The study was funded by the National Institute of General Medical Sciences.
News
Lab-grown corticospinal neurons offer new models for ALS and spinal injuries
Researchers have developed a way to grow a highly specialized subset of brain nerve cells that are involved in motor neuron disease and damaged in spinal injuries. Their study, published today in eLife as the final [...]
Urgent warning over deadly ‘brain swelling’ virus amid fears it could spread globally
Airports across Asia have been put on high alert after India confirmed two cases of the deadly Nipah virus in the state of West Bengal over the past month. Thailand, Nepal and Vietnam are among the [...]
This Vaccine Stops Bird Flu Before It Reaches the Lungs
A new nasal spray vaccine could stop bird flu at the door — blocking infection, reducing spread, and helping head off the next pandemic. Since first appearing in the United States in 2014, H5N1 [...]
These two viruses may become the next public health threats, scientists say
Two emerging pathogens with animal origins—influenza D virus and canine coronavirus—have so far been quietly flying under the radar, but researchers warn conditions are ripe for the viruses to spread more widely among humans. [...]
COVID-19 viral fragments shown to target and kill specific immune cells
COVID-19 viral fragments shown to target and kill specific immune cells in UCLA-led study Clues about extreme cases and omicron’s effects come from a cross-disciplinary international research team New research shows that after the [...]
Smaller Than a Grain of Salt: Engineers Create the World’s Tiniest Wireless Brain Implant
A salt-grain-sized neural implant can record and transmit brain activity wirelessly for extended periods. Researchers at Cornell University, working with collaborators, have created an extremely small neural implant that can sit on a grain of [...]
Scientists Develop a New Way To See Inside the Human Body Using 3D Color Imaging
A newly developed imaging method blends ultrasound and photoacoustics to capture both tissue structure and blood-vessel function in 3D. By blending two powerful imaging methods, researchers from Caltech and USC have developed a new way to [...]
Brain waves could help paralyzed patients move again
People with spinal cord injuries often lose the ability to move their arms or legs. In many cases, the nerves in the limbs remain healthy, and the brain continues to function normally. The loss of [...]
Scientists Discover a New “Cleanup Hub” Inside the Human Brain
A newly identified lymphatic drainage pathway along the middle meningeal artery reveals how the human brain clears waste. How does the brain clear away waste? This task is handled by the brain’s lymphatic drainage [...]
New Drug Slashes Dangerous Blood Fats by Nearly 40% in First Human Trial
Scientists have found a way to fine-tune a central fat-control pathway in the liver, reducing harmful blood triglycerides while preserving beneficial cholesterol functions. When we eat, the body turns surplus calories into molecules called [...]
A Simple Brain Scan May Help Restore Movement After Paralysis
A brain cap and smart algorithms may one day help paralyzed patients turn thought into movement—no surgery required. People with spinal cord injuries often experience partial or complete loss of movement in their arms [...]
Plant Discovery Could Transform How Medicines Are Made
Scientists have uncovered an unexpected way plants make powerful chemicals, revealing hidden biological connections that could transform how medicines are discovered and produced. Plants produce protective chemicals called alkaloids as part of their natural [...]
Scientists Develop IV Therapy That Repairs the Brain After Stroke
New nanomaterial passes the blood-brain barrier to reduce damaging inflammation after the most common form of stroke. When someone experiences a stroke, doctors must quickly restore blood flow to the brain to prevent death. [...]
Analyzing Darwin’s specimens without opening 200-year-old jars
Scientists have successfully analyzed Charles Darwin's original specimens from his HMS Beagle voyage (1831 to 1836) to the Galapagos Islands. Remarkably, the specimens have been analyzed without opening their 200-year-old preservation jars. Examining 46 [...]
Scientists discover natural ‘brake’ that could stop harmful inflammation
Researchers at University College London (UCL) have uncovered a key mechanism that helps the body switch off inflammation—a breakthrough that could lead to new treatments for chronic diseases affecting millions worldwide. Inflammation is the [...]
A Forgotten Molecule Could Revive Failing Antifungal Drugs and Save Millions of Lives
Scientists have uncovered a way to make existing antifungal drugs work again against deadly, drug-resistant fungi. Fungal infections claim millions of lives worldwide each year, and current medical treatments are failing to keep pace. [...]















