Researchers from Mass General Brigham determined that ChatGPT achieved an accuracy rate of almost 72% across all medical specialties and phases of clinical care, and 77 percent accuracy in making final diagnoses.
Researchers from Mass General Brigham have conducted a study which reveals that ChatGPT demonstrated an accuracy rate of approximately 72% in overall clinical decision-making processes, ranging from suggesting potential diagnoses to finalizing diagnoses and determining care management strategies. This expansive language model-based AI chatbot exhibited consistent performance in both primary care and emergency medical environments across diverse medical fields. The findings were recently published in the Journal of Medical Internet Research.
“Our paper comprehensively assesses decision support via ChatGPT from the very beginning of working with a patient through the entire care scenario, from differential diagnosis all the way through testing, diagnosis, and management,” said corresponding author Marc Succi, MD, associate chair of innovation and commercialization and strategic innovation leader at Mass General Brigham and executive director of the MESH Incubator.
“No real benchmarks exist, but we estimate this performance to be at the level of someone who has just graduated from medical school, such as an intern or resident. This tells us that LLMs, in general, have the potential to be an augmenting tool for the practice of medicine and support clinical decision-making with impressive accuracy.”
The study was done by pasting successive portions of 36 standardized, published clinical vignettes into ChatGPT. The tool first was asked to come up with a set of possible, or differential, diagnoses based on the patient’s initial information, which included age, gender, symptoms, and whether the case was an emergency. ChatGPT was then given additional pieces of information and asked to make management decisions as well as give a final diagnosis—simulating the entire process of seeing a real patient. The team compared ChatGPT’s accuracy on differential diagnosis, diagnostic testing, final diagnosis, and management in a structured blinded process, awarding points for correct answers and using linear regressions to assess the relationship between ChatGPT’s performance and the vignette’s demographic information.
The researchers found that overall, ChatGPT was about 72 percent accurate and that it was best in making a final diagnosis, where it was 77 percent accurate. It was lowest-performing in making differential diagnoses, where it was only 60 percent accurate. And it was only 68 percent accurate in clinical management decisions, such as figuring out what medications to treat the patient with after arriving at the correct diagnosis. Other notable findings from the study included that ChatGPT’s answers did not show gender bias and that its overall performance was steady across both primary and emergency care.
“ChatGPT struggled with differential diagnosis, which is the meat and potatoes of medicine when a physician has to figure out what to do,” said Succi. “That is important because it tells us where physicians are truly experts and adding the most value—in the early stages of patient care with little presenting information, when a list of possible diagnoses is needed.”
The authors note that before tools like ChatGPT can be considered for integration into clinical care, more benchmark research and regulatory guidance is needed. Next, Succi’s team is looking at whether AI tools can improve patient care and outcomes in hospitals’ resource-constrained areas.
The emergence of artificial intelligence tools in health has been groundbreaking and has the potential to positively reshape the continuum of care. Mass General Brigham, as one of the nation’s top integrated academic health systems and largest innovation enterprises, is leading the way in conducting rigorous research on new and emerging technologies to inform the responsible incorporation of AI into care delivery, workforce support, and administrative processes.
“Mass General Brigham sees great promise for LLMs to help improve care delivery and clinician experience,” said co-author Adam Landman, MD, MS, MIS, MHS, chief information officer and senior vice president of digital at Mass General Brigham. “We are currently evaluating LLM solutions that assist with clinical documentation and draft responses to patient messages with a focus on understanding their accuracy, reliability, safety, and equity. Rigorous studies like this one are needed before we integrate LLM tools into clinical care.”
Reference: “Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study” by Arya Rao, Michael Pang, John Kim, Meghana Kamineni, Winston Lie, Anoop K Prasad, Adam Landman, Keith Dreyer and Marc D Succi, 22 August 2023, Journal of Medical Internet Research.
DOI: 10.2196/48659
The study was funded by the National Institute of General Medical Sciences.
News
NIH Scientists Discover Gene Responsible for Rare Eye Disease
Findings supported by the NIH pave the way for the development of genetic testing, clinical trials, and therapies. Researchers at the National Institutes of Health (NIH) and their collaborators have discovered a gene linked to certain [...]
Alzheimer’s Breakthrough: Synthetic THC Pill Proves Effective in Clinical Trial
Patients tolerated synthetic THC (dronabinol) well, without the adverse effects commonly associated with existing Alzheimer’s agitation medications. A study conducted by researchers from Johns Hopkins University School of Medicine and Tufts University School of Medicine found that a pill form [...]
The Future of Rare Disease Treatment with Precision Medicine
Understanding rare diseases Rare diseases affect less than 5 people out of 10,000. However, this still amounts to about 7% of the world’s population, with over 10,000 such conditions. Almost all are genetic in [...]
Doctors issue warning for upcoming ‘tripledemic
The term ‘tripledemic’ has hit headlines this week as the NHS begins its Covid and fluvaccine roll-out for vulnerable adults. As the cold weather sets in, many of us have experienced a decline in health, and this may [...]
The FDA approved a gel that can stop bleeding from wounds in seconds
Aug 15 (Reuters) - The U.S. Food and Drug Administration has cleared Cresilon's gel to quickly control bleeding, the privately held company said on Thursday, potentially giving emergency medical technicians and combat medics a [...]
High levels of microplastics found in prostate tumors, possibly linked to take-out food
The presence of microplastics in prostate tumors points to potential health risks, and researchers are calling for urgent studies to explore how take-out food may be driving this exposure. In a recent study published [...]
AI outperforms radiologists in brain tumor diagnosis
As artificial intelligence advances, its uses and capabilities in real-world applications continue to reach new heights that may even surpass human expertise. In the field of radiology, where a correct diagnosis is crucial to ensure [...]
Breakthrough Study Reveals Molecular Clues to Dementia Origins
Work could lead to the discovery of new therapeutic targets. For the first time, researchers have identified “molecular markers” linked to degeneration—detectable changes in cells and their gene-regulating networks—that are common across several types [...]
Better than blood tests? Nanoparticle potential found for assessing kidneys
In a study published July 29 in Advanced Materials, University of Texas at Dallas researchers found that X-rays of the kidneys using gold nanoparticles as a contrast agent might be more accurate in detecting kidney [...]
Greener nanomaterials could transform how our everyday stuff is made
Tiny nanoparticles are at the forefront of materials science—with special properties that make them great at absorbing light in solar panels, cleaning wastewater, and delivering drugs precisely. Some nanoparticles take the form of sheets or fibers. But nanomaterials all [...]
AI could predict breast cancer risk via ‘zombie cells’
Women worldwide could see better treatment with new AI technology, which enables better detection of damaged cells and more precisely predicts the risk of getting breast cancer, shows new research from the [...]
Through the eyes of a cat – biomimicry of feline eyes may revolutionize robotic vision
In a recent study published in the journal Science Advances, researchers leveraged crucial aspects of feline eyes, particularly their tapetum lucidum and vertically elongated pupils (VP), to develop a monocular artificial vision system capable of [...]
New Alzheimer’s Therapy Shows Remarkable Results in Animal Trials
A study from TUM demonstrates a promising therapeutic approach. Researchers at the Technical University of Munich (TUM) have made promising advances in preventing Alzheimer’s by developing a new therapeutic strategy. Their approach focuses on targeting the amyloid beta [...]
Rewriting Cancer’s Blueprint: New Study Challenges Old Theories
A new study argues for a revised clonal evolution model of cancer, incorporating genetic and non-genetic factors to improve understanding and treatment. Like all living organisms, cancer cells are driven by the fundamental need [...]
Microplastics Everywhere: Experts Demand Worldwide Treaty Before It’s Too Late
A new report calls for global action on plastic pollution, urging reductions in plastic production and microplastic emissions. Researchers stress the importance of addressing plastic pollution through both scientific and social science perspectives. A [...]
Blood tests could soon predict your risk of Alzheimer’s
Scientists are closing in on biomarkers that reflect the progression of Alzheimer’s disease and could improve treatments. Like many Alzheimer’s researchers, neurologist Randall Bateman is not prone to effusiveness, having endured disappointments in his [...]