Press Release

Can ChatGPT diagnose your condition? Not yet

Journal Article
Correspondence to

September 19, 2023

A research group led by Tokyo Medical and Dental University (TMDU) finds that when common orthopedic symptoms are given, ChatGPT’s diagnosis and recommendations are inconsistent

Tokyo, Japan – ChatGPT, a sophisticated chatbot driven by artificial intelligence (AI) technology, has been increasingly used in health care contexts, one of which is assisting patients in self-diagnosing before seeking medical help. Although it seems very useful at first glance, AI may cause more harm than good to the patient if it is not accurate in its diagnosis and recommendations. A research team from Japan and the United States recently found that the precision of ChatGPT’s diagnoses and the degree to which it recommends medical consultation require further development.

In a study published in September, the multi-institutional research team led by Tokyo Medical and Dental University (TMDU) evaluated the accuracy (percentage of correct responses) and precision of ChatGPT’s response to five common orthopedic diseases (including carpal tunnel syndrome, cervical myelopathy, and hip osteoarthritis) because orthopedic complaints are very common in clinical practice and comprise up to 26% of the reasons why patients seek care. Over a 5-day course, each of the study researchers submitted the same questions to ChatGPT. The reproducibility between days and researchers was also calculated, and the strength of the recommendation that the patient seek medical attention was evaluated.

“We found that accuracy and reproducibility of ChatGPT’s diagnosis are not consistent over the five conditions. ChatGPT's diagnosis was 100% accurate for carpal tunnel syndrome, but only 4% for cervical myelopathy,” says lead author Tomoyuki Kuroiwa. Additionally, reproducibility between days and researchers varied from “poor” to “almost perfect” among the five conditions even though researchers entered the same questions every time.

ChatGPT was also inconsistent in recommending medical consultation. Although almost 80% of ChatGPT’s answers recommended medical consultation, only 12.8% included a strong recommendation as set by the study standards. “Without direct language, it is possible that the patient is left confused after self-diagnosis, or worse, experience harm from a misdiagnosis,” says Kuroiwa.

This is the first study to evaluate the reproducibility and degree of the medical consultation recommendation of ChatGPT’s ability to self-diagnose. “In its current form, ChatGPT is inconsistent in both accuracy and precision to help patients diagnose their disease,” explains senior author Koji Fujita. “Given the risk of error and potential harm from misdiagnosis, it is important for any diagnostic tool to include clear language alerting patients to seek expert medical opinions for confirmation of a disease.”

The researchers also note some limitations of the study including the use of questions simulated by the research team and not patient-derived questions; focusing on only five orthopedic diseases; and using only
ChatGPT. While it is still too early to use AI intelligence for self-diagnosis, the training of ChatGPT on diseases of interest could change this. Future studies can help shed light on the role of AI as a diagnostic tool.

ChatGPT responds differently on different days
An example of a ChatGPT response in this study; despite asking the ChatGPT the exact same questions about orthopedic symptoms, the ChatGPT offered different diagnoses on different days. Sometimes the same author asked the same question, but the diagnosis differed.

###

The article, “The potential of ChatGPT as a self-diagnostic tool in common orthopedic diseases: exploratory study,” was published in Journal of Medical Internet Research at DOI: 10.2196/47621

Summary

A research team led by Tokyo Medical and Dental University (TMDU) investigated for the first time whether ChatGPT can be used by precisely patients to make a diagnosis from their symptoms. They found that the accuracy and reproducibility of ChatGPT to self-diagnose common orthopedic conditions were inconsistent; moreover, its recommendations for medical consultation were weak. Although ChatGPT could serve as a potential first step in accessing care, further development is necessary before relying on AI for self-diagnosis.

Journal Article

JOURNAL：Journal of Medical Internet Research

TITLE：The Potential of ChatGPT as a Self-Diagnostic Tool in Common Orthopedic Diseases: Exploratory Study

DOI：https://doi.org/10.2196/47621

Correspondence to

Koji Fujita, MD, PhD,Professor

Division of Medical Design Innovations,
Open Innovation Center, Institute of Research Innovation,
Tokyo Medical and Dental University(TMDU)
E-mail:fujiorth[@]tmd.ac.jp

＊Please change (at) in e-mail addresses to @ on sending your e-mail to contact personnels.