Article Text
Abstract
Objectives 1) Compare and assess the accuracy of Google Bard and ChatGPT in SLE patients education. 2) Comparison, Scoring, and Validation of the answers by AI tools from rheumatologists vs other specialities treating lupus.
Methods The prospective study was performed in the Dept. of Medicine, KMC Manipal, from November 2023 to January 2024. Twenty commonly asked questions by SLE patients (table 1) were retrieved by ChatGPT and Google Bard, which were then evaluated by expert consultants. This procedure was iterated thrice. SMOG (Simple Measure of Gobbledygook) scoring was done to check the clarity and accessibility of each question, and the best was chosen based on the highest readability scoring. 17 experts treating lupus patients from different domains were engaged to evaluate these answers through an online Google form. Their role was to access and score these answers derived from the above AI tools on a scale from 0 to 10, with grades ranging from 10 to highly satisfactory to less than 5: not satisfactory. Informed consent and approval were obtained from all the experts.
Results Google Bard showed more relevant information with comprehensiveness and clarity on most of the questions than ChatGPT (figure 1). There were a few questions for where we received the comments as follows:
ChatGPT expert’s comment: Answer 1: A very concise answer, not descriptive common characteristics are not highlighted. Answer 7: missing many markers mentioned rare ones. Answer 14: incorporated more clinical symptoms rather than indices to differentiate. Missed CRP and other markers. Answer 19: Has not recognized the importance of other specialities in treating SLE as it’s a multi-organ disease that requires a multi-disciplinary approach.
Google Bard’s Answer 2: has not gone into intrinsic details of the etiopathogenesis. Answer 5: not a clear explanation and less clear. Rheumatologists rely more on Google Bard, whereas ChatGPT was found to be best among other specialities treating lupus (figure 2 a,b).
Conclusions Our study revealed Google Bard is significantly more precise and comprehensive about the SLE than ChatGPT. SLE is a heterogenous disease that involves multi-specialties along with a rheumatologist as an expert to treat and make critical decisions with organ involvement. SMOG scoring ultimately contributes to the successful transmission of information and knowledge across diverse audiences in an easily readable manner.
Acknowledgements The authors did not receive any funding for the study.
This is an open access article distributed in accordance with the Creative Commons Attribution Non Commercial (CC BY-NC 4.0) license, which permits others to distribute, remix, adapt, build upon this work non-commercially, and license their derivative works on different terms, provided the original work is properly cited, appropriate credit is given, any changes made indicated, and the use is non-commercial. See: http://creativecommons.org/licenses/by-nc/4.0/ .