Oral Presentations

O19 ChatSLE – Consulting ChatGPT for 100 frequently asked lupus questions

Abstract

Objective Lupus is a rare and complex disease that affects almost all aspects of life. Inevitably, patients are constantly confronted with questions about their disease. Nevertheless, the shortage of rheumatology expert care often stands in contrast with the patients’ demand for sufficient information. To provide reliable disease-related information in a patient-friendly language, ‘Lupus100.org’ was launched, where experts have answered 100 common questions.

With the advent of widely accessible artificial intelligence, the question arises to what extent AI large language models (LLM) could fill in the information gap and support physicians in the care of patients. Therefore, this study aimed to assess the capability of the LLM ChatGPT-4 to comment on 100 frequently asked patient questions related to lupus.

Methods ChatGPT-4 responses were generated by entering the English questions from https://lupus100.org/ in a fresh session on October 16, 2023. Three senior rheumatologists who were blinded concerning authorship evaluated responses from ChatGPT-4 and Lupus100 independently. The evaluation criteria were quality, empathy (Likert scale 1–5 each) and the selection of a preferred answer. Differences between the scores were analysed using a two-tailed Student’s t-test. A one-sample Chi-Square test was performed to assess whether there was a preferred source for the answers. All statistical analyses were conducted in SPSS, the significance threshold used was p<.05.

Results The quality of the answers provided by ChatGPT-4 was considered significantly greater than that of the Lupus100 responses (table 1). A similar trend was observed for empathy but the difference was not statistically significant. Regarding the responses scored as of ‘poor’ or ‘very poor’ quality and ‘not empathetic’, there were very few cases for either ChatGPT-4 or Lupus100. In general, more responses from ChatGPT-4 (n = 171, 57%) were preferred over those from Lupus100 (n = 129, 43%) and this difference was seen to be significant (p = 0.02).

Conclusions In this study, the LLM ChatGPT-4 generated quality and empathetic responses to patient questions concerning lupus. The study suggests that such models might be a valuable source of patient information and it may support physicians in generating beneficial patient information.

Abstract O19 Table 1
|
Article metrics
Altmetric data not available for this article.
Dimensionsopen-url