Application of ChatGPT as a content generation tool in continuing medical education: acne as a test topic


Submitted: 10 September 2024
Accepted: 5 November 2024
Published: 28 November 2024
Abstract Views: 102
PDF: 29
SUPPLEMENTARY: 5
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.

Authors

The large language model (LLM) ChatGPT can answer open-ended and complex questions, but its accuracy in providing reliable medical information requires a careful assessment. As part of the AICHECK (Artificial Intelligence for CME Health E-learning Contents and Knowledge) Study, aimed at evaluating the potential of ChatGPT in continuous medical education (CME), we compared ChatGPT-generated educational contents to the recommendations of the National Institute for Health and Care Excellence (NICE) guidelines on acne vulgaris. ChatGPT version 4 was exposed to a 23-item questionnaire developed by an experienced dermatologist. A panel of five dermatologists rated the answers positively in terms of “quality” (87.8%), “readability” (94.8%), “accuracy” (75.7%), “thoroughness” (85.2%), and “consistency” with guidelines (76.8%). The references provided by ChatGPT obtained positive ratings for “pertinence” (94.6%), “relevance” (91.2%), and “update” (62.3%). The internal reproducibility was adequate both for answers (93.5%) and references (67.4%). Answers related to issues of uncertainty and/or controversy in the scientific community scored the lowest. This study underscores the need to develop rigorous evaluation criteria for AI-generated medical content and for expert oversight to ensure accuracy and guideline adherence.


OpenAI. ChatGPT. https://chat.openai.com/chat

Noy S, Zhang W. Experimental evidence on the productivity effects of generative artificial intelligence. Science 2023;381:187-92. DOI: https://doi.org/10.1126/science.adh2586

Meskó B, Topol EJ. The imperative for regulatory oversight of large language models (or generative AI) in healthcare. NPJ Digit Med 2023;6:120. DOI: https://doi.org/10.1038/s41746-023-00873-0

Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell 2023;6:1169595. DOI: https://doi.org/10.3389/frai.2023.1169595

Safranek CW, Sidamon-Eristoff AE, Gilson A, Chartash D. The role of large language models in medical education: applications and implications. JMIR Med Educ 2023;9:e50945. DOI: https://doi.org/10.2196/50945

Eysenbach G. The role of ChatGPT, Generative Language Models, and Artificial Intelligence in medical education: a conversation with ChatGPT and a call for papers. JMIR Med Educ 2023;9:e46885. DOI: https://doi.org/10.2196/46885

Sallam M. ChatGPT utility in healthcare education, research, and practice: systematic review on the promising perspectives and valid concerns. Healthcare (Basel) 2023;11:887. DOI: https://doi.org/10.3390/healthcare11060887

Shah NH, Entwistle D, Pfeffer MA. Creation and adoption of Large Language Models in medicine. JAMA 2023;330:866-9. DOI: https://doi.org/10.1001/jama.2023.14217

Gordon ER, Trager MH, Kontos D, et al. Ethical considerations for artificial intelligence in dermatology: a scoping review. Br J Dermatol 2024;190:789-97. DOI: https://doi.org/10.1093/bjd/ljae040

Chen S, Kann BH, Foote MB, et al. Use of Artificial Intelligence chatbots for cancer treatment information. JAMA Oncol 2023;9:1459-62. DOI: https://doi.org/10.1001/jamaoncol.2023.2954

Ferreira AL, Chu B, Grant-Kels JM, et al. Evaluation of ChatGPT dermatology responses to common patient queries. JMIR Dermatol 2023;6:e49280. DOI: https://doi.org/10.2196/49280

Goodman RS, Patrinely JR, Stone CA Jr, et al. Accuracy and reliability of chatbot responses to physician questions. JAMA Netw Open 2023;6:e2336483. DOI: https://doi.org/10.1001/jamanetworkopen.2023.36483

Lam Hoai XL, Simonart T. Comparing meta-analyses with ChatGPT in the evaluation of the effectiveness and tolerance of systemic therapies in moderate-to-severe plaque psoriasis. J Clin Med 2023;12:5410. DOI: https://doi.org/10.3390/jcm12165410

Rossettini G, Cook C, Palese A, et al. Pros and cons of using Artificial Intelligence chatbots for musculoskeletal rehabilitation management. J Orthop Sports Phys Ther 2023;53:1-7. DOI: https://doi.org/10.2519/jospt.2023.12000

Temsah O, Khan SA, Chaiah Y, et al. Overview of early ChatGPT's presence in medical literature: insights from a hybrid literature review by ChatGPT and human experts. Cureus 2023;15:e37281. DOI: https://doi.org/10.7759/cureus.37281

Bettoli V, Naldi L, Santoro E, et al. ChatGPT and acne: Accuracy and reliability of the information provided-The AI-check study. J Eur Acad Dermatol Venereol 2024. DOI: https://doi.org/10.1111/jdv.20324

National Institute for Health and Care Excellence (NICE) Acne vulgaris: management. NICE guideline 198. Available from: https://www.nice.org.uk/guidance/ng198/resources/acnevulgaris-management-pdf-66142088866501.

Nast A, Dréno B, Bettoli V, et al. European evidence-based (S3) guideline for the treatment of acne - update 2016 - short version. J Eur Acad Dermatol Venereol 2016;30:1261-8. DOI: https://doi.org/10.1111/jdv.13776

Reynolds RV, Yeung H, Cheng CE, et al. Guidelines of care for the management of acne vulgaris. J Am Acad Dermatol 2024;90:1006.e1-1006.e30. DOI: https://doi.org/10.1016/j.jaad.2023.12.017

Zaenglein AL, Pathy AL, Schlosser BJ, et al. Guidelines of care for the management of acne vulgaris. J Am Acad Dermatol 2016;74:945-73.e33. DOI: https://doi.org/10.1016/j.jaad.2015.12.037

Lakdawala N, Channa L, Gronbeck C, et al. Assessing the accuracy and comprehensiveness of ChatGPT in offering clinical guidance for atopic dermatitis and acne vulgaris. JMIR Dermatol 2023;6:e50409. DOI: https://doi.org/10.2196/50409

Cirone K, Akrout M, Abid L, Oakley A. Assessing the utility of multimodal Large Language Models (GPT-4 Vision and Large Language and Vision Assistant) in identifying melanoma across different skin tones. JMIR Dermatol 2024;7:e55508. DOI: https://doi.org/10.2196/55508

Reynolds K, Tejasvi T. Potential use of ChatGPT in responding to patient questions and creating patient resources. JMIR Dermatol 2024;7:e48451. DOI: https://doi.org/10.2196/48451

O'Hagan R, Poplausky D, Young JN, et al. The accuracy and appropriateness of ChatGPT responses on nonmelanoma skin cancer information using zero-shot chain of thought prompting. JMIR Dermatol 2023;6:e49889. DOI: https://doi.org/10.2196/49889

Charvet-Berard AI, Chopard P, Perneger TV. Measuring quality of patient information documents with an expanded EQIP scale. Patient Educ Couns 2008;70:407-11. DOI: https://doi.org/10.1016/j.pec.2007.11.018

Naldi, L., Bettoli, V., Santoro, E., Valetto, M. R., Bolzon, A., Cassalia, F., Cazzaniga, S., Cima, S., Danese, A., Emendi, S., Ponzano, M., Scarpa, N., & Dri, P. (2024). Application of ChatGPT as a content generation tool in continuing medical education: acne as a test topic. Dermatology Reports. https://doi.org/10.4081/dr.2024.10138

Downloads

Download data is not yet available.

Citations