Abstract
Background
Most health information does not meet the health literacy needs of our communities. Writing health information in plain language is time-consuming but the release of tools like ChatGPT may make it easier to produce reliable plain language health information.
Objective
To investigate the capacity for ChatGPT to produce plain language versions of health texts.
Design
Observational study of 26 health texts from reputable websites.
Methods
ChatGPT was prompted to ‘rewrite the text for people with low literacy’. Researchers captured three revised versions of each original text.
Main Measures
Objective health literacy assessment, including Simple Measure of Gobbledygook (SMOG), proportion of the text that contains complex language (%), number of instances of passive voice and subjective ratings of key messages retained (%).
Key Results
On average, original texts were written at grade 12.8 (SD = 2.2) and revised to grade 11.0 (SD = 1.2), p < 0.001. Original texts were on average 22.8% complex (SD = 7.5%) compared to 14.4% (SD = 5.6%) in revised texts, p < 0.001. Original texts had on average 4.7 instances (SD = 3.2) of passive text compared to 1.7 (SD = 1.2) in revised texts, p < 0.001. On average 80% of key messages were retained (SD = 15.0). The more complex original texts showed more improvements than less complex original texts. For example, when original texts were ≥ grade 13, revised versions improved by an average 3.3 grades (SD = 2.2), p < 0.001. Simpler original texts (< grade 11) improved by an average 0.5 grades (SD = 1.4), p < 0.001.
Conclusions
This study used multiple objective assessments of health literacy to demonstrate that ChatGPT can simplify health information while retaining most key messages. However, the revised texts typically did not meet health literacy targets for grade reading score, and improvements were marginal for texts that were already relatively simple.
In recent years, health literacy has come to the forefront of public health research and practice, with persistent calls to provide health information that is easy to access and understand.1, 2 Studies consistently report that most health information does not address the health literacy needs of our communities, particularly those who are older, with lower education and have less fluency in a community’s dominant language.3,4,5,6 This includes information developed by government, health services and non-government organisations.7, 8
Addressing this issue is challenging given the vast amount of health information available online. Currently, writing in plain language requires a health information provider to manually implement advice from health literacy guidelines and checklists.9,10,11,12 This is a process that demands considerable expertise and time. Though there are tools for objectively assessing the health literacy of health information and automating text-simplification,13,14,15 revisions are still largely carried out by humans.
Recent advances in large language models present new opportunities that might transform our ability to develop plain language health information at scale. For example, in November 2022, OpenAI publicly released ChatGPT, a large language model that has been trained on a large database of text data to produce plausible, contextually appropriate and human-like responses to prompts—typically questions or requests to produce writing meeting certain constraints. Large language models do not synthesise or evaluate evidence, but rather they predict what should come next in a piece of text by learning from large volumes of training data.16 ChatGPT is also capable of adapting text to different writing styles and audiences, has a simple user interface that does not require software or programming expertise, and is freely available.
There is limited evidence showing that ChatGPT can produce information that adheres to health literacy guidelines. For example, one study has shown that ChatGPT prompts can produce patient letters that are written at 9th grade reading level,17 and rated ChatGPT output describing patient postoperative instructions as adequately understandable, actionable and generally complete.18 However, there is substantial room for improvement, both in terms of optimising the ChatGPT prompts and employing more comprehensive assessment of plain language. Other studies have found that ChatGPT outputs in health domains were generally correct and complete, with low potential for harm, though the complexity of the language was not assessed.19, 20 Several studies have also identified a reasonable level of accuracy in ChatGPT output that responds to health questions.21,22,23,24
This study sought to investigate the capacity for ChatGPT (GPT-3.5) to produce plain language versions of health texts across a range of health topics. To our knowledge, no studies have evaluated the appropriateness of plain language health information generated by ChatGPT using multiple objective assessments.
References
- Wild A, Kunstler B, Goodwin D, Onyala S, Zhang L, Kufi M, et al. Communicating COVID-19 health information to culturally and linguistically diverse communities: insights from a participatory research collaboration. Public Health Res Pract. 2021;31(1):e311210. https://doi.org/10.17061/phrp3112105
- White SJ, Barello S, Cao di San Marco E, Colombo C, Eeckman E, Gilligan C, et al. Critical observations on and suggested ways forward for healthcare communication during COVID-19: pEACH position paper. Patient Education and Counseling. 2021;104(2):217-22. https://doi.org/10.1016/j.pec.2020.12.025
- Mac OA, Muscat DM, Ayre J, Patel P, McCaffery KJ. The readability of official public health information on COVID-19. Med J Aust. 2021;215(8):373-5. https://doi.org/10.5694/mja2.51282
- Ayre J, Muscat DM, Mac O, Batcup C, Cvejic E, Pickles K, et al. Main COVID-19 information sources in a culturally and linguistically diverse community in Sydney, Australia: A cross-sectional survey. Patient Education and Counseling. 2022;105(8):2793-800. https://doi.org/10.1016/j.pec.2022.03.028
- McCaffery KJ, Dodd RH, Cvejic E, Ayre J, Batcup C, Isautier JM, et al. Health literacy and disparities in COVID-19–related knowledge, attitudes, beliefs and behaviours in Australia. Public Health Res Pract. 2020;30(4):30342012. https://doi.org/10.17061/phrp30342012
- Mishra V, Dexter JP. Comparison of Readability of Official Public Health Information About COVID-19 on Websites of International Agencies and the Governments of 15 Countries. JAMA Netw Open. 2020;3(8):e2018033. https://doi.org/10.1001/jamanetworkopen.2020.18033
- Cheng C, Dunn M. Health literacy and the Internet: a study on the readability of Australian online health information. Aust N Z J Public Health. 2015;39(4):309-14. https://doi.org/10.1111/1753-6405.12341
- Daraz L, Morrow AS, Ponce OJ, Farah W, Katabi A, Majzoub A, et al. Readability of Online Health Information: A Meta-Narrative Systematic Review. American Journal of Medical Quality. 2018;33(5):487-92. https://doi.org/10.1177/1062860617751639
- Shoemaker SJ, Wolf MS, Brach C. Development of the Patient Education Materials Assessment Tool (PEMAT): a new measure of understandability and actionability for print and audiovisual patient information. Patient Educ Couns. 2014;96(3):395-403. https://doi.org/10.1016/j.pec.2014.05.027
- Brega A, Barnard J, Mabachi N, Weiss B, DeWalt D, Brach C, et al. AHRQ Health Literacy Universal Precautions Toolkit, 2nd Edition. Agency for Healthcare Research and Quality, Rockville, MD. 2015. https://www.ahrq.gov/professionals/quality-patient-safety/quality-resources/tools/literacy-toolkit/healthlittoolkit2.html. Accessed 14 Jun 2017.
- Plain Language Action and Information Network. Federal plain language guidelines, March, 2011. 2011. https://www.plainlanguage.gov/media/FederalPLGuidelines.pdf. Accessed 12 Dec 2018.
- National Adult Literacy Agency. Simply Put: Writing and design tips. Dublin, Ireland: National Adult Literacy Agency; 2011.
- VisibleThread. The Language Analysis Platform That Means Business. 2022. https://www.visiblethread.com/. Accessed 2 Dec 2022.
- Leroy G, Kauchak D, Haeger D, Spegman D. Evaluation of an online text simplification editor using manual and automated metrics for perceived and actual text difficulty. JAMIA Open. 2022;5(2):ooac044. https://doi.org/10.1093/jamiaopen/ooac044
- Ayre J, Bonner C, Muscat DM, Dunn AG, Harrison E, Dalmazzo J, et al. Multiple Automated Health Literacy Assessments of Written Health Information: Development of the SHeLL (Sydney Health Literacy Lab) Health Literacy Editor v1. JMIR Form Res. 2023;7:e40645. https://doi.org/10.2196/40645
- Farrokhnia M, Banihashem SK, Noroozi O, Wals A. A SWOT analysis of ChatGPT: Implications for educational practice and research. Innovations in Education and Teaching International. 2023:1–15. https://doi.org/10.1080/14703297.2023.2195846
- Ali SR, Dobbs TD, Hutchings HA, Whitaker IS. Using ChatGPT to write patient clinic letters. The Lancet Digital Health. 2023;5(4):e179-e81. https://doi.org/10.1016/S2589-7500(23)00048-1
- Ayoub NF, Lee Y-J, Grimm D, Balakrishnan K. Comparison Between ChatGPT and Google Search as Sources of Postoperative Patient Instructions. JAMA Otolaryngology–Head & Neck Surgery. 2023. https://doi.org/10.1001/jamaoto.2023.0704
- Jeblick K, Schachtner B, Dexl J, Mittermeier A, Stüber AT, Topalis J, et al. ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. 2022. https://doi.org/10.48550/arXiv.2212.14882
- Lyu Q, Tan J, Zapadka ME, Ponnatapuram J, Niu C, Wang G, et al. Translating radiology reports into plain language using chatgpt and gpt-4 with prompt learning: Promising results, limitations, and potential. 2023. https://doi.org/10.48550/arXiv.2303.09038
- Gilson A, Safranek CW, Huang T, Socrates V, Chi L, Taylor RA, et al. How Does ChatGPT Perform on the United States Medical Licensing Examination? The Implications of Large Language Models for Medical Education and Knowledge Assessment. JMIR Med Educ. 2023;9:e45312. https://doi.org/10.2196/45312
- Kung TH, Cheatham M, Medenilla A, Sillos C, De Leon L, Elepaño C, et al. Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models. PLOS Digital Health. 2023;2(2):e0000198. https://doi.org/10.1371/journal.pdig.0000198
- Walker HL, Ghani S, Kuemmerli C, Nebiker CA, Müller BP, Raptis DA, et al. Reliability of Medical Information Provided by ChatGPT: Assessment Against Clinical Guidelines and Patient Information Quality Instrument. J Med Internet Res. 2023;25:e47479. https://doi.org/10.2196/47479
- Samaan JS, Yeo YH, Rajeev N, Hawley L, Abel S, Ng WH, et al. Assessing the Accuracy of Responses by the Language Model ChatGPT to Questions Regarding Bariatric Surgery. Obesity Surgery. 2023;33(6):1790-6. https://doi.org/10.1007/s11695-023-06603-5
- Australian Bureau of Statistics. Health Literacy, Australia, 2006. Canberra, Australia; 2008. https://www.abs.gov.au/ausstats/abs@.nsf/Latestproducts/4233.0Main%20Features22006.
- Clinical Excellence Commission. NSW Health Literacy Framework. 2019–2024. Clinical Excellence Commission, Sydney. 2019. https://www.cec.health.nsw.gov.au/__data/assets/pdf_file/0008/487169/NSW-Health-Literacy-Framework-2019-2024.pdf. Accessed 20 Apr 2022.
- McLaughlin GH. SMOG Grading-a New Readability Formula. Journal of Reading. 1969;12(8):639-46.
- Mac O, Ayre J, Bell K, McCaffery K, Muscat DM. Comparison of Readability Scores for Written Health Information Across Formulas Using Automated vs Manual Measures. JAMA Network Open. 2022;5(12):e2246051-e. https://doi.org/10.1001/jamanetworkopen.2022.46051
- Office of Disease Prevention and Health Promotion. Health literacy online: A guide to simplifying the user experience. 2015. https://health.gov/healthliteracyonline/. Accessed 27 Oct 2023.
Topic
JGIM
Author Descriptions
Sydney Health Literacy Lab, Sydney School of Public Health, Faculty of Medicine and Health, The University of Sydney, Rm 128C Edward Ford Building, Sydney, NSW, Australia
Julie Ayre PhD, Olivia Mac MPH, Kirsten McCaffery PhD, Brad R. McKay FRACGP, MPH, Mingyi Liu MPH, Yi Shi MPH & Atria Rezwan BPsychSc(Hons)
Discipline of Biomedical Informatics and Digital Health, School of Medical Sciences, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW, Australia
Adam G. Dunn PhD
Share
Related Articles
Patterns of Social Needs Predict Quality-of-Life and Healthcare Utilization Outcomes in Patients from a Large Hospital System
Abstract Background Unmet social needs (SNs) often coexist in distinct patterns within…
Order Keepers or Immigration Agents? Latine Immigrant Views of Law Enforcement in Healthcare Settings
Abstract Background Police and security presence in healthcare settings have grown. There…
Become
an SGIM Member
Join SGIM today, become part of our lifelong learning community, and enjoy access to the many benefits of SGIM membership.