Mining Electronic Medical Records
The second speaker, Bryant England, MD, PhD, associate professor in the Division of Rheumatology, University of Nebraska Medical Center, Omaha, began his talk by discussing the issue of wide variability in estimates of RA-ILD incidence and prevalence. Such variations raise questions: Are we over- or underdiagnosing RA-ILD? How wide is the spectrum of RA-ILD disease? One topic on which general agreement exists is that clinicians must seek to identify RA-ILD as early as possible to have any hope of reducing mortality (of note, the median survival for patients with RA-ILD is 3–10 years).5
Dr. England pointed out that electronic medical record systems contain a wealth of clinical information, and, using the correct algorithmic techniques, this data could be harnessed to better identify patients with RA-ILD. In a 2020 study on this matter, Dr. England et al. used the records of more than 500 patients with RA from the Veterans Affairs administrative data sets and identified patients with RA-ILD based on International Classification of Diseases Ninth Revision (ICD-9) and Tenth Revision (ICD-10) codes. They then characterized the performance of administrative algorithms for identifying RA-ILD compared with doing so by a detailed chart review. The authors found that the best performing algorithm was the one that incorporated at least two ILD diagnosis codes 30 days apart, a pulmonologist diagnosis or CT/pulmonary function test (PFT) evidence of ILD prior to the rheumatologist’s diagnosis and exclusion of other ILD causes.6
In a separate study, Dr. England et al. furthered their work on RA-ILD algorithms by applying text mining techniques to HRCT reports. In this study, patients with RA-ILD were identified and ILD-related terms, such as reticulation, ground glass, honeycomb and interstitial, were identified in CT chest reports by natural language processing software. Importantly, the researchers made sure to confirm that no negating terms were around these words (i.e., such phrases as no honeycombing or absence of reticulations). By combining ILD-related terms from CT reports with administrative algorithms, the positive predictive value for identifying RA-ILD exceeded 90%.7 Projects like these are important to think about identifying RA-ILD in real-world scenarios; frequently, patients with RA present to their primary care doctor with cough or other vague pulmonary symptoms and undergo CT chest imaging. Dr. England asked the audience to picture this scenario and add the element of an artificial intelligence system working in the background, culling radiology reports from such patients and flagging any with potential concern for RA-ILD, prompting the treating rheumatologist to order PFTs and consider a pulmonology referral.