For clinical research, electronic health records (EHRs) give investigators access to significant amounts of patient data, including longitudinal information related to a patient’s disease course and genetic implications. Given the right algorithm, the use of EHRs may be an efficient tool for identifying and studying patients with uncommon conditions, such as systemic lupus erythematosus (SLE). In the past, researchers have used algorithms with multiple counts of the International Classification of Diseases, Ninth Revision (ICD-9) Clinical Modification billing code data, specifically 710.0. However, this method has not been rigorously validated and has yielded positive predictive values of only 50–60% in general populations.
A recent study in patients with rheumatoid arthritis expanded the ICD-9-based approach, designing algorithms that incorporated more disease-related data with ICD-9 (710.0) coding. Building on this concept, April Barnado, MD, and colleagues from the Vanderbilt University Medical Center in Nashville set out to design and validate EHR algorithms that incorporate data from multiple counts of ICD-9 (710.0) coding, laboratory testing, medication data and keywords to accurately identify patients with SLE. The results of their work appear in the May 2017 issue of Arthritis Care & Research.
Using a de-identified version of the Vanderbilt EHR with 2.5 million subjects, researchers identified all individuals with at least one SLE ICD-9 code (710.0) (N=5,959). They then created a training set to identify the true disease status of 200 randomly selected individuals during a chart review by two rheumatologists. Positive predictive values and sensitivity were calculated, and the algorithms with the highest positive predictive values were internally validated using a random set of 100 individuals from the remaining 5,759 subjects.
The Results
“We developed and validated three novel algorithms to identify patients with SLE using multiple classes of data available in the EHR,” write the authors in their discussion. “It is the first instance of validated algorithms to incorporate laboratory and medication values with the SLE ICD-9 code.”
During the training set, the three algorithms exhibited positive predictive values of 95%, 89% and 91%. The rheumatologists—with 96% agreement—identified 90 patients with SLE and 95 without, as well as excluded 15 patients with missing information. (Note: For this study, a case was identified as a patient diagnosed with SLE by a rheumatologist, nephrologist or dermatologist.) Using these 185 cases, researchers found that “as the frequency of the code T2 counts increased, the [positive predictive value] increased. Excluding ICD-9 codes for [systemic sclerosis] and [dermatomyositis] further increased the [positive predictive values] for all the algorithms by 2–5%,” they write.