Powered by the Evidence-based Practice Centers
Evidence Reports All of EHC
Evidence Reports All of EHC



Use of Electronic Medical Records and Administrative Claims Data for Assessing Type 2 Diabetes Care

Research Report Apr 7, 2010
Download PDF files for this report here.

Page Contents

This report is available in PDF (1.5 MB) only. People using assistive technology may not be able to fully access information in this file. For assistance, please contact us.

Note: This report is greater than 5 years old. Findings may be used for research purposes but should not be considered current.



Evaluating the adequacy of diabetic care requires access to clinical data not available from administrative claims data. Electronic medical records (EMRs) may be a valuable resource for studying diabetes and related complications. More medical systems are adopting EMRs, but we know relatively little about the opportunities and challenges of using these newer EMRs for clinical and health services research.


This project had 2 parallel goals. The informatics goal was to understand the challenges in conducting research using EMR data and to compare the usefulness of these data with that of administrative data from the North Carolina (NC) Medicaid program. The clinical goals were to evaluate medication use in patients with newly diagnosed diabetes mellitus type 2 (DMT2), some of whom may have comorbid conditions such as hypertension and dyslipidemia, and to assess the adequacy of diabetic care.


Cohort analyses of patients with newly diagnosed DMT2.

Setting and Patients

Data from 2 populations were used: patients seen by clinicians affiliated with the University of North Carolina Health Care System (UNCHCS) and individuals covered by NC Medicaid. DMT2 was required to have been diagnosed after January 1, 2001 for the UNC cohort and after January 1, 2002 for the NC Medicaid cohort.


We conducted a literature search to identify publications focusing on the efficacy and effectiveness of medications for newly diagnosed DMT2. We used this review to determine which glycemic indicators should be assessed to determine the adequacy of medications used in the 2 study populations.

The EMR we used for this project was from the UNCHCS (WebCIS). Because UNC is an academic medical center, we developed inclusion and exclusion criteria to restrict the population to those who appeared to be seen regularly by UNC clinicians. We used the patient problem, laboratory test, medication prescribing, and transcription files to identify newly diagnosed DMT2 patients. We developed and tested a medical record abstraction form to guide clinical review of the EMR data, whereby clinicians evaluated the adequacy of glycemic, lipid, and hypertension control. For the NC Medicaid data, we used typical algorithms to identify new DMT2, which reflected eligibility for coverage and the absence of DMT2 disease and medication codes 1 year before diagnosis. In both patient cohorts, we described comorbidities present when DMT2 was diagnosed and medications used within 12 months after diagnosis. We also identified patients who died, had a stroke, or who had a myocardial infarction (MI) early in the course of their DMT2 and described their care in the 12 months before the outcome.


We identified numerous challenges in meeting the informatics goal for this project. First, although structured data such as laboratory results had been deidentified by anonymizing the medical record number, full-text transcription notes from the visit files still contained identifiable patient information. These notes had to be manually deidentified before leaving the clinical site repository. We incorporated text data-mining procedures to use the free-text visit data most efficiently for the project.

For the clinical goal of assessing medication use in DMT2 patients, we focused on patients with adverse outcomes. In all, 78 of the 1664 WebCIS patients had an MI or stroke or died after DMT2 diagnosis, only 30 of whom had a truly new diagnosis of DMT2 based on manual record review. From the 2794 newly diagnosed DMT2 NC Medicaid patients, we identified 49 who had an MI, 173 who had a stroke, and 14 who had both events (death was not captured in this population).

Of the 30 newly diagnosed WebCIS diabetics who had an MI or stroke after their DMT2 diagnosis, most had hypertension (HT) and/or dyslipidemia (DL) in addition to their DMT2 (71% of MI patients and 60% of stroke patients). Only 20% of the patients who had both HT and DL in addition to DMT2 received adequate pharmacological treatment. Of the 19 WebCIS patients who had comorbid HT, only 42% were prescribed an angiotensin-converting enzyme inhibitor (ACEI) or angiotensin receptor blocker (ARB) soon after their DMT2 was diagnosed.

We could not assess the adequacy of glycemic, lipid, or hypertension control in the Medicaid population due to a lack of clinical information in the claims data. There were 49 patients who had an MI, 49% of whom had comorbid DM, HT, and DL, whereas only 30.6% of the patients who had a stroke had comorbid disease. Of the 1753 Medicaid patients who had DMT2 with comorbid HT, 61.5% were dispensed an ACEI or ARB within 12 months after the DMT2 diagnosis. The proportion of WebCIS diabetic patients who had an MI and/or stroke and comorbid HT and/or DL was larger than that in the Medicaid cohort, possibly reflecting a greater burden of illness at the tertiary care center.


By conducting similar analyses in both patient populations, we could discern the value of each data resource for conducting observational research on DMT2. We applied many of the principles of claims-based analysis to the EMR but faced many new challenges throughout the project.

  1. With regard to the informatics goal, the UNC WebCIS EMR was a rich data source with good representation across all UNC care sites. However, the penetration and use of some of the specific functions (electronic prescribing) within the WebCIS system were variable and may have resulted in underreporting of medications. Although text-mining methods were helpful in addressing this issue, comparison of WebCIS data versus text-mining results indicated that neither data source provided complete reporting of patients’ medications.
  2. Given the fragmentation of healthcare in the US, we do not have access to longitudinal data sources that allow us to determine when a condition was first diagnosed. Identifying the onset of a chronic condition such as DMT2 is difficult when using any electronic database, whether EMRs or administrative claims databases. The challenge is to ensure high sensitivity and high specificity to reduce the number of false-positive cases, especially when dealing with relatively prevalent diseases such as diabetes. Validation of case status is critical to ensure accurate disease classification. Discrete EMR values may be validated using full-text information from the same EMR, whereas validation of administrative claims data requires access to outside sources.
  3. Text-mining methods may be useful for ascertaining information critical to research studies and drug safety assessments. Greater focus should be placed on methods to maximize automated extraction of clinically relevant text data regarding historical notations, negation, diagnostic issues, and adverse effects of medications.
  4. Substantial technical challenges are present when using clinical data from an EMR for research. The promise of this activity is substantial and overall we were encouraged, but we underestimated the technical issues involved in carrying out research with an information-rich resource that has both discrete and free-text data.
  5. With regard to the clinical goal of this project, patients who suffered adverse outcomes (MI, stroke, death) early in the course of their DMT2 had substantial preexisting and coexisting morbidities. Treatment of these coexisting conditions, specifically dyslipidemia and hypertension, might have reduced the complication rate. These patients had relatively high utilization of care in the year before the adverse outcome, suggesting ample opportunity for intensification of therapies.
  6. Overall patterns of care and diabetes comorbidities in the Medicaid population were similar to those in the WebCIS population. The use of medications appeared to be somewhat more intensive in the Medicaid population. We caution that the populations were likely not directly comparable regarding factors such as demographics, socioeconomic status, and severity of disease.

Journal Publications

West SL, Blake C, Zhiwen L, et al. Reflections on the use of electronic health record data for clinical research. Health Informatics J 2009;15:108-21.

Kudyakov R, Bowen J, Ewen E, et al. Electronic health record use to classify patients with newly diagnosed versus preexisting type 2 diabetes: infrastructure for comparative effectiveness research and population health management. Popul Health Manag 2011 Aug 30 [Epub ahead of print]. PMID: 21877923.

Project Timeline

Use of Electronic Medical Records and Administrative Claims Data for Assessing Type 2 Diabetes Care

Mar 22, 2010
Topic Initiated
Apr 7, 2010
Research Report
Page last reviewed November 2017
Page originally created November 2017

Internet Citation: Research Report: Use of Electronic Medical Records and Administrative Claims Data for Assessing Type 2 Diabetes Care. Content last reviewed November 2017. Effective Health Care Program, Agency for Healthcare Research and Quality, Rockville, MD.

Select to copy citation