Skip to main content
Effective Health Care Program

Adaptation of Data Mining Algorithms Assessing the Comparative Effectiveness and Safety of NSAIDs

Research Report

Findings from this study were published in the following journal article.

Curtis JR, Cheng H, Delzell E, ScD, Fram D, Kilgore M, Saag K, MD, MSc, Yun H, and DuMouchel W. Adaptation of Bayesian Data Mining Algorithms to Longitudinal Claims Data: Coxib Safety as an Example. Med Care. 2008 Sept; 46, (9): 969-975.



Bayesian data mining methods have been used to evaluate drug safety signals from adverse event reporting systems and allow for evaluation of multiple endpoints that are not prespecified. Their adaptation for use with longitudinal data such as administrative claims has not been previously evaluated or validated.


In this pilot study, we evaluated the feasibility of adapting data mining methods using the empirical Bayes Multi-item Gamma Poisson Shrinkage (MGPS) algorithm to longitudinal administrative claims data. The Medicare Current Beneficiary Survey was used to identify a cohort of Medicare enrollees who were exposed to cyclooxygenase selective (coxib) or nonselective nonsteroidal anti-inflammatory drugs (NS-NSAIDs) from 1999 to 2003. Empirical Bayes MGPS algorithm was used to simultaneously evaluate 259 outcomes associated with current use of coxibs versus NS-NSAIDs while adjusting for key covariates and multiple comparisons. For comparison, a parallel analysis used traditional epidemiologic methods to evaluate the relationship between coxib versus NS-NSAID use and acute myocardial infarction, with the goal of establishing the concurrent validity of the data mining approach.


Among 9431 Medicare beneficiaries using NSAIDs and considering all 259 possible outcomes, empirical Bayes MGPS identified an association between current celecoxib use and acute myocardial infarction (Empirical Bayes Geometric Mean ratio 1.91) but not other outcomes. Rofecoxib use was associated with acute cerebrovascular events (Empirical Bayes Geometric Mean ratio 1.85) and several other diagnoses that likely represented indications for the drug. Results from the analyses using traditional epidemiologic methods were similar and indicated that the data mining results were valid.


Bayesian data mining methods seem useful to evaluate drug safety using administrative data. Further work will be needed to extend these findings to different types of drug exposures and to other claims databases.


The objective of this Technical Brief is to provide an overview of U.S. Food and Drug Administration (FDA)-approved ERT for the treatment of six LSDs. The purpose of a Technical Brief is to report what outcomes (benefits and harms) have been studied for a technology, drug or procedure; it does not enumerate those outcomes. The Technical Brief also addresses research gaps identified during its preparation. It is not intended as a comparative effectiveness review or systematic review that draws conclusions as to the clinical benefits and harms of a drug, device, or procedure. It does not assess study quality or the strength of the body of evidence on specific outcomes.