In practice, our learning healthcare system relies primarily on observational studies generating one effect estimate at a time using customized study designs with unknown operating characteristics and publishing – or not – one estimate at a time. When we investigate the distribution of estimates that this process has produced, we see clear evidence of its shortcomings, including an apparent over-abundance of statistically significant effects.
We propose a standardized process for performing observational research that can be evaluated, calibrated and applied at scale to generate a more reliable and complete evidence base than previously possible. We demonstrate this new paradigm by generating evidence about all pairwise comparisons of 39 treatments for hypertension for a relevant set of 58 health outcomes using nine large-scale health record databases from four countries.
In total, we estimate 1.3M hazard ratios, each using a comparative effectiveness study design and propensity score stratification on par with current one-off observational studies in the literature. Moreover, the process enables us to employ negative and positive controls to evaluate and calibrate estimates ensuring, for example, that the 95% confidence interval includes the true effect size 95% of the time. The result set consistently reflects current established knowledge where known, and its distribution shows no evidence of the faults of the current process.
Joint work with George Hripcsak, Patrick Ryan, Martijn Schuemie, and Marc Suchard.
Viewable by anyone with the link to the video.
All rights reserved.