“How do I find 3,000 geographically-dispersed diabetics who are currently taking metformin, and who do not have chronic kidney disease--of which at least 1000 are currently in good control, and at least 1000 are not?”
Welcome back! Our opening question illustrates the opportunity of healthplan data for efficient recruiting for a widely-distributed pragmatic study (WDPS, as discussed in Episodes 2 and 3 of this series); and we’ll use the question to illustrate the strengths, limitations and “gotchas” inherent in using this kind of data--and how it can be used wisely in expert hands.
Cut to the punchline: Healthplan data carries many advantages in recruiting for your pragmatic study (as well as for the retrospective analytics often needed to clarify questions and design your study). But, it’s designed for payment, not for clinical outcomes studies. Yet it can be extremely useful in expert hands when properly used. Working directly with a healthplan can have additional advantages, too, but also certain caveats. We provide a handy checklist to help you navigate.
Why healthplan data? The WDPS can yield actionable insights about the use, adherence to, and consequences (outcomes) of treatments over a wide range of appropriate real-world clinical scenarios, patient and clinician characteristics, and healthcare settings. This implies your study will need to recruit and track a lot of patients (subjects) and clinicians over a breadth of geography. Large healthplans have this kind of data.
How big a population do you need to do your study? For example, if you need 3000 type 2 diabetics age 45-64, let’s estimate that: Your healthplan age distribution mirrors the overall US population, with 26.4% age 45-64 (1) (of course, if you don’t have Medicare data, this proportion is probably larger); that the prevalence of this scenario in your healthplan data approximates the CDC’s estimate, 12.7% diagnosed (2); that 8.5% of them will be excluded due to chronic kidney failure (3); that 60% are taking metformin of which 85% aren’t using insulin and of those, 70% aren’t taking other oral hypoglycemics (4). This gets us to about 109,000 potential candidates.
But then you(probably through the healthplan) have to recruit their doctors (directly--or indirectly by first reaching out to potentially-qualifying healthplan members), who in turn screen and recruit patients, some of whom drop out:
Those numbers look promising, but if you know healthplan data you’ll know that you still have some cutting down to do: For example, your protocol may specify that qualifying members must have been “continuously eligible (insured)” for at least 12 months, not become age 65 during the 12-month study, and not have been in the ER or hospital with hypo- or hyperglycemia or acute cardiovascular disease in the past 12 months. And, of course, they must give consent if the study involves being randomized to, or offered a choice to receive, an intervention (or if your IRB says consent is required, whatever you may think).
The accuracy vs. inclusiveness trade-off: The CDC prevalence is based on more accurate sources than claims, which are notoriously susceptible to false-positive disease identifications. (5) False-negatives (failure to identify a diabetic) may occur, too, if there’s inadequate data history. In a study (6) that compared several Medicare claims-based algorithms and used self-report as the gold standard (“correct by definition”), the best model’s sensitivity was about 70% and specificity 97.5%. (7) Of course, self-report is not as gold standard as lab testing which--with today’s electronic medical records--may automatically inform a claim (billing) diagnosis code. Unsurprisingly, the study found that combining more than 1 data source (e.g. inpatient, outpatient, lab, pharmacy), having longer claims history, and requiring multiple claims if services were outpatient reduced the false-positive rate. The higher the bar, the likelier your clues point to the real McCoy, but also, the more real McCoys you will miss.
Strengths and limitations of healthplan data for recruiting, outcomes:
Whether you’re looking to use healthplan data for recruiting or as part of assessing treatment, you must understand how to utilize this fabulous data trove wisely.
Here’s a checklist for using healthplan data:
The easiest way to work with healthplan data is, of course, to work with a healthplan! While it’s unlikely they’ll give you direct access to their data, their informatics specialists--who know their data extremely well--will know how to elicit your research questions and recruiting criteria, convert them to queries, and--if the data will be used as part of outcomes evaluation--develop analytics. In addition, healthplans are entitled to reach out to their members and contracted clinicians and facilities (with certain constraints).
Working with a healthplan may imbue your relationship with a heightened sense of collaboration. A downside to working with a single healthplan is limitation in the number of potential subjects for a study--could be important with a rare disease or treatment. In some circumstances it may be possible to work with more than one plan; and in the near future, multi-data vendors may arise, with capabilities of identifying patients and doctors for recruiting.
Closing remarks: Healthplan data offers a wealth of advantages and opportunities for recruiting and gaining insights into the drivers and outcomes of therapies. This is especially important to the widely-distributed type of pragmatic study--but please engage (or be) an expert!
Next up: Let’s talk recruiting!
Want to know more? Find us HERE!
NOTES