Health care redesign identifies new approaches for treating chronic disease to improve outcomes, increase satisfaction, and lower resource utilization (RU) (Berwick 1996). To deliver these improvements, policy makers propose systematic changes to traditional primary care delivery, including institutional disease management, case management, care management (CM), the chronic care model (CCM), guided care, transitional care, personalized care, and the patient-centered medical home (PCMH). Some elements included in these models have been enumerated (Wagner 1998, Boult 2008, Coleman 2006, Coulter 2015, Berenson 2008, Stellefson 2013). They include: 1) registries and information systems, 2) self-management support, 3) decision support, 4) care managers in multidisciplinary teams (CMT), 5) delivery-system redesign, and 6) change in community resources and public policy. Researchers have used randomized controlled trials (RCTs), quasi-randomized studies, and cohort studies to evaluate implementation of specific elements of these models (Peters 1995, Davidson 2007, Shojania 2006). Reviews and policy statements based on these elements propose that deployment of some of these models will lead to quality and utilization benefits (Weingarten 2002). The American College of Physicians website promotes implementation of the CCM (Allweiss 2015) and a PCMH (ACP 2016). A large portion of literature on transforming the delivery of care stresses how these models will improve the care of people with chronic illnesses, and PCMHs have been promoted as benefiting all patients. However, the paucity of high-quality evidence (Shojania 2006, Stokes 2015, Hussey 2009, Kolbasovsky 2011, Solberg 2007) substantiating success in lowering cost, decreasing RU, increasing access, and improving satisfaction and quality should give those eager to implement these models some pause. A review by Jackson (2013) cited several factors needed to create a valid body of literature.
Results of rigorous studies of CMT and elements of that model have shown improved process without showing benefits in clinical outcomes, RU, or costs (Norris 2001, Loveman 2003, Goldman 2014). The large-scale comparison in the Health Disparities Collaboratives demonstrated improved process measures for only two of three disease states within safety-net institutions. No improvement in clinical outcomes, RU, or costs was demonstrated (Landon 2007).
Optimism for the various CM templates stemmed from early studies (Davidson 2007, Wagner 2001, Diabetes 1993). In a 12-month RCT, Aubert showed a statistically significant reduction in HbA1c for patients managed by a CMT versus those receiving standard care (1.7 vs. 0.6 percentage points, P=.001) but did not find reductions in hospital admissions, patient satisfaction, or RU (Aubert 1998). Norris et al. (2001) reviewed 72 studies and found no economic effect. They concluded that external generalization of results for self-management was limited with respect to quality improvement and RU. Typically, evaluations of successful PCMH programs are reports of case-controlled designs (Higgins 2014). In a 15-center Medicare review of care coordination studies, Peikes (2009) reported no benefit in cost or RU with CM and CMT. In a meta-analysis, Stokes (2015) stated, “Current results do not support case management as an effective model, especially concerning reduction of secondary care use or total costs.”
Evaluation of the effect of CM is vulnerable to negative and positive spillover effects. This is due to the statistical aberration of clustering. In safety-net institutions, an additional source of misinterpretation is due to diversion of fixed resources to benefit an intervention group while depriving control and nonstudy patients.
This study looks at RU during reorganization of traditional primary care physician (PCP) practices. It compares RU by patients with diabetes who were care managed with those managed traditionally. Additional outcomes include comparison of RU for all patients in these physicians’ panels to identify “unintended consequences” as suggested by Jackson (2013). Comparisons required a cluster adjustment, which was also suggested by Jackson. Results reflect overall RU by the panels.
The study was a prospective RCT in a safety-net institution. The institutional review board (IRB) approved this study as a quality-improvement program based on the planned implementation of an organizational care delivery change. The IRB approved the consent for physicians participating in the study. There were no adjustments to the study protocol during the entirety of the study.
A query of a registry with 16,824 diabetic patients identified 18 PCPs, each caring for >300 patients with diabetes working in one Federally Qualified Health Center. These physicians were included as potential subjects if they were board-certified in internal medicine, practiced at the study site for more than five years, and were willing to participate as members of Group 1 or Group 2 (Figure). The IRB approved a cluster randomization scheme that randomized each PCP by a study coordinator using a computer-generated, random-number scheme in an opaque sealed envelope. After consent was obtained, a blinded study coordinator opened the envelope. The study’s principal investigator consented and randomized 12 PCPs to Group 1 or Group 2 depicted in the Figure. The intervention group, Group 1, consisted of 5 PCPs assigned CMTs. The control group, Group 2, consisted of 5 PCPs that continued with traditional care. Each group had one alternate physician randomized in case a physician left similar to the way juries have alternate jurors.
D-CMT=diabetes care management teams, D-TC=diabetes traditional care
A system-wide electronic Health Information System (HIS; Invision, Siemens, Malvern, Pa.) maintained a list of all patients with their assigned PCP. Patients diagnosed with diabetes by their PCP had been entered into a proprietary registry (DM, FileMaker Pro v. 6.0) either upon referral for diabetes education or abstraction from a one-time query of the HIS (ICD-9 codes 249.**, 250.**, 357, 362.0*, 366.4, 648.0*). A patient’s relationship with a specific PCP determined the patient’s group assignment. Prior to and during the baseline year of the study, the PCPs established the approach to diabetes care guided management.
The study lasted three years. Patients maintained their initial group assignment for the duration of the study and for analysis. Patients in both groups had an opportunity to attend a series of standardized self-management educational sessions (Norris 2001) discussing monitoring of blood sugar, adjusting lifestyle and medication based on test results, diet, exercise, and coping with chronic conditions and complications.
CMTs were formed as part of an operational change of the daily routine of care delivery (Solberg 2007). Three certified diabetic educators (CDEs) from the Diabetes & Metabolism clinic were reassigned to a Group 1 PCP to join a care-manager team (D-CMT). To be assigned, the CDE met specific criteria: 1) a current RN or PharmD license; 2) certification as a diabetes educator; and 3) passing an internal examination certifying familiarity with the protocols for medication adjustment, aspirin use, statin initiation, blood pressure treatment, microalbuminuria screening, and management. A medical assistant (MA) was assigned to worked exclusively with the D-CMT. The team thus consisted of a Group 1 physician, a care manager, and the MA.
D-CMT members assumed extended roles. The PCP authorized the D-CMT to use all the management protocols for their patients with diabetes. D-CMTs implemented elements of the CCM program with emphasis on information technology, patient self-management, practice reorganization, and protocol-driven management. Patients were encouraged to call their PCP first, 24/7, for new problems or questions. The D-CMT worked with coaches from the MacColl Institute for practice facilitation during the study (Coleman 2009). D-CMT met routinely to discuss patients failing to achieve a clinical goal according to regular registry reports. The PCP provided backup for exceptions falling outside the guidelines. Patients agreed to contact by D-CMT to receive recommendations for management of medication and testing. The registry automatically generated reminders, results, and alerts. Medication adjustments followed protocols. D-CMT scheduled patients for follow-up after admission to a hospital, emergency department (ED), or urgent care (UC). The PCPs managed their remaining paneled patients with the assistance of regular clinic staff.
The traditional care (D-TC) followed routines, guidelines (ADA 2005), and protocols developed by practice consensus over 10 years with individual variations. Groups 1 and 2 had information systems with read-only access available for demographic information, clinical laboratory, diagnostic radiology, a separate picture archiving system, and access to guidelines and protocols, all through a nonintegrated clinical IT system. The system did not have registry functionality. Although Group 2 shared in the continuing medical education programs presenting the CCM model, there was no proactive effort to provide access to a registry or point of care access to guidelines or to initiate a redistribution of tasks according to skill level, training, aptitude, and interest among staff. A head nurse managed the practice of the 18 physicians. An assistant head nurse worked with 8–10 physicians and was assisted by MAs whose assignment to specific physicians varied through the week.
Group 2 practiced in the same clinic as Group 1. Group 2 physicians worked without a team structure. CDEs reported to a remote Diabetes & Metabolism clinic and provided consultation and self-management educational sessions on an ad hoc basis (Norris 2001). Eight CDEs shared an MA and had no access to the registry. Group 2 received ad hoc laboratory and radiological results per that PCP’s routine practice.
The divisions of endocrinology and primary care wrote management guidelines and protocols for diabetes and the cardiovascular cluster of diseases. The pharmacy and therapeutics committee approved the protocols before implementation. The protocols conformed to national guidelines with annual updates (ADA 2005).
A mature database (DM, FileMaker Pro, v. 6.0) was available to the D-CMT. The registry included robust query capabilities with algorithms that identified individuals failing to reach goals and in need of management (Peterson 2008). The registry produced lists of patients for D-CMT with clinical details. The D-CMT contacted patients based on these reports. The number of tests performed divided by the number recommended by guidelines determined compliance. Lists identified patients requiring additional assistance in accomplishing their self-management goals. Lists included notification of a patient’s admission to the hospital and a monthly update on visits to the ED and UC. D-TC received no reports.
The primary outcome was the change in RU rates between the two groups during the baseline and Year 3 of the study. RU included panel rates of admissions to the hospital, ED, and UC. An assumed 2% absolute reduction in urgent care visits would be meaningful with an observed UC visit rate of 13 visits per 100 patients per year. An 80% likelihood of detecting a change at the P<.05 level required 9,000 paneled patients and 300 diabetic patients per group. Based on historical data, 5 PCPs in each group would achieve this power. After randomizing 12 PCPs, the last two PCPs were designated alternates. The study excluded the remaining six PCP panels.
Effects were calculated using prespecified criteria. Intention-to-treat design included the entire panel of patients in Groups 1 and 2 as well as the subset of patients in the diabetes registry within those panels (D-CMT, D-TC). Statistical comparisons used standard statistical software for Student’s t-tests for continuous variables and a chi square for dichotomous values. An intracluster level adjustment used a rho, ρ, value of 0.01 with 10 clusters (Killip 2004). The unit of measure for RU is a rate of visits to a resource annually.
Extended stays due to socioeconomic factors and regional cost idiosyncrasies precluded the usefulness of cost data. Readmission rates were determined for patients readmitted within 30 days of a discharge for any cause. Significance is reported by confidence intervals or P<.05.
For the entire group of PCPs, the average practice experience was 14 years (median 13 years) at the same clinic. The average age was 40, and median, 35. Group 1 had two male PCPs and three female PCPs and Group 2 had three male PCPs and two female PCPs, with all boarded in internal medicine. The 10 physicians randomized to Groups 1 and 2 had combined panels of 9,708 and 9,988 patients respectively (Table 1). The panels averaged 1,969 (Group 1=1,941; Group 2=1,998) patients. Baseline characteristics of the patients were similar (Table 1). D-CMT averaged 370 patients with diabetes while D-TC averaged 385 patients.
Baseline characteristics of Groups 1 and 2
|Group 1||Group 2|
|Age, years||55.2 ± 14.9||54.2 ± 15.0|
|Gender female (male), %||53 (47)||51 (49)|
|Financial class, %|
|Marital status, %|
|RU by panel|
|Hospital admission rate (CI), %/yr||12.1 (9.1, 15.1)||11.9 (10.9, 12.9)|
|Hospital LOS (CI), mean days||3.7 (3.69, 3.71)||3.79 (3.78, 3.80)|
|Readmission rate (CI), %/yr||15.5 (12.4, 18.6)||14.8 (11.8, 17.8)|
|Emergency room visits (CI), %/yr||13.3 (10.2, 16.4)||13.4 (13.3, 16.5)|
|Urgent care visits (CI) , %/yr||15.1 (11.1, 19.1)||12.7 (8.7, 16.7)|
|RU by patients with diabetes, % panel)||18.4||19.7|
|HbA1c, mean (CI); mmol/mol (SD)||7.8 (7.7, 7.9); 62 (9)||7.7 (7.6, 7.7); 61 (6)|
|Hospital admission rate (CI), %/yr||10.1 (9.2, 11.0)||10.6 (9.6, 11.5)|
|Hospital LOS (CI), mean days||3.7 (3.6, 3.7)||3.8 (3.7, 3.8)|
|Readmission rate (CI), %/yr||21.0 (19.6, 21.9)||20.0 (18.7, 21.1)|
|Emergency room visits (CI), %/yr||22.3 (20.2, 24.2)||22.9 (21.0, 24.8)|
|Urgent care visits (CI), %/yr||22.2 (19.9, 24.5)||17.9 (15.6, 20.2)|
|CI=95% confidence interval, D-CMT=diabetes care management team, D-TC=diabetes traditional care, LOS=length of stay, RU=resource utilization|
Both Groups 1 and 2 had a decrease in UC visits from Year 1 to Year 3 (Table 2). The within-group decrease was not significantly different for either group (–2.5% [–2.1, –2.9%], –1.7% [–1.3, –2.1%]) (mean + CI) Groups 1 and 2, respectively (P=.73).
Change in visit rates for both groups
|Site||Group 1||Group 2||P value|
|Panel size, n||9,708||9,988|
|Urgent care visits (CI), % per year||–2.5 (–2.1, –2.9)||–1.7 (–1.3, –2.1)||.73|
|Emergency room visits (CI), % per year||3.8 (3.5, 4.1)||3.6 (3.3, 3.9)||.90|
|Hospital admissions (CI), % per year||–1.4 (–3.3, 0.5)||–1.9 (–3.9, 0.0)||.28|
|Readmission rate (CI), % per year||7.5 (4.4, 10.5)||3.7 (1.5, 5.8)||.31|
|Percent change in visits to the respective clinical site. A positive number indicates an increase in visits from baseline to Year 3.|
Groups 1 and 2 increased ER utilization, with visits rising by 3.8% (3.5, 4.1%) and 3.6% (3.3, 3.9%), respectively (Table 2). The within-group increase was not statistically significant (P=.11). Combining the UC and ER visits for the respective groups represents the total number of unanticipated visits. The groups had baseline combined ER and UC visit rates of 28.4 and 26.2 visits per 100 patient-years, respectively (P=.61). The combined encounters of Group 1 increased 1.3 visits per 100 patient-years, compared with an increase of 1.9 visits per 100 patient-years for Group 2. The number of visits per 100 patient-years for UC (P=.71) and ER (P=.87) and the combined totals (P=.86) did not differ significantly.
The change in admission rates between baseline and Year 3 for Groups 1 and 2 was similar. Within-group changes in hospital admissions for both groups was an insignificant decrease between baseline and Year 3 (P=.63). This decrease was not statistically significantly different in comparing the 2 groups (Table 2). By Year 3, the length of stay had increased by 1.1 (0.7, 1.5) days for Group 1 and 0.6 (0.5, 0.7) day for Group 2. Group 1 stayed 4.8 (4.7, 4.9) days, with Group 2 staying 4.4 (4.3, 4.5) days (P=.01).
The readmission rate for Groups 1 and 2 increased. The readmission rate did not differ within groups. The difference between the two groups was likewise not statistically significant (P=.79) (Table 2).
The D-CMT group had similar reductions in UC visits to D-TC (Table 3). By contrast, ER visits had a similar increase for both D-CMT and D-TC (Table 3). Admissions to the hospital decreased for D-CMT, with an insignificant difference compared to the reduction seen for D-TC (Table 3). There was a greater than twofold increase in the readmission rate for D-CMT, but this difference was not statistically significant compared with that of D-TC (Table 3). Combining all visits to these sites, there was no statistical difference between the two groups. (P=.69)
Change in rate of visits for subset of patients with diabetes
|Site||Diabetes care management team||Diabetes traditional care||P value|
|Panel size, n||1,850||1,925|
|Urgent care visits (CI),
% per year
|–2.5 (–4.8, –0.2)||–3.0 (–5.3, –0.7)||.54|
|Emergency room visits (CI),
% per year
|4.0 (3.1, 4.9)||4.1 (3.2, 5.0)||.48|
|Hospital admissions (CI),
% per year
|–1.8 (–5.8 , 2.2)||–1.1 (–4.2, 1.9)||.22|
|Readmission rate (CI),
% per year
|5.7 (3.4, 8.0)||2.4 (0.1, 4.7)||.09|
|Percent change in the rate of visits to the respective clinical site for patients with diabetes.|
Studies have shown that implementation of a CM model may improve process measures and have variable improvement in intermediate outcomes for patients receiving the intervention (Davidson 2007, Chin 2007). According to several sources (Stokes 2015, Hussey 2009, Kolbasovsky 2011, Holtz-Eakin 2004, Jackson 2013), insufficient evidence exists showing that CM programs reduce overall spending, reduce RU, or improve clinical outcomes. In addition, the body of literature supporting CM has yet to identify essential elements for an effective program (Wasson 2017). Finally, studies evaluating CM ignore the concept of externality (Jackson 2013). In medicine, these are the “unintended consequences.” These include consequences affecting nonstudy and control individuals when attention and resources divert to study patients.
This study of a broad comparison of the effects of CM and TC in a large practice includes a simultaneous analysis of effects on the subset of patients with diabetes. This study showed comparable changes in RU for Groups 1 and 2 as well as equivalent changes occurring in RU for D-CMT and D-TC.
This study contributes a randomized evaluation of RU between primary care models across entire panels in an integrated safety net institution. In addition, it demonstrates equivalent RU among diabetic patients actively care managed with appropriate statistical adjustments for cluster randomization.
Failure to demonstrate a benefit is not proof that CM cannot influence RU. As pointed out by Wasson, the typical approach to PCMH and CM is to regulate process and measure compliance. Perhaps this approach misses the essence and value of clinical care, which is doing “what matters to patients” (Wasson 2017). As Rothman (2005) demonstrated, using a similar design, improvement in HbA1c can be observed despite a lack of improvement in RU. Boult’s “guided care” approach (Sylvia 2008) relies on CMT across heterogeneous panels. In this population, preliminary reports, as with other studies noted, were not statistically significant and these findings remain unsubstantiated by independent evaluation. Further rigorous investigations are required.
Without an evidentiary foundation for improved access, quality, or reduced RU, support for the CM hypothesis relies on improvement of process measures and potential benefit to intermediate and surrogate outcomes (Davidson 2007, Norris 2001, Landon 2007, Aubert 1998, Higgins 2014, Peikes 2009, Holtz-Eakin 2004). Even the studies of CM reporting intermediate and clinical outcomes have varying results in different care venues (Weingarten 2002, Diabetes 1993, Rothman 2005, Sylvia 2008, Steiner 2008, Homer 2005).
In order for redesign studies to support generalization of their findings, methodology should include a randomized, controlled study design with intention-to-treat analysis. When randomization employs physician practices, statistical methods require defined adjustments (Killip 2004). Before implementing a reorganization of the national primary care system, policy makers should require evidence levels I and II in demonstrating benefit with minimal adverse effects to access, quality, cost, resource utilization, and satisfaction of patients and providers (Asch 2005, Hayward 2007). An additional goal for future studies will identify the model’s elements essential to delivering benefits in the associated care settings (Kahn 2015).
Patrick Kearns, MD
7 W. Central Ave.
Los Gatos, CA 95030
ClinicalTrials.gov Identifier: NCT00838825; https://clinicaltrials.gov/ct2/show/NCT00838825
Acknowledgments: The author thanks Jeffry Young, MD, Stanford School of Medicine, Division of Nephrology, for reviewing the manuscript and contributing to the discussion.
Funding: Santa Clara Valley Medical Center’s Quality Improvement budget provided funding for the study.
Conflict of interest and data integrity: The author generated the hypothesis, collected and maintained the data, performed the research and analysis, and wrote the manuscript. The author has no conflict of interest in performing the study or publishing the results.
ACP (American College of Physicians). Patient-centered medical home. www.acponline.org/running_practice/delivery_and_payment_models/pcmh. Accessed Sept. 7, 2017.
ADA (American Diabetes Association). Standards of medical care in diabetes—2005. Diabetes Care. 2005;28(suppl 1):S4–S36.
Allweiss P. The chronic care model and practice transformation: tools of the trade. May 19, 2015. www.acponline.org/practice-resources/quality-improvement/acp-quality-connect/practice-transformation-in-chronic-care-management. Accessed Sept. 7, 2017.
Asch SM, Baker DW, Keesey JW, et al. Does the collaborative model improve care for chronic heart failure? Med Care. 2005;43(7):667–675.
Aubert RE, Herman WH, Waters J, et al. Nurse case management to improve glycemic control in diabetic patients in a health maintenance organization: a randomized, controlled trial. Ann Intern Med. 1998;129(8):605–612.
Berenson RA, Hammons T, Gans DN, et al. A house is not a home: keeping patients at the center of practice redesign. Health Aff (Millwood). 2008;27(5):1219–1230.
Berwick DM. A primer on leading the improvement of systems. BMJ. 1996;312(7031):619–622.
Boult C, Karm L, Groves C. Improving chronic care: the “guided care” model. Perm J. 2008;12(1): 50–54.
Chin MH, Drum Ml, Guillen M, et al. Improving and sustaining diabetes care in community health centers with the Health Disparities Collaborative. Med Care. 2007;45(12):1135–1143.
Coleman C, Pearson M, Wu S. Integrating chronic care and business strategies in the safety net: a practice coaching manual. AHRQ publication no. 09-0061-EF, prepared by Rand Corp. Rockville, MD: Agency for Healthcare Research and Quality. 2009;21–36. www.improvingchroniccare.org/downloads/icic_practice_coaching_manual.pdf. Accessed Sept. 7, 2017.
Coleman E, Parry C, Chalmers S, Min S. The care transitions intervention: results of a randomized controlled trial. Arch Intern Med. 2006;166(17):1822–1828.
Coulter A, Entwistle VA, Eccles A, et al. Personalised care planning for adults with chronic or long-term health conditions. Cochrane Database Syst Rev. 2015;(3):CD010523.
Davidson MB, Ansari A, Karlan VJ. Effect of a nurse-directed diabetes disease management program on urgent care/emergency room visits and hospitalizations in a minority population. Diabetes Care. 2007;30:224–227.
The Diabetes Control and Complications Trial Research Group. The effect of intensive treatment of diabetes on the development and progression of long-term complications in insulin dependent diabetes mellitus. N Engl J Med. 1993;329(14):977–986.
Goldman LE, Sarkar U, Kessell E, et al. Support from hospital to home for elders. Ann Intern Med. 2014;161(7):472–481.
Hayward RA. Performance measurement in search of a path. N Engl J Med. 2007;356(9):951–953.
Higgins S, Chawla R, Colombo C, et al. Medical homes and cost and utilization among high-risk patients. Am J Manag Care. 2014;20(3):e61–e71.
Holtz-Eakin D. An analysis of the literature on disease management programs. Oct. 13, 2004. www.cbo.gov/ftpdocs/59xx/doc5909/10-13-DiseaseMngmnt.pdf. Accessed Sept. 7, 2017.
Homer CJ, Forbes P, Horvitz L, et al. Impact of a quality improvement program on care and outcomes for children with asthma. Arch Pediatr Adolesc Med. 2005;159(5):464–469.
Hussey PS, Eibner, C, Ridgely MS, McGlynn EA. Controlling U.S. health care spending—separating promising from unpromising approaches. N Engl J Med. 2009; 361(22):2109–2111.
Jackson GL, Powers BJ, Chatterjee R, et al. Improving patient care. The patient centered medical home: a systematic review. Ann Intern Med. 2013; 158(3):169–178.
Kahn KL, Timbie JW, Friedberg MW, et al. Evaluation of CMS’s Federally Qualified Health Center (FQHC) advanced primary care practice (APCP) demonstration: final second annual report. Santa Monica, CA: Rand Corp. 2015. www.rand.org/pubs/research_reports/RR886z1.html. Accessed Sept. 7, 2017.
Killip S, Mahfoud Z, Pearce K. What is an intracluster correlation coefficient? Crucial concepts for primary care researchers. Ann Fam Med. 2004;2:204–208.
Kolbasovsky A. Strategies for measuring outcomes and ROI for managed care programs. Manag Care. 2011;20(11):54–57.
Landon BE, Hicks LS, O’Malley AJ, et al. Improving the management of chronic disease at community health centers. N Engl J Med. 2007;356(9):921–934.
Loveman E, Royle P, Waugh N. Specialist nurses in diabetes mellitus. Cochrane Database Systematic Rev. 2003;(2):CD003286.
Norris SL, Engelgau MM, Narayan KM. Effectiveness of self-management training in type 2 diabetes: a systematic review of randomized controlled trials. Diabetes Care. 2001;24(3):561–587.
Peikes D, Chen A, Schore J, Brown R. Effects of care coordination on hospitalization, quality of care, and health care expenditures among Medicare beneficiaries: 15 randomized trials. JAMA. 2009;301(6):603–618.
Peters AL, Davidson MB, Ossorio RL. Management of patients with diabetes by nurses with support of subspecialists. HMO Practice. 1995;9(1):8–13.
Peterson KA, Radosevich DM, O’Connor PJ, et al. Improving diabetes care in practice: findings from the TRANSLATE trial. Diabetes Care. 2008;31:2238–2243.
Rothman RL, Malone R, Bryant B, et al. A randomized trial of a primary care-based disease management program to improve cardiovascular risk factors and glycated hemoglobin levels in patients with diabetes. Am J Med. 2005;118(3):276–284.
Shojania KG, Ranji SR, McDonald KM, et al. Effects of quality improvement strategies for type 2 diabetes on glycemic control: a meta-regression analysis. JAMA. 2006;296(4):427–440.
Solberg LI. Improving medical practice: a conceptual framework. Ann Fam Med. 2007;5(3):251–256.
Steiner BD, Denham AC, Ashkin E, et al. Community care of North Carolina: improving care through community health networks. Ann Fam Med. 2008;6(4):361–367.
Stellefson M, Dipnarine K, Stopka C. The chronic care model and diabetes management in US primary care settings: a systematic review. Prev Chronic Dis. 2013;1 0:E26.
Stokes J, Panagioti M, Alam R, et al. Effectiveness of case management for ‘at risk’ patients in primary care: a systematic review and meta-analysis. PLoS ONE. 2015;10(7):e0132340.
Sylvia ML, Griswold M, Dunbar L, et al. Guided care: cost and utilization outcomes in a pilot study. Dis Manag. 2008;11(1):29–36.
Wagner EH, Sandhu N, Newton KM, et al. Effect of improved glycemic control on health care costs and utilization. JAMA. 2001;285(2):182–189.
Wagner EH. Chronic disease management: what will it take to improve care for chronic illness? Eff Clin Pract. 1998;1(1):2–4.
Wasson J. A troubled asset relief program for patient-centered medical home. J Ambul Care Manage. 2017;40(2);89–100.
Weingarten S, Henning J, Badamgarav E, et al. Interventions used in disease management programmes for patients with chronic illness—which ones work? Meta-analysis of published reports. BMJ. 2002;325(7370):925–932.
Chronic kidney disease (CKD) threatens to become a major health burden in coming years—and that may come as news to many of the Americans who will be affected, according to a study in the American Journal of Kidney Diseases. As the authors put it, awareness of CKD “remains low in the United States, and few estimates of its future burden exist.” In fact, according to federal government health surveys, less than 10% of Americans with the early stages of CKD are aware of their condition.
Researchers at RTI International, a not-for-profit research group, used their own previously developed CKD Health Policy Model to make their predictions. By their reckoning, more than half (54%) of Americans ages 30 and older who don’t currently have CKD will develop the condition some time in their lives.
*Index age 45.
CKD=chronic kidney disease.
Sources: SEER Cancer Statistics Review, JAMA.
They also forecast a ramping up of CKD prevalence from the current level of 13.2% to 14.4% in 2020 (28 million Americans). By 2030, their model projects the prevalence will reach 16.7% (38 million Americans).
Source: Hoerger TJ et al., American Journal of Kidney Diseases, March 2015.
Lead author Thomas Hoerger, PhD, tells Managed Care that the group’s results argue for “interventions to control the conditions that increase the risk of CKD (primarily, tight glycemic control for persons with diabetes and better blood pressure control for persons with hypertension), and [also] partly for the development of new interventions to slow progression among persons in the early stages of CKD.”
The main risk factors for CKD include diabetes, hypertension, and age. Early detection and treatment of CKD can forestall or delay heart disease and kidney failure. Likewise, early treatment of diabetes and hypertension can prevent CKD from developing.
However, as Hoerger and his colleagues point out, the clinical significance of early stage CKD among the elderly with borderline numbers is somewhat debatable, partly because of competing health problems.
Even if these CKD forecasts come true, there’s good news about one of its most dire consequences, end-stage renal disease (ESRD).
When Kaiser Permanente Northern California rolled out a new electronic health record (EHR) system for outpatients a few years back, a team of researchers considered it a golden opportunity to evaluate how such systems affect care and outcomes.
The staggered implementation of the EHR system at 17 KP-owned medical centers from 2004 to 2009 allowed researchers to “examine the association between use of a commercially available certified EHR and clinical care processes and disease control in patients with diabetes,” says the study “Outpatient Electronic Health Records and the Clinical Care and Outcomes of Patients With Diabetes Mellitus” in the Oct. 12 issue of Annals of Internal Medicine.
Mary Reed, DrPH, the lead author, tells Managed Care that “patients’ diabetes and cholesterol control were actually significantly better when their physicians used an EHR compared to when they didn’t. We found that the patients who needed the most care, meaning their lab values were furthest out of control, were most helped by using the EHR.”
The study included nearly 170,000 patients and found that use of EHRs led to better control of diabetes and dyslipidemia.
The report says that when an EHR was used, patients with the greatest needs got increased testing, treatment, and physiologic improvement, and people already meeting glycemic and lipid targets had “decreased testing and treatment intensification.”
There’s a lot at stake, the study points out. Federal incentives for “meaningful use” of certified EHRs total about $29 billion, as much as $44,000 per doctor, and financial penalties for lack of certified EHR begin in 2015.
“The outpatient EHR completely replaced the paper-based medical record and a limited patchwork of pre-existing nonintegrated health IT tools” at the 17 medical centers,” says the study. “Use of those early health IT tools was limited because paper-based alternatives were still in use.”
It’s good to know that the investment will be worth it because “even with federal incentive payments … implementing a complete EHR system requires a large up-front investment in money and time, with careful coordination....”
Everything in life has an “80–20” rule. Example: 20 percent of the population accounts for more than 80 percent of income; 80 percent of a ball club’s salary goes to 20 percent of its players, and so forth. The 80–20 rule is everywhere.
In population health spending, the 80–20 rule is that 80 percent of the time, there is no 80–20 rule. For instance, the Centers for Disease Control claims that the 50 percent of adults who have chronic disease account for 75 percent of health care spending. A 75–50 rule is about as far from an 80–20 rule as you can get, and means that costs are diffused throughout the system, rather than concentrated. (It is also not the slightest bit clear how they can define “chronic disease” so broadly that fully 50 percent of us have it. Are they including insomnia? Tooth decay? Dandruff? Ring around the collar? And how do they even know I suffer from these afflictions, let alone how much I spend on white noise machines, toothpaste, or Head and Shoulders? But we shall leave both these questions and Those Dirty Rings for another day.)
Consistent with that observation about unconcentrated costs, it turns out that large chunks of potential savings are not sitting in one place just waiting to be harvested by a vendor imploring people to smoke fewer Marlboros and eat more broccoli. Yes, the lesson will be: A simple, usually voluntary, program isn’t going to make a noticeable dent in health spending.
But try explaining this to the population health improvement industry (PHI), which prides itself on saying they do exactly that. Fortunately a modicum of math and critical thinking, using one or more of seven informal, common-sense rules can help determine whether this pride is justified. These rules are not footnoted or otherwise sourced, because there is no precedent and no governing body for validating PHI outcomes.
Or, to quote the immortal words of the great philosopher Groucho Marx: “Who are you gonna believe, me or your own eyes?”
The lesson from this chapter will be: A simple, usually voluntary, program isn’t going to make a noticeable dent in health spending.
The goal of these common-sense rules is not to validate every study that is truly valid, which would be a Herculean task, but rather to invalidate those claims that are obviously invalid, a first level of intellectual triage to avoid making misguided resource allocation decisions. The plausibility rules are as follows, with their shorthand in boldface:
The 100 Percent Rule: Outcomes explicitly or implicitly cannot require any element of cost to decline by more than 100 percent.
The Every Metric Can’t Improve Rule: Every element of resource use or group of people cannot decline in cost, through programs aimed generally at improving prevention. In particular, the actual costs associated with prevention, such as primary care visits, drug use, and health screening, must rise.
The 50 Percent Savings Rule: In a voluntary program with no incentives, declines in excess of 50 percent in any given resource category are the result of invalidity, not effectiveness.
The Nexus Rule: There must be a logical link between the goal of the program and the source of savings.
The Quality Dose–Cost Response Rule: Just as in pharmacology, cost cannot decline significantly faster or more than the related quality variables improve.
The Control Group Equivalency Rule: Control groups, if not prospectively sorted into two similar or equivalent groups, based on objective data, before members are even contacted to determine willingness to enroll, are likely to mislead. This is especially true of historic controls (meaning pre-post), matched controls, and using the non-disease group as a control for the disease group.
The Multiple Violations Rule: When one of these rules is violated, others are likely to be violated, as well.
There is a concept in testing called “face validity,” meaning what you’d expect it to mean: A study has face validity if it looks like it’s fairly measuring what it’s supposed to measure. These plausibility tests introduce a companion measure: “face impossibility.” An example has face impossibility if rather than challenge the data or the study design to question an outcome, you can simply tell from the numbers themselves as presented by the vendor that the outcome is impossible.
Every example here has face impossibility.
The textbook example of face impossibility is violating this plausibility rule: You cannot reduce a number by more than 100 percent, period. This is true no matter how hard you try. And just to avoid any potential misunderstandings by our readers Down Under, this also isn’t one of those things that’s the opposite in the Southern hemisphere.
Give it a shot yourself if you don’t believe me. A guy with two PhDs tried and even he couldn’t do it. He posted online — for the world to see if the world didn’t have better things to do with its time — the following: “Suppose you buy a stock at $10. It goes up to $50 and then down to $5. You’ve lost 450 percent.” Nope. Your stock has gone from $10 to $5, a fifth-grade textbook case of a 50 percent decline.
The 100 percent rule is a rule of math, and as mentioned earlier, rules of math are strictly enforced. That means the web page is wrong.
It’s lucky math isn’t a popularity contest because these guys aren’t the only ones who think you can reduce a number by more than 100 percent, as the conclusion of a case study from Vendor D suggests: “Wellness program participants are 225% less likely [boldface is actually theirs, believe it or not] to utilize Extended Illness Benefit than nonparticipants.” Note that for copyright reasons (even though this brochure wasn’t copyrighted) both the hospital’s name and the percentage reduction were changed. We did them a favor not just on the former but also on the latter, because the actual number they claimed was even higher.
Maybe “225 percent” wasn’t enough to excite their customers, because the Vendor D website now proclaims “390 percent’.’
It’s hard to tell which makes less sense: the numbers or the words. “390 percent” compared to what? There is also a misplaced modifier issue, as in “crossing the street, the bus hit me.” Or, perhaps they intended it to read this way. Perhaps the “400 percent losses” apply only to “employers associated with chronic disease,” such as Merck, Pfizer, and maybe Healthways or Alere. Presumably, being in the chronic disease business, they can make up their mathematically impossible losses in volume. The good news is that NASA employees don’t need to worry about their job security, because these people are obviously not rocket scientists.
It’s lucky arithmetic is not a popularity contest because here is another vendor whose outcomes break the mathematical impossibility barrier
St. Mary’s Hospital started their [sic] first comprehensive wellness program in 2006, implementing a personalized approach focused around a high trust, high engagement strategy with [Vendor D]. The following provides data directly from that program.
Like most organizations, hours tied to sick time are categorized as Extended Illness Benefit (EIB). Anomalies such as maternal leave were pulled out, leaving 96 percent of the population for the analysis. The result was that wellness program participants are 225 percent less likely to utilize EIB than non-participants.
However, even highly respected organizations can trip over the 100 percent rule. Here is a press release citing the Institute for Healthcare Improvement (IHI). The consensus would be that IHI is one of the most capable and influential organizations in the field. And yet…
HI-WIRE George Miller January 04, 2010
A five-year prospective evaluation of the model yields a 129 percent increase in patients receiving optimal diabetes care and a 48 percent increase for heart-disease patients. The model also achieved a 350 percent reduction in appointment waiting time, as reported by the Institute for Healthcare Improvement.
More common violations of the 100 percent rule are not as flagrant. As a reader of these reports you can’t assume that your vendors will simply announce that they are violating basic rules of fifth-grade arithmetic. You will have to infer it.
The Center for Health Value Innovation (CHVI) has a vision statement that says, “[CHVI] will be the trusted resource to demonstrate how engagement in health can improve accountability and economic performance.” In one of their presentations they showed a savings of $5,000 per person per year (net savings, meaning after fees are subtracted) generated by a care management program for commercially insured members, where this number was said to be for the “average” person. However, the average commercially insured person doesn’t even incur $5,000/year in paid claims — and certainly not in claims that could be considered even theoretically avoidable — making it impossible to reduce claims by this amount, especially net of fees — a clear violation of the 100 percent rule.
An example like this demonstrates the need for more instruction in the health outcomes numeracy field, both in general and also specifically because the CHVI, which itself provides instruction in outcomes-based contracting, was unable to recognize it is not possible to save $5,000/year/person in a commercial population.
The Why Nobody Believes mantra: If you insulate your house, you should save money overall, but you won’t save money on insulation.
Likewise, in health care you need to spend more money in some areas to save money overall. So, for instance, unless you believe it’s possible to talk people out of taking their drugs and have their inpatient utilization decline nonetheless, this Health Plan C slide has face impossibility (not to mention that the quantities in the first two columns don’t sum to the quantity in the last column.
The result could also have been caused by comparing people who volunteered to participate in the program to those who didn’t participate — a classic fallacy. We will bring it up a bunch more times before the book is done.
Years of doing valid outcomes measurement have confirmed the obvious: You can’t “move the needle” a lot without strong financial incentives. Want people to stay out of the ER? Sure, you can entice doctors to keep longer hours by paying them more, and that should reduce ER usage a little bit. Double the ER co-pay, though, and watch ER visits decline.
You especially can’t move the needle on chronic disease events, because most adverse events simply aren’t preventable with a few outbound phone calls.
And in reality, the needle-moving impossibility threshold, using programs without strong economic incentives/disincentives, is more like 20 percent. I chose 50 percent because there are so many outcomes studies showing greater improvements than that.
State agencies routinely accept outcomes that violate one or more of the plausibility rules, as we will see in-depth in the next chapter, and again in Chapter 8. Here is an example from Georgia Medicaid, prepared by Benefits Consulting Firm A. A word-for-word reconstruction of the summary page of their report is shown below:
The 50 percent savings rule would guide readers to look at Region 1’s 19 percent overall decrease. True, the 50 percent savings rule focuses on declines of 50 percent or more, but that’s 50 percent in any single category. A 19 percent overall decrease can — and will — easily be shown to require a >50 percent decline in hospitalizations, since disease management generates savings almost exclusively in hospital costs and ER costs, the latter being quite trivial, though, as compared to the former. Because the idea of disease management is to provide enough preventive services and self-care to avoid hospitalizations, typically the cost of non-hospital services increases in successful programs.
Let us, however, generously assume away any likely increase in non-hospital costs and say that the hospitalization reduction was achieved without increasing prevention-oriented costs. Next, let us add back in the actual fees, approximately $9 per member per month or roughly 2 percent, making the gross savings before fees 21 percent (19 percent + 2 percent).
Let us also make some assumptions for program outreach and intervention effectiveness that are generous to the program in that they exceed, in most cases by a lot, what most programs achieve:
We can build these assumptions into a table to determine how many hospitalizations would need to be avoided in the last bullet-pointed group — the sub-category in which the program was effective — in order to save 21 percent gross, meaning 19 percent net plus the 2 percent fees.
|Category||% of Total||Reduction in costs needed to get 21% overall gross savings|
|Hospital costs||50% of costs are hospital costs||42% of total hospital costs must be avoided|
|Avoidable hospital hosts||50% of hospital costs are avoidable through telephone disease management||84% of avoidable costs must be avoided|
|Engagement rate||50% of members are engaged||168% of total hospital costs must be avoided in the engaged population|
|Success rate||50% of engaged members are successful in avoiding avoidable hospitalizations||336% of avoidable hospitalizations must be avoided in the engaged population|
Obviously, despite generous assumptions for contact and success rates in disease management, the 19 percent net savings result that the state of Georgia accepted is so obviously a violation of the 50 percent savings rule that some might question whether the state’s administrators at the time accepted the findings not because they believed them but rather because the results justified further federally matched spending on the program.
Postscript: Vendor E, having grossly underbid the project, was later found to have made almost no outgoing phone calls to beneficiaries, and consequently agreed to return money to the state. So, Benefits Consulting Firm A was able to find mathematically impossible savings for the state despite the fact that the vendor allegedly generating those savings acknowledged not doing anything.
What list of states lying about finances would be complete without Illinois? Here is their press release, which claims more savings through disease management than the state actually spent on chronic disease events, a 100 percent rule and a poster child for face impossibility. (Oh, yes, and in actuality their chronic disease events did not even decline enough to pay for the program, a minor detail.) But some other bigger news at the time about that state’s governor relegated this news to the inside pages, sort of like Michael Jackson’s death did to Farrah Fawcett’s.
Listen carefully once again: You can only achieve savings in the categories in which you are trying to achieve savings. If costs decline in any other category, it had nothing to do with you. We will see two examples in our detailed case studies of this, but for now, consider this slide. We can’t say the name of — or even assign a code name to and then charge you for revealing the name of — the vendor shown in Figure 2 because this slide wasn’t published, but that doesn’t make it any less amusing.
It’s not just that everything declines. It’s that the biggest declines are in the two largely preventive categories (MD visits and drugs) where one would expect an increase — exactly contradicting the tenets of care management. Yes, once again that goes back to the observation that even if insulating your house saves money, the cost of the insulation itself doesn’t decline.
Perhaps Ned Flanders would be okay with this type of internal inconsistency because he believes everything in the Bible, including the “stuff that contradicts the other stuff” but obviously no one else would, right? Wrong. For three years these guys presented this material without anyone other than me noticing. To their credit, they did change their methodologies after I suggested doing so for the third time.
There is no way that events can decline if you don’t improve quality. If they do, that’s face impossibility. Usually the changes in quality variables are a smoking gun that invalidates the entire cost savings claim, as in Table 2.
|TABLE 2 How trivial quality improvements generate massive reductions in hospitalizations|
|% Cardiac Members||Base||Contract year 1||Improvement|
|With an LDL screen||75.0%||77.0%||2.0%|
|With at least one claim for a statin||69.0%||70.5%||1.5%|
|Receiving an ACE inhibitor or alternative||43.5%||44.7%||1.2%|
|Post-MI with at least one claim for a beta-blocker||0.89||0.89||0.0%|
|Hospitalizations per 1,000 cardiac members for a primary diagnosis of myocardial infarction||47.60||24.38||−48.8%|
Along with a lack of understanding of significant digits, percentages versus decimals, and changes in percentage versus changes in percentage points, this example clearly shows what is sometimes informally referred to in population health improvement as the “wishful thinking multiplier”:
% Event or cost reduction / % Improvement in quality indicators
Or, in wellness:
% Cost reduction / % Risk factor reduction
In this example, the wishful thinking multiplier is about 40, meaning that events fell about 40 times faster than the average of those four quality variables improved. The real wishful thinking multiplier, as we will see when we review the valid literature and review “mediation analysis” that connects the two, is only slightly greater than 0 for the first two to three years of a program.
Even the denominator itself can be gamed. Let’s start with quality indicators. Several vendors are partial to bragging that “10 of the 15 quality indicators either improved or stayed the same.” That means five deteriorated. If, of the 10 that improved or stayed the same, half stayed the same, as is often the case, no quality improvement took place. Five indicators got better and five got worse.
One of the vendor community’s favorite tools involving quality indicators is a “gaps in care” report, like this one in Figure 3.
The vendor reports that 43 percent of open care gaps were closed, while only 19 percent of closed care gaps opened up. Big success, right? Look again. This time, focus on the absolute number of gaps that changed over the course of the year. Turns out, there was virtually no change in open care gaps.
The wellness equivalent of quality indicator improvement, risk factor reduction, is equally if not more suspect and will be addressed in the wellness vignettes in the next chapter. It turns out that alleged risk factor reduction is often the result of fallacious measurement rather than actual impact. For instance, many wellness vendors measure only the engaged (participating) members, meaning the ones most likely to comply. We see this particular fallacy about once every 10 pages. And often vendors measure only the people who showed up in the baseline, against themselves a year later. That “historic control” fallacy is described in the next section.
One reason that there is a rule covering multiple violations is that you tend not to get impossible results without making myriad mistakes along the way. Some footage from the highlights reel:
Historic controls — meaning the same population before and after — creates a fallacy where people who were high-cost enough to make it into your “before” population will as a group regress to the mean, but formerly low-cost people not in the “before” population who regress upwards during the “after” period will be excluded from this analysis.
Matched controls — volunteers are compared to non-volunteers with similar claims and demographics — fail to control for motivation, which is the key to successful self-management of a disease.
Using the non-disease group as a control will overstate savings because people who don’t generate disease-specific claims because they are mildly chronically ill will often slip into the control group, and then explode in costs as their disease progresses, thus inflating the trend line.
One article traced what happened when you simply tracked the costs of people who were identified in the baseline year absent a program — a historic control. (Note that the study used a very tight algorithm to identify cardiac patients — a cardiac event in the baseline year.
Consider perhaps the best pure example of failure to control for motivation by using matched controls, taken verbatim from the white paper downloadable from the website of Vendor G:
Farther down on their website, they note that they produce “valuable disease management reports” that “provide you with ROI.”
How do these “valuable disease management reports” determine an ROI? To what control group do they compare motivated, incentivized volunteers? They “match members in the measurement year with non-participating members with similar clinical, utilization, and cost characteristics.” They do precisely what a biostatistician or health services researcher would never do: They find (1) motivated volunteers, (2) bribe them to participate, and then (3) compare the results to people who lacked enough motivation to participate and were not given incentives.
“This approach is used because there is a need to compare the program participants to something [emphasis theirs] in order to judge whether there have been improvements.” In other words, they prefer to offer an obviously invalid ROI analysis than none at all. This is presumably because their customers, egged on by their consultants, demand to know: “What’s my ROI?” Yep, they want a number, notwithstanding that it is meaningless. Vendor G, to its credit, basically acknowledges online that this measurement is meaningless, and provides it only because their customers are insisting on it.
But we have run out of space for this excerpt.
The rest of the chapter, and all the other chapters, of course, are in the book.
Its costs exceed $177 billion annually. It results in 125,000 deaths, nationwide. Nonadherence to medication is so prevalent that about half of the 3.2 billion prescriptions issued in the United States are not taken as directed.
To encourage compliance, Medco Health Solutions tried an innovative approach. The result? “For people nonadherent at any point in time, we can get 75 percent of them back on therapy in about 12 weeks,” reports Glen Stettin, MD, chief medical officer.
How did the company do it? Back in 2004, as data analysis technology improved, Medco’s staff found that half its patient population has some chronic condition requiring daily medication. These meds account for 96 percent of Medco’s drug costs and about 75 percent of total medical costs for its 65 million members.
“We started segmenting our data to find areas with sufficient volume to merit training a pharmacist to specialize in [just] one common chronic condition,” the Medco CMO recalls.
Identifying six widespread conditions, Medco offered certification training to its pharmacists in different locations. Response was strong. “A majority found this an exciting, innovative concept in pharmacy,” says Stettin.
“It would give them a cause they could rally around, with metrics so they’d know how they were doing.”
Gradually, as Medco identified 15 common conditions — including diabetes, cardiovascular disease, and pulmonary illness — the specialist pharmacist approach became the foundation of its operations. “This is our business model,” declares Stettin.
“We created a new way to practice pharmacy.”
Under the new model, the 1,500 specialist pharmacists — approximately half of Medco’s pharmacists — target what the company calls “gaps in care.” The most common gap is adherence. Another is an “omission gap,” when no evidence indicates that a particular therapy was considered by the physician, although it may be part of the standard treatment regimen for that condition. In 2010, Medco closed 2.3 million gaps in care, with an estimated total health care savings of about $900 million.
Affordability is a key area, Stettin notes. “Even though a patient has coverage, for a [diabetic] taking eight to ten medications a day, that adds up on a fixed income. We help patients understand lower-cost, generic, or available options from our formulary, and that mail order, or getting a larger supply, can save money.” The pharmacist will also attempt to take out a med that may not be necessary.
Safety checking is crucial. With Medco able to view over-the-counter medications that members get through its retail pharmacy network, “a big question,” says Stettin, “is who it’s for and how to use it.” A specialist pharmacist learning that it’s for the member can check the product against that patient’s prescriptions. “For instance, if we see someone on warfarin ordering aspirin, we can discourage them from buying it, or suggest safer alternatives.”
People using Medco mail-order are notified that if they have one of the 15 chronic conditions, they’re entitled to a specialist pharmacist’s services, which include attention to affordability, adherence, omissions, and co-morbidities.
Some conditions even have an additional safety program. For anyone starting on insulin U500, a specialist pharmacist checks with him about how he uses it.
“Several times a week, we have to make a change because, for example, someone is using the wrong syringe for that medication,” Stettin reports. One woman said, “I’ve been on this for four months, and it’s not working well.”
Checking her prescription, the pharmacist discovered she’d been taking U100! “Because specialist pharmacists see the same condition and medications every day, they know how to talk to the patient and the physician about it, how to ask the questions and find the facts.”
Each condition has at least one “Therapeutic Resource Center” (TRC) in a separate location, to take care of what Stettin calls “the whole patient.” Several hundred specialist pharmacists are based at two TRCs for Medco’s five million diabetes patients, along with several diabetes educators. Three cardiovascular disease TRCs serve 11 million patients. Other large TRCs are pulmonary (5.8 million) and neuroscience (10.3 million). Some TRCs have a nurse who makes home visits.
“We want pharmacists committed to a specialty, who want to do counseling,” observes Stettin. “The clinical relationship is the heart of Medco’s program. We make an effort to have our pharmacists develop the level of patient care and service that any of us want for our own family.”
Trained in how to discuss adherence, specialist pharmacists call a patient when they spot nonadherence (for example, based on refills). “If you ask a patient, Are you taking your medication? it’s socially unacceptable to admit that you’re not,” Stettin explains. A more subtle approach — such as, “a lot of people may not take a medication exactly the way it’s prescribed. Do you ever miss a dose? For what reason?” — often evokes candor. “People may say they forget, or had a side effect. Getting at why that member isn’t taking a medication leads to exploring how we can help overcome the individual barrier.”
For practice development, Medco gets a patient’s permission to record a telephone counseling session. “We have people who listen and coach pharmacists on what they do well, and what they could do better. Pharmacists can also listen to their own recordings, and learn that way, too,” Stettin notes. “We constantly measure how the specialist pharmacists are doing at eliminating omissions and other gaps in care.”
If a member prefers, a pharmacist will alert him to any noticed problem, such as a late refill order, by way of www.medco.com. Any member opting for “click to call,” by simply entering his phone number on the Web site, is assured of a specialist pharmacist’s callback within two minutes of any question posted.
For each chronic condition, specialist pharmacists are alert to common problems. Someone noticed that many asthma patients were frequently ordering rescue inhalers.
“But for most people with asthma under control, an inhaler should last a month or more,” Stettin explains. “Through our campaign to reduce rescue inhaler use, we found that most people didn’t need as many as the prescriptions required. The right amount of controller medication reduced the quantity of rescue inhalers. Patients didn’t realize that without symptoms, they don’t need the inhaler.”
One in five Americans is a member of Medco, through its thousands of clients, comprising national and regional health plans, employers of all sizes, labor organizations, and federal, state, and local government agencies.
“Gaps in care may sound overly complicated,” Stettin reflects, “but we’ve done much analysis to make this a good service experience for patients and their families, and as simple as possible for the patient. The member needs to know only one phone number, one Web site for questions, one address if you mail. And we’ll bring the best pharmacist and pharmacy resources right to you, to help you.”
“Pharmacists always believed that they can affect medication therapy and proper compliance and identify adverse reactions,” observes Allan Zimmerman, national pharmacy practice director for the human resources services group at PricewaterhouseCoopers.
Now, increased national dialogue about health care reform, quality, and cost containment “creates a more holistic look at compliance programs and their value in delivering better care and outcomes. Payers are understanding how generalist and specialist pharmacists in this role can affect quality, cost, and coordination of care. Acceptance is broadening.”
Previously, Zimmerman suspects, many payers were dubious about potential ROI. “In a PBM-based initiative, they viewed it as a vehicle for the PBM to increase revenue, even with adherence as the focus.”
Marissa Schlaifer, RPh, agrees. “Pharmacists have a level of expertise that helps them identify a patient’s medical problems,” says Schlaifer, director of pharmacy affairs for the Academy of Managed Care Pharmacy. Pharmacists can prevent adverse drug reactions, make sure patients are on the most appropriate medication, and identify others they should be taking.
“Most people don’t realize that pharmacists look at not only a diabetic patient’s current medication, but also at whether he’s taking care of other potential problems, such as kidneys,” says Schlaifer.
“Substantial data show that the use of pharmacists for adherence, patient education, and reconciliation provides value in lowering cost and raising quality of care,” says Zimmerman, a registered pharmacist.
In one study of a PBM’s members with diabetes — which often has non-compliance rates of 50 percent or more — the payer saved $600 per patient annually through pharmacist counseling and coaching.
For some disease states, with active pharmacist intervention, every dollar spent can bring a return of nearly $5, notes Zimmerman. Documented return on investment includes decreased hospitalization and fewer emergency room visits.
“Our health care system, for too long, has not tapped into a valuable source for educating plan members to increase compliance, by integrating their pharmaceutical care into overall health care, through pharmacy practitioner counseling and intervention,” says Zimmerman.
“We’re already seeing results showing the effect of using pharmacists for these purposes.”
“Pharmacy has evolved from where it was 30 years ago, when we had fewer chronic care medications,” observes Merritt.
“We have moved from a surgical emergency system to a wellness-based system: Rather than perform that heart surgery, we prefer to prevent it by way of medications. That growing preventive model requires more and more specialization among pharmacists,” he says.
Heart failure results in substantial morbidity, mortality, and health care expenditures. There are 5.8 million people in the United States with heart failure, and this condition has one of the highest rates of hospitalization and rehospitalization. There are, fortunately, several therapies that are based on evidence and recommended by professional societies and that can reduce morbidity, mortality, and costs.
For example, a 2009 study in the Journal of the American Medical Association of 12,565 patients with HF investigated whether guidelines were being adhered to. It found that fewer than one third of patients who were eligible for aldosterone antagonist therapy were actually given these guideline-recommended drugs. At discharge, some hospitals prescribed them to no patients at all. Meanwhile, the rate of documented contraindication in the medical record was only 0.5 percent.
Yet for heart failure patients these drugs are proven to lower mortality, hospitalization, and rehospitalization rates.
“These heart failure therapies are highly cost effective,” says Gregg C. Fonarow, MD, professor of cardiovascular medicine at UCLA.
There is compelling evidence supporting the efficacy and effectiveness of the therapy. The 2009 update of the American College of Cardiology Foundation/American Heart Association’s heart failure guidelines recommends low-dose aldosterone antagonists for appropriate heart failure patients, and documents “strong data demonstrating reduced death and rehospitalization in two clinical trial populations.”
Furthermore, a 1999 New England Journal of Medicine study of 1,663 patients with severe heart failure found that with spironolactone “the frequency of hospitalization for worsening heart failure was 35 percent lower” and risk of death 30 percent lower, while symptoms of heart failure showed “significant improvement” and might result in fewer clinic visits and cost moderation.
In a 2011 NEJM study of 2,737 patients with systolic heart failure and mild symptoms, the number of hospitalizations — including second and subsequent hospitalizations — was consistently lower for patients receiving eplerenone. Compared with the control group, there was a 24 percent reduction in total hospitalizations, a 29 percent reduction in hospitalizations for cardiovascular reasons, and a 38 percent reduction in hospitalizations for heart failure.
There is a lot of room for care improvement here. This is just the sort of clear, unambiguous, focused opportunity that medical directors like, Fonarow says.
“By providing quality feedback and encouraging participation in performance improvement systems, managed care can provide highly effective but less costly care for heart failure” compared with current practice.
Importantly, the 2009 JAMA study found aldosterone antagonist use significantly higher in patients who were also prescribed other evidence–based heart failure therapies. These other therapies were also associated with reduced hospitalization.
You might think the situation would be different for drugs that have been around longer. Angiotensin converting enzyme inhibitors and angiotensin receptor blockers have also been proven to lower the risk of hospitalization, rehospitalization, and death in heart failure patients, Fonarow says. Yet, as he and colleagues report in Archives of Internal Medicine, prescriptions for these drugs were not given to 28 percent of eligible patients.
Similarly, “evidence–based beta blockers have been shown to lower the rates of death and rehospitalization in the 30 days after hospital discharge for heart failure,” Fonarow says. “This therapy provides early, intermediate, and long-term benefits.”
Other beta blockers have not been shown to provide such benefit, yet many physicians prescribe these other beta blockers for heart failure and not those proven to improve survival, Fonarow and colleagues reported in a 2007 study in Archives of Internal Medicine.
The pattern of underuse with clear consequences for both health and costs extends beyond the limits of pharmacotherapy. Cardiac resynchronization therapy for appropriate patients lowers mortality and hospitalization rates, but not rehospitalization. “Heart failure hospitalization rates were lowered by 50 percent in the CARE HF study,” Fonarow says, citing studies by Cleland and colleagues. “Yet only one third of ideal candidates for cardiac resynchronization therapy have received this therapy in cardiology practices.” It is clear that we are not taking sufficient advantage of guideline-recommended therapies and the substantial opportunity to improve outcomes for patients with heart failure.
Certain management strategies have been shown to provide benefits for both health and finances. Ample evidence demonstrates that certain forms of disease management are effective for heart failure, says Marvin A. Konstam, MD, past president of the Heart Failure Society and professor of medicine at Tufts University School of Medicine. The 2004 randomized study he led of heart failure management in a diverse provider network tracked 200 patients with high baseline use of approved heart failure pharmacotherapy.
At 90 days, patients randomized to disease management experienced half the hospitalizations for heart failure as controls. Furthermore, intervention patients had reduced hospital days related to a primary diagnosis of heart failure.
Overall, days in hospital per patient-year for cardiovascular cause were vastly reduced in the intervention group, but when the program was ended, much of the gain in positive outcomes was lost.
Better payment methodologies would do a lot to improve outcomes in the heart failure population, says Marvin A. Konstam, MD, past president of the Heart Failure Society.
Not all configurations of heart failure disease management reduce hospitalization equally. In a recent editorial in the Journal of the American College of Cardiology, Konstam and a colleague recommend better use of pharmaceuticals and good diet. They also push educating patients and families for the sake of better adherence, self-monitoring, and responses to changes in clinical status. “Patients who show the greatest improvement in adherence also show the greatest reductions in heart failure hospitalizations.”
A heart failure clinic led by heart failure specialists, together with advance practice nurses and physician assistants, has also been proven to be effective. These achieve “very substantial reductions in hospitalization,” Fonarow says, referring to studies he published in 1997 and 2010. Encountering patients during hospitalization and following patients during the ensuing period, such centers achieve improved compliance in the use of guideline-recommended drugs and devices, and better outcomes. Of course, Fonarow says, “Not every geographic region has access to such centers.”
Another approach uses telemedicine. Important questions regarding telemedicine include: Who conducts it (nurses? other staff members?), which patients receive it (only those with the highest likelihood of adverse events?), and how long does each category of patient receive it?
Recently, a study by Harlan M. Krumholz in NEJM found that telemonitoring reduced neither rates of death nor rates of rehospitalization in heart failure patients. Whether a different approach to telemonitoring or targeting of a different patient mix would be effective is not known. Meanwhile, this may not be the best place for managed care to expend resources for heart failure patients.
Should participation in performance improvement programs be encouraged?
Health plan medical directors use several approaches to monitoring successes in reducing hospitalization and extending life for patients with heart failure. “None of these replace hospital and outpatient participation in performance improvement programs for heart failure patients, such as the American Heart Association’s ‘Get With the Guidelines’ program,” Fonarow says.
“Hospitals participating in Get With the Guidelines have been shown to provide higher quality of care and have lower 30-day rehospitalization rates, compared to other hospitals,” he says.
That bears repeating: The Heart Association’s program achieves not only better quality of care but, in addition, fewer expensive rehospitalizations.
“Performance improvement programs can also be effective in improving care and outcomes for heart failure in the outpatient practice setting,” Fonarow says.
A 2010 prospective study of a registry including 34,810 patients with left ventricular ejection fraction of 35 percent or less and chronic heart failure or previous myocardial infarction achieved “substantial improvements in the use of guideline-recommended therapies in eligible patients with heart failure in outpatient cardiology practices.” The study was conducted at 167 outpatient cardiology and multispecialty practices.
At 24 months the use of beta blockers had increased from 86 percent to 92.2 percent, of aldosterone antagonists from 34.5 percent to 60.3 percent, of cardiac resynchronization therapy from 37.3 percent to 66.3 percent, of implantable cardioverter-defibrillators from 50.1 percent to 77.5 percent, and of education about heart failure from 59.5 percent to 72.1 percent.
“As these therapies are highly effective in reducing hospitalization rates and have been shown to be cost-effective, the expected benefits in terms of lives saved, hospitalizations prevented, and health care expenditures reduced would be expected to be substantial,” Fonarow says.
“Heart failure [is] perhaps the best example of a chronic disease for which care could be optimized by a medical home approach,” write Konstam and Barry H. Greenberg, MD, in the Journal of Cardiac Failure, while acknowledging that certain aspects of the approach have not yet been worked out adequately.
“The medical home is appealing in part because these are patients with lots of comorbidities,” says Greenberg, professor of medicine at the University of California–San Diego Medical Center.
It “offers the best opportunity to succeed in both driving quality and outcomes and controlling health care costs, while maintaining provider discretion to individualize care in the best interest of the patient,” the authors state.
The approach provides incentives for continuity of care, aligns multidisciplinary teams of providers to weigh the range of options appropriate for each patient, encourages competition with other systems, and avoids payer micromanagement.
Providing incentives for continuity of care means patients “have access to the health care system” and can be “educated in such a way as to improve adherence to prescribed medications, to dietary recommendations, to daily monitoring of weight, and similar steps,” Konstam says.
In contrast, such metrics as 30-day rehospitalization might lead to no more than deferral of a needed rehospitalization beyond 30 days to avoid penalty, he points out.
If managed care wishes to reduce hospitalization and rehospitalization, then these payers should attempt to alter the payment model, Konstam suggests.
“Managed care organizations should work with provider systems to develop models of reimbursement that change the approach to managing patients longitudinally and that align incentives to keeping patients out of the hospital,” he states.
Overall, we need to get to where patients monitor themselves and are monitored by health care professionals and medical devices “in a more systematic way than their showing up at the emergency department after they have gained 30 pounds,” Konstam says.
The 2003 National Health Interview Survey found that chronic obstructive pulmonary disease would ascend from the fourth- to the third-leading cause of death in the United States by 2020. It was off by a mile.
COPD has already taken third place on the list, according to the U.S. Centers for Disease Control and Prevention (CDC).
It affects as many as 24 million Americans. There was an 18-percent increase in patients hospitalized for acute COPD exacerbations between 1998 and 2008, says Richard A. Mularski, MD, chairman of the American Thoracic Society’s quality improvement committee. He is also an investigator at the Kaiser Permanente Center for Health Research.
More than 822,000 patients are hospitalized annually for COPD. Among patients 64 years of age and younger, there were more than 230,000 hospitalizations in which COPD was the first-listed diagnosis, and many more in which COPD was involved.
COPD has an average length of stay of 4.8 days, Mularski points out, with direct costs for caring for patients coming to about $40 billion per year.
Meanwhile, our understanding of the disease has been changing. What was once thought of as a pulmonary problem is now considered a larger systemic disease that may involve many comorbidities. In this article, we use the definition of comorbid from Dorland’s Illustrated Medical Dictionary, 2007, “pertaining to a disease or other pathologic process that occurs simultaneously with another.”
COPD is now understood to involve inflammation, and inflammation is unwilling to remain neatly parked in the lungs. Rather, COPD-associated inflammation affects many organ systems, and recent studies have demonstrated a set of nonpulmonary morbidities associated with COPD.
“These are all intertwined,” says Brian Carlin, MD, chairman of the COPD Alliance — a multisociety organization that includes the American College of Chest Physicians. Carlin is an assistant professor of medicine at Drexel University College of Medicine, among other appointments.
As these are patients with complex conditions, “they require complex interventions” to keep them healthy and reduce hospitalization expenses, he says.
“We now understand that COPD is a systemic disease with synergistic interactions with other systemic diseases,” says Mularski. “You cannot succeed with these patients with just single interventions.”
According to a recent study by Bartolome Celli and colleagues in the American Journal of Respiratory and Critical Care Medicine, COPD “is a complex disease at the clinical, cellular, and molecular levels.” Currently “diagnosis, assessment, and therapeutic management are based almost exclusively on the severity of airflow limitation,” even while this measure “fails to adequately express this complexity.”
As each patient’s comorbidities differ, “a holistic approach makes sense,” says William M. Vollmer, PhD, a biostatistician and senior investigator at the Kaiser Permanente Center for Health Research.
“It is almost impossible to disentangle the comorbidities,” he says. “Look at what is going on overall with each patient.”
As is so often the case, complexity does not bring easy answers. But answers are needed, as the costs are far from trivial.
No one truly knows the full extent of these costs, even though this subject has been studied numerous times. Several of the experts interviewed for this article noted that all existing economic analyses of COPD underestimate the actual costs greatly, in that these analyses focus on COPD as a pulmonary condition and do not count expenses associated with the comorbidities.
Hospitalizations and rehospitalizations are costly and, in COPD, the second is usually longer than the first, Carlin says.
Moreover, hospitals would be prudent to improve their treatment of COPD patients now, rather than be vulnerable once new Medicare rules on readmissions go into effect, he says. “Lack of payment for readmissions will be a strong motivator,” he says.
At the present time, approximately 1 in 4 COPD patients is re-hospitalized within 30 days of discharge, Mularski says, summarizing the findings of several studies.
It is in the interest of managed care “to invest up front to keep patients healthy and keep them out of the hospital,” says Carlin.
“Health plan medical directors should educate providers to have a checklist of some type, both for the initial diagnosis and for follow-up visits,” Carlin says. They should be considering such comorbidities as heart failure, arterial stiffness, right ventricular dysfunction, left ventricular diastolic dysfunction, metabolic syndrome, osteoporosis, peripheral skeletal muscle dysfunction, nutritional abnormalities, or cancer. Furthermore, diabetes and metabolic abnormalities should be considered, particularly for patients receiving steroids.
Also, there is some evidence that COPD is associated with increased risk of long-term mortality in patients with peripheral arterial disease and those with chronic kidney disease.
Further, “between 40 percent and 50 percent of patients with COPD suffer from depression,” Carlin points out. It is well-known that depression is associated with noncompliance with physician recommendations.
Informational e-mails from medical directors alerting providers to these issues might be beneficial by improving patient health and reducing hospitalizations and rehospitalizations. “The provider should be thinking of these conditions and rule them out in the outpatient arena,” Carlin says. Thus, in terms of an established comorbidity such as osteoporosis, the provider should be pursuing the following question: “Does this patient have osteoporosis and, if so, how should that be managed to avoid a hospitalization for fracture?” At that point, he says, it is not necessary to decide precisely how much of the patient’s osteoporosis or heart failure relates to inflammation generated by COPD and how much all these conditions relate to a history of smoking or other factors.
“It is essential with these patients to search out comorbidities,” Carlin says. “Too frequently, these patients are siloed into one diagnosis” which neither protects their overall health nor effectively keeps them out of the hospital.
Vollmer points out that inadequacies in the literature are a problem. The National Institutes of Health itself is focused primarily on individual diseases, he explains, and, thus, research is lacking on diseases that manifest as multiple comorbidities. “Research tends not to cut across outcomes,” he says.
Meanwhile, a great deal is being spent on ineffective treatments. A study of 69,820 patients hospitalized for acute exacerbations of COPD, reported by Peter Lindenauer and colleagues in the Annals of Internal Medicine, found that 45 percent of these patients received at least one nonrecommended test or treatment.
Using claims data and computerized programs to search out possible comorbidities and then alerting physicians to them “is the essence of the care considerations,” says Haydee Muse, MD, MBA, senior medical director at Aetna.
Aetna’s Care Engine constantly scans Aetna’s system to look for potential gaps. This is a proprietary technology platform, continuously analyzing claims and other data with reference to evidenced-based best practices and alerting the members and their physicians about possible care gaps and such inconsistencies as drug-drug interactions, missing preventive exams, or needed screening tests. If a member’s records, for example, indicate he has COPD but there is no evidence of spirometry testing, the member’s physician would then be informed.
For patients with COPD, Aetna supplements physician visits with “personalized outreach interventions for members,” with nurse case managers more likely to intervene during times of hospitalization or medical crisis and the disease management team to engage them at other times to help close gaps in care, says Muse.
Both case managers and the disease management nurses work one-on-one with members, educating them on their COPD action plans, she says. “For instance, they review the warning signs that symptoms might be getting worse and require treatment — such as recognizing when [patients are] getting shorter on breath, or paying attention to increased mucous, swollen ankles, fevers, and chills.”
Aetna’s National Clinical Improvement Work Group, which develops interventional quality programs for specific patient subpopulations, develops programs targeting patients with acute COPD exacerbations leading to hospitalization. The workgroup sends information to both providers and patients — such as the importance of avoiding metformin in COPD patients who have acidosis, or the need to address sleep apnea in patients with COPD.
Aetna’s Health Media furnishes highly personalized, self-paced online coaching sessions, she says, permitting members to chose between live and online smoking cessation programs.
Cigna is using claims data to attempt to tease out information on comorbidities. “We calculate a risk score that helps prioritize customers for outreach. The presence of comorbidities will often raise the risk score,” says Scott Josephs, MD, vice president and national medical officer for total health management.
“We also identify gaps in care through our Well Informed program,” Josephs says. “This program uses algorithms to combine medical, pharmacy, and laboratory findings to determine, for example, whether the patient is overdue for a blood test to monitor a particular medication.”
In approaching each patient, Cigna takes into account such factors as the patient’s ability to read and to comprehend medical terms, socioeconomic position, and “any financial or health barriers that might affect their ability to understand and manage their condition,” Josephs says. “We try to meet them where they are.”
The goals are preventing exacerbations and maintaining level of function, restoring the level of health whenever the patient becomes ill, and preventing complications of illness and of treatment.
Personal contact by health advocates is the key, he says. These advocates, mostly nurses, are provided with more than 40 hours of behavior-change theory training at Cigna and enrolled in a form of continuing education thereafter. They focus on educational, social, and functional barriers.
Smoking cessation is, of course, a prime concern with many COPD patients, and nicotine replacement therapy is provided free to appropriate patients. An example of another focus would be whether arthritis is interfering with inhaler use.
Cigna’s Well Aware Chronic Obstructive Pulmonary Disease program was developed using nationally recognized guidelines and the recommendations of such organizations as the American Thoracic Society and the Veterans Health Administration.
“Only by improving outcomes will you reduce costs,” Josephs says. “COPD is a problem that is increasing faster than was predicted and that needs attention.”
Several experts say that the cost of COPD treatment has been underestimated in many studies.
Perhaps no issue in disease management (DM) is more controversial than outcomes measurement. As for wellness, that field is five years behind DM in the ability to measure outcomes validly. Being five years behind DM in measurement is like being five years behind Iraq in democracy.
Many — if not most — reported results are wrong, infected by either obvious or insidious regression to the mean and distortions due to faulty trend calculations. How do you know if your results are among those so infected? Three simple tests will tell you whether your results are infected by regression to the mean:
(1) Did you see cost or utilization declines in categories which do not normally decline in DM, such as physician visits or drugs?
(2) Did drug costs decline (a reduction attributed to the program) while the quality indicators showed an improvement in adherence to drug therapy?
(3) Is the stated decline in admissions of a much greater proportion than the improvement in quality indicators?
If the answer to any one of these questions is positive, your results are infected and hence invalid at worst and controversial at best. But help is on the way. We are already seeing a glimpse of the future in measurement, and the good news is that “regression to the mean” specifically — and complex, invalid, expensive actuarial methodologies generally — are being banished to what Leon Trotsky once called “the dustbin of history.”
What follows are the emerging insights which, taken into consideration when you measure, will remove most of the controversy around measurement and produce generally valid results … and will save both time and money in the process. That’s because validity of outcomes and complexity of the process used to generate those outcomes turn out to be inversely correlated.
Though generally not practical for health plans, the only truly valid methodology is randomized controlled trials (RCTs). Any other methodology needs to be confirmed with plausibility checking before being accepted.
Randomized control group trials were used by the Centers for Medicare & Medicaid Services (CMS) in their Medicare Health Support project. Whatever other mistakes made by CMS that perhaps caused the contracted vendors to miss their targets (and there were many), there was no issue about measuring in this manner, the closest approximation to a double-blind study there could be in a field where placebos aren’t possible. Of course, in a “real” RCT, the doctors don’t have patients in both the control and study groups, the way they did in this situation. That is just one example of mistakes made in the CMS study design.
Before embarking on your own RCT or accepting a study provided by a vendor, keep in mind that all RCTs are not created equal. In particular, there are a number of comparisons between the two groups which must be checked, and rarely are, in RCTs:
What was the previous hospitalization rate of the two groups? Often, the groups look like a match on demographics and illness burden, but had a much different rate of hospitalizations in the six months prior to the start of the trial.
Could the difference in results have been caused by the intervention? It’s not enough to just accept differing results between the control group and the intervention group in the study period. Some changes are not due to DM, including large percentage differences of any type; differences between the groups in categories like radiology or post-acute care, which simply do not get noticeably affected by DM; and differences which are larger in lower-acuity members than high-acuity members.
Even within the “expectable” categories such as hospitalizations, did the researchers rule out other possible causes for differing results? Once one focuses on knowing that only certain categories are affected by DM, one must go a step further to determine whether the differences — even if in the “expectable” categories like hospitalizations — were in fact due to the program. Was a differential decline in hospitalizations due to fewer hospitalizations for the conditions actually being managed? Was a differential decline in surgeries concentrated in the surgeries where patient preference can make a difference? Or was it across the board?
Did you achieve cost reductions in most or all categories? Keep in mind that it is not possible to reduce costs in most or all categories — the cost has to go somewhere. It might move from inpatient to outpatient, or inpatient to drugs, or ER to physician office visits. But it doesn’t go away.
Does the reported savings change when the outlier cutoff point is changed? If so, the savings are not likely caused by the program, since a few phone calls can’t prevent a six-figure hospitalization. A good way to check this: Does the vendor, who is touting the RCT, tell you what the outlier cutoff is, and whether changing it changed the savings? If not, chances are that they picked the cutoff which resulted in the greatest savings.
Assuming these paragraphs above are taken into account, the RCT is the best comparison available. That is why it is used in drug trials. However, the major disadvantage is that only rarely does a health plan or any other entity find itself in a position to conduct an RCT. Occasionally a program is offered to the insured population but many self-insured groups don’t buy it, as was the case where Blue Shield of California offered a catastrophic case management program to its own members and used the California Public Employees Retirement System as a control. It had several hundred thousand people in each group, with essentially no migration between groups. The outcomes, peer-reviewed and published in the February 2007 American Journal of Managed Care, appeared to be valid.
RCTs provide a far more accurate analysis than a “pre-post” study, in which a single population is used as both the control and study group over two periods of time. In a pre-post study, generally it is assumed that the baseline cohort’s costs would stay the same adjusted for trend (as measured by the nondiseased population’s cost change) absent the DM intervention. Therefore, any change in costs (adjusted for the change in costs of the nondiseased population) is attributed to the program.
Among the many problems with this methodology, the most obvious is that the baseline does not include the entire population, only the population sick enough to have claims. Hence the “planes on the ground” (as explained in Rule Five, below) are not included and the calculated average cost of everyone with claims for the condition is higher than the underlying average cost of everyone with the condition.
Pre-post methodologies can be divided into two types: “prospective identification,” in which anyone who ever had a claim for a condition is counted in all future periods, and “annual requalification,” in which only people with claims in any period are counted going forward.
Before initiating a program, you need to know which conditions are most out of control and are creating the most unnecessary admissions.
To know which conditions are out of control, you need to know basic facts. For instance, if you are managing or considering managing heart disease, you need to know your rate per 1,000 for heart attacks, angina attacks, and other cardiac events.
These two rules can be considered together. Today, too many health plans and employers say, “Let’s do DM.” Too many employers say, “Let’s do wellness.” Dollars are committed and spent and measured by actuaries … and yet, basic questions don’t get asked or answered. Let us use the example of heart disease. Health plans and employers are spending millions to manage this category, to reduce heart attacks and other ischemic events. But almost no one can answer basic questions like:
What is our rate per 1,000 patients for heart attacks, angina attacks and other cardiac events?
How has it been trending since we started this program?
How does it compare to other similar populations? Are we out of control or in control?
That set of epidemiological questions begs another set of managerial questions: How can you manage something if you don’t know what you are managing? How do you know where to focus your DM efforts if you don’t know whether and where your adverse event rates are out of control?
These event-rate tests are very simple, and avoid all the actuarial data-crunching and what-if scenarios found in the typical benefits consultant analysis. You divide the number of ER and inpatient events primary-coded for the condition in question by the total plan membership, just as if you were calculating a birth rate.
For instance, if you count 3,000 asthma attacks overall, and you have 1 million members in your plan, your asthma attack rate is 3 per 1,000.
It is vastly more actionable to know one’s out-of-control event rate, which is a known, valid, replicable figure, than it is to know the prevalence rate. Prevalence is a term of art whose parameters vary according to the “claims-extraction algorithm” used to find members. Suppose you are satisfied with the prevalence-rate algorithm and find that the prevalence is high. Does that mean you should “do DM?” Not necessarily. Perhaps usual care is quite good — that means there are few events left to avoid. The Boston area, for example, is a hotbed of asthma. Yet in the commercial population, the health plan which has the best event avoidance in the entire U.S. pulls its membership largely from greater Boston. How can they have such a low event rate in a high-asthma-prevalence environment? Because usual care is quite good thanks to years of physician education and disease management, so this plan is trying to move its customers into other care management programs.
There are two examples of what health plans can learn by looking at their event rate trend over time, and by comparing their trend to national averages.
For instance, a Southeast health plan implemented programs but never asked whether the program was actually doing what it was intended to do — reduce adverse events in the conditions being managed. An observation of condition-specific event rates showed that there was no program impact on utilization (and hence cost), notwithstanding the actuarial calculations of large savings using its modeling system.
Comparing oneself to historical performance yields some insight, but one can’t be certain that the lack of decline isn’t reflective of excellent initial performance, and therefore the expectation of improvement could be unrealistic. In that particular case, it turned out that the Southeast health plan’s performance was average and therefore should have improved.
How is it possible to know that it was “average”? The results from 29 commercial health plans and employers (but not Medicare health plans or Medicaid health plans, which would have different event rates) were combined into one average. This allows a health plan to compare itself to a benchmark and see how it is performing over time. Another case is Harvard Pilgrim Health Care Inc. Harvard Pilgrim has the best outcomes in the country, roughly tied with Providence Health Plans in Oregon.
The trend lines suggest that Harvard Pilgrim had been improving both in absolute terms and versus the national averages, and — unlike the Southeast health plan above — was already much better than average before implementing its DM programs. Even so, its performance has improved since DM implementation.
The broader question: Why doesn’t everyone look ahead of time at adverse events by condition before deciding which programs to do? The goal of chronic DM is to reduce adverse events, so it would seem very logical to see ahead of time if — and in which conditions — there are enough to merit an attempt to reduce them.
In addition to looking at these event rates — the so-called “plausibility test” — biostatisticians also recommend a “number needed to decrease” (NND) analysis to confirm whether the ROI you believe you have achieved actually was achieved.
An NND test tells you how many of these events you need to avoid in order to hit your ROI targets, given inputs for program costs and emergency and inpatient care expenses. Then, you input an explicit, transparent assumption about the likelihood of comorbidities being reduced as well, if admissions for the specific primary morbidity are avoided. For instance, in asthma most of the event avoidance will take place in asthma itself. But in diabetes, good DM could avoid events across many related comorbidities.
In addition to the basic assumption that a DM program should reduce events in the disease being managed, there are two other assumptions implicit in an NND test. The second critical assumption is that events associated with those related comorbidities are falling at the same rate that the events coded to the primary morbidity are falling. The third is that it is plausible to say that related comorbidities could fall only if there appears to have been an impact on the primary morbidity.
For this last assumption, an analogy could be made to, yes, sports. If you watch a player hit a bunch of slow balls down the middle for home runs, and the player tells you he can also hit sliders on the corners, you might believe him. However, if he misses the slow balls down the middle, that very same statement about being able to hit sliders on the corners is simply not plausible. That’s why both the straight plausibility test and the NND analysis are so concerned with success or lack thereof in the primary morbidity. While easily measurable on its own, that success would also certainly correlate with much less easily measurable results across a range of comorbidities.
A very simple example of an NND analysis might be as follows. Assume you are spending $1 million on asthma, and that avoiding an average event — the weight-average of the costs of an admission and an ER visit — saves $1,000. If you are targeting a 2:1 ROI, and you assume that whatever minor comorbidity reduction you might achieve is offset by higher drug costs, you must therefore avoid 2,000 asthma events to save $2 million.
Is that achievable? Go back to the event-rate chart. There are about three asthma events for every 1,000 plan members. Recall that this is not diagnosed members or members participating in the DM program — this is just a raw rate of event incidence. Since asthma attacks can occur in anyone, since the health plan pays claims for everyone, and since it’s the program’s job to save money by avoiding events, the raw rate of incidence is the correct rate to measure.
As one example from an event-rate chart, 1 million members would yield about 3,000 asthma events in total. This would make avoidance of 2,000 events extraordinarily unlikely — it would be a 67 percent reduction, would generate a very sharp decline in the event-rate line, and would run into the reality that about a third of asthmatics are simply unknown to a health plan in the first place. Either they themselves don’t know, or they received their diagnosis while belonging to another health plan.
However, if you have 10 million members generating 30,000 events, it would indeed be possible to avoid 2,000 of them. You would just track events the following year using the event-rate test above to see if indeed you avoided 2,000 events — 6.7 percent of the total. An event-rate chart database would reveal multiple instances of a 6.7 percent decline in events.
Adding comorbidities creates another layer of complexity in search of more validity. Vary the asthma example above to substitute heart failure for asthma. For heart failure, the value of an avoided event is much greater than for asthma, because a much higher percentage of patients presenting in the ER get admitted, and the lengths of stay are much longer. By looking at the composition of the event rate as between ER and inpatient, and applying your costs for hospital use, you can figure that perhaps the average avoidable heart failure event saved $10,000, rather than $1,000 as in asthma.
Event rates for heart failure fluid overload in the commercial population are about 0.5 per 1,000 members. So assuming the same 1 million people as in the asthma example and the same $1 million in spending, you would have to avoid 200 events to get a 2:1 ROI.
But with an event rate of just 0.5 per 1,000, there are only 500 such events to begin with, making avoidance of 200 — a 40 percent reduction — a difficult challenge. This is where the comorbidity assumption comes in. Suppose that instead of virtually no comorbidity impact from a DM program, as in asthma, you assume that for every fluid overload case your disease managers avoid, they avoid four cases of other medically-related complications or comorbidities. These are nowhere near as measurable because they are spread out over many ICD-9 codes. That assumption significantly reduces the number of avoidable, measured fluid-overload cases needed to decrease to reach the target ROI.
Specifically, to avoid 200 total events in members with congestive heart failure, one has to measure only 40 avoided cases specifically of fluid overload, or about 8 percent of the 500 expected. This is enough to show up on the event-rate charts as a noticeable decline in all but the smallest health plans.
While it is, of course, true that the “comorbidity multiplier” is clearly an assumption, the NND analysis has two huge advantages over actuarial pre-post analysis. Measurement of comorbidities is explicit and transparent, and the “comorbidity multiplier” can easily be varied. In actuarial methodologies, the sources of savings, by condition, are totally implicit, as results are presented in dollars. The best example: A public presentation by a William M. Mercer consultant showed savings of $6 million in asthma, without even checking to see if asthma admissions and ER visits changed. Had he done that, he would have noted that his client, a large retailer, didn’t even incur enough asthma events to spend $6 million on them in the first place, let alone save $6 million by avoiding them.
The “once chronic, always chronic” methodology will invariably overstate savings.
A popular methodology among health plans, vendors and especially benefits consultants is the “once chronic, always chronic” approach, a form of pre-post analysis. The Care Continuum Alliance refers to it formally as the “prospective identification” methodology, in which they find the same flaws as described in this section. In this methodology, any member who is identified in any period as having a chronic condition is assumed to continue to have that chronic condition in future periods. The assumption is based on the logic that chronic conditions, by definition, don’t go away and therefore everyone who has them, even if they are totally under control, should be tracked.
The flaw in this logic is that only members who have high enough claims to be identified through a claims algorithm are counted in the baseline. As is well known by now, tracking members with high claims forward will always yield a decline in costs, through regression to the mean.
The classic analogy, well-known to veterans of the DM field, is to aviation. Radar measures the altitude of all the flights it tracks. One could use that data to measure the altitude of planes actually in the air. This measurement will overstate the altitude of the average plane, because many planes are on the ground at any given time. So the “baseline” measurement of altitude will overstate the actual average altitude of all planes because planes on the ground are not captured in the initial average. Over time they will be, because the radar will “know” which planes have landed. So over time, the average will migrate from an average of flights in the air, to an average of all planes including the planes on the ground, thus showing a decline in measured altitude even if there is no change in actual average altitude in the U.S. aviation system.
Now assume that the radar is a claims-extraction algorithm and that the different times of the reading are years of the program. One can see, by analogy, that measured claims will decline as more people with the disease who didn’t happen to have claims in the baseline — metaphorical “planes on the ground” — are included in the measurement.
If indeed everyone with a chronic condition had claims in every period, a “prospective identification” methodology would work. However, as long as there are any “planes on the ground,” the average “altitude” (cost per disease-eligible member) noted by claims-extraction algorithms will overstate the true average cost per disease-eligible member.
The “annual requalification” methodology should prevent overstatement of savings, but often doesn’t, due to the correlation between higher compliance and recent events.
In theory, the problems noted above should be avoided by a methodology, recommended by the DMAA, which is essentially the same as the “prospective identification” methodology except that it does not count “planes on the ground” in any period, thus canceling out the bias by creating seemingly symmetrical measurement periods. It is a vastly preferable methodology, but still should not be taken as valid unless checked via an event-rate-based “plausibility test.”
Table 1 shows a hypothetical example illustrating how the “annual requalification” methodology gives a much more valid result than does the prospective identification methodology. Assume that there are only two asthmatics in the health plan, and one baseline and one program year. Further assume that inflation/trend have already been taken into account.
|TABLE 1 Cost per person with asthma in both periods using both methodologies (scenario 1)|
|2005 (baseline year)||2006 (contract year)|
|Cost per patient with asthma: $1,000 in both methodologies
$500 in prospective, $1,000 in annual requalification
In the baseline, both methodologies yield the same result — $1,000 is the average cost per asthmatic because the second asthmatic, a classic “plane on the ground,” doesn’t show up in the measurement. In the contract period, however, the methodologies yield dramatically different results. The “annual requalification” shows a $1,000 cost per asthmatic because #1 is not counted since he had no asthma-identifiable claims. The prospective methodology, though, counts him because he had asthma in the baseline so he certainly still has it, even if it’s under control enough not to generate claims.
Even though the total costs to the plan have not changed, the “prospective” methodology shows a 50 percent reduction in cost per member just by counting both members, while the annual requalification methodology shows the correct result, that costs did not decline. Curiously, even though the annual requalification methodology finds the correct mathematical answer in this case and “prospective” does not, most epidemiologists would argue the opposite — that prospective identification truly captures the right population because actual chronic disease itself does not go away. Hence, people who have shown that they have it should be counted in all future periods, even without claims. The proponents of the “annual requalification” methodology would respond that the large majority of those who do not requalify represented false positives in the baseline, and so eliminating them is epidemiologically correct as well as mathematically correct.
Yet even the annual reconciliation methodology must be plausibility-checked with an event-rate measurement, because it too can be flawed. Assume the previous example, but in this scenario, assume that #1 takes drugs for a while, having had a “scare” (see Table 2).
|TABLE 2 Cost per person with asthma in both periods using both methodologies (scenario 2)|
|2005 (baseline year)||2006 (contract year)|
|Cost per person with asthma||$1,000||$550|
Both methodologies identify the member in the contract period and both would then show a 45 percent decline in costs, even though the costs actually increased. Note, below, how a simple application of an event-based plausibility test highlights the flaw in the measurement. Having seen this red flag, the actuaries can now go back and remeasure to get the right answer (Table 3).
|TABLE 3 Asthma events in the payer as a whole|
|2005 (baseline year)||2006 (contract year)|
|Cost per person with asthma||$1,000||$550|
|Event per person with asthma||1||1|
One might say, “That’s not fair — you added the drugs to 2006 to create an artificial scenario where annual requalification wouldn’t work.” However, there is nothing “artificial” about this scenario. It is actually the most common scenario imaginable — that people are much more likely to take their drugs after they have a “scare” than before they do. If indeed it were the case that taking drugs was not more likely after a scare, then a “$100” would also consistently show up in the quadrant of baseline per patient #2, and the average in both years would be $550. However, who among us isn’t much more careful after just having had a “scare” of any kind than after the memory of the scare has faded?
There are only two significant sources of savings: a reduction in inpatient admissions for the condition(s) and emergency room (ER) avoidance for the specific conditions being managed and their closely related comorbidities. No other savings of significance can be attributed to DM, so there is no reason to complicate measurement with claims from other cost categories.
There is no unit-cost change possible in DM and, therefore, no reason to measure inflation, or “trend.” Doing so just increases the cost and complexity of measurement, while reducing the validity.
These two facts can be grouped because they both point in the same direction: Keep the calculation simple and population-based.
If the answer is so simple and these facts are so incontrovertible, why did reconciliations develop into the complex, expensive, invalid methodologies that have been, until recently, so widely used?
There are three reasons:
(1) The industry evolved based on savings guarantees. Guarantees had to be financially based. So methodologies were developed which were totally based on financial results, and usually never even considered whether the underlying utilization declines necessary to support that analysis were even possible. To this day, one national health plan routinely presents financial savings which, according to its own data, are impossible, and no one notices.
(2) Even without guarantees, the program sponsors within a health plan felt that they needed to present “a number” to senior management. It did not matter that the “number” was no more relevant to the program’s success than the North Vietnamese “body count” was to the Vietnam War’s success. People felt that they needed “a number.”
(3) Most health plans rely on their actuaries, as employers rely on their actuarial consultants, for financial calculations. So the actuarial departments were either given responsibility or took responsibility for this function. And actuaries are clearly the authorities when it comes to answering questions like, “How will a three-tier drug program affect medical spending?” That is an actuarial question and should be given an actuarial answer.
However, DM is not an actuarial science, involving the application of numerical models to a set of assumptions. It is a biostatistical science requiring knowledge and inferences about dose-response relationships for behavior change and event avoidance. It is all about the avoidance of exacerbations and complications, either currently (as noted in the event-rate calculations) or in the future, through favorable changes in quality indicators which presage the avoidance of future events. It is not about unrelated hospitalizations. It is not about lab tests, radiology, home care or any other element of cost which gets included in savings calculations.
And it is not about where one sets the “outlier filter.” By the time people get anywhere near the six-figure claims level, they have long since surpassed a point where events can be prevented by phone calls. Yet, changes in outlier filters can dramatically change the savings “number” despite the lack of impact of DM on those very high-cost members. It’s all about luck, at that point: Did there happen to be more outliers in the baseline or in the study period?
Likewise, pricing has nothing to do with it. DM vendors do not affect contract pricing. So why include inflation in the calculations? Small changes in inflationary trend assumptions can create massive changes in perceived savings, when inflation has nothing to do with it.
Mercer’s lead actuary, Seth Serxner, graciously and candidly acknowledges this fatal flaw in actuarial methodologies in a 2008 article when he writes:
“We can conclude, however, that the choice [emphasis added] of trend has a large impact on estimates of financial savings. Evaluators may be wise, therefore, to conduct their analyses with more than one trend in mind in order to attain a range.”
In other words, there is no way of knowing what the underlying savings actually are since it is all dependent on one’s “choice” of inflation trend. A methodology which does not need to be adjusted for inflation avoids that fatal flaw.
Speaking of “trend,” there is no evidence that the trend for the nonchronic conditions can be used as a proxy for what the chronic-disease trend would have been absent the intervention.
Contrary to what actuaries will tell you, the nonchronically ill population typically differs from the diseased population on nearly all demographic or economic variables. Using a noncomparable group to determine expected trends in cost will introduce measurement bias and limit the ability to draw accurate conclusions about the results. Only if many serial observations of cost are determined to be equivalent between the populations can some degree of confidence be achieved in using the nonchronic trend as a comparator for the chronic population.
These concerns are illustrated by the National Hospital Discharge Survey (NHDS) data presented in Figure 3. Three major chronic disease categories (circulatory, endocrine, and respiratory) are compared to “all other” discharges. As shown, those categories in which the majority of conditions are chronic have been flat, while nonchronic discharges have gone up by 7.5 percent over the observed five-year period. The assumption can also be made that the “all other” category of discharges is more costly than the chronic conditions because many of the diagnoses require surgeries (i.e., injuries, deliveries, and complications), as opposed to less costly medical stays. While these data do not represent chronic versus nonchronic populations per se and some degree of overlap is inevitable, these data do demonstrate that both the level and trend of discharges for nonchronic conditions are significantly higher from those categories considered chronic. These findings suggest that applying a nonchronic “trend” to the diseased population will bias the results in favor of the DM program.
There is also the problem that the two populations aren’t static. The “planes on the ground” will show up in the nonchronic population in the baseline. Then suppose some of them have an event. That event will show up in the nonchronic trend line, causing it to rise even though the person had the condition. In the next period, those people will be shifted into the chronic population. And, like many people with an event, they will then regress to the mean and not have another event. Thus their spike in costs will be counted as part of the nonchronic trend and their subsequent regression to the mean will be part of the chronic trend. As a result, the calculation might favor the diseased population.
Or, some actuaries might do it the other way around to avoid this, and recalculate the trend for both the chronic and nonchronic populations retrospectively, once it is learned that some of the people in the baseline were really “planes on the ground” and not nonchronic, and should have been in the chronic population to begin with. How many years would one do this for, retrospectively? How many times would one recalculate?
No matter which way one looks at it, finding an answer is cumbersome and the answer is probably invalid anyway. One can see why simply counting how many events are avoided is becoming the preferred approach.
A population selected based on “risk scores” of any type — including members selected on the basis of predictive modeling — will also regress to the mean.
One of the most common fallacies of the actuarial approach is that starting not with a population based on claims identification, but rather with a population based on its risk score, will avoid regression to the mean. However, all risk-scoring methodologies weigh either last year’s claims or some proxy for last year’s claims. Why? For the simple reason that last year’s claims are a good predictor of this year’s claims. Many people who are high-cost last year will stay high-cost, though some will move to low-cost.
Likewise, most of the people who were low-cost last year will remain so, though some will move to high-cost. If predictive models or risk scoring could predict those two moves, then indeed “risk scores” would avoid regression to the mean. But they can’t. If your doctor can’t predict when you will have a heart attack, how can a software claims algorithm?
A major risk-bearing health system once tested the ability of predictive algorithms to truly “predict,” meaning to determine which low-cost people would transition to become high-cost. What they found was, ironically, in itself quite predictable but also disappointing. Only a few low-cost members were correctly predicted to become high-cost.
In one exercise, predictive modeling vendors were asked to, well, predict. They were given a two-year-old data set and asked to predict the low-cost people who would become high-cost in the following year. “Low cost” was defined as having had claims of $4,000 or less in the base year, while high cost was defined as $10,000 or more. The results were that very few members were predicted to transition in this manner, with only Vendor A predicting more than a handful. The line, representing the percentage of those predicted who actually became high-cost, tells a story too. Even Vendor D, which tried to be most specific in this prediction and basically predicted less than 20, was right about only one of them.
In reality, several thousand people transitioned in this manner. Most were a surprise to the predictive modeling vendors, and were probably also surprises to themselves and their physician.
Conclusion: While risk scoring and its predictive-modeling cousin may have a role in identifying members for DM, they cannot be relied upon as a tool to predict a cohort’s claims cost, nor can they be used as a study design when trying to select a population whose future claims will be immune to regression to the mean.
The good news is that you can measure validly using “ingredients you have around the kitchen,” without the need for expensive actuarial consulting.
The preferred methodologies that have been described in this chapter can be easily measured by any health plan or large employer using just ICD-9 codes. There is no reason to spend large sums of money on actuarial modeling when you can get greater validity and transparency simply by determining whether you have avoided events and complications closely associated with the conditions which you are managing specifically in order to avoid events and complications.
Taking the advice in this chapter will improve your measurement dramatically. Too many reports are accepted which contain too many unnoticed mistakes, mistakes which would be caught if these simple rules and observations are followed. You may think you can spot these mistakes already. But you probably didn’t even notice that this chapter on “10 Things You Need To Know about Measuring Outcomes” actually contained a list of eleven.
Copyright ©2011 by Atlantic Information Services Inc. (AIS). This is an excerpt from Disease Management and Wellness in the Post-Reform Era, published by Atlantic Information Services Inc. (www.AISHealth.com). It is reprinted with permission from AIS.
Purpose: Insulin pump users discard unused medication and infusion sets according to labeling and manufacturer’s instructions. The stability labeling for insulin aspart [rDNA origin] (Novolog) was increased from two days to six. The associated savings was modeled from the perspective of a hypothetical one-million member health plan and the total United States population.
Design: The discarded insulin volume and the number of infusion sets used under a two-day stability scenario versus six were modeled.
Methods: A mix of insulin pumps of various reservoir capacities with a range of daily insulin dosages was used. Average daily insulin dose was 65 units ranging from 10 to 150 units. Costs of discarded insulin aspart [rDNA origin] were calculated using WAC (Average Wholesale Price minus 16.67%). The cost of pump supplies was computed for the two-day scenario assuming a complete infusion set change, including reservoirs, every two days. Under the six-day scenario complete infusion sets were discarded every six days while cannulas at the insertion site were changed midway between complete changes. AWP of least expensive supplies was used to compute their costs.
Principal findings: For the hypothetical health plan (1,182 pump users) the annual reduction in discarded insulin volume between scenarios was 19.8 million units. The corresponding cost reduction for the plan due to drug and supply savings was $3.4 million. From the U.S. population perspective, savings of over $1 billion were estimated.
Conclusions: Using insulin that is stable for six days in pump reservoirs can yield substantial savings to health plans and other payers, including patients.
Diabetes continues to be a critical health issue, both medically and financially. Its incidence in the United States has grown significantly, rising in tandem with the increase in the obesity rate (CDC 2010, Flegel 2010). Currently, almost 18 million people have diabetes in the United States; of these, 90 to 95 percent have type 2 (CDC 2010, American Diabetes Association 2010). As a result, diabetes is a major contributor to morbidity and mortality, driving increased use of a wide variety of medical services. The annual direct medical and pharmacy costs of diabetes, estimated at $116 billion in 2007, are substantial and are expected to continue to rise as the population of patients with diabetes expands (CDC 2010, American Diabetes Association 2010). These costs are largely borne by payers, including private carriers, employers, federal and state governments, and ultimately taxpayers and consumers. This article presents a method with potential to reduce the costs of insulin for individual payers and across the U.S. health care system.
The U.S. Agency for Healthcare Research and Quality (AHRQ) recognized that diabetes is a difficult and complex disease to manage. AHRQ has identified that intensive therapy and a team approach results in improved care of patients with diabetes. Specifically, the agency recommends the following:
Using these guidelines, the mean glycosylated hemoglobin (HbA1c) for type 1 patients was 7.1 percent, and the mean for type 2 patients was 6.9 percent (AHRQ).
An important subset of the diabetic population consists of users of insulin pumps, estimated to be over 360,000 pump users in the United States (JMP Securities, 2009). Insulin pumps offer the advantages of being readily available and may simplify the diabetes treatment regimen.
The insulin pump has changed diabetes management but has added new complexities to diabetes care. Patients must fill the pump reservoirs, whose capacities differ by pump product, with insulin, discarding unused insulin that has passed the in-use expiration date, as well as changing the needle and the tubing on a regular basis. The insulin and the other disposable parts constitute a significant portion of the pump care costs. Extending the expiration period for insulin potentially decreases drug wastage and reduces the frequency of changing infusion sets. This could lead to lower costs for payers and patients, as well as simplifying the regimen for patients.
Recently, the labeling for Novolog (insulin aspart [rDNA origin] Injection) has changed to reflect the improved stability in insulin pump reservoirs from two days to six days in three insulin pump devices (Novolog prescribing information). In this study, we used a population-based model to project the potential impact on medical and pharmacy costs, from the payer point of view. We also examined care management issues related to pump use for insulin pump–dependent patients with diabetes. Some of these care management issues included complexity of treatment regimen, patient dosing requirements, and frequency of changing insulin in reservoir and infusion sets.
The cohort of insulin pump users under each scenario (“two-day” and “six-day”) was distributed among a mix of the three insulin pumps with various reservoir capacities (Minimed Report). These pumps have been evaluated for use with insulin aspart [rDNA origin] and are listed in the insulin aspart product label. Table 1 summarizes reservoir capacity, cost of consumable pump components, and the approximate U.S. market share of three insulin pump types.
|TABLE 1 Characteristics and market share of top three pumps in market|
|Pump||Reservoir size in units||AWP of full infusion set with reservoir||AWP of needle and cannula||Patient mix|
|Source: MediSpan 2009|
In the two-day scenario, the entire infusion set, including the reservoir, is discarded at the end of each two-day period. In the six-day scenario, the needle and cannula are changed at the infusion site at day 3 (as per pump manufacturer recommendation) and the entire infusion set is discarded at the end of each six-day period. We calculated the total cost of discarded pump components for each two-day period for the “two-day scenario” and each six-day period for the “six-day scenario” based on AWP prices derived from the MediSpan database for December 2009.
The daily insulin doses assumed to be taken by the pump user cohort were based on the distribution shown in Figure 1, which is derived from a large sample of U.S. diabetes patients (IMS LifeLink). All patients are assumed to be using insulin aspart [rDNA origin] 10-mg vials, as this would appear to be the most convenient method for transferring insulin into pump reservoirs.
We calculated the amount of discarded insulin remaining in the reservoir at the end of the two-day and six-day period, assuming the pump reservoir is filled at the start of the two-day or six-day period and is filled whenever the reservoir is emptied during the period. We also included in our calculations any unused insulin remaining in the vial 28 days after opening. The cost of the discarded insulin in each two-day and six-day period was based on the wholesale acquisition cost (WAC) of insulin aspart [rDNA origin] 10-ml vials (Red Book).
Finally, the difference in total cost of all discarded insulin and all pump components consumed over a one-year period for the six-day scenario was compared with the two-day scenario.
The two population cohorts used in the model were based upon 1) a one-million member health plan and 2) total estimated pump users in the U.S. (360,000 U.S. pump users correspond to a prevalence of 0.118 percent or 1,182 pump users in a one-million member plan) (JMP Securities). These cohorts were selected as they represent common planning scenarios used by payers and epidemiologists.
Table 2 shows the breakout of cost savings on a per-patient basis, which are independent of total population. On average, a typical health plan of any size can expect total annual savings in drug and supply costs for each pump patient to average $2,873, assuming that patients follow the six-day scenario regimen.
|TABLE 2 Per-person average annual cost savings for a six-day versus a two-day scenario|
|Source of annual per-person savings||Estimated average annual per-patient savings|
|Reduction in discarded insulin (WAC)||$1,556|
|Reduction in supplies (AWP)||$1,317|
Table 3 shows the breakout of savings associated with each of the studied populations. For the one-million member health plan, the model estimates that there will be 1,182 insulin pump users with prescribed insulin amounting to 28 million units of insulin aspart [rDNA origin] at a WAC of $2.6 million each year. The reduction in discarded insulin in the “six-day scenario” compared to the “two-day scenario” is estimated at 19.8 million units, saving $1.84 million. The reduction in pump supply costs yields a further $1.55 million in savings. The overall annual savings associated with all 1,182 pump users is $3.39 million.
|TABLE 3 Population-based savings for a six-day versus a two-day scenario|
|Health plan scenario (annual)||Health plan scenario PMPM||U.S. population scenario (annual)|
|Population||1 million||307 million|
|Estimated insulin pump users||1,182||362,746|
|Insulin units prescribed (avg. 65 units per day)||28 million||8.6 billion|
|WAC of prescribed insulin||$2.6 million||$798 million|
|Reduction in discarded insulin (units)||19.8 million||6.1 billion|
|Reduction in discarded insulin (WAC)||$1.84 million||$.15||$564 million|
|Reduction in supplies (AWP)||$1.55 million||$.13||$478 million|
|Total annual savings||$3.39 million||$.28||$1.04 billion|
Based on U.S. census data for July 2009, the population is estimated to be just over 307 million people. (U.S. Census Bureau) The model estimates 362,746 pump users with prescribed insulin, amounting to 8.6 billion units at a WAC of just under $800 million annually. The reduction in discarded insulin in the six-day scenario, compared with the two-day scenario, for the total U. S. cohort of pump users is estimated at 6.1 billion units, saving $564 million. The reduction in pump supply costs yields a further $478 million in savings. The overall annual savings associated with the entire U.S. cohort of pump users is just over $1 billion.
The potential clinical impact of uncontrolled diabetes to patients is well established as is the economic burden to both the patient and society. Patients utilizing insulin pumps have the additional costs and inconvenience of periodically having to replace both their insulin medication and infusion sets. Our results demonstrate that there is an opportunity to provide patients using an insulin pump with a six-day stability with a less complicated, cost-saving treatment regimen that is likely to increase patient and provider satisfaction with therapy.
It is important to consider the major stakeholders in any decision to implement a broad-based program to migrate patients to a new regimen. These include the payer, provider, and patient. Insulin and pump supplies for patients covered under traditional multi-tier pharmacy benefit plans with typical medical benefit coverage for durable medical equipment cost substantially less for the six-day regimen compared to the two-day regimen. Each patient utilizing daily doses of insulin ranging from 10 to 150 units, with a mean of 65 units saved, on average, $2,873 ($1,556 insulin and $1,317 supplies) annually (IMS LifeLink). These savings are greater than the total cost at WAC ($2,199) of the prescribed insulin for a patient taking the average daily dose of 65 units. For a one-million member health plan, overall savings are estimated to be almost $3.4 million, equating to 28¢ per member per month (PMPM) savings.
The implications of this level of savings to a health plan are dramatic, because there is a substantial difference in cost and profit compared with two-day regimen. Assuming that a health plan is operating on a 3 percent margin, to achieve $3.4 million in bottom-line profit, it would have to acquire $113 million in incremental revenue from new sources. In 2009, the average cost of health care premiums was $13,375 (Kaiser Family Foundation 2009); therefore, it takes new employer groups with 8,460 employees to yield an equivalent amount. The costs of health care premiums are likely to go up and therefore the implementation of this new regimen may help slow some of these increases.
Another aspect that is important to health plans is for patients to maintain glycemic control through adherence to their treatment regimens. This is important to health plans, because members with diabetes utilize more health care services than members without the illness. One study found in a privately insured patient population that adults with type 2 diabetes and 24 months of continuous health plan enrollment had 2.4 times the adjusted health care costs of matched controlled non-diabetic patients (Durden 2009).
It is important to recognize that managed care organizations make decisions by committee; therefore, a change in protocol is likely to require P&T or even medical technology committee approvals. As a result, like any other systematic change, an inter- nal champion must emerge to make the case and handle any objections that may arise. Based on the impact of the cost savings involved, it seems likely that either a pharmacy director or medical director with pharmacy budget responsibility will take the lead.
Although these estimated cost savings from reduced wastage are impressive for private health plans, the cost implication also applies to the significant portion of the diabetic population covered by government sponsored plans such as Medicare, Medicaid, Tricare and the Veterans Affairs Department. The result is to further magnify the positive impact on both costs and patient outcomes at a societal level.
Diabetes is a complex disease requiring patients to demonstrate excellent care management over their lifetimes in order to successfully keep their HbA1c levels within target range. It has been shown that medication adherence rates for patients with type 1 and type 2 diabetes who use a self- monitoring blood glucose regimen are 70 percent and 64 percent, respectively (Delamater 2006). Complex treatment regimens are a barrier to successful glucose control, resulting in increased morbidity and mortality as well as economic consequences. The Diabetes Control and Complications Trial proved that intense management of patients with respect to glycemic control resulted in dramatic reductions in the rate of development and progression of retinopathy, neuropathy, and neph- ropathy. In turn, tighter glycemic control yielded lower long-term medical costs from reduced hospitalization and fewer medical services (Zinman 1997).
Lower adherence rates have been correlated with chronic conditions, asymptomatic, or varying symptomatology during disease progression, and complex treatment regimens, especially those requiring lifestyle changes (Delamater 2006). All of these are associated with diabetes and can represent a challenge to patients and health care professionals. From the payer perspective, tighter control represents a significant medical cost savings opportunity.
Patients who make pharmacy copayments for drugs and have out-of- pocket coinsurance for some supplies under more traditional pharmacy and medical benefit plans have much to gain by converting to a six-day regimen. In 2007, Kaiser Family Foundation reported average copayments for generics, preferred brands, and non-preferred brands were $11, $25, and $43, respectively (Kaiser Family Foundation 2011). Under a six-day scenario, we assume that doctors will continue to prescribe the same number of insulin units per prescription; however, patients will refill them less frequently because of the changed label of insulin aspart, which allows for 6 days’ use in a pump, and the reduction in discarded insulin. Furthermore, patients save on the infusion sets because of the reduced number required throughout the year. Based on the six-day regimen, patients will refill 3 to 4 fewer prescriptions per year, resulting in $75 to $172 in annual savings, depending on whether the six-day insulin is reimbursed as a preferred or non-preferred brand.
Relative to insulin costs and pump supplies, greater savings to the patient are more likely to occur through the reduction of the number of full infusion sets required during the year. Under the two-day regimen, patients require 15 infusion sets per month, costing an estimated $2,500 annually. Conversely, a six-day regimen requires only five complete infusion sets per month, costing $833 per year and resulting in $1,667 in infusion set savings for the year.
There is, however, an additional cost of $350 to the six-day regimen associated with new needles and cannulas, which are required every three days. But even with this adjustment, the supply cost savings net at $1,317 per annum per patient. We would expect patients to realize additional savings from reduced medical costs, because adherence and glycemic control improves with the less complicated regimen.
Further potential benefits to diabetic patients result from the reduction of these supply cost savings. Many health plans impose a durable medical equipment cap for the year, typically $2,500. By lowering the supply costs associated with pump supplies, patients are less likely to reach their cap, or they can have more coverage for other durable medical equipment purposes.
In the United States, the number of individuals with high deductible health plans (HDHP) is increasing (AHIP 2009). According to a 2009 report from America’s Health Insurance Plans Center for Policy and Research, HDHP coverage rose to 8 million in January 2009. This is an increase of 1.9 million from 6.1 million in January 2008. For patients with high deductible health plans the six-day regimen has an even greater favorable financial impact than for patients with more traditional coverage. For the former, 100 percent of their medical and pharmaceutical expenses are paid out of pocket until their high deductible is reached. For a member with a $5,000 deductible, a reduction in out-of-pocket expenses of $2,500 or more is quite considerable.
The change to the six-day regimen would be especially important for patients who are trying to stretch their health care out-of-pocket dollars. Perhaps these patients are not changing their insulin and infusion sets every two days as directed and therefore are at risk for less efficacious results because the insulin that they are using is outdated. Benefits of the six-day protocol include a reduction in frequency of changes.
The six-day protocol also has meaningful advantages for health care professionals trying to simplify the complex and costly treatment regimens. With the ever-increasing costs of health care, the new regimen enables the provider to offer a more economical option to patients. It provides yet another opportunity to educate patients on the need to follow the health care professional’s guidance. It provides a much simpler method for patients to follow that could lead to better adherence, not only to the insulin pump dosing but to overall treatment.
There are limitations to this analysis. The model assumes patients are currently adhering to the manufacturers’ recommendations for changing the insulin reservoir every two-days, as per the FDA’s indications or the pump instructions. In the real world, patients are not always compliant with their medication or with device instructions. If a patient continues to use the same insulin reservoir until it is empty, regardless of the two-day in-pump use limitation, this would diminish the savings projection. However, there may be clinical and economic consequences associated with adverse events caused by the use of expired insulin. In addition, in an effort to reduce costs, there may be patients who currently do not fill their reservoir to capacity as instructed.
There will also be a certain level of noncompliance with proper insulin pump use, although this is likely to be less common with a pump-dependent group than with insulin-dependent diabetic patients in general, (Raccah 2011) because of the screen- ing that typically occurs before a patient is started on the pump and because of the hyperglycemic risks associated with running out of insulin too soon. A reasonable amount of resources applied by health plans toward educating health care professionals and patients on the clinical and economic benefits of the six-day protocol should result in new pump users initiating therapy with the new regimen.
The model assumes typical distributions of patients and market shares of the three different insulin pumps. Individual health plans may, of course, have their own preferred insulin pump products and, therefore, have a different mix of pumps that will affect the outputs from the model.
Another assumption in the model is that patients will continue to obtain prescriptions for insulin from their doctors as they are currently written. For example, if a patient is currently prescribed 2,000 insulin units for a 30-day supply, the patient will continue to refill that prescription—only the time between refills will be less frequent (more than 30 days). The result is less cost to the patient and payer; however, the plan will collect fewer copayments during the year.
Overall, we believe that these limitations do not significantly change the impact of the savings that plans and patients are likely to experience.
Our model clearly demonstrates that significant costs savings can be achieved through the conversion of a two-day insulin regimen to a six-day regimen among insulin pump users. Pharmacy budgets can be expected to lower their insulin costs by $1,556 per patient and medical supply costs lowered by $1,317 per patient under the six-day regimen. Health plans with even small numbers of patients who use insulin pumps may find it useful to encourage the use of the six-day regimen. From a societal perspective, it appears that significant cost savings to the health care system might be achieved.
Funding for this study was provided to Managed Solutions by Novo Nordisk Inc., the manufacturer of Novolog. An honorarium was provided to Derek van Amerongen, MD, MS, for his contributions.
Richard C. Weiss had primary responsibility for concept and design with assistance from van Amerongen and Gary Bazalo. The manuscript was drafted primarily by Weiss and van Amerongen with input from Bazalo.
PO Box 526
Mt. Freedom, NJ 07970
Appropriate patients for insulin therapy include patients with type 1 diabetes and individuals with type 2 diabetes who are unable to be controlled with other medications. Type 1 is an autoimmune disease in which the ability of the pancreas to make insulin has been destroyed. In type 2 diabetes, either the pancreas does not produce enough insulin or the body’s cells are resistant to the action of insulin. (AHRQ)
It is not for every illness that the FDA recommends massage and emotional support. But fibromyalgia is not like most other illnesses. First, it should be said that there is no longer any question about whether fibromyalgia syndrome is a specific illness — the FDA itself states that these are patients who have gone through a change in the way their brains perceive pain.
And emotional support forms an essential element in management of this syndrome, according to the FDA. “Medication is just one part of the treatment approach” and the FDA also mentions walking, jogging, biking, gently stretching muscles, water therapy, massage, yoga, and other exercises.
Lyrica and Cymbalta and other pain relievers are not the only medications that the FDA considers relevant to fibromyalgia patients. Because comorbidities play a significant role, the FDA also considers sleep medications, antidepressants, muscle relaxants, and antiseizure medications.
But fibromyalgia patients should be approached with the assumption that no single modality alone is sufficient to address the syndrome, says Michael Siegel, MD, corporate vice president and medical director for utilization management and quality improvement at Molina Healthcare. Doctors, he says, should review the several approaches appropriate for the patient, explain each, and guide while at the same time empowering the patient.
Many patients who are managing their illness effectively achieve a moderating of symptoms, says Connie Luedtke, RN, assistant professor of nursing at the Mayo Clinic Pain Rehabilitation Center and nursing supervisor of the Fibromyalgia & Chronic Fatigue Clinic at Mayo Clinic.
Speaking directly to the interests of health plan medical directors, Luedtke urges an attempt to achieve diagnosis at earlier stages of the illness and before patients respond to their symptoms by decreasing exercise and activities and become deconditioned. In her experience it takes an average of five years for primary care doctors to complete this diagnosis. The American College of Rheumatology says fibromyalgia develops in between 2 and 4 percent of the population, predominantly in women. So, prevalence is considerable and providers should be alert to the presence of the illness. Luedtke has treated patients who had to be started on a daily exercise regimen that lasted no more than two minutes!
Fibromyalgia remains challenging for clinicians because it generates tremendous amounts of friction between providers and patients. With fibromyalgia, as with similar conditions, “the biggest problem confronting the interaction between medical providers and patients is that patients fully expect that it will be more or less obvious what the problem is, and that modern medical science will have a way to fix it. And physicians, trapped in the same paradigm, feel they should be able to provide a discrete diagnosis and prescribe a discrete cure,” says Bill Clark, MD, past president of the American Academy on Communication in Healthcare and a lecturer in medicine at Harvard Medical School.
There are no simple answers. Providers have to work carefully to build trust and respect. What is important here is that how to do this can be learned. “Communication and interactional skills can be taught and learned,” says Clark. Modules 26 and 27 of the AACH’s “doc.com” project (http://www.aachonline.org) help physicians provide self-management support.
In most situations, giving emotional support is simple, Clark says. It amounts to actively noticing patients’ emotions and letting patients know you see or hear what they are experiencing, and that their uneasy and negative feelings about fibromyalgia are normal.
To convey emotional support, the academy advises:
Clark says to “Listen carefully and ask the patient to tell you more about what worries him and about how the fibromyalgia has been affecting daily life.” Interrupt if the patient goes on and on; patients want not only your attention, but also your expertise, Clark says. One study found physicians tend to interrupt patients’ opening statement after an average of 18 seconds; a more recent repeat study found that they did so after 23 seconds.
Molina Healthcare educates providers to use a shared decision-making model, says Siegel. Rather than saying, “This is what you should do,” the physician reviews the findings of the examination and test results with the patient and informs the patient about options. After a full discussion about what treatments might be used, the patient and physician decide together what the next steps are. “Often patients will choose the less aggressive option,” says Siegel.
There is a very high comorbidity of serious depression in fibromyalgia patients, points out Michael Golinkoff, PhD, MBA, chief clinical officer at Aetna Behavioral Health. He says that Aetna routinely screens for depression in these patients.
“Even with patients who do not meet diagnostic criteria for depression, we often find that there are many psychological symptoms present,” says Golinkoff. As fibromyalgia is “very stressing, very debilitating, addressing this suffering is an important part of the treatment plan.” Thus it is critical to refer patients as appropriate to psychotherapy and other associated treatments, Golinkoff says.
Golinkoff recommends the meta-analysis of psychological treatments for fibromyalgia by Julia A. Glombiewski et al in the journal Pain (2010; 151: 280–295), both the text and the incorporated references. This study found cognitive-behavioral therapy was associated with the greatest effect sizes and reports that “meta-analytic integration resulted in a significant but small effect size for short-term pain reduction (Hedges’s g = 0.37, 95 percent confidence interval (CI): 0.27–0.48) and a small-to-medium effect size for long-term pain reduction over an average follow-up phase of 7.4 months (Hedges’s g = 0.47, 95 percent CI: 0.3–0.49). Psychological treatments also proved effective in reducing sleep problems (Hedges’s g = 0.46, 95 percent CI: 0.28–0.64), depression (Hedges’s g = 0.33, 95 percent CI: 0.20–0.45), functional status (Hedges’s g = 0.42, 95 percent CI: 0.25–0.58), and catastrophizing (Hedges’s g = 0.33, 95 percent CI: 0.17–0.49).”
Notably these results are “comparable to those reported for other pain and drug treatments used for this disorder.”
Of course, this meta-analysis provides no statistics on the degree to which psychological treatments helped empower patients for self-management or how increased capacity for self-management affects patients’ frequency of physician visits and use of other expensive resources. However, the possibility that such relationships do exist might be kept in mind.
Group therapy as well as individual therapy has been shown to be effective, Golinkoff says. Siegel also emphasized the value of group sessions “if at all possible,” both because they help patients feel they’re not alone and because they provide an opportunity for them to learn from the experience of other fibromyalgia patients.
When patients receive individual or group therapy, “we find symptoms not necessarily associated with depression but associated directly with fibromyalgia improve,” Golinkoff says.
With some patients with fibromyalgia, biofeedback is extremely useful in helping them relax and become aware of when they are relaxed, Luedtke says. “Patients seem to love that.”
At Mayo Clinic, a 2.5-day program introduces fibromyalgia patients to the syndrome, although staffing is not large enough to enroll any but patients referred internally.
Some patients’ employers provide employee assistance programs and, if this is the case, Golinkoff says, it is often beneficial to integrate the resources of such a program into the care of this patient.
In selecting medications for these patients, it may be prudent to monitor carefully or avoid entirely medications that cause musculoskeletal pain, Siegel says, mentioning statins in particular.
What about fibromyalgia patients who look to unconventional approaches to manage their illness? Aetna offers eligible members who are interested in natural and complementary health care services access to many complementary health care services.
Even more important may be how providers address a patient’s wish to embark on some entirely unscientific activities — avoiding all sugar, for example. Too often the doctor’s stance is: I know the right answer, Clark says, although other approaches may be more productive in the long run. Keep in mind that the situation is not urgent. Focus on staying in a relationship with the patient and building a reserve of trust for the future. Understand the patient’s perspective. Support the patient’s determination to make the best decisions for herself, whether or not this perspective is very helpful — the physician has the expertise to explain there are other options.
Strengthening the physician-patient relationship is not only more effective for the patient but also may be more rewarding — and less frustrating — for the physician over the long run as well, Clark says. “Every patient wants a miracle cure and every doctor wants to prescribe it,” says Clark. “It is one thing to take out an appendix; it is another to wrestle with the symptoms and the mystery of fibromyalgia, an illness that is endlessly distressing for patients.”
Group therapy has been shown to improve even nonpsychological symptoms.
The communication and interactional skills needed to deal with patients who have fibromyalgia can be taught and learned.