Written by: Steve Doig
Investigative reporters at CaliforniaWatch received tips that a large chain of hospitals in California might be systematically gaming the federal Medicare program that pays for the costs of medical treatments of Americans aged 65 or older. The particular scam that was alleged is called upcoding, which means reporting patients having more complicated conditions — worth higher reimbursement — than actually existed. But a key source was a union that was fighting with the hospital chain’s management, and the CaliforniaWatch team knew that independent verification was necessary for the story to have credibility.
Luckily, California’s department of health has public records that give very detailed information about each case treated in all the state’s hospitals. The 128 variables include up to 25 diagnosis codes from the "International Statistical Classification of Diseases and Related Health Problems" manual (commonly known as ICD-9) published by the World Health Organization. While patients aren’t identified by name in the data, other variables tell the age of the patient, how the costs are paid and which hospital treated him or her. The reporters realized that with these records, they could see if the hospitals owned by the chain were reporting certain unusual conditions at significantly higher rates than were being seen at other hospitals.
The data sets were large; nearly 4 million records per year, and the reporters wanted to study six years worth of records in order to see how patterns changed over time. They ordered the data from the state agency; it arrived on CD-ROMs that were easily copied into a desktop computer. The reporter doing the actual data analysis used a system called SAS to work with the data. SAS is very powerful (allowing analysis of many millions of records) and is used by many government agencies, including the California health department, but it is expensive — the same kind of analysis could have been done using any of a variety of other database tools, such as Microsoft Access or the open-source MySQL.
With the data in hand and the programs written to study it, finding suspicious patterns was relatively simple. For example, one allegation was that the chain was reporting various degrees of malnutrition at much higher rates than were seen at other hospitals. Using SAS, the data analyst extracted frequency tables that showed the numbers of malnutrition cases being reported each year by each of California’s more than 300 acute care hospitals. The raw frequency tables then were imported into Microsoft Excel for closer inspection of the patterns for each hospital; Excel’s ability to sort, filter and calculate rates from the raw numbers made seeing the patterns easy.
Particularly striking were reports of a condition called Kwashiorkor, a protein deficiency syndrome that almost exclusively is seen in starving infants in famine-afflicted developing countries. Yet the chain was reporting its hospitals were diagnosing Kwashiorkor among elderly Californians at rates as much as 70 times higher than the state average of all hospitals.
For other stories, the analysis used similar techniques to examine the reported rates of conditions like septicemia, encephalopathy, malignant hypertension and autonomic nerve disorder. And another analysis looked at allegations that the chain was admitting from its emergency rooms into hospital care unusually high percentages of Medicare patients, whose source of payment for hospital care is more certain than is the case for many other emergency room patients.
To summarize, stories like these become possible when you use data to produce evidence to test independently allegations being made by sources who may have their own agendas. These stories also are a good example of the necessity for strong public records laws; the reason the government requires hospitals to report this data is so that these kinds of analyzes can be done, whether by government, academics, investigators or even citizen journalists. The subject of these stories is important because it examines whether millions of dollars of public money is being spent properly.
Time to have your say
Did you find this chapter helpful?
At DataJournalism.com we value your feedback, please let us know if you found this chapter helpful.
Thank you for your feedback.
Want to say something about this chapter?
Share, comment, discuss. Now its your turn to let the world know your opinion on this subject.