It’s now been a full year since the new breach reporting requirements went into effect for HIPAA-covered entities. Although I’ve regularly updated this blog with new incidents revealed on HHS’s web site, it might be useful to look at some statistics for the first year’s worth of reports.
During this period, 166 breaches each affecting 500 or more individuals were reported to HHS. We won’t know how many smaller breaches occurred unless or until HHS reports that figure to Congress at some future date, but for the 166 breaches reported, 4,905,768 patients were affected. Keep in mind that breaches may not have been reported if the entity decided that the incident did not reach the “harm” threshold incorporated in the interim rule. That has since been pulled, and it’s not clear whether there will be a harm threshold in the final rule (there shouldn’t be one). If HHS did not have the ‘harm” threshold, how many more incidents would we have learned about?
Here are a few statistics to mull over from the 166 cases in the dataset:
- 4 of the incidents involved hacking, affecting 63,000 patients (mean number of patients per incident=15,750)
- 6 involved improper disposal of PHI, affecting 35,439 (mean = 5906.5)
- 20 involved loss of PHI, affecting 1,007,576 (mean = 50,378,8). These figures do not include incidents that were reported as “theft, loss” or “loss” in combination with some other threat vector, so should be interpreted as a low estimate of loss.
- 80 involved theft of PHI, affecting 3,043, 292 (mean = 38,041.15). These figures do not include incidents that were reported as “theft, loss” or “theft” in combination with some other threat vector, so should be interpreted as a low estimate of theft.
- 10 involved unauthorized access, affecting 50,491 (mean = 5,049.1)
- 10 were described as “theft, unauthorized access,” affecting 40,835 (mean = 4083.5)
- 33 of the breaches involved a business associate, affecting 1,460,980 (mean = 44,272.12)
- 34 involved paper records, affecting 121,106 (mean = 3561.94). This figure does not include some of the entities involved in a recent case in Massachusetts.
- 43 involved a laptop, accounting for 1,503,370 (mean = 34,962.09 )
- 21 involved a desktop computer, affecting 243,365 (mean = 11,588.81)
- 5 additional incidents involved both a desktop and a laptop
- 23 involved a portable electronic device, affecting 1,139,419 (mean = 49,539.96 )
- An additional 12 incidents indicated network server as the location of the PHI, affecting 169,656 (mean = 14, 138 )
Other incidents were coded as “other,” some combination of other events, or other categories such as e-mail disclosures.
Viewing the data as above, it appears that somewhat more than half of all reported breaches involved theft and theft accounted for over 62% of all patients whose records were involved in reported breaches involving unsecured PHI. Loss, which accounted for 12% of all reported incidents, accounted for 21% of all patients affected.
Significantly, the theft vector appears greater than that reported across all sectors in DataLossDB.org for the current year. It’s not immediately obvious why the healthcare sector would have higher rates of theft than other sectors, but one possible explanation is that healthcare facilities are more likely to have computers in areas where there is greater traffic and public access (e.g., nurses’ stations, outpatient clinics).
The data on business associates are also suggestive, as the percent of incidents involving BA in this dataset is somewhat higher than in DataLossDB.org’s dataset (20% vs. 15%), and the difference in records or individuals is even larger. Breaches involving business associates accounted for 30% of all affected patients in the HHS dataset, but only account for 8% of individuals or records in DataLossDB.org’s dataset.
Significantly, but not surprisingly perhaps, stolen laptops contained more unsecured PHI than stolen desktops, suggesting that healthcare entities still need to address adequate security for information on portable devices used either by their employees on or off premises or by their business associates.
Of no small concern to me, of the 166 reports, not all states were represented: 38 states plus D.C. filed reports. Of these:
- 19 were from California
- 15 were from Texas
- 14 were from New York
Do we really believe that 12 states have had no reportable breaches for an entire year or do we suspect that breaches are either not being detected or not being reported?
Seven of the breaches each affected over 100,000 patients. The three largest incidents were:
- 1,222,000 – AvMed (theft)
- 998,442 – Blue Cross Blue Shield TN (theft)
- 800,000 South Shore Hospital/ Archive Data Solutions (loss)
What can we learn from these breach reports? And do we really believe that there are proportionally less hacks in the health care sector than in other sectors (4% vs. 14%)? Although it’s certainly possible that hackers or other cybercriminals are less interested in health care databases than in retail or financial sector databases, it’s also possible that the health care sector is not detecting hacks as well. Merchants and financial sector entities have a lot of security requirements for security compliance and assessment by others. In light of other studies suggesting very high attack rates on databases, I cannot help but strongly suspect that the health care sector is often just not detecting intrusions in their systems. In the alternative, they’re just not reporting them. In either case, the low rate of reported hacks raises a caution flag in my mind.
What do you make of the data so far?
Update: Adam Shostack has responded to my question on his blog, here.
I agree. The numbers for “hacking” seem suspiciously low. It’s difficult to know exactly what the attacks mean, especially when terms like “theft” are used – is that physical theft of a system or electronic theft of records?
I was just reading the latest report from the Verizon Business Risk Team – it includes data on threat vectors reported by their VERIS framework. The numbers for hacking (including using SQL injection, backdoors, etc.) is generally up around mid 20%, which is more reasonable.
I think the only way these numbers make sense is if either these kinds of attacks are being rolled up (for example using unauthorized credential access is being included in “business associate”) or if there are a lot of the more sophisticated attacks simply going unnoticed.
Thanks for your thoughts, Geoff. Given how many hacks we saw in 2008 that took a long time to detect (and some are seemingly first being detected in the business sector), one of my nightmare scenarios is that we may have many covered entities who were hacked in 2008 and still haven’t detected it.
The breach reporting form is at http://transparency.cit.nih.gov/breach/index.cfm. I don’t see any guide to clarify how to use some terms such as “theft” or what to do if an incident involved more than one attack (e.g., the insider who exceeds authorized access, downloads info and sells it to others).
When I start getting HHS’s investigatory reports on incidents, maybe it will be clearer to me how they are using the labels in their dataset. For now, I’m just guessing.