The UpGuard Cyber Risk Team can now disclose that sensitive data from the Los Angeles County 211 service, a nonprofit assistance organization described on their website as “the central source for providing information and referrals for all health and human services in LA County,” was publicly exposed online.
The contents of the downloadable files include access credentials for those operating the 211 system, email addresses for contacts and registered resources of LA County 211, and most troubling, detailed call notes. These notes describe the reason for the calls, including personally identifying information for people reporting the problem, persons in need, and, where applicable, their reported abusers. Included in the more than 3 million rows of call logs are 200,000 rows of detailed notes, including graphic descriptions of elder abuse, child abuse, and suicidal distress, raising serious, large-scale privacy concerns. In many of these cases, full names, phone numbers, addresses, and even 33,000 instances of full Social Security numbers are revealed among the data.
This information was stored in an Amazon AWS S3 bucket configured to be publicly and anonymously accessible. Though some of the files in the bucket were not publicly downloadable, those that were included Postgres database backups and CSV exports of that data, with hundreds of thousands of rows of sensitive personal information. Despite 211’s dedication to preserving the confidentiality of reports, a technical misconfiguration - in this case, an inadvertently public cloud storage instance - exposed not only email addresses and weakly hashed passwords for LA County 211 employees, but six years of highly sensitive call logs regarding some of the most vulnerable people in LA County.
The Discovery
On March 14th, 2018, the UpGuard Cyber Risk team discovered an Amazon Web Services S3 cloud storage bucket located at the subdomain “lacounty.” After initial analysis revealed the sensitive nature of the information inside, the team began notification efforts immediately, calling LA County 211 and emailing the recommended contact. Ultimately the team's notification efforts culminated in reaching a member of information security on April 24, 2018. Our contact at LA County 211 assured us the problem would be taken care of, and in less than 24 hours, UpGuard confirmed the bucket itself was no longer publicly accessible.
Amazon S3 access rules can be set for both the bucket as a whole and for the files within it. In the case of the “lacounty” bucket, permission settings allowed anyone to list the contents; some of the files inside, however, had additional rules preventing public users from downloading them. Other files did not and were publicly downloadable, including the Postgres database backup and CSV exports containing the call records. Such combinations of permissions levels can get convoluted quickly, explaining why misconfigurations, like those due to complex security rules, are the leading cause of breaches and outages for users of cloud services.
The Contents
Several CSV files found within the bucket contain personal information critical to the operation of the LA County 211 service. In “users.csv”, the names, email addresses, and hashed passwords for 384 users were exposed, with 153 marked as active. Almost all of the email addresses were at the @211LA.org domain. The passwords, while hashed, were done so using the MD5 algorithm— an algorithm that is considered weak relative to modern computing power and security standards, and one where many hashes have already been broken, compromising the encryption entirely. In the event the encryption was defeated, these passwords would not only make 211LA.org accounts vulnerable, but open individuals up to attacks on other platforms if they have reused their passwords, as many people do. The other contents of the bucket indicate that LA County 211 uses remote desktop applications to administer their resources, meaning that users and passwords compromised from this public file could potentially be used to remotely access other systems and gain further data.
However, the bulk of the find is contained in a 1.3GB CSV file titled “t_contact.” This file, exported from a Postgres table of the same name, contains a massive amount of personally identifiable information (PII), including the call notes themselves, for over 200,000 calls logged between 2010 and 2016.
Among the data scattered throughout this table were the following:
3,500,000 Total Records396,000 Contact Emails200,000 Detailed Call Notes33,000 Social Security Numbers
These records contained, where available, data compiled in dozens of fields, revealing a breadth of information about many of the individuals involved in the call incidents.
Also included in some records were the full names of people in need, caregivers, reporters, and sometimes even abusers.
Relevant home addresses, phone numbers, and birth dates were included in many reports, as well as what relationship the person reporting the incident had to the person in need.
The Significance
LA County 211 is frequently cited as a leader in delivering badly needed services to people in need with few alternatives, effectively organizing and providing assistance to the citizens it serves. According to their website, they “provide over 500,000 people every year with information and referrals to the services that best meet their needs.” LA County 211 is a top of the funnel operation, triaging reports to the appropriate areas, and assisting people with the sometimes confusing bureaucracy of getting the right help. This top of the funnel position means that they cast a wide net when it comes to the data they gather. Reports of all types are centralized into a single database. From a functional perspective, this makes sense: centralized and standardized technology makes administration and collaboration faster and easier, and reduces the overhead of multiple systems.
But from a cyber risk perspective, it means that you are creating a crown jewel— a single asset with nearly the value of your entire operation. If this dataset is not carefully handled, the magnitude of exposure is far greater than if it occurred at any of the more specialized links down the triage chain. Furthermore, the specific work done by 211 adds another layer of sensitivity on top of the normal things digital businesses have to worry about, such as user credentials being exploited, or systems being compromised. Those could damage the business. But it should be self-evident how the detailed and not-anonymized call records of an emergency, crisis, and abuse hotline could be used to hurt any number of individuals involved. There are few situations that call for greater confidentiality.
When critical social infrastructure like LA County 211 is placed on top of technological infrastructure, it takes on the risks of that technology. For example, the LA County 211 website has a cyber risk score of 608 out of 950, according to the UpGuard Cloud Scanner, meaning that while some data protection is in place, improvements could be made to further harden systems. The public dispersal of the information contained in the LA County 211 files could be extremely damaging to those involved, and measures taken to protect such information should be equal to those repercussions. The obvious and necessary advantages to using centralized databases, cloud hosting, and online storage must be seen alongside the threats they pose to the business being conducted with them; not so that such innovations can be avoided, but so that the risks can be accounted for upfront and controlled as best as possible.
Any loss of trust in a crisis and abuse reporting system will deter people from using it, removing one of the few mechanisms available to people in need. This problem isn’t unique to 211 or to their sector, but a problem facing all organizations using cloud technology and internet applications to store and process their data. This incident highlights the importance of building a resilient digital ecosystem that can provide privacy and reliability as effectively as it does speed and power.
Note: On June 11, 2018, we further redacted the images containing call notes.