Is it wrong to collect everyone’s public information and put it in one place?
At the very least, it’s extremely creepy. On Friday, two security researchers reported uncovering a 4-terabyte database that held the records on a whopping 1.2 billion people. The same database was also openly exposed on the internet without any security.
“The leaked data contained names, email addresses, phone numbers, LinkedIn and Facebook profile information,” the researcher Vinny Troia wrote in a blog post about the findings.
Troia goes on to say the finding is “one of the largest data leaks from a single source organization in history.” However, the exposed data wasn’t exactly private information; a lot of it was actually scraped from the internet.
After uncovering the database, Troia and security researcher Bob Diachenko traced back the trove of information to two companies called People Data Labs (PDL) and Oxydata, which specialize in analytics and marketing. Both collectively also hold demographic details on more than a billion people.
For example, PDL boasts possessing more than 400 million phone numbers and a billion personal email addresses. In some cases, PDL scraped the data from the web and social media profiles. In other cases, it bought the information from third-party data brokers, which can specialize in collecting people’s contact data from sources like public records or surveys.
Both companies have been offering access to the records to help businesses connect with prospective clients. Oxydata claims it even knows people’s education and employment histories. But what happens if that same data falls into the wrong hands?
That’s why Diachenko and Troia find the 4TB database so disturbing. Sure, it may have been created for marketing purposes. But the information has been gathered in a way that makes it all too easy for someone to search through and look up a person’s detailed profile over a span of years. For instance, Troia mentions the database containing a landline phone number AT&T apparently registered in his name as part of a TV bundle a decade ago.
As far as the researchers can tell, the database also wasn’t owned by either PDL or Oxydata, but by an anonymous user. For whatever reason, the mysterious actor was pulling the personal records from both companies, and then storing them on what turned out to be an Elasticsearch hosted over Google Cloud.
“If this was a customer that had normal access to PDL’s data, then it would indicate the data was not actually ‘stolen’, but rather misused,” Troia wrote. “This unfortunately does not ease the troubles of any of the 1.2 billion people who had their information exposed.”
The good news is that the exposed records are no longer online. Troia told Wired the database was shut down after he notified the FBI about its existence. Nevertheless, the incident underscores how easy people’s personal data can circulate over the internet without any safeguards in place. In the past, both Troia and Diachenko have uncovered repeated cases of companies accidentally exposing online databases holding troves of customer information.
“If this was not a breach, then who is accountable for this exposure?” Troia asks in his write-up.
Both PDL and Oxydata didn’t immediately respond to a request for comment.