Never Heard of People Data Labs? They’ve Heard of You, Likely Have Your Data and It Was Exposed in a Breach

Share the knowledge

People Data Lab (PDL), a data scraping and selling “enrichment” company, boasts that they have data on 1.5 billion people. That data, stored on an Elasticsearch server, was all breached in November. To put this in perspective, the total population of the United States is 330,130,233; PDL says they have three times the data, and it was stored in 4billion records, belonging to at least 1.2billion people, on that compromised Elasticsearch server.

total us u.s. population december 2019

Here’s what People Data Labs says about themselves:

“People Data Labs builds people data. Use our dataset of 1.5 billion unique person profiles to build products, enrich person profiles, power predictive modeling/AI, analysis, and more. We work with technical teams as their engineering focused people data partner.”

And here is what they have and will sell to anyone who pays them, according to them:

  • Over 1.5 Billion unique people, including close to 260 million in the US.
  • Over 1 billion personal email addresses. Work email for 70%+ decision makers in the US, UK, and Canada.
  • Over 420 million linkedin urls
  • Over 1 billion facebook urls and ids.
  • 400 million+ phone numbers. 200 million+ US-based valid cell phone numbers.
people data labs data
Credit (if you can call it that): People Data Labs

This data was found on an Elasticsearch server. Elasticsearch allows you to “store, search, and analyze” data. The Elasticsearch server, with those 4billion records, and all that data, was completely open and unsecured – no password or other authentication was needed to access all of that data.

Get New Internet Patrol Articles by Email!

(Unobtrusive plea for financial support by tipping us.)


HaveIBeenPwned Notification of PDL and Elasticsearch Breach

have I been pwned pawned pdl elasticsearch oxydata

All of this was discovered by security researcher Vinny Troia, over at Data Viper. According to Troia, “The discovered Elasticsearch server containing all of the information was unprotected and accessible via web browser at No password or authentication of any kind was needed to access or download all of the data.”

To further complicate things, the Elasticsearch server was set up in the Google cloud.

But wait, there’s more. Because Troia discovered the data of a second “data enrichment” company on that same Elasticsearch server. That company is, who offers “In-Depth Data on People and Companies”, boasting “Finally, an easy way to get business data whenever you need it.”

Now, here is where it gets even more interesting. Both PDL and OxyData deny ownership of the server. That means, as Troia deduces, that either the data was stolen from both companies and stashed on the Elasticsearch server, or a customer of one or both companies misappropriated it, and stored it on the Elasticsearch server, completely unprotected.


So, where does the liability rest for this breach? PDL? They say it isn’t their server, and that they didn’t put their data there. Ditto Oxydata. Is Elasticsearch responsible? Google for hosting the Elasticsearch server?

We’ll probably never know, and it’s likely that, at the end of the day, nobody will be held accountable. And this is why lawmakers have introduced the Federal Online Privacy Act (finally), and why so many states have introduced their own online privacy laws in the meantime.

Note: The Internet Patrol is completely free, and reader-supported. If something that you find here helps you, please consider supporting us. We also earn a small amount from ads and Amazon links:
Click for amount options

Share the knowledge

Leave a Reply

Your email address will not be published.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.