Opinion | Stanford Study: It Is Trivially Easy to Identify People With Metadata

When the NSA's bulk collection of every single American's phone records was disclosed this past summer, defenders of the program argued it was not invasive surveillance because it's only metadata (who you called, when, and for how long) and doesn't include the identity of the callers or the content of the conversation. "There are no names, there's no content in that database," Obama said in June.

A new study at Stanford University has just ripped that argument to shreds.

Stanford computer scientists Jonathon Mayer and Patrick Mutchler found that it is "trivially" easy to determine the identity of callers if all you have is metadata. They write about their research in a blog post:

So, just how easy is it to identify a phone number?
Trivial, we found. We randomly sampled 5,000 numbers from our crowdsourced MetaPhone datasetand queried the Yelp, Google Places, and Facebook directories. With little marginal effort and just those three sources--all free and public--we matched 1,356 (27.1%) of the numbers. Specifically, there were 378 hits (7.6%) on Yelp, 684 (13.7%) on Google Places, and 618 (12.3%) on Facebook.
What about if an organization were willing to put in some manpower? To conservatively approximate human analysis, we randomly sampled 100 numbers from our dataset, then ran Google searches on each. In under an hour, we were able to associate an individual or a business with 60 of the 100 numbers. When we added in our three initial sources, we were up to 73.
How about if money were no object? We don't have the budget or credentials to access a premium data aggregator, so we ran our 100 numbers with Intelius, a cheap consumer-oriented service. 74 matched.¹ Between Intelius, Google search, and our three initial sources, we associated a name with 91 of the 100 numbers.
If a few academic researchers can get this far this quickly, it's difficult to believe the NSA would have any trouble identifying the overwhelming majority of American phone numbers.

This shouldn't be too surprising to anyone that has been paying attention. When the Snowden leaks broke, NSA whistleblower William Binney took issue with arguments like Obama's that said metadata wasn't revealing. Binney said collecting metadata can be more revealing than listening in to the content of phone calls.

This study represents just another in a long line of definitive knock-downs of pro-NSA arguments. The transparency that Snowden's leaks have imposed on the government and its defenders has mortally embarrassed them and allowed for each of their arguments - which we would otherwise have to take on their word - to be disproven.

That is true in general, but it is especially true of the metadata program. The disclosure of this program proved James Clapper, the Director of National Intelligence, to be a bald-faced liar given that he testified to Congress that no such program existed. Then NSA chief Gen. Keith Alexander said the metadata program foiled 54 terrorist plots, a justification that was later proven (and admitted by Alexander) to be completely false. Then they said it was perfectly legal, until we found out that a FISC ruling found in 2011 that the NSA "frequently and systematically violated" statutory laws restricting how intelligence agents can search databases of Americans' telephone communications. To add to that, a federal judge essentially ruled it unconstitutional. And now we discover that their "metadata-isn't-really-invasive" argument is also baloney.

Before Edward Snowden, NSA overreach was, to borrow a phrase, an unknown unknown. After Edward Snowden, they have to lie about it...repeatedly...apparently without a whiff of shame.

Join Us: News for people demanding a better world

Common Dreams is powered by optimists who believe in the power of informed and engaged citizens to ignite and enact change to make the world a better place.

We're hundreds of thousands strong, but every single supporter makes the difference.

Your contribution supports this bold media model—free, independent, and dedicated to reporting the facts every day. Stand with us in the fight for economic equality, social justice, human rights, and a more sustainable future. As a people-powered nonprofit news outlet, we cover the issues the corporate media never will. Join with us today!

Our work is licensed under Creative Commons (CC BY-NC-ND 3.0). Feel free to republish and share widely.

big brother edward snowden nsa william binney

A new study at Stanford University has just ripped that argument to shreds.

So, just how easy is it to identify a phone number?
Trivial, we found. We randomly sampled 5,000 numbers from our crowdsourced MetaPhone datasetand queried the Yelp, Google Places, and Facebook directories. With little marginal effort and just those three sources--all free and public--we matched 1,356 (27.1%) of the numbers. Specifically, there were 378 hits (7.6%) on Yelp, 684 (13.7%) on Google Places, and 618 (12.3%) on Facebook.
What about if an organization were willing to put in some manpower? To conservatively approximate human analysis, we randomly sampled 100 numbers from our dataset, then ran Google searches on each. In under an hour, we were able to associate an individual or a business with 60 of the 100 numbers. When we added in our three initial sources, we were up to 73.
How about if money were no object? We don't have the budget or credentials to access a premium data aggregator, so we ran our 100 numbers with Intelius, a cheap consumer-oriented service. 74 matched.¹ Between Intelius, Google search, and our three initial sources, we associated a name with 91 of the 100 numbers.
If a few academic researchers can get this far this quickly, it's difficult to believe the NSA would have any trouble identifying the overwhelming majority of American phone numbers.

Before Edward Snowden, NSA overreach was, to borrow a phrase, an unknown unknown. After Edward Snowden, they have to lie about it...repeatedly...apparently without a whiff of shame.

A new study at Stanford University has just ripped that argument to shreds.

So, just how easy is it to identify a phone number?
Trivial, we found. We randomly sampled 5,000 numbers from our crowdsourced MetaPhone datasetand queried the Yelp, Google Places, and Facebook directories. With little marginal effort and just those three sources--all free and public--we matched 1,356 (27.1%) of the numbers. Specifically, there were 378 hits (7.6%) on Yelp, 684 (13.7%) on Google Places, and 618 (12.3%) on Facebook.
What about if an organization were willing to put in some manpower? To conservatively approximate human analysis, we randomly sampled 100 numbers from our dataset, then ran Google searches on each. In under an hour, we were able to associate an individual or a business with 60 of the 100 numbers. When we added in our three initial sources, we were up to 73.
How about if money were no object? We don't have the budget or credentials to access a premium data aggregator, so we ran our 100 numbers with Intelius, a cheap consumer-oriented service. 74 matched.¹ Between Intelius, Google search, and our three initial sources, we associated a name with 91 of the 100 numbers.
If a few academic researchers can get this far this quickly, it's difficult to believe the NSA would have any trouble identifying the overwhelming majority of American phone numbers.

Before Edward Snowden, NSA overreach was, to borrow a phrase, an unknown unknown. After Edward Snowden, they have to lie about it...repeatedly...apparently without a whiff of shame.

big brother edward snowden nsa william binney

Quality journalism. Progressive values. Direct to your inbox.