De-anonymizing Social Networks

March 19, 2009 at 11:09 am 18 comments

Our social networks paper is finally officially out! It will be appearing at this year’s IEEE S&P (Oakland).

Download: PDF | PS | HTML

Please read the FAQ about the paper.

Abstract:

Operators of online social networks are increasingly sharing potentially sensitive information about users and their relationships with advertisers, application developers, and data-mining researchers. Privacy is typically protected by anonymization, i.e., removing names, addresses, etc.

We present a framework for analyzing privacy and anonymity in social networks and develop a new re-identification algorithm targeting anonymized social-network graphs. To demonstrate its effectiveness on real-world networks, we show that a third of the users who can be verified to have accounts on both Twitter, a popular microblogging service, and Flickr, an online photo-sharing site, can be re-identified in the anonymous Twitter graph with only a 12% error rate.

Our de-anonymization algorithm is based purely on the network topology, does not require creation of a large number of dummy “sybil” nodes, is robust to noise and all existing defenses, and works even when the overlap between the target network and the adversary’s auxiliary information is small.

The HTML version was produced using my Project Luther software, which in my opinion produces much prettier output than anything else (especially math formulas). Another big benefit is the handling of citations: it automatically searches various bibliographic databases and adds abstract/bibtex/download links and even finds and adds links to author homepages in the bib entries.

I have never formally announced or released Luther; it needs more work before it can be generally usable, and my time is limited. Drop me a line if you’re interested in using it.

Entry filed under: Uncategorized. Tags: anonymity, privacy, re-identification, social networks.

Anonymous Data Collection: Lessons from the A-Rod Affair Is Anonymity Research Ethical?

18 Comments Add your own

1. Researchers can ID anonymous Twitterers | March 27, 2009 at 11:21 am

[…] look at the way anonymous data can be analyzed and have come to some troubling conclusions. In a paper set to be delivered at an upcoming security conference, they showed how they were able to map out […]
Reply
2. View From Planet Jamie » Blog Archive » De-anonymizing Social Networks: Anonymity Is Not Enough | March 27, 2009 at 12:31 pm

[…] their paper “De-Anonymising Social Networks“, Arvind Narayanan and Dr Vitaly Shmatikov from the University Of Texas at Austin present a […]
Reply
3. riot | March 27, 2009 at 2:24 pm

Somehow i’m not that much impressed. I could do that with multiple jabber accounts 3 years ago.

Well you didn’t patent it, so i still have to say: nice work ;)
Reply
- 4. Arvind | March 27, 2009 at 2:28 pm
  
  I suspect you haven’t read the paper :-)
  Reply
5. riot | March 27, 2009 at 2:42 pm

Wee, i came here from /. so what did you expect? ;)

Nah, i read parts of it and it seems quite similar. Yet you’re right, it does something different.. but imho the basic concepts are really quite similar. Gonna read the full paper when i have time this evening.
Are you going to hold a lecture at some european conference? Like 26c3 or HAR2009 or something?
Reply
6. Arvind | March 27, 2009 at 2:50 pm

The underlying concept is relatively simple, but the hard part was to pull it off at scale, in a fully automated way, with very noisy data.

Re. conferences, that all depends on visas and funding :-)
Reply
7. Privacy Lives » Blog Archive » IDG News: Researchers Can ID Anonymous Twitterers | March 27, 2009 at 6:31 pm

[…] look at the way anonymous data can be analyzed and have come to some troubling conclusions. In a paper set to be delivered at an upcoming security conference, they showed how they were able to map out […]
Reply
8. New Study Shows Anonymous Data Isn’t Very Anonymous At All | SolidWebs | March 27, 2009 at 10:21 pm

[…] a real person. It looks like there’s now some research to support that. Steven Hoy points us to a new paper where some researchers wrote an algorithm that takes anonymized data from social networks and […]
Reply
9. Nowinki » New Study Shows Anonymous Data Isn’t Very Anonymous At All | March 27, 2009 at 10:28 pm

[…] a real person. It looks like there’s now some research to support that. Steven Hoy points us to a new paper where some researchers wrote an algorithm that takes anonymized data from social networks and […]
Reply
10. Games » Blog Archive » New Study Shows Anonymous Data Isn’t Very Anonymous At All | March 30, 2009 at 1:36 am

[…] person. It looks like there’s now some research to support that. Steven Hoy points us to a new paper where some researchers wrote an algorithm that takes anonymized data from social networks and […]
Reply
11. staycek | March 30, 2009 at 12:47 pm

I enjoyed your paper, I’m grateful for your research, and I think it is important to raise awareness that anonymity /= privacy; however I personally do not find this shocking nor do I perceive any personal threat.

I realize the threats you cited are all plausible scenarios for large-scale attacks, but I fail to see the personal threat. If my birthday, gender or relationship status were accidentally shared with strangers, I would not care in the slightest. If it was truly personal, I would never have published it on Facebook, even if I trusted their privacy policy.
Reply
12. Anonymity and studies of social networks | Population of One | March 30, 2009 at 2:35 pm

[…] They’re actually looking at something that is an issue that is a lot more delicate: are anonymised data from social networks truly anonymous? Operators of online social networks are increasingly sharing potentially sensitive information […]
Reply
13. Privacy Value Networks » Blog Archive » The limits of anonymisation | March 30, 2009 at 2:38 pm

[…] Narayanan and Dr Vitaly Shmatikov (University of Texas at Austin) have a fascinating new paper on the impact of social networks on the anonymisation of personal data (thanks, Mo!): Operators of […]
Reply
14. Socal Networks in the News | March 30, 2009 at 3:56 pm

[…] De-anonymizing Social Networks – Arvind Narayanan & Vitaly Shmatikov (http://33bits.org/2009/03/19/de-anonymizing-social-networks/) […]
Reply
15. Your Morning Commute is Unique: On the Anonymity of Home/Work Location Pairs « 33 Bits of Entropy | May 13, 2009 at 6:42 am

[…] trail of the user/vehicle unknown even to the service provider — unlike in the context of social networks, people often don’t even trust the service provider. There are several papers on anonymizing […]
Reply
16. Graduation and plans « 33 Bits of Entropy | May 20, 2009 at 6:36 am

[…] presented the social network de-anonymization paper at the S&P conference today at Oakland. Email me for the […]
Reply
17. Livejournal done right: the case for a social network with built-in privacy « 33 Bits of Entropy | September 9, 2009 at 11:52 am

[…] on my work on de-anonymizing social networks with Shmatikov, and other research such as Bonneau & Preibusch’s survey of the dismal […]
Reply
18. Anonymous Data? « Virtual Shadows | January 25, 2010 at 9:22 am

[…] logs of over a half million of their users (here) and in 2009 by researchers in social networks (here). Stripping personal identifiable information such as usernames from data sets is an insufficient […]
Reply

33 Bits of Entropy

18 Comments Add your own

Leave a comment Cancel reply

About 33bits.org

Me, elsewhere

Email Subscription

33 Bits of Entropy