Attacking bitcoin’s anonymity with graph analytics
A new paper by researchers from Harvard and MIT shows that the Bitcoin users anonymity might be vulnerable. Using the transaction history and open-source information, the researchers used graph analytics to associate “real” names to transactions.
Bitcoin allows users to store and exchange bitcoins, a virtual currency. Its supposed popularity among criminals has made headlines around the world. Critics claim the anonymity of Bitcoin can be used by criminals to conduct safely their illegal activities.
The paradox is that by nature, Bitcoin is very public. The code of the project is open-source. The entire Bitcoin transaction history is actually public : all transactions are recorded into a public ledger, the block chain, and are viewable by everyone. In this public history Bitcoin addresses appear. These addresses, like “3J98t1WpEZ73CNmQviecrnyiWrnqRhWNLy” in themselves tell nothing about the people behind them. The best source of information about a given address is his transaction history. It is possible to know for each address which address it sent Bitcoins to, how many and when.
It is very much like having access to the complete bank history of someone. Except that all the recipients of the money are indecipherable character chains. If criminals can enjoy anonymity on Bitcoin it is by hiding in plain sight. Of course, those who want to obfuscate their identity further have a few resources available.
Now researchers from Harvard and MIT claim that it is possible to link Bitcoin addresses to public identities via graph analytics. How is possible and does it mean that Bitcoin is not anonymous? To answer these questions we need to understand the way the researchers cracked the anonymity of Bitcoin.
Bitcoin transactions form a network or graph. In this graph Bitcoin addresses are linked together via the transactions they take part in.
That information is public but there is no way to find who is behind the Bitcoin addresses. Except that it is possible to use public information (or open-source intelligence) to tie the addresses to “identities”. For example, on popular Bitcoin forums users tend to add their Bitcoin address in their signature. It it thus possible to tie a forum user account to an address. Similar data is available social networks like Twitter or Facebook.
Actually, even without mentioning his Bitcoin address someone may leak enough information to be tied to it. Let’s say you overhear someone say “hey, I’ll send you $100 today at noon”. The researchers demonstrate that it is possible to identify potential transactions matching this information. That means that the innocuous overheard sentence may leak the identity of two persons.
Scrapping websites and correlating transactions means that it is possible to enrich the Bitcoin graph. Instead of just looking at a set of addresses, we can tie in the public identities used unsafely by Bitcoin users. These identities are not necessarily easy to match with “real” persons though. For non transparent identities, one would have to investigate further to find the name or location of the person using the identity. That practice is called doxing and can have good results.
All of this doesn’t mean Bitcoin is not anonymous. People who are tying themselves publicly to Bitcoin addresses are less anonymous than the average Bitcoin user though.
An other interesting finding is that Bitcoin is similar to the web. Both can be modeled as graphs. With Bitcoin we have addresses and transactions, with the web it is websites and hypertext links. As a consequence, the tools used to analyse the web can be used for Bitcoin. The PageRank algorithm helped Google assess credibly the authority of different websites on topics. Michael Fleder, Michael S. Kester and Sudeep Pillai applied it to Bitcoin.
They found for example that among the Bitcoin addresses with a high PageRank is the account of the FBI. During the seizure of the Silk Road, it orchestrated a series of 445 transactions of exactly 324 Bitcoins. Without knowing that Bitcoin address or the seizure, they could have identified the importance of the event.
Bitcoin has a complicated relationship with anonymity. Just because the public addresses used by its users are weird doesn’t mean that they can’t be tied to real persons. Every transaction, every public mention of Bitcoin addresses contribute to give a clearer picture of the person (or persons) behind a public address. And with graph analytics, that data can be analysed and deliver interesting insights.
A spotlight on graph technology directly in your inbox.