Interested in researching online criminal markets? Here is 1.6 TB of data
By Patricio Estevez-Soto, on 20 July 2016
It is not news that much organised crime activity has moved to the web. However, an article in this week’s edition of The Economist (“Shedding light on the dark web“, July 16 2016), provides an enlightened analysis on how the drug trade has moved from the street to online markets facilitated by anonymising technology such as Bitcoin and Tor.
The article focuses on how the market infrastructure provided by “Dark Net Markets” (DNM)—such as an escrow service, information sharing between buyers and sellers, dispute resolution mechanisms, etc.—has transformed business practices of these “organised criminals”, making them look more like Amazon and less like Al Capone.
While the article is an interesting read, what made it all the more interesting is the data it uses. As the article states:
The secretive nature of dark-web markets makes them difficult to study. But last year a researcher using the pseudonym Gwern Branwen cast some light on them. Roughly once a week between December 2013 and July 2015, programmes he had written crawled 90-odd cryptomarkets, archiving a snapshot of each page. (The Economist, July 16 2016)
Naturally, this data is a treasure trove for anyone interested in studying these criminal markets, and luckily for the research community, it is publicly available at Gwern Branwen’s Black-market archives. Branwen’s description is enticing (my emphasis):
Dark Net Markets (DNM) are online markets typically hosted as Tor hidden services providing escrow services between buyers & sellers transacting in Bitcoin or other cryptocoins, usually for drugs or other illegal/regulated goods; the most famous DNM was Silk Road 1, which pioneered the business model in 2011. From 2013-2015, I scraped/mirrored on a weekly or daily basis all existing English-language DNMs as part of my research into their usage, lifetimes/characteristics, & legal riskiness; these scrapes covered vendor pages, feedback, images, etc. In addition, I made or obtained copies of as many other datasets & documents related to the DNMs as I could. This uniquely comprehensive collection is now publicly released as a 50GB (~1.6TB) collection covering 89 DNMs & 37+ related forums, representing <4,438 mirrors, and is available for any research. (Branwen, July 14 2016)
Some of this data has already been used in articles and posts, yet there is still a lot of potential for researchers from an organised crime and/or cybercrime perspectives. Branwen lists some possible uses, yet I am sure researchers that specialise in this field can think of many more.
The views expressed in this blog post are the author’s own and do not necessarily represent the views of UCL, the Department of Security and Crime Science or the UCL Organised Crime Research Network.