| Tool | Data | Description | More info |
|---|---|---|---|
| http://infoextractor.org | YouTube, Facebook, Wikipedia discussions | a web-based tool for collecting data and metadata from focused social media content. The tool allows social science researchers to easily collect data for quantitative analysis, and is designed to deliver data from popular and influential social media sites in a useful and easy to access way. | InfoExtractor: a Tool for Social Media Data-Mining |
| http://tubekit.org | YouTube | a query-based YouTube crawling toolkit. This software is a collection of tools that allows one to build one’s own crawler that can crawl YouTube? based on a set of seed queries and collect up to 17 different attributes | Supporting research data collection from YouTube with TubeKit |
| http://roxyproxy.org | behavioral Web surfing data | Roxy gathers Web log data as well as the text and HTML code of each page visited by participants. | Research Real-World Web Use with Roxy, the Research Proxy |
| http://code.google.com/p/snowcrawl | political websites | provides a common API for directed webcrawls using a single process, multiple processing, or a client-server architecture. Snowcrawl also automates storage of downloaded files, edge lists, and state backup. | An automated snowball census of the political web |