Websites of interest
Data repositories and APIs
A number of websites make data available for download, either through an end-user interface or via an API aimed at developers.
- Freebase -- wide variety of community-sourced data
- Infochimps
- Numbrary
- UN data sources
- ITU Statistics on ICT usage
- World Bank
- Sunlight Foundation
- Data.gov (US federal government data)
- NYC data
- MBTA Transit Data
- San Francisco data
- Google n-grams corpus
- Security breach database
Data opportunities
These sources haven't made themselves as immediately accessible as the repositories. Nonetheless, they can be queried systematically to contruct data sets of interest.
- Bing Search API
- Google Insights for Search
- DoubleClick Ad Planner
- Alexa
- Google Keyword Tool
- Weather Underground
Exemplary papers and projects
- Propublica Investigations
- National Obesity Comparison Tool
- Crowdsourcing black-market jobs