It seems that some Members of (the UK) Parliament have been rather… irregular in their expense claims. In order to investigate the expense claims thoroughly, it is necessary to trawl through hundreds of thousands of documents.
The Guardian decided to crowdsource the trawling, by setting up a web site with copies of expense documents and an interface allowing visitors to classify each document. Michael Andersen at Harvard’s Nieman Journalism Lab presented four crowdsourcing lessons, based on an interview with Simon Willison, who developed the web application.
Two of the lessons are psychological:
- your workers are unpaid, so make it fun
- attention is fickle, so launch immediately.
The other two are technical:
- speed is mandatory, so use a framework
- participation will come in one big burst, so have servers ready.
Note that the technical reasons follow on from “attention is fickle.” The framework was Django, and the servers were in the cloud, at Amazon’s EC2. Glyn Moody remarked that open source made this crowdsourcing project feasible. I’ll be more explicit (or perhaps more glib) and remark that this is an example of open source serving the cause of open government.
Is this an example of citizen journalism? It’s certainly an example of investigative journalism, with much of the investigation done by citizens.