By Lukas Biewald, March 13, 2008

The Manifesto: Why We Built Dolores Labs on top of Mechanical Turk

The first time I used Amazon’s Mechanical Turk it was at a search engine startup, Powerset, and I used it to compare the quality of a few versions of our early internal algorithm with Yahoo and Google. We were thinking we would have to hire a team of people that would spend their entire day comparing the quality of results.

As an experiment, I set up a task with no quality control, put in about fifty bucks and let it run overnight. The data that came back was noisy but I was able to find meaningful differences between the search engines. Completely on my own. I didn’t have to get approval to hire people, put my experimental design through a committee and wait a month for the results to come back. I could design the experiment empirically, doing meta experiments on the data collection process itself.

Back when I was thinking about what machine learning papers to write at Stanford, the conversation always hinged on what kind of data sets were available. We’d go research what data was out there and then figure out what we wanted to do. We’d spend a ton of time wrangling data designed for one purpose into another. I think it’s the same in lots of disciplines that use data.

Here at Dolores Labs, we’ve built tools and processes to quickly and efficiently collect lots of data on Mechanical Turk and other places. I hope that this blog gives us a chance to play with our technology. Back when I made my first AMT jobs, I thought about all the crazy experiments I wanted to run. Overnight, could you figure out what airline carrier was the cheapest? Could you find the exact threshold where what most people call “red” becomes what most people call “orange”? Could you quantify the difference in sentiment between FOX news and NPR?

When I was in college, I had an art teacher who made everyone draw twenty pictures a day. I hope these experiments are like those pictures. Sloppy and fun and occasionally brilliant.

We’ve been brainstorming experiments that we’d like to run, but if there’s any data set that you’d like to suggest send us an email. Maybe we can make this deal: if you have a cool idea, we’ll collect the data for you, and you guest post a short analysis.

Our first experiments will be posted shortly, and many more to come. I hope you enjoy em!