By Brendan O'Connor, August 5, 2008

Wisdom of small crowds, part 2: individual workloads and rates

[ Update: see also another visualization of this. ]

AMT’s great new interface makes it easy to download completion times for individual worker assignments. Therefore, it’s easy to visualize 🙂 For a recent small job we did (250 HIT’s, 5 workers per HIT), here’s a graph of completion times per worker, over the entire 15 minute duration of the job. Each assignment is a single point, graphed by when it was done versus how long it took.

Most workers come in, do a string of HITs, then leave. Some do all of the HITs available. There seem to be two distinct work modes. Most people do lots of HITs in rapid succession. But several of them work slowly (e.g. workers 8, 13, and 18, with more horizontal space between points), either spending more time on each assignment, or perhaps leaving then coming back.

This graph also illustrates a common trend we see: lots of the work gets done by “tail” workers; that is, people who do only a small amount of work. This is where crowdsourcing really shines — it’s OK if individuals give you a small number of judgments, because you can aggregate across many of them. The total “prolificness” of each worker was lightly skewed on this task — 50% of the work was done by 8 out of 37 workers. Usually, we see a split more like 50% of the work being done by the top 10% of workers; this one had a more even distribution probably because it was small, so enthusiastic workers didn’t have an opportunity to do a very large number of HITs.

Another phenomenon: some workers have a downward trend in work time. This could be learning to do the task faster, or it could be increased carelessness. A quality analysis (along the lines of part 1) can flesh this out.

The task was a fairly subjective image classification problem where positives are rare; purple points are “YES” responses. Responding “YES” takes more time (presumably, more cognitive load) — average work times for YES vs NO responses are 22 vs 12 seconds, significant at t-test p<.001.>


p.s. The graph is due to the R’s awesome lattice package. It’s incredibly easy to use: not much more than xyplot(WorkTimeInSeconds ~ SubmitTime | WorkerId).

[ This article is part of a series, Wisdom of Small Crowds, which focuses on crowdsourcing methodology for Amazon Mechanical Turk-like systems. ]