One of the best things about interning at CrowdFlower is that you get the opportunity to learn about human intelligence from the world’s most diverse workforce. In my previous blog post, I showed how we can predict task performance using task difficulty. This time, I would like to focus on our contributors, the most valuable resource in data enrichment and crowdsourcing, and pose the question: How well will a contributor do on a new task?
Naive Baseline: “what has been will be again”
Our first model makes oversimplified assumption that contributors don’t change their performance – we use past behavior to make a prediction. We take a few parameters that describe the contributors’ accuracy distribution across tasks – like the average, variance, skewness and kurtosis. We use these parameters in a Random Forest Model and predict the contributor’s accuracy on a new job.
Contributors Predicted Accuracy vs. True Accuracy
The x-axis is the actual accuracy observed on a task (true accuracy), the y-axis is the predicted accuracy; the line represent perfect prediction (true accuracy is equal to predicted accuracy).
Well, not very surprisingly, our naive baseline does not do very well. The predictions based on past accuracy are very different from those observed in practice. Apparently, meshing together performance on a variety of tasks (like image classification and determining sentiment of tweets) is not a good way to predict contributor’s success in the new task.
Learning Contributors Cognitive Abilities
How about using contributors’ cognitive abilities to predict their performance?
Since I already classified tasks into cognitive abilities, it shouldn’t be too hard to associate these with contributors. I combined the cognitive requirements of tasks with how well each contributor performed on these tasks to compute the their cognitive abilities. For example when assessing a contributors memory:
This leads to each contributor being rated on eight cognitive abilities, representing how well they perform in each ability.
Given these ratings, we can explore what “types” of contributors emerged. Using k-means, a well known clustering algorithm, we grouped contributors that have similar Cognitive Abilities. The beautiful result is below.
Clustering Contributors into Cognitive Groups
In this plot I selected 3 clusters (out of 10) and 4 cognitive abilities. Each dot is a contributor, the axes represent their ratings on cognitive skills (see labels). The colors represent the contributors’ clustering. The plots on the diagonal show the distribution of contributors on each skill (histogram of skills).
Aside from the beautiful colours, now we can clearly see how contributors belong to very distinct clusters. Two results are especially interesting:
- Different contributors have different cognitive abilities.
- The green contributors seem to be good overall. More importantly, while the red contributors are better than the blue contributors at categorization tasks (third box on the diagonal from the top); the blue contributors are better at visually difficult tasks (second box on the diagonal from the top).
- We could not have realized this by looking at the contributors average across all tasks, which ignores specific task requirements such as memory or visual difficulty.
- Cognitive skills are related to each other.
- Categorization and memory seem to be related skills: the red group of contributors is very good at both tasks while the blue group isn’t good at either of them. People who have good memory, may be better at remembering categories and thus at categorization. This demonstrates how big data can teach us something about human cognition. Interestingly, a similar result was recently published in the psychology literature.
Our results are really useful. Think about product recommendations on Amazon (“People who bought X also liked Y”). We can recommend specific tasks based on the contributor strengths – “contributors like you, were good at task Y”.
Predicting Contributor Performance using Cognitive Abilities
Let’s get back to our goal – predicting contributor’s performance on a task. For each contributor we calculated their cognitive abilities. When a new task arrives, we can rate how demanding it is on each cognitive ability. Then, by using the Cognitive Abilities required by the task, as well as those possessed by the contributor, we can predict the contributor’s accuracy on this new task:
The x-axis is the actual accuracy observed on a task (true accuracy), the y-axis is the predicted accuracy. Each point is a contributor’s accuracy on a new (untrained) task. The size of the dots represents the number of test questions the contributor answered. That is, we should value more bigger points because it is based on more data.
The results above look much better. We can now predict the contributor’s accuracy on a new job using the cognitive model. Points stack around the perfect prediction line for the most part, for all clusters, and the large dots are closer to the line than the small ones.
The good fit validates our model of Cognitive Abilities and shows that we indeed captured something real about people. Better accuracy predictions means that we can more effectively match contributors to the tasks, saving time and costs while improving quality. The result is a win for everyone – contributors will earn more money and CrowdFlower customers will get better quality results.
Cognitive modeling improves the quality of data enrichment and crowdsourcing. Using concepts from psychology we map Cognitive Abilities to tasks and contributors.
We found that people tend to be good at a set of skills, but not others, and using this information improves the prediction of task accuracy on new tasks. In the process of making predictions we replicated recent findings from psychology in large scale, outside of lab settings. I think this is the beginning of great friendship between Psychology, Data Science and Data Enrichment.