By Lukas Biewald, May 17, 2017

CrowdFlower’s Toolkit for Computer Vision

In 2011, Sarah Fortune, a Harvard University professor approached CrowdFlower with a completely unexpected use case. In her work on fighting drug resistant tuberculosis, she needed cells labeled and counted at massive scale.  She set up a CrowdFlower job to collect the data she needed and compared the results to the work that her grad students were doing. She found that CrowdFlower’s contributor network provided results of equivalent quality to her students but gave her access to much greater scale.

At the time I thought her use case might be an interesting one-off, but in the past few years we’ve seen an explosion of image labeling tasks not just in medical research but in industries ranging from retail and financial services to drones and satellites. The reason, of course, is the advances in deep learning.

Deep learning is a machine learning technique that exacerbates the trend the CrowdFlower was built on: more training data = better algorithms. Especially when it comes to vision, deep learning algorithms can keep improving on millions and billions of examples when other algorithms start to slow down. Deep learning has been especially powerful in computer vision where algorithms have moved from struggling to reliable differentiate between cats and dogs to human level object recognition in just a few years. But deep learning needs massive training data and that’s why today we are announcing a set of dedicated tools for computer vision.

As companies collect data at more and more massive volumes, good labeling tools become even more important. CrowdFlower’s Pixel Labeling tool uses machine learning to pre-segment images into super pixels, which allow annotators to label literally every pixel in every image which is essential for the growing practice of Semantic Segmentation.

Pixel Labeling with Surfaces, Cars, Trucks, Pedestrians

Pixel Labeling with Surfaces, Cars, Trucks, Pedestrians

The human-in-the-loop pattern, where machine learning and humans work together is essential for making vision tasks work well.  For example, recognizing tiny pedestrians in a huge image becomes much more efficient when an algorithm guesses their locations and guides the annotator’s eye to where they might be. CrowdFlower contributors can efficiently make high-level judgments, such as setting the confidence thresholds of the boxes that they would like to be shown.

Computer vision has been propelled forward in recent years by open data sets such as Imagenet.  CrowdFlower’s vision is to make high quality data available for every application. Customers already use CrowdFlower’s Computer Vision tools for everything from finding elephants in drone photos for preventing poaching to labeling biopsies for developing personalized cancer treatments. We’re super excited to make our tools available to everyone and we can’t wait to see the vision applications our customers develop with them.