By Emma Ferneyhough, September 29, 2014

Introducing CrowdFlower Labs

You may have noticed a new link on the left navbar of your jobs dashboard recently – “CrowdFlower Labs.” This is a new set of features that we have introduced to our customers and are excited to share with you. 

Until now, CrowdFlower Labs has been a toolbox only available to CrowdFlower employees working collaboratively with our biggest customers to process data for their most complex jobs.

This morning, we’re pleased to announce three CrowdFlower Labs tools available on the CrowdFlower platform:

  • Taxonomyzer
  • Workflows (Enterprise Customers Only)
  • Doublepass (Enterprise Customers Only)


Taxonomyzer: Build Taxonomies with Confidence

For customers who use our platform for categorization tasks, you may be familiar with our cml:taxonomy widget that allows you to structure your categories hierarchically and provides a search function for easy browsing.

If you’ve ever used cml:taxonomy, however, you know how tricky it can be to wrangle your categories into the correct JSON format. Taxonomyzer takes away all the pain associated with turning your hierarchical category tree into a properly formatted JS resource, generates a search index, and provides you with the properly formatted CML code. And, it hosts it all for you. All you need is a CSV of your categories and Taxonomyzer does the rest. 

Workflows: Automate Multi-Step Data Enrichment

For Enterprise customers whose data requires multiple steps of enrichment, our Enterprise Engineering team builds out custom workflows that string multiple jobs together. The newly introduced Workflows tool allows our customers to easily upload, download and monitor data that flows through their workflows.

Features include:

  • Web interface for uploading data to the workflow and downloading results
  • Data pre-processing and validations
  • Unit processing and routing (ETL) between CrowdFlower jobs
  • Integration with Doublepass (see below)
  • Data post-processing and packaging
  • A webhook sender
  • An emailer

Doublepass: Generate Quality Content

Those of us who use CrowdFlower to create original content (e.g., writing reviews, tagging content, harvesting images etc.) understand that Test Questions aren’t the best quality-control tool for the job. This is where Doublepass comes in. 

The most typical use-case is a two-job peer-review workflow, where one job is set up for content generation and a second job is set up for content review. This was discussed in an earlier blog post here

To summarize, data flows from the generation job to the review job using our Workflows application, and only the content review job needs Test Questions. If a piece of content passes review, the original author of that content gets bonused. If it does not pass review, the original author’s judgment is flagged, and the content gets passed back to the generation job for another run. This process continues until all pieces of content are approved. Doublepass also provides several automatic-review tools to programmatically evaluate content (e.g., for spelling, grammar, and appropriateness) before wasting money on it in the review job. The best part is you only pay full price for the content that passes review.

Features include:

  • Auto-bonusing for good content: control the bonus amount, accuracy threshold, and messaging
  • Auto-flag for bad content: control the accuracy threshold to eliminate sub-optimal contributors
  • Auto-review: have our algorithms review the content first
  • The ability to peek at approved / rejected content  

CrowdFlower Labs: More Functionality for Complex Tasks

Taken together, CrowdFlower Labs will allow our customers to tackle complex work more quickly and easily than ever before. We look forward to working with you to get up and running on these new tools and hearing your feedback.