By Robert Munro, October 28, 2013

Translation and Crowdsourcing

Originally posted on Idibon’s Blog, Oct. 25.

This week was the fourth annual CrowdConf, the world’s largest crowdsourcing conference. I’ve enjoyed speaking at each one, seeing the conference evolve from a collection of curious early-adopters to an industry event attended by the leading technology companies. This year I moderated a panel on translation: my co-panelists were Joseph Childress, product manager at CrowdFlower, Jani Penttinen, CEO of Transfluent, and Jessica Roland senior director of sales development at Gengo.

We covered a number of topics in our panel, which I’m sharing here:

How is crowdsourcing disrupting (or complementing) professional translation?

The biggest disruption comes from the price: crowdsourced translation typically comes in at about 20 percent the cost of professional translation. We have seen this coming for a few years: in 2010, Jost Zetzsche described crowdsourced translation as the perceived “straw the broke the camels back” for professional translators at a collaborative translation workshop. After several years of worrying that the role of professional translators was being replaced by machine translation, along came crowdsourced translation where *any* multilingual person could now compete for the same job. But at the same time, Jost Zetzsche (coauthor of Found in Translation with SmartLing‘s Nataly Kelly) was also optimistic, arguing that new approaches could ultimately open up new markets and opportunities for translation.

In the few years since, we have seen that the difference between professional and crowdsourced translation is not that clear-cut. Many professional translators are also working on crowdsourced translation platforms like Gengo. Increasingly, the crowdsourcing platforms are also tracking and serving different levels of quality. Consumers of translation (including us at Idibon) use combinations of crowdsourced and professional translation.

At the far end of the continuum is machine translation. It has the lowest quality, but is the cheapest option. Joseph Childres reported that CrowdFlower is used by machine-translation people to create sentences for their algorithms (machine translation typically works by learning from existing pairs of parallel sentences). Translators can also interact by reviewing and correcting machine translation. This is with the assumption that this is quicker than translating from scratch—an assumption that has not always held up in practice. The interaction between machine and human translation is even more nascent than crowdsourced translation. For the cutting edge, see a recent paper by Spence Green, Jeff Heer and Chris Manning on The Efficacy of Human Post-Editing for Language Translation.

What works?

The greatest advance for me comes from seeing the diversity of languages that can be served by crowdsourced translation, in many languages where machine translation does not exist and professional translation would be prohibitively expensive. We are opening up the diversity of translation services and allowing economic sustainability for the translation of many less widely spoken languages—easily 10 times as many languages as were commonly served a decade ago. For more on this topic, see my post on worker diversity.

As with crowdsourcing more broadly, translation is best when broken into small chunks. Ideally, a crowdsourced worker will only need to translate a sentence or small paragraph at a time. Beyond that, it can become too difficult to programmatically identify errors in the work. It is possible to break down longer documents into smaller chunks, and many crowdsourced platforms take this approach, but this does leave the translations more vulnerable to errors that lack the benefit of context, especially when the content is metaphoric or otherwise abstracted from the literal meaning:


Researcher Translation, courtesy of XKCD.

What doesn’t work?

Unsurprisingly, overly technical text can be difficult for the crowd to translate. Without specialized knowledge, it would be difficult for a typical person to comprehend documents like research papers aimed at cutting-edge academics and technical manuals aimed at specialized engineers.

Medical documents also fall into this category, with the added risk that mistranslated language can result in direct harm to patients. Hospitals have been addressing this problem for some time—those blue phones that you see on the walls of US hospitals connect medical staff to interpreters. Despite this, only 28 percent of hospitals have policies related to the use of bilingual doctors and nurses serving as interpreters. Less than 50% of US hospitals have formal policies regarding the use of minors, friends or family, or non-trained staff as interpreters. There is great potential in this domain that we are not yet serving. We explored these implications in more detail in our post on machine-learning for medicine.

Quality Control When There is No One Right Answer

Crowdsourcing is most often evaluated by comparing a worker’s output to known answers (‘gold answers’), or by giving the same task to multiple workers and calculating how much they agree with each other. This only works for tasks where there is a clear correct answer. For translation, there are often multiple correct answers, so it cannot take advantage of these common quality control mechanisms.

The way that crowdsourced translation companies address this is varied and inventive. Joseph Childress reported that CrowdFlower intersperse translation tasks with similar kinds of language tasks that can be presented as multiple choice answers (e.g., “select the correct definition for this word”). They can embed gold answers among these multiple choice questions, with the reasonable assumption that the accuracy corresponds to the quality of the translation output from that same person.

Another common way that quality is addressed is through a process of training, management and review. This approach builds on a wealth of experience from professional translation companies. Individuals undertake standardized translation tests to determine their quality, and the most qualified and experienced translators will manually review the output of the less-experienced translators, ensuring quality control and providing further feedback about the quality of individual translators.

Things I regretted missing

There was only so much that we could cover! I would have especially liked to hear more about how Transfluent and Gengo addressed quality control: this part of the discussion got cut short.

With regret, we decided not to address interpretation (real-time voice translation), which is different to the translation of text or speech in offline batch scenarios. Organizations like BabelVerse are addressing interpretation within crowdsourcing models.

It would have been interesting to also talk about volunteer translation and its position relative to the more common paid crowdsourced translation models. I have run some of the largest uses of volunteer translation for social good, but at the same time I have been a strong advocate for making this sustainable through paid engagements. Organizations like Translators Without Borders are currently looking at the boundaries between the two.

The future: more sophistication

Just a few years ago, the field of crowdsourcing for language processing was fairly fragmented. Even within the panel, we can see this is no longer true. Transfluent uses Gengo for some of their processes. Gengo has used more broad crowdsourcing platforms like CrowdFlower for some of the workforces. At Idibon we will use services produced by all three.

Translation is not one task: worker review, training and post-editing are all part of the translation eco-system, as are managing known translations (translation-memory) to avoid repetition, and developing workflows and interfaces to make all the processes come together quickly. Even within a single document, it might be optimal to combine professional, crowdsourced and machine translation, with different quality control methods for each. Rather than having one company solve all these problems, it is likely that we will see increased specialization in many different areas, allow increased sophistication to exist alongside flexible strategies.

Disclosures: CrowdFlower’s CEO and Chris Manning are official advisors to Idibon; co-panelists Gengo and I have been advising Translators Without Borders; BabelVerse have been (tidy) former houseguests in their globetrotting business. This post is my independent reflection on the panel and themes.