This post is one in a weekly series in which we highlight one of the many amazing and diverse people who work on CrowdFlower Tasks. There are many stories hidden within the world's largest workforce. This is just one.READ MORE
This post is one in a weekly series in which we highlight one of the many amazing and diverse people who work on CrowdFlower tasks. There are many stories hidden within the world's largest workforce. This is just one.READ MORE
It’s pretty clear to any parent that the Internet, in all its wonder and cat-filled glory, can be a dangerous place for kids. Not that we need any reminders – the media does a pretty good job of educating and alarming parents of the horrors lurking only a few clicks away. To put it mildly: NBC’s Chris Hansen wouldn’t have a job if it weren’t for the Internet.
This is where a company like Artimys Language Technologies comes in. Artimys uses machine learning to identify language patterns in online communication that indicate bullying, sexual predation or suicidal behavior. Sexual predators and bullies usually display common linguistic patterns – similar patterns of phrasing sentences, excessive use of particular words, stylistic elements, etc.
|Data Scientists Spend 80 Percent of Their Time Cleaning Data
I heard this stat referenced over and over at the recent DataBeat conference, which is not surprising. It was also a prominent pain point discussed at the Big Data Innovation Summit in Las Vegas last week. Clean data is crucial to building a good predictive model and cleaning massive amounts of data is a challenge. The task is equally dreadful as it is time consuming. I have yet to see a data scientist who prefers cleaning data to building a better model or creating a more insightful visualization.
Microtasking – The Data Scientist’s Best Friend
With data cleaning being discussed as one of the primary pain points of data science, it surprises me that there is not more discussion of ways to address it. Microtasking is a form of crowdsourcing and it is an extremely powerful way to clean up your data, create a labeled dataset or mark up data to drive your business decisions.
Our platform is designed to help data scientists sort, classify and organize large datasets using our on-demand workforce. You simply upload your data, design a job that tells our on-demand workers how to label your data, and let our platform do the rest. You get results.READ MORE
At more than 5 million, CrowdFlower has the world's largest network of on-demand workers. Who are these amazing people helping our customers produce accurate results, incredibly fast and on an unprecedented scale?READ MORE