Open Source Data Sets

Access Real-World High-Quality Free Data Sets

A common problem in data science is the acquisition of clean data from various sources that can be used in higher level analysis. Many data sets are now open source, but those data sets must be put to use before their value can be measured. Unfortunately, some valuable information or metrics are frequently missing, making data scientists do the same analysis over and over. CrowdFlower has your solution.

The CrowdFlower Data for Everyone library contains large, high-quality, real-world data sets collected by our customers and published with their permission. These open source, enriched data sets range from biomedical imagery to football strategy and are free for anyone to use. We are constantly updating our library to provide small researchers, academics, small businesses, and startups the tools necessary for success.

With Data for Everyone, you can:

  • Download data sets ranging from sentiment analysis of words or phrases to housing and wheelchair accessibility
  • Create machine learning algorithms you can trust
  • Train or retrain your algorithms as your models change for better and better results

CrowdFlower can also help convert unstructured data into rich structured data with our human-in-the-loop platform. Access rich open source data sets with Data for Everyone, or try CrowdFlower today to see how we can help you clean and enrich your own data.