Research & Insights

By Drew Spitzer, June 24, 2015

How I cleaned thousands of sales leads before lunch for about $300

It’s my first week at CrowdFlower and my excitement is higher than it’s ever been. Not only is the company full to the brim with incredibly talented people, but those talented people are actually really fun to work with [ed note: he means me]. However, I noticed a problem in my first week as our Lead Sales Development Rep that I know a lot of my peers come across at their companies: Salesforce data.

It’s not that Salesforce gives you bad data per se, but rather that it can incredibly noisy and, therefore, not-all-that-useful. There are dupes and incorrect emails everywhere and there are tens of thousands of them. Before dumping a bunch of new leads in there, I needed to know what was actually worth keeping and what was, essentially, a waste of space. How do you clean a database that large without spending weeks on it? Turns out I work at a place that does this exact sort of thing; I decided to use CrowdFlower.

Here’s how I went about cleaning up my lead data in SFDC

First off, I had to define what exactly would constitute a clean lead database. I set my criteria pretty high level: I only wanted leads that have valid email domains, company or account names, and some activity history. That was pretty straight forward. In order to get this I needed to eliminate leads with no or a bad company/account names, have “bad” email domains (gmail, yahoo, etc), and have no activity history.

I did some of the basic cleaning I could do myself by downloading Salesforce leads and deduping, searching for @gmail or @hotmail type addresses, which was a good start. But I needed to do more. I knew there were emails in there from defunct domains or without company names or a dozen other problem areas. I could’ve done that myself, but that meant going through line by line and reading each company name and domain and Googling anything I didn’t recognize. Not only would that have been time consuming and expensive, but doing it by myself would have been, frankly, exhausting. Thankfully, I work at a place that distributes this exact kind of work to contributors across the world. Funny how that works out, right?

So I uploaded my remaining leads as a csv (a.k.a. my source data). Then I built out my job parameters, wrote instructions so the leads would be cleaned to my liking, entered 50 test questions to ensure quality of work, and launched it. Right off the bat, dozens of contributors poured into my job. Flash forward two hours and I had a cleaned leads database, with thousands of noisy, bad leads removed from my and my team’s workloads.

Not only was the whole process only three hours long, but I did it on a Friday on the side. I cleaned thousands upon thousands of leads, removing the nearly 40% of them that weren’t up to snuff. In other words, that’s thousands of leads I don’t have to chase down and my sales team doesn’t need to worry about. Additionally, we now have a process to execute every 6 months to maintain a tight, clean CRM. I spent an hour and a half prepping, 2 hours of passive monitoring, and $300ish doing what every sales org complains about every week. Which, frankly, just made my job a whole lot easier.