Tech

By Chris Van Pelt, September 10, 2015

The ever-evolving CrowdFlower tech stack

It was 2008. I was a bright-eyed 26-year-old that was thrilled to leave a stable paycheck for a life of entrepreneurial uncertainty. I had a blank canvas in front of me and a few massive decisions about how to build the CrowdFlower tech stack. Oh, but how little did I understand the impact those initial choices would have on the CrowdFlower of 2015.

Given we just launched our new graphical editor–replacing a piece of CrowdFlower that was originally written in 2008–I thought this was the perfect time to reflect on the terrible choices I‘ve made over the last 7 years. Maybe we can all learn something.

Application Servers

MERB

2008 was a pivotal time for the Ruby on Rails community. Rails had been around for a couple years and it had gotten rather bloated. It lacked modularity and a stable plugin API. Quite frankly, it felt a little boring and I was in the “shiny new toy” phase of my life. MERB emerged as a lightweight, edgy, and cool alternative written by dudes named Ezra and Yehuda. The choice was clear.

At the end of 2008 it was announced that MERB would be merged into Rails 3. I knew I was onto something! There were dreams of an easy transition from MERB to Rails, but that never happened. CrowdFlower still serves up a small percentage of its traffic from parts of the legacy MERB app. It’s been a thorn in my side since Rails 3 was launched in 2009.

Clojure

One of the most difficult-to-scale components of the CrowdFlower platform is our reservation system. It’s responsible for assigning tasks to the millions of contributors that sign up looking for extra money.

The system that handles these requests has gone through plenty of iterations. Originally, we just did it with vanilla SQL, then we implemented it using Redis as a backend store. Both were plagued with outages and scaling issues. We needed an enterprise-grade solution. Java is obviously too boring and I love parenthesis, so…enter Clojure.

We’re a Ruby shop. Every engineer knows Ruby inside out. Often we dabble with interesting languages on the side, but I was hesitant to introduce a new language to our primary stack. I finally gave in with Clojure, and I really shouldn’t have. We ended up writing not one, but two versions of the reservation system in Clojure. Both were plagued with the same outages and scaling issues that our previous implementations had. A year ago we replaced these systems with an incredibly well written PL/pgSQL function and some minor adjustments to our schema.

We haven’t had an issue since.

JavaScript

 

MooTools

I’m originally from Iowa. Perhaps it was the years seeing cows at pasture that predisposed me to choose a JavaScript framework named MooTools.

Being an early Rails guy, I was weaned on Prototype and Scriptaculous. These were huge libraries and, quite frankly, a little boring. MooTools was way more compact and modular. I had heard of jQuery, but it was still in its infancy.

Today we include both jQuery and MooTools in parts of our frontend (so much for being more compact and modular). I’m revered by the engineering team for making the “$” mean different things in different contexts. At this stage I think I’m the only member of the team that is familiar with MooTools and there are still tens of thousands of lines of actively used code. Bravo Chris.

Ember

It was 2012, by now we had Rails, MERB, Clojure, MooTools, and jQuery deployed in production. Single page applications were all the rage and a sexy new framework written by none other than the MERB Yehuda dude emerged. Enter ember.js

There were a couple problems with us choosing to implement an application in ember.js. The first one was that ember was still very early. The API was constantly changing, and documentation was sparse. The second was that the application we used it on wasn’t all that well suited for the single page paradigm. In the end the project went way over budget, was rather buggy, and was eventually thrown away.

Databases

 

MongoDB

I still remember the night I discovered MongoDB. This was it, the database that would finally solve all of our problems. Schema-less, auto-magic sharding, infinite query-ability, robust failover. I welcomed it into our stack with open arms. We eased into it obviously, using it for “non-critical” components of our platform alongside our primary data store Postgres.

We used a very early version of MongoDB, because that’s how I roll. We definitely ran into some hiccups, such as not being able to elect a master in a failover scenario or doing sequence scans just because. We actually used it prior to mainstream ORM development so I wrote my own and called it Mongood. To this day, CrowdFlower engineers are searching online for Mongood documentation which only exists in my brain. Version 1.8.0 of MongoDB is still running in our infrastructure. No need to thank me CrowdFlower Operations team.

Tokyo Tyrant

This one was rather briefly lived, but it was such an epically bad idea I had to mention it. What is Tokyo Tyrant you might ask? Obviously it was the predecessor to Kyoto Tycoon. I believe I introduced Tokyo Tyrant into our stack to query location based data… It’s all a bit fuzzy, honestly. Thank god it didn’t last.

Conclusions

So what can we learn from my amazing track record of bad architecture decisions?

  1. Popularity matters

  2. Wait for a “stable” version

  3. Be deliberate about your choices.

  4. Postgres is capable of more than you imagine.

  5. You’re going to make mistakes, be sure to learn from them.

That’s what we’ve learned. But have I learned? Since the new graphical editor I mentioned in the beginning is implemented using React the sexiest JavaScript framework of late (and we’re also about to introduce some AWS Lambda functions written in Java)… I guess time will tell.


If you want to come work on thrilling and/or wonky tech like MooTools and MERB,  CrowdFlower’s hiring! Drop us a line, sir or madam.