As the photos and judgments are stacking up at FaceStat, I thought it would be interesting to see relationships between tags – when tag X appears very frequently with tag Y. I downloaded more than 4,000 rows of tags, where each row corresponds to one face, and ran a Python script to count concurrences.
Some interesting trends (where the first word often occurs with the others):
- Old: dad, wise, jolly, pedophile, grandpa, perv, professor, writer, sour, experienced, dead, matronly, alcoholic
- Smart: business, librarian, azn, graduate, genius, engineer, intent, bookworm
- Oily: wet, sweaty, shiny
- Drunk: drunkard, tipsy, alcoholic, partyboy, scene, stripper, wasted, ditzy, ew
- Young: tween, underage, uninterested, teen, childlike, jailbait, child, tooyoung, highschooler, babyface, kid, boy, innocent, virgin
- Athletic: runner, driven, jock, sporty
- Serious: angry, grumpy, direct, piercing, alert, dedicated, azn, doctor, suave
- Nerd: goodlooking, virgin, slacker, nerdy, goof, geek
- Gay: cowboy, flamboyant, metrosexual, yuppie, feminine, queer, homosexual, out, pissed, dangerous
Other highly concurrent pairs: army & dedicated, fighter & patriotic, skeezy & hairy, mustache & dad, sunglasses & secretive, cougar & milf, naked & creepy, pimp & playa, plastic & fake, tease & sexy, badass & cool.
For more information on how I counted concurrence, in this diagram I would count the tag "tall" occurring 4 times with the word "lanky", even though the concurrences are all on one line. I converted all words to lowercase.
I sorted the list of tag pairs by a formula: number of concurrences of the pair / number of total occurrences of the first tag. This sort helped to find the interesting pairs.