Research & Insights

By Bryan O'Rourke, August 28, 2013

Fantasies of Big Data and Baseball Part 2

In part one of this series on some mathematical insight into fantasy sports behavior, we covered the distribution of crowd contributors who participated. There were roughly 1,000 participants all fitting into categories of baseball knowledge. These categories are; lowest knowledge, little knowledge, medium knowledge, higher knowledge and highest knowledge of baseball and fantasy baseball. The little knowledge group contained the most contributors with just over 350. This is a link to the previous week’s blog post here.

The structure of the task I designed asked the contributor to first put themselves into one of these categories of prior knowledge. Then it asked them to choose one player for each position on the field, two pitchers and a closer for their fantasy lineup. These players would be given point totals for their play over a week. The scoring is taken off of a standard fantasy baseball league scoring system. The top three winners of this contest were awarded handsome payments as an incentive to choose wisely. After collecting this data, I created a team for each category of prior level of knowledge. These teams were determined by the players chosen most frequently at each position per knowledge category.

Here are the players each knowledge class chose most frequently along with how many points they scored:

Position Lowest Knowledge Little Knowledge Medium Knowledge Higher Knowledge Highest Knowledge
Catcher Y. Molina (12) Y. Molina (12) Y. Molina (12) Y. Molina (12) B. Posey (8)
1st Base A. Craig (39) A. Craig (39) P. Fielder (24) C. Davis (33) C. Davis (33)
2nd Base M. Carpenter (18) M. Carpenter (18) R. Cano (6) R. Cano (6) R. Cano (6)
3rd Base M. Cabrera (24) M. Cabrera (24) M. Cabrera (24) M. Cabrera (24) M. Cabrera (24)
Short Stop J. Segura (25) J. Segura (25) J. Segura (25) J. Segura (25) J. Segura (25)
Left Fielder M. Trout (33) M. Trout (33) M. Trout (33) M. Trout (33) M. Trout (33)
Center Fielder J. Ellisbury (33) A. McCutchen (55) A. McCutchen (55) A. McCutchen (55) A. McCutchen (55)
Right Fielder M. Cuddyer (31) M. Cuddyer (31) C. Beltran (39) C. Beltran (39) J. Bautista (26)
Pitcher 1 C. Kershaw (20) C. Kershaw (20) C. Kershaw (20) C. Kershaw (20) C. Kershaw (20)
Pitcher 2 M. Scherzer (22) M. Scherzer (22) M. Scherzer (22) M. Scherzer (22) M. Scherzer (22)
Closer J. Johnson (8) M. Rivera (10) M. Rivera (10) M. Rivera (10) M. Rivera (10)

Just like in most fantasy leagues I’ve played in, there are clear winners at each position, and everyone knows who those winners are. This is reflected in the table above by M. Cabrera, J. Segura, M. Trout, C. Kershaw and M. Scherzer being chosen for every team. Y. Molina, A. McCutchen and M. Rivera were chosen for four out of the five teams. As a result, all of the knowledge classes scored relatively close to each other.

For the final part of this series on crowdsourcing, data and fantasy baseball, we will dive a little deeper into the results of the knowledge classes and what that means for the big picture.

Surprised by who was chosen at each position? Think any players performed unusually good or bad during this one week span? Leave your thoughts below in a comment!