You are here

Scouting and Scoring: How We Know What We Know About Baseball

Christopher J. Phillips
Princeton University Press
Publication Date: 
Number of Pages: 
[Reviewed by
Russ Goodman
, on
Craig Biggio played professional baseball for 20 years, from 1988 to 2007, and hit 668 doubles, the most ever for any right-handed batter in history. Biggio’s 668 doubles are 11 more than Nap Lajoie, who played almost a century before Biggio’s career. In Scouting and Scoring: How We Know What We Know About Baseball, Christopher J. Phillips takes us on a journey to understand just how we know with confidence that Lajoie hit exactly 657 doubles while Biggio hit exactly 11 more doubles. While not explicitly a history of data analysis in baseball, Phillips’ book is an enticing read for baseball data enthusiasts and, more broadly, those interested in thinking about notions such as “fact” and “truth,” how one measures the seemingly immeasurable, and attempts to quantify human potential.
Phillips’ book is split roughly equally between “scoring” and “scouting” and effectively connects the two. There is a smooth transition from four chapters devoted to the history of baseball data and attempts to standardize its collection and recording in reliable databases to the final three chapters on the history of baseball scouting and player acquisition, scouts’ reliance on data and devotion to bureaucracy, and the notion of quantifying a baseball player’s OFP = Overall Future Potential.
Scouting and Scoring devotes significant time to the origins of baseball data and how games came to be scored in the manner they currently are. While not the inventor of scorekeeping, Henry Chadwick features prominently in the second chapter as someone wholly promoting the role of the “scorer” of a baseball game as someone intimately knowledgeable of the game but also as someone trustworthy and as credible as possible. The intensity of today’s baseball data and analytics is owed quite a bit to pioneers like Henry Chadwick, who firmly believed that credible analysis of the game required good data. The truth of that idea holds just as strongly in 2020 as it did in 1859 and Phillips does a page-turning job of pulling back the historical curtain on the subject.
The rest of the first four chapters provides a wonderfully nerdy arc to the development of the “art” of scoring a baseball game, where errors, judgments, and consistency are of the highest interest, and the evolution of just who has been entrusted to score baseball games over the years. Phillips also describes Bill James’ (at least nominally) Project Scoresheet, an attempt in the 1980s to gather and publicize Major League Baseball (MLB) play-by-play data for every game played. This all-volunteer effort played a role in the development of modern play-by-play scoring. Consider a player, Jones, who in 1984 batted second in the lineup and came to the plate three times: he grounded out shortstop-to-first leading off the third inning, he flew out to right field with one out and no one on base in the fifth, then he grounded out to third to lead off the eighth inning. This player’s line would read: 2 Jones 3a.63 5b.9 8a.53 and this form of encoding has continued to evolve to efficiently record the dynamics of every player and every play. Further, Phillips recounts the curation of play-by-play data via Project Scoresheet, STATS,, Retrosheet, Elias, and finally MLB Advanced Media (MLBAM).
What one could consider a history of scoring and data in baseball provides significantly more context than expected and effectively helps the reader understand the development of baseball data from simple but messy beginnings to the Statcast world in which we live, with metrics such as launch angle, spin rate, BABIP, etc.
The latter half of Phillips’ book is devoted to scouting in baseball. Phillips shares an efficient yet lively history of scouting for the reader to understand his conclusion that “good scouting meant good paperwork” (147). In the early period of scouting, and perhaps even today, scouts were valued not only for their evaluation and acquisition of talent but also for their bureaucratic abilities. In assessing talent and putting a number (most likely a dollar figure!) on a prospect meant there had to be a paper trail for the ball club to follow in assessing the scout’s success or failure. Regarding the necessity of bureaucracy in scouting, Phillips comments “[t]hat’s just how reliable knowledge usually gets made” (169).
The final two chapters of the book draw out the psychological role in scouting and how scouts’ ability to quantify “the invisible” often made or broke their careers. Phillips makes sure we are clear that the intersection of the subjective judgment of scouts with the objective world of quantitative facts is guided solely by one question: will this prospect succeed at the highest level? The point, then, was never to eliminate judgments but rather to find ways to express them numerically.
The remainder of the book is a fascinating account of MLB clubs striving for efficiency and consistency in judging prospects especially with the transition from the free-for-all of signing any prospects they could to the introduction of the draft in 1965. The notions of pooled scouting and the utilization of one number, the OFP, to rank prospects play into this work to this day.
Phillips underlying message to all of this is what makes Scouting and Scoring a fascinating must-read: expertise isn’t being replaced by data, nor is data a threat to the (MLB) establishment, rather there is significant effort that goes into what counts as truth/data, what questions can be answered by data, and how to make inherently subjective judgments and decisions more reliable.


Russ Goodman ( is a Professor of Mathematics at Central College in Pella, Iowa. He is also the organizer of the Midwest Sports Analytics Meeting. His mathematical/pedagogical interests are sports analytics, computational commutative algebra, teaching via inquiry-based learning, and mathematics in pop culture. His non-mathematical interests are coaching and watching lots of soccer, running, reading, and spending time with his family.
The table of contents is not available.