A New York Times story tells of how Major League Baseball is hiring “loggers” to watch every game in the season and tag each play with descriptive terms – so that the video archive can be more accurately searched. As the story notes:
“Your archive is only as good as what you know is in it,” said Elizabeth Scott, M.L.B. Productions’ vice president for programming and business affairs.
A very good point.
The story gives examples of the kinds of pre-programmed tags that can be attached to plays, ranging from “ground out” to “fist pump” to “hugging,” “animals,” and “drinking.” Makes you wonder what actually happens in a baseball game.
But there’s a method to this madness. The idea, of course, being that 1. without some organized and systematic effort to catalog and describe everything that goes on, large parts of the value of the archive are lost, and 2. it’s easier to do it was things happen, than to go back and try to do it after the fact. Both points being very much in sync with the ideas of structured journalism: Get it right the first time by rejigging the workflow, and let’s make sure we catalog everything.
This makes even more sense for MLB, which is dealing in this case with video – one of the more difficult things to tag and search for effectively. Building better search optimization isn’t just about getting better traffic via Google; it’s about really increasing the value of archive. If you can be confident that a search for “A-Rod striking out swinging” gets you all the times that was caught on tape, that’s a lot more valuable than being able to randomly find a couple of examples. That turns archive from a nice-to-have tool that people stumble on, to a research asset that people can use seriously.
Tagging isn’t ideal in general, of course, for lots of reasons, not least the relative structural looseness of tags and their reference to an entire article rather than a specific fact or part of it. Although in this case it seems like MLB is moving beyond those limitations – first, by specifying a set of acceptable tags (ie, structuring tags) and secondly by essentially microtagging each play in a game. That turns this much more into a structured set of information rather than simply a list of tags attached to a game.
Of course, MLB – and baseball in general – is well-versed in the notion of structured data. It probably collects the most statistics of any sport out there, and has done so with consistency for decades – making detailed comparisons and analysis of trends and performances something of an obsession for any number of fans (and a crushing burden to bear for their non-baseball-loving friends.)
So if MLB can do it for baseball games, why can’t newsrooms do it for the stories we turn out? We don’t even need to hire “loggers” to do the work; we already have people in the newsroom who can.