OK, there isn’t any.
And there probably isn’t any sex appeal in structured journalism, either – it’s very nuts and bolts, data-structured-related stuff that should be invisible to readers. But, like plumbing, it should be an essential part of the infrastructure.
So it won’t have the wow impact of great visualizations or the gee-whiz factor that a well-designed website exudes.
It’s much more basic – what are the fields we want to capture in any given type of story; how do we categorize stories and elements in a story; how do we extract key information and store it; and so on. But if we get it right – as Matt Waite did with Politifact – what we have is the underlying structure that lets us build stronger and smarter applications on top of it.
That’s provided we design the plumbing well. That means making choices early on, and finding ways to amend them when they aren’t working out. That’s one reason why holding out for a technology-oriented solution to parsing free text is so seductive; we don’t have to make any decisions now, since the technology will figure it out for us eventually. But we do, and should.
For plumbing to work, enough people have to know how to use it. That means forcing a large enough newsroom – or a focused-enough small newsroom – to pick one structure and stick to it, at least until they see it’s not working well. And in an ideal world, other allied newsrooms would also use the same structure, so the two sets of databases can talk to each other.
So let’s say we were building a relationship database, and decided to capture people’s names, affliations, connections, in a particular format. And say the Washington Post wanted to do the same thing, via their WhoRunsGov site. If the SCMP has a great database on Chinese movers and shakers, and the WP has one of the US government, the smart thing is for both of us to share our databases and really increase the value of the shared resource.
But to do that, we’d have to agree on the underlying data structure (not to mention a whole bunch of other things). But if we could agree, that would really unlock a lot of potential value. We wouldn’t even have to be wedded for life; it’s probably possible for both sides to track their contributions to the dataset and pull them out if the relationship broke up. (It would be ugly, and nasty, and much of the database would get wrecked. But that’s like many breakups.) And there may be arguments about whether each side is pulling its weight in contributing information to the database. A bit like fights over whose turn it is to take the garbage out.
But even without a marriage, shared data structures would make it easier for both sides to talk and share information, even on an ad-hoc basis.
Everyone will want their own structure, of course. But compromise – or the imposition of an agreed standard by a consortium/group – could help goose value and cooperation.
That, like shared railway gauges, might determine who allies with who and what they get out of it.