A nice, Jeff Jarvis-like title that conjures up images of goose-stepping Nazis trampling on internet freedoms (or enforcing them – whatever). But this post isn’t about powers of the state, or freedom to publish, or anything like that. It’s about how databases are really only valuable when they’re complete – cover a universe totally - and are up-to-date. But it does start with Jeff and Germany.
Last month, Jeff took Germany to task for pushing Google to allow homeowners to opt-out of the company’s Street View photos of their homes. Those who don’t want photos of their buildings shown can ask for them to be pixelated, and Jeff’s not happy about it.
It is more offensive than I had imagined, a desecration of the public demanded and abetted by German politicians and media on a supposed privacy frenzy.
His point is that you can see the building on the street, and you can even take a photo of it for your album, But it can’t be shown in Google Street View. Germany, he says, “has stolen from the public.”
This is an issue of publicness. These are public visions now obscured. This is why I am writing a book about protecting the public, from assaults such as this. I can’t write it fast enough.
Needless to say, this got a lot of discussion going – a lot of it very intelligent and nuanced, about the difference between private and public ownership of information, about availability of information and ease of use and access, and the looming issues of public and private privacy.
A couple of days ago Ron Rosenbaum jumped into the fray with a scathing criticism of Jeff’s argument; while I don’t want to get into the middle of that catfight (and in any case, it looks like there’s a growing cottage industry in that business, including this earlier attack), he makes an interesting point about the principles behind a product like Street View.
In the case of Germany, a global corporation is—or was, before the opt-out was allowed—trying to monetize an individual’s privacy, a monetization that is worth more if the company can claim absolute total, or totalitarian completeness.
I’m not sure it’s all that sinister – but it’s certainly true that “completeness” adds a great deal of value to any database. In fact, I could make the case that it’s fairly basic to creating any real value: Google Maps wouldn’t be worth much if you couldn’t be sure the maps were reasonably complete. True, there are countries it doesn’t cover – but you know which ones, so those are universes that aren’t included in the “completeness” you expect. What you do expect is not to come across streets that Google omitted to include because of some random glitch.
And so it is for any newsroom database project – better to be smaller, and more complete, than larger and with unpredictable gaps in it. In other words, if you have one government department’s budget in detail, and with no omissions, it’s much better to build something based on that, than trying to reach for the entire government budget and not be sure what is and isn’t in it. At least if you’re trying to charge for it.
Similarly, databases are only good as long as the information in them is up-to-date. Otherwise, it’s functionally incomplete, with the equivalent of random gaps in it. Or like a Google Map that hasn’t been refreshed after major road works.
So this may seem self-evident, but it bears repeating. Especially since it’s hard work to maintain databases over time, and even more so after prize season is over. Unless newsrooms get drafted in to keep inputting information – even if it’s just a couple of minutes a day – it’s hard for any dedicated research or projects team to keep it going for any length of time.
True, you can build databases that just scrape or pull information from other, public sources. But there’s limited monetary value in that in the long run; you can’t build a competitive advantage off what someone else can do just as easily. I believe you need your own “secret sauce” to give products that edge, and usually that means some human involvement (although, presumably, it could also be some proprietary algorithm. )
Which is another way of saying – figuring out, in advance, how to keep a database going is probably as important as building it in the first place. And that means figuring out how to define the universe it covers so that the database is complete, and can stay that way.
Otherwise it may be a fine database or site to explore – and possibly even yield great stories – but it wouldn’t be something you want to depend on to navigate from one part of town to the other, and that’s critical if you want to charge people real money for it.