I’ve been meaning for some time to write about China Vitae, an interesting site that offers biographies, tracks travels and appearances, and best of all, allows for explorations of the intersections of careers of senior Chinese officials. According to the site, it holds biographies for “4000 Chinese leaders in government, politics, the military, education, business, and the media,” and tracks travels and appearances of 300.
I can’t speak to the accuracy of the data, which is contributed by Wen Wei Publishing group – it certainly looks regularly updated and official. But the most interesting thing about the site is how the data appears to be structured, and as a result, how it can be accessed. It’s not all the way to a deep implementation of some of the ideas of structured journalism, but it’s a very nice way station.
The site allows you to call up biographies, both in text form as well as in a list, similar to in a CV. Now that’s not particularly innovative. What’s more interesting is that as you view biographies, checkboxes appears on the right against the names of the people you’ve called up. Check the ones you want to compare, and the site will call up similarities those people have in common as far as positions, locations or institutions go. (ie, they all worked in Chengdu at some point.)
That’s adding context – and value – by using the structure inherent in a database, rather than relying on tagging or other ways of parsing text. (Not that technologically that also couldn’t be done; I just haven’t seen much of it, and I think it’s easier to work with databases, for a host of reasons I’ve outlined here.)
Now, China Vitae isn’t perfect by a long shot. The comparisons aren’t presented in a visually interesting way to encourage more exploration or engagement; they’re set out simply as lists. There are sites that are playing with more visual ways to portray information, such as muckety and silobreaker; those are also works-in-progress, but visually much more immersive. That said, China Vitae could probably bolt on a visual front-end without much trouble.
More of a question is the nuts and bolts of their data structure. Without looking under the hood, I’m just guessing, of course, but it seems like the site organizes information in terms of names, institutions, roles/positions, locations, dates, all of which makes sense. But the date structure doesn’t seem to be as flexible as it should be, to allow for matching periods of time in different records. In other words, if official A was based in Chengdu from 1986-1992, official B was there from 1991-1996, and official C was there from 2000-2005, the system ought to identify A and B more clearly as being more related than C; C’s time in Chengdu, while still a relationship shared with them, isn’t as significant because he wasn’t there at the same time. Or at least that’s the theory.
Right now, however, calling for a comparison of the three will simply surface all those dates. Perhaps the data isn’t organized in a format that would allow for more detailed parsing; perhaps it isn’t important A and B were there at the same time; perhaps I’ve missed something on the site.
My broader point is that, as we build out data structures for collecting information, we need not only to focus on processes for collecting and updating that information, but also on the likely products and applications that we’ll want to build, so that we collect data in the most useful – and valuable – way, while obviously not burdening ourselves with too much work for the sake of it.
But there’s a mine of stuff here, and it also offers some lessons too in terms of how sites like whorunsgov could ramp up their value.