Great Minds, Etc…

An oldie but goodie:  What may be the granddaddy of the thinking on SJ.  An extremely good post from 2006 by Adrian Holovarty about how SJ isn’t in conflict with the notion of writing stories – but it simply helps extract more value from stories down the road.

A critical point he notes is simply the logistical difficulty of storing information in any structured form, and the need for CMSs and other technological fixes to make this easy.  As I mentioned in another post, one big step would be for Google or WordPress (is anyone listening?) to build out blogging software that would allow for easier input of standardized, structured information – and, to give people the instant feedback we all crave, some way of visualizing the aggregate data from everyone else.


  1. OpenCalais is worth looking into. Check this url and paste any piece of content in there and see how it extracts important information from the “blob of text”.

    • It’s good stuff. At Dow Jones there was also software (originally called Generate) designed to extract information about entities, and especially contact information, and I know the NYT has a high-powered engine (FAST ESP) as well. All these work well when they work well, and I suspect they’ll get better and better over time. I do think, however, that we need to go beyond the identification and tagging and move information more actively into structured data fields so we can really work on them; and secondly that we need to get journalists to input more of what’s in their brains and notebooks that doesn’t get printed in stories (and hence can’t be accessed by engines like OpenCalais.
      I wonder if there’s an architecture that can allow us to, in real time, scan and tag and move into a database information from text, and simultaneously perform data search queries on it. In effect, building databases on the fly as we need them.

      • Yes that’s possible. OpenCalais is a web service and you can create your own script to parse the response that you will get from the service and eventually save it to the database.

  2. Ah I like that site (opencalais), works much better than merlin… tried putting what you replied into it and comes out:

    Technology Internet96%

    Social Tags:
    Technology Internet
    Database theory
    Database management systems
    Data management

    hmmm, I didn’t realize that I am working with a mathematician…

