It’s a question that came up as I looked over an old NYT blog post on a story the paper had done about machine-generated stories; specifically, a profile of Narrative Science, a start-up that takes data and turns out stories from it. (I wrote about that piece late last year as well.) The NYT blog post riffs about the value not just of turning out sports stories, as Narrative Science is best known for doing, but more broadly on how much easier it can be for the average person to understand prose rather than spreadsheets.
For people who value words, the scientists at Narrative Science have good news. Up to now, the computer tools for helping people make sense of data have mostly been on-screen dashboards that distill mounds of information into graphs or symbols resembling traffic lights — green is good, red is bad.
“Story is a much more accessible medium,” said Kristian Hammond, chief technology officer of Narrative Science and a professor at Northwestern University. “It expresses what’s most important and expresses it first, and in words so it’s understandable to humans.”
“The narrative rules in terms of cognition,” Mr. Hammond said.
And up to a point, that’s true. Narrative is a great way of imparting information. I was talking to some very smart computer scientists the other day, and they framed nicely how they saw the main uses of language-generation engines like Narrative Science: The first is simply to turn out simple – and not-so-simple – stories (by itself a development that will likely have a profound impact on journalism). Another is to – as the NYT blog post suggests – help make it easier to digest data by turning numbers into words. And yet another use is to generate prose reports about that data: What’s the biggest number, smallest increase year-on-year, etc. There’s no question there can be value in all these applications – especially if they don’t cost a lot to create and can be done at scale.
But other information formats have their advantages too: Video can convey emotion much more strongly than text when done well. Data visualization allows for tremendous amounts of interaction and exploration. (And, as I learned when I started out in broadcasting, “the pictures are much better on radio.”) The real question is how best to use each type of format for the story/information you have.
There are lots of great visualizations out there that convey information that could never be well-told in text; the groundbreaking theyrule is just one example, but of course there are many more. And there’s beauty in spreadsheets, too (I can’t believe I said that); I had a business-side colleague at the Journal who could glance over an A3-sized spreadsheet in 8-point font and immediately home in on an errant number. He didn’t need a visualization, or a auto-generated text story to help him understand what the message the numbers were conveying. (In fact, they would probably have gotten in the way.)
Which is another way of saying that we shouldn’t necessarily favor one information format over any other; they all have advantages and disadvantages – and, in any case, people’s capacity to understand information formats and “grammars” evolves over time. (Contrast the style of any 1950s film with the pace of a modern, MTV-inspired movie). As visualizations become more commonplace, people will learn to use them as instinctively as children now swipe their fingers over any screen they see.
The real leap forward here – assuming machine story generation and natural-language processing capabilities progress apace – is the increasing ease of converting from one format to another, and our ability to present information in multiple formats for multiple audiences and purposes.