Posted by: structureofnews | February 1, 2012

Words, Pictures, Numbers

If a picture’s worth a thousand words, what’s a data visualization worth?  Or a spreadsheet?

It’s a question that came up as I looked over an old NYT blog post on a story the paper had done about machine-generated stories; specifically, a profile of Narrative Science, a start-up that takes data and turns out stories from it. (I wrote about that piece late last year as well.) The NYT blog post riffs about the value not just of turning out sports stories, as Narrative Science is best known for doing, but more broadly on how much easier it can be for the average person to understand prose rather than spreadsheets.

For people who value words, the scientists at Narrative Science have good news. Up to now, the computer tools for helping people make sense of data have mostly been on-screen dashboards that distill mounds of information into graphs or symbols resembling traffic lights — green is good, red is bad.

“Story is a much more accessible medium,” said Kristian Hammond, chief technology officer of Narrative Science and a professor at Northwestern University. “It expresses what’s most important and expresses it first, and in words so it’s understandable to humans.”

“The narrative rules in terms of cognition,” Mr. Hammond said.

And up to a point, that’s true.  Narrative is a great way of imparting information. I was talking to some very smart computer scientists the other day, and they framed nicely how they saw the main uses of language-generation engines like Narrative Science: The first is simply to turn out simple – and not-so-simple – stories (by itself a development that will likely have a profound impact on journalism).  Another is to – as the NYT blog post suggests – help make it easier to digest data by turning numbers into words.  And yet another use is to generate prose reports about that data: What’s the biggest number, smallest increase year-on-year, etc. There’s no question there can be value in all these applications – especially if they don’t cost a lot to create and can be done at scale.

But other information formats have their advantages too: Video can convey emotion much more strongly than text when done well.  Data visualization allows for tremendous amounts of interaction and exploration.  (And, as I learned when I started out in broadcasting, “the pictures are much better on radio.”) The real question is how best to use each type of  format for the story/information you have.

There are lots of great visualizations out there that convey information that could never be well-told in text; the groundbreaking theyrule is just one example, but of course there are many more.  And there’s beauty in spreadsheets, too (I can’t believe I said that); I had a business-side colleague at the Journal who could glance over an A3-sized spreadsheet in 8-point font and immediately home in on an errant number.  He didn’t need a visualization, or a auto-generated text story to help him understand what the message the numbers were conveying.  (In fact, they would probably have gotten in the way.)

Which is another way of saying that we shouldn’t necessarily favor one information format over any other; they all have advantages and disadvantages – and, in any case, people’s capacity to understand information formats and “grammars” evolves over time.  (Contrast the style of any 1950s film with the pace of a modern, MTV-inspired movie).  As visualizations become more commonplace, people will learn to use them as instinctively as children now swipe their fingers over any screen they see.

The real leap forward here – assuming machine story generation and natural-language processing capabilities progress apace – is the increasing ease of converting from one format to another, and our ability to present information in multiple formats for multiple audiences and purposes.


  1. Hmm, perhaps you could one day produce a publication by pooling/ranking together what’s being said on different social media networks?

    My beef about traditional writing, versus data visualisation, is that in the former, you typically try to convey a general idea based on a limited number of actors, particular cases — whereas in the latter, it is possible to literally make the entire dataset talk, using the general to talk about… the general. I’ve got complaints on this point of view (from my features-writing friends) that a traditional story is perhaps easier to relate to. But in my eyes, it’s maybe because the well-designed data visualisation, with rich, dense data (as Tufte wrote/illustrated), is not widespread enough in today’s ever-transitioning old media.

  2. Cedric,

    There’s actually some nice stuff in Kahneman’s book (Thinking Fast, and Slow) about how we understand and use general and particular information; specific examples really do nail things in our memory, and that’s why traditional narratives, with story characters, make us remember ideas well. On the other hand, it’s also not a great way for helping you explore a broader issue and all the nuances of it, and that’s where a more interactive data visualization or simulation may be better.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s


%d bloggers like this: