Posted by: structureofnews | September 26, 2012

A Question of Trust

I was talking – more precisely, listening – to some smart colleagues discussing data visualization the other day, and something one of them said struck me.  He was talking about how visualizations could help users clear through complexity and a flood of information – but that depended on a level of trust.  Not just in the quality of the data that’s being visualized, but how it’s selected, analyzed and ultimately presented.

Perhaps that’s self-evident – and to some degree it is – but it’s an important issue that needs to be addressed as data becomes more and more part of the daily work of journalism.

We’re very used to the notion of disclosing – where we know them – the biases of the people we interview and quote.  Readers, we hope, have a fairly well-attuned sense of what statements seem right, which are self-serving, and what appears to be pure hyperbole.  (OK, so maybe I’m expecting a lot here – but certainly these are skills that we use in everyday life, whether sizing up the used-car salesman or figuring out if your teenaged son really did his homework.)

Data is different.  We’re much less used to examining the fundamental biases of the data we work with.  True, good data journalists – and there are a lot of them out there – do this as a matter of course, and they filter out a lot of the bad stuff before it even makes it into a story.  But readers as a whole have less experience in questioning how data is collected and assembled and what the basic assumptions are that go into making databases.

As Susan McGregor, a professor at Columbia University’s Tow Center for Digital Journalism notes in a video interview:

Using data is often like using responses to an interview that someone else wrote.  You don’t know necessarily what the biases or objectives were that went into collecting a certain set of data.  It’s up to you as the journalist to research that, find out what the implications are, why were these particular questions asked, what did the answers really mean.

To say that a number is “true” is the same as saying that quoting someone is “true.”  But we know not to quote out of context.

(The video is from a series that NPR’s Lam Thuy Vo made for a course she’s teaching at UMass Amherst; there’s one of me as well.  I’m sure my mother will watch it, doubling the page views.)

Here’s a real example – in a very smart story, Sasha Chavkin at CJR takes apart the contradictory numbers about ad spending in this political season.  It isn’t an investigative piece in the sense that it uncovers wrongdoing; but it does dissect in detail how the data on ad buys is collected, and shows how journalists often don’t look beyond the headline number.

Which is a longish way of saying there are two somewhat contradictory forces at play here.  At one level – re my colleague’s comment – it’s important that, as we turn more and more to algorithms and visualizations to help us understand the world, we need to invest in them a level of trust that their inner workings make sense and aren’t biased (or broken.)   But, per Susan’s comments – it’s also important that we keep a reasonably high level of skepticism about the data that we’re using, and work to educate readers about where the flaws in the numbers are, in the same way that we point out the biases in the people we interview.

Perhaps that lowers the overall faith in the stories we create – but in the longer term, one hopes, it gives people greater confidence in, and more understanding of, the data that increasingly pervades our life.


Responses

  1. A question for you. What do you think about the need to explain the methodology of a data project? I know that some data people (myself included, sometimes) feel the need to explain their method, as to appeal to a need for transparency and reproducibility. Such a mode d’emploi may, borderline, read like a scientific paper…

    Say, in a data project, I may feel the need to explain how I do the scraping, etc. But then, methodology in real-life is often omitted, perhaps for clarity or just plain lack of time. And in traditional reporting, I don’t really see journalists delve in meta stuff…

    • Cedric, good question. I think it’s an evolving area – meaning, as people get familiar with some practices/concepts, it’s less and less important to note how it’s done. No one needs to know how you managed to get someone on the phone for an interview, for example; but they might want to know how and under what conditions you got a terabyte of otherwise-exclusive data.

      But I suspect the real place where there needs to be disclosure is more in the area of data quality – what does it actually measure, what are its limitations, etc – and in algorithms we write (or borrow) to assert a statement – ie, “our analysis shows this bank is the one most likely to fail.”

      Whether that falls into the story, or in an explainer box is really a matter of style. And how much detail goes into it is, as well. Do we want our work reproducible? Should we?

      Reg

  2. […] So what can we do to combat unconscious bias? It’s tough.  We can take the IATs until we’re blue in the face, but it won’t change very much.  Still, it’s important to know what our biases are; awareness is a good first step, and helps us be more conscious about compensating for our biases. We can get out more, and talk to, and get to know, a broader range of people; the more women we see as virtuoso musicians, the weaker the mindbug that virtuoso = male becomes. We can depend less on human testimony – which is often flawed anyway – and expand out to look at documents and data – also flawed in their own way, but at least in different ways. […]


Leave a comment

Categories