Belatedly – very belatedly – I just wanted to point out Andreas Graefe’s Guide to Automated Journalism, a report published a month ago for Columbia University’s Tow Center for Digital Journalism.
It’s a smart summary (full disclosure: I’m quoted in it) of what’s going on in the field, and flags a couple key questions that the industry will have to grapple with as automations – or machine-generated stories – play a bigger role in journalism.
As Andreas notes in the executive summary:
Automated journalism will substantially increase the amount of available news, which will further increase people’s burden to find content that is most relevant to them.
An increase in automated—and, in particular, personalized—news is likely to reemphasize concerns about potential fragmentation of public opinion.
Little is known about potential implications for democracy if algorithms are to take over part of journalism’s role as a watchdog for government
All true, and all important questions. Which is why, in many ways, the real path forward for automation is – as with all disruptive innovations – to start in places that existing journalism doesn’t really serve well, or at the scale it should.
Andreas points out, for example, that automated journalism is highly dependent on the quality and structure of data, which is one reason it’s flourished in the world of finance and sports, and that data just isn’t as readily available (or accurate) elsewhere. True enough – but what then about journalist-collected or –created data, built from their daily reporting? That’s essentially what Politifact and Homicide Watch (and to some extent, Connected China) did, and in the process introduced – in a limited way – machine-generated content to new fields.
Likewise, while it’s true that machine-generated stories aren’t the compelling narratives out there – and hence don’t compete well against human-written prose – it’s also true that there are tons of undercovered areas out there where human stories are few and far between.
Similarly, while human journalists have traditionally only covered earthquakes that exceeded a certain magnitude or left significant damage, (the LA Times’) Quakebot provides comprehensive coverage of all earthquakes detected by seismographic sensors in Southern California.
And covering all earthquakes – like covering all homicides in DC – can be a real public service. Yes, that does add to the cacophony of stories that readers have to wade through to find what interests them; but on the other hand, it can certainly also be argued that the old model of grizzled old editors deciding what was newsworthy (and what wasn’t) also left huge swathes of their communities under-informed.
There are clearly issues that, as a community, all readers should be informed about and ideally debate; but there are also lots of issues where individual information is more important – how my child’s school is performing, what’s happening on my street, how my congressman is voting, how my investments are faring (not to mention, is there an earthquake near me). And that’s where personalized news-on-demand can help fill a core public information need. It doesn’t have to be automated, of course, but then again that’s the only way such stories (or visualizations, or whatever) can be created at scale and moderate cost.
That’s where, as Andreas also notes, automated journalism can really grow. And that includes creating stories with different angles or tones to suit different readers, and also stories that allow for hypothetical scenarios.
Algorithms could also answer what-if scenarios, such as how well a portfolio would have performed if a trader had bought stock X as compared to stock Y. While algorithms for generating news on demand are currently not yet available, they will likely be the future of automated journalism.
At least, that’s one of the likely futures for automated journalism in the near future – beyond the obvious uses for speed (which news organizations like Bloomberg and Reuters already deploy extensively) and scale (ala the AP’s move into automated company earnings stories.)
Where automation leads after that is an open question. Disruptive technologies tend to make inroads in areas where existing technologies – in the case, humans – aren’t competitive or interested in. But over time, as capabilities improve, anything is possible. Certainly Andreas notes one example where automations are helping journalists find possible stories by trawling through data in a way few reporters are able to do. At the LA Times:
…the platform uses data provided by the L.A. Police and County Sheriff’s Departments to automatically generate warnings if crime reports surpass certain predefined thresholds. For example, the system triggers a crime alert for a certain neighborhood if a minimum of three crimes is reported in a single week, and if the number of reported crimes in that week is significantly higher than the weekly average of the previous quarter.
I suspect there’s a bright future in the cybernetic newsroom, where machines and humans work together to find insights and create stories at scale. But we have to embrace those capabilities and rethink how we work – and what our mission is – if we’re to make the most of the machines already in our midst.