Just an entirely self-serving shout-out to the nice CJR piece by Jonathan Stray about some of the innovations going on at Reuters, including our Automation For Insight project and Reuters News Tracer – a cool new tool that detects newsworthy events on social media and assigns a confidence score assessing how credible they are.
And as a two-fer, we also got a nice piece about News Tracer in Nieman Lab as well.
Not bad for a single day.
(And as a complete side note, if you haven’t been reading Jonathan’s blog, you should. There’s some really good stuff there.)
Basically – and you can read the pieces for more description – what News Tracer does is find clusters of tweets, cleans out spam and other dross, figures out which clusters are “newsworthy,” at least as mainstream news organizations define it, separate assertions of opinion from assertions of fact, and then figures out a score for the credibility of the cluster.
Loads of kudos to the Thomson Reuters R&D team, and especially Sameena Shah, who led the development team who solved a whole host of very interesting algorithmic challenges over a two-year period. As Jonathan notes:
Newsroom standards are rarely formal enough to turn into code. How many independent sources do you need before you’re willing to run a story? And which sources are trustworthy? For what type of story? “The interesting exercise when you start moving to machines is you have to start codifying this,” says Chua. Much like trying to program ethics for self-driving cars, it’s an exercise in turning implicit judgments into clear instructions.
Sameena’s team did really smart work figuring out – with help from the newsroom – what “newsworthiness” means, and also how to pull together a basket of factors to help assess credibility. It’s a never-ending iterative process, of course, but they’ve built up a very impressive capability that extends the reach of the newsroom, improves its speed, and frees reporters up to do more value-added work.
What’s not to like?