AI in the newsroom

Conversations with Data: #55

Do you want to receive Conversations with Data? Subscribe

21081793 9d11 4394 9c6b d6c83c3b9ecf

From data mining for investigative journalism to automated content for finance and sports coverage, journalists are no strangers to automation and machine learning. But what are some of the opportunities and challenges that come with A.I. making its way into the newsroom?

To better understand this, we spoke with Professor Nicholas Diakopoulos, an Assistant Professor in Communication Studies and Computer Science (by courtesy) at Northwestern University where he directs the Computational Journalism Lab. He explains how A.I. can assist journalists in discovering, composing and distributing stories and talks about his new book, Automating the News: How Algorithms are Rewriting the Media.

You can listen to our entire podcast on Spotify, SoundCloud or Apple Podcasts. Alternatively, read the edited Q&A with Professor Nicholas Diakopoulos below.

What we asked

Let's start with the basics. How do you define an algorithm?

A straightforward way to define an algorithm is as a series of steps that are undertaken in order to solve a problem or to accomplish a defined outcome. The easiest way to think about this is to compare a recipe to an algorithm. The recipe defines the ingredients and a set of steps that show you how to combine them in order to achieve that delicious dinner that you want to cook.

Of course, there are far more complicated, complex algorithms in computer science. Oftentimes in computational journalism, we are interested in digital algorithms that run on computers and allow us to scale up information workflows or information recipes. So you know how to structure the transformation of information from data to make sense of that data, visualise it and publish it. One way to think of computational journalism is to think about what the algorithms are for producing news information.

How would you define A.I.?

For A.I., the meaning seems to change over time, depending on who you're talking to. I think a fair definition of A.I. is a computer system that can perform a task that would typically require some level of human intelligence. The current wave of hyped-up A.I. technology is based on machine learning, where a system essentially figures out how to accomplish a task based on a bunch of data that it's been trained with.

Let's take Trint, for example. It's an automated transcription service for transcribing and interpreting audio material. These types of algorithms are trained on hours and hours of recorded audio, together with manually typed transcripts. The algorithm, based on those two types of data, figures out what is the mapping between the sound that someone makes and the word that you would use to transcribe that sound. If you give the computer enough examples of that, then it can do a pretty good job of transcribing a new piece of audio that you feed it.

In your new book, Automating the News: How Algorithms Are Rewriting Media, you explain how A.I. is making its way into the newsroom. Tell us more about this.

There are so many facets of journalism which are touched by A.I. and automation right now. Some journalists are using data mining to find interesting new stories from large datasets. These technologies can also be used to automatically generate content like written articles or videos. Another use case is to implement things like social bots or chatbots where people can interact and chat with A.I. systems to get information.

Given all of the content produced on a daily basis from a large newsroom, journalists might also want to use algorithms to figure out how to optimise click-through rates or distribution patterns around that content. To boost audience views, algorithms can certainly help determine which headline is producing more engagement and amplify the amount of traffic that content can receive online.

How are investigative journalists using A.I. in their research and reporting?

One of the areas I'm pretty excited about and where I've seen it effectively used in journalism is increasing the scale, the scope and the quality of investigative journalism. There's plenty of examples now of investigative journalists using machine learning approaches to define a pattern in a data set that they think is newsworthy. They then use that machine learning model to find other patterns in the data that are similar to the one that they want to find. That helps these journalists to cover a lot more ground in terms of the stories that they're able to find for the scale and the scope of the investigation.

Can you share a specific example of this?

A few years ago, The Atlanta Journal-Constitution used this approach to try to find doctors who had been accused or even found to be guilty of sexual misconduct in their practice, but who were still practising medical doctors. Once the journalists knew that there was this pattern, that there were these types of doctors out there, they could look for that pattern in other documents that they could scrape up. They were then able to take what might have been a regional or state level story and turn it into a national-level investigation. So we definitely see the power of A.I. in those types of contexts.

How else can A.I. be used in the newsroom?

A.I. can also have a lot of value in reducing repetitive work. For example, The New York Times and The Washington Post have been using machine learning technologies for helping with content moderation. They get thousands of millions of comments per month submitted to their websites. A.I. helps filter through it and potentially weed out comments that are inappropriate or bring down the level of the discourse on the site. Another area where we see A.I. bringing some value to journalism is its ability to increase the speed of information. This is particularly valuable for breaking news or finance.

A29ed768 9aa9 47fb afb5 d7cb96690c8b

Professor Nicholas Diakopoulos is an Assistant Professor in Communication Studies and Computer Science (by courtesy) at Northwestern University where he directs the Computational Journalism Lab. His research focuses on computational journalism, including aspects of automation and algorithms in news production, algorithmic accountability and transparency, and social media in news contexts. He is author of the book, Automating the News: How Algorithms are Rewriting the Media, published by Harvard University Press in 2019.

What is the biggest misconception people have about A.I. in journalism?

The biggest misconception that I encounter over and over again is that journalists think A.I. is going to take their job. Having researched this for several years now, I just haven't found that to be the case. Instead, what I found is A.I. creates new jobs and new opportunities rather than destroying them. But it's also changing jobs. For instance, to be an effective journalist you might need to know how to work with data or have the ability to learn new tools. A.I. can't do entire jobs. It can usually do little tasks or pieces of jobs. Even if it could come along and do 10% of someone's job, that's not going to take away their entire job. It's just going to mean that they have 10% more hours in their job or to do other things that they couldn't do before. Hopefully, A.I. is going to make journalists more effective and efficient and to augment their capabilities.

Lastly, what is next for A.I. in the media?

One thing that I'm pretty excited about is the new class of technology related to synthetic media. Most people have probably heard of deep fakes -- automatically generated images and videos that are created using machine learning models. But fewer people may realise that the same type of technology can also be used to generate text. So whether that's news articles or comments or other forms of text. I think it's interesting to think about how synthetic media can be could be adapted to journalistic requirements. It's going to turbocharge automated content production, both in positive ways in terms of like news organisations being able to churn out a lot more data-driven stories, but potentially also negative ways by creating new opportunities for misinformation.

Latest from

We are excited to announce that the News Impact Summits are back for a 2020 online edition! Organised by the EJC and powered by Google News Initiative, the events will cover three topics that are driving innovation in journalism: Data Journalism, Audio & Voice and Audience. The Data Journalism stream will take place on 3-5 November. Register here!

E19495ac c24f 4c57 beb5 5aac79421331 1

Our next conversation

In the next episode of our Conversations with Data podcast, we will be speaking with Professor Charlie Beckett from the London School of Economics (LSE). As the founding director of Polis, LSE's media think-tank, he also leads the Polis JournalismAI project, a global initiative aiming to inform media organisations about the potential offered by A.I. powered technologies. In addition to discussing automation in news production, we'll also talk about the findings from Polis' global survey of journalism and artificial intelligence.

As always, don’t forget to let us know what you’d like us to feature in our future editions. You can also read all of our past editions here.


Tara from the EJC Data team,

bringing you, supported by Google News Initiative.

P.S. Are you interested in supporting this newsletter as well? Get in touch to discuss sponsorship opportunities.

subscribe figure