Name: DataJournalism.com
Price range: $

Imagine you’re a business reporter with a couple of hours to file a 500-word article about falling oil prices and the impact on multiple regions. The words are there, but you’re spending a lot of time pulling data. Too much, in fact, to do an interview.

Imagine you’re a local reporter for a regional news outlet tasked with covering multiple city council meetings that are happening on the same evening. How can you possibly be in three places at once?

Imagine you’re on the politics desk for a national newspaper and notice a peculiar phrase used by a politician at a press conference. You’re trying to get a sense of whether this is a dog whistle or a typo? How do you verify this?

In each of these scenarios, existing data-driven tools and software can help alleviate the pressure points for journalists in their day-to-day reporting.

One powerful thing that AI can do is write up and distribute summaries of city council meetings.

While investigative journalists are doing innovative reporting with machine learning, the uptake has been uneven in journalism more generally, reported Charlie Beckett, professor of practice and director of the JournalismAI project at London School of Economics. Many services are developed using statistical modelling, sometimes called machine learning or “AI-powered,” available to newsrooms from third-party vendors.

In 2019, Beckett led a survey of 74 journalism organisations from around the world and found most newsrooms trialling AI are in the northern hemisphere. Reasons for the slow onboarding are varied, but many are fearful of job loss or deskilling, while smaller newsrooms face challenges related to skill and resources.

While such fears are not without precedent, there’s optimism. In a podcast for Conversations with Data in July 2020, Beckett said, “If you want more time to get on with your journalism, then you better be in a newsroom where the boring bits are being looked after by a machine.”

How to leverage AI-powered services

Find out who is sharing what

For journalists concerned with what is being shared online, the biggest players in Big Tech don’t make social analytics easy to come by. The main service available, CrowdTangle, is owned by Meta, the company formerly known as Facebook. Familiar to many audience development editors and social media teams, the tool can also help journalists identify stories or stamp out misinformation.

Using CrowdTangle, journalists can identify engagement with content, including which social media accounts have shared a specific URL hyperlink or keyword across social media (Facebook, Twitter, Reddit, and Instagram). Also available to the public is a hub of Live Displays, which offer a real-time view of trends on social media.

Screenshot 2021 12 12 at 13 12 18 — CrowdTangle is a service from Meta that provides analytics about Instagram, Facebook, and Reddit. Above is a screenshot showing the tool in action.

Reporters and newsrooms can gain special access to more features from CrowdTangle by filing an access request. To get up to speed, journalists can check out a training programme from First Draft, a nonprofit that aims to protect communities from misinformation, to use CrowdTangle for research. Given how much a journalist's reporting relies on social media, such a tool is essential for verification and delivering fact-based reporting at speed.

The takeaway? Use it! It's free, and the barriers to entry are low. But like any data-driven service, the product won't give all the answers. CrowdTangle's metrics were decided by the parent company Meta. The Markup, an investigative newsroom, recently pulled back the curtain on CrowdTangle in "How We Built A Facebook Inspector".

They show that Meta does not provide impressions data or the number of times the content is shown to users, limiting what journalists can say about the information. In addition, keep in mind reporting on certain keywords or trends can inadvertently amplify them. Ethically speaking, this amplification could be problematic as it may unnecessarily spread uncertainty or fear to audiences.

Transcribe audio

There are a range of free and paid transcription services newsrooms can use to report on live events such as city hall meetings, especially if they are online. Otter.ai uses artificial intelligence to provide real-time transcription and notes in easy-to-use file formats and direct language. Trint offers similar services. Reporters can upload a recorded audio file from Zoom or another video conferencing service, and the tools transcribe the file in written text. Reporters can listen to the audio file and edit the transcription file within the document for accuracy. Files can be shared with team members or exported. Trint is available in 14 languages.

Trint dashboard — Above is a screenshot showing the Trint display for editing an audio file.

According to Aimee Rinehart, the AI program manager at Associated Press, these services may be helpful in places with a dearth of local media.

Rinehart referenced a conversation with Laura Frank, Executive Director of CoLab, about what small newsrooms are experiencing in southwest Colorado, where one news reporter is responsible for covering six counties.

“That’s a tremendous task,” said Rinehart. “AI can help. One powerful thing that AI can do is write up and distribute summaries of city council meetings. For instance, provide a transcript, a summary. The public needs information about civic life.”

“You look at things like voting habits, and they happen less in news deserts. The public needs information about civic life,” said Rinehart.

The takeaway: Transcription services are already a part of the workplace at many news media organisations. They save journalists time transcribing interviews, podcasts, or live meeting notes. In the case of “news deserts,” transcription bots could feasibly support newsrooms in doing the impossible: sending a reporter to three places at once.

Costs range from free to enterprise rates in hundreds of US dollars. While versatile and already a part of many newsrooms, the transcripts may require human-powered copyediting to be publishable.

Services and pricing: Otter.ai provides real-time transcription and note-taking in English. Pricing plans include a free plan to transcribe live, more services including transcribing files and using Otter.ai with third-party applications such as Zoom start at $20 USD per month.

Trint provides transcription services in 54 languages on mobile and browsers. There’s no free version, but Trint does offer a 7-day free trial. Pricing plans for one user are $48 USD per month, while plans for up to 50 users are $68 USD per month.

Automate parts of newsmaking

When freelance journalist Samantha Sunne worked at Reuters in the mid-2010s, she was tasked with writing oil industry reports. She spent a good part of her day tracking prices and preparing reports for a subset of subscribers involved in the oil industry.

“’Oil went up by a half-cent, Oil went down by a cent,’ that kind of thing,” said Sunne. “If I have a robot making the lead for the oil price change reports and a script inserting the data, then I have time to do an interview.” Fast forward a decade, and according to Beckett, many larger newsrooms have smart automation services in place that write copy on topics such as pricing or sports.

For instance, The Associated Press has partnered with Automated Insights, a natural language generation platform, to automate earnings reports and sports articles.

Investments in these services can relieve journalists from tedious and time-consuming work.

The AP is also developing services in-house. Currently available is AP VoteCast, which provides localised coverage of election day with automated stories and graphics

The AP is currently at work on more scripting services, said Troy Thibodeaux, AP's director of digital news, in an email. The aim is to make it easier for partner newsrooms to auto-generate copy for specific states using up-to-date datasets.

AI-powered software can be a helpful research tool. This is the hope of former journalist and author Maria Amelie, CEO and co-founder of Factiverse. This AI-powered browser add-on (try the prototype) fact-checks claims for journalists. Use the search bar to check specific claims or create a profile to fact-check an entire article. The source data is culled from the International Fact-Checking Network (IFCN). Amelie's experience in journalism motivated her to develop the product.

"Journalists are spending hours on research to avoid making mistakes. There's a great deal of pressure on newsrooms to deliver many articles and to be fast. But there's so much content that they need to review -- millions and millions -- so our task is to make some of that easier," said Amelie.

The takeaway: In general, automation services from third-party vendors cost money, and prices vary depending on the services. But investments can relieve journalists from tedious and time-consuming work. "On one hand, people don't want to get automated out of a job, but there's a place where drudgery can be relieved by AI," said Sunne.

Apvotecast — AP leverages machine-learning, among other techniques, to automate localised coverage of elections for newsrooms with AP VoteCast.

Keeping robot writers at bay

AI-powered robot writers stoke both fear and excitement. In 2012, Wired wrote that, according to Kristian Hammond, Chief Scientist at Narrative Science, an AI and natural language processing software developer, 90% of news would be written by robots in 2026. A decade on, does automated writing have a place in newsrooms?

Abhishek Prasad, Head of Insights & AI and Automation at the HT Media Group, a mass media company based in Dehli, India, has experience trying to incorporate robot writers into newsmaking, and “believe me, we tried it, and the technology is absolutely not there.”

I piloted two robot writers to understand better the tools, which promised to turn a blank page into copy ready to edit. Given I was working on this article, the proposal was appealing, and I popped these keywords into the online AI writer Rytr.me. It costs $29 USD per month and offers a free trial:

Role of AI in the future of journalism. Artificial intelligence, bias, public good, moral compass, practical applications, diversity, clarity

Out came copy—not unconvincing—but inaccurate:

AI can take away all the bias from an article by detecting the correct keywords and synonyms. They also have a good moral compass to know what is in the public’s interest and what not to report. It is important to note that AI writers do not have a sense of bias, which is something that has been a concern when it comes to journalism in the past. With AI writers, there is no need to worry about bias in any shape or form.”

Ryte 1 1 — Above is a screenshot of the Rytr.me settings and copy from fall 2021 on the topic of “The Role of AI and the Future of Journalism.” Rytr is available in thirty languages and can write content for a variety of settings, such as blogs, emails, and social media posts. Here is what text came out:

It turns out this robot isn’t a terrible writer, but a terrible reporter. Most of the above are inaccurate. But the copy could feasibly fool a distracted reader. And it doesn’t take an ethicist to say that’s problematic.

Next, I tried Copy.Ai using the same keywords. This company was recently awarded $11 million USD in Series A venture capital. The robot writer costs $35 USD/month to use and offers a one-week free trial.

Copyai 1 — Pictured above is a screenshot of the Copy.ai settings and copy on the topic of “How to use AI for journalism.” The software offers several settings to adjust and is available in 25 languages.

The software pushed out several paragraphs of copy, including a section referencing Microsoft and Associated Press, but something seemed off. There was no context.

I stopped there to do an internet search, which led me to a 2017 Poynter piece by Sunne.

Copyai2 — Above is a screenshot of copy written by Copy.ai.

Her story was summarised in a line, without references. Where did it come from? Who can use this writing?

I sent an email to Chris Lu, CEO of Copy.ai, asking about the training data and the rights to re-use. His reply was enthusiastic:

"All of our content is AI-generated! You don't need to credit Copy.ai at all, and it's your work, especially if you make edits after generating it with Copy.ai!" He confirmed by email that their robots are trained using about 10% of the internet, including Wikipedia articles.

"Those robots are straight up stealing," Sunne said jokingly in an email when I contacted her. Thieves, perhaps, but sloppy ones. It wasn't any December, but December 2017; AP didn't "join forces" with Microsoft but used a Microsoft application as a part of a "non-financial collaboration." The pilot is also over.

The takeaway: Robot writers can stoke the imagination with copy, but results can be sloppy and require additional research to provide necessary nuance and fact-checking.

Keep in mind robot writers could be misused and amplify the production and circulation of false or misleading information online, which impacts journalism and society.

Push for robot-generated copy to be labelled, and AI transparent in general, as the Council of Europe demands in a June 2021 declaration.

As long as robots are one tool in the toolbox and not responsible for the ethics statements that will guide their adoption, they can, as Beckett hopes, allow journalists to do more creative work. This includes using machine learning to hold governments and corporations to account.

Will this headline make you mad?

Statistical modelling has been used for more than a decade to predict which headlines will garner clicks, using the logic of Search Engine Optimisation, and more recently, emotional resonance to predict engagement on social media platforms. Plugging headlines into an analyser is a low-stakes way to gauge the grab of a headline for a general audience.

Advanced Marketing Institute’s Headline Analyzer, for example, gives a percentage score from 1-100 to a headline’s “emotional marketing value” but no feedback on what is lacking. While the Headline Analyzer Tool evaluates headlines on SEO, readability, and sentiment.

The takeaway: For newsrooms interested in using AI but unsure of the next steps, consider the specific problems they are trying to solve. The answer to the question can guide the next steps--whether it’s hiring someone or using AI-powered software. Headline analyzers are an example of a tool that might be helpful, depending on the specific needs of the newsroom.

From robots to infrastructure

Artificial intelligence, when defined as an autonomous, sentient system, is unlikely. But automation using machine learning is already a part of daily life. Investigative journalists are using machine learning to hold companies and governmental organisations accountable.

The tools and services described in this article are already inside newsroom editorial processes. What the future holds, perhaps is less about whether or not robots will be a part of journalism, but how definitions of what is abnormal--or magical--will shift as norms change.

In the closing session for the AI Journalism Festival on Dec. 1, 2021, Agnes Stenbom, Responsible Data & AI Specialist at Schibsted Group said discussions about applications of AI will soon be obsolete. Just as electricity seemed magical when it first was adopted and now is a banal part of everyday life; so goes AI.

“So we need to invest in the groundwork, to lay a solid foundation,” Stenbom said. “It will be central for news organisations to prepare their people -- not just their tech people -- for the change toward AI as infrastructure.”

Just as electricity seemed magical when it first was adopted and now is a banal part of everyday life; so goes AI.

In other words, in a decade, there won’t be articles bemoaning how the robot writers have stolen reporting jobs. But that won’t be because machine learning and automation aren’t involved in newsmaking. They will be.

But as automation forms infrastructure, the work of the journalist will shift as well. In the meantime, Stenbom suggests newsrooms “vocalise our values and explicitly state and plan for the direction we want to go.”

Sabin Muzaffar, the Founder and Executive Editor of Ananke Magazine, a digital platform for women across the MENA & Sub-continent regions, has only recently begun incorporating AI into newsmaking and production, as shared in a panel on AI and small newsrooms at the AI Journalism Festival.

“As far as my experience is concerned, everyone in the newsroom needs to have ownership [over the use of AI]. Only then can you move positively forward. And you learn every step of the way,” she said.

Thanks to Tadayoshi Kohno, Professor in the University of Washington Department of Computer Science & Engineering, for interviewing with me for background research for this story.

Longform reads

Verification Handbook

Data Journalism Handbook 2

New course

Quality journalism

Countering hate speech

New course

Video course

Fundamental search for journalists

Popular course

Coding

Python for journalists

Write a response