Data Journalism with Impact
Written by: Paul Bradshaw
If you’ve not seen Spotlight, the film about the Boston Globe’s investigation into institutional silence over child abuse, then you should watch it right now. More to point – you should watch right through to the title cards at the end.1
A list scrolls down the screen. It details the dozens and dozens of places where abuse scandals have been uncovered since the events of the film, from Akute, Nigeria, to Wollongong, Australia. But the title cards also cause us to pause in our celebrations: one of the key figures involved in the scandal, it says, was reassigned to “one of the highest ranking Roman Catholic churches in the world.”
This is the challenge of impact in data journalism: is raising awareness of a problem “impact”? Does the story have to result in penalty or reward? Visible policy change? How important is impact? And to whom?
These last two questions are worth tackling first. Traditionally impact has been important for two main reasons: commercial, and cultural. Commercially, measures of impact such as brand awareness and high audience figures can contribute directly to a publication’s profit margin through advertising (increasing both price and volume) and subscription/copy sales2. Culturally, however, stories with impact have also given news organisations and individual journalists ‘bragging rights’ among their peers. Both, as we shall see, have become more complicated.
Measurements of impact in journalism have, historically, been limited: aggregate sales and audience figures, a limited pool of industry prizes and the occasional audience survey were all that publishers could draw on.
Now, of course, the challenge lies not only in a proliferation of metrics, but in a proliferation of business models, too, with the expansion of non-profit news provision in particular leading to an increasing emphasis on impact and discussion about how that might be measured3.
Furthermore, the ability to measure impact on a story-by-story basis has meant it is no longer editors that are held responsible for audience impact, but journalists too.
Measuring impact by the numbers
Perhaps the easiest measure of impact is sheer reach: data-driven interactives like the BBC’s ‘7 billion people and you: What's your number?’ engaged millions of readers in a topical story; while at one point in 2012 Nate Silver’s data journalism was reaching one in five visitors to the New York Times.4
Some will sneer at such crude measures — but they are important. If journalists were once criticised for trying to impress their peers at the expense of their audience, modern journalism is at least expected to prove that it can connect with that audience. In most cases this proof is needed for advertisers, but even publicly-funded universal news providers like the BBC need it too, to demonstrate that they are meeting requirements for funding.
Engagement is reach’s more sophisticated relation, and here data journalism does well too: at one editors’ conference for newspaper publisher Reach, for example, it was revealed that simply adding a piece of data visualisation to a page can increase dwell time (the amount of time a person spends on a page) by a third. Data-driven interactivity can transform the dullest of subjects: in 2015 the same company’s David Higgerson noted that more than 200,000 people put their postcodes into an interactive widget by their data team based on deprivation statistics — a far higher number, he pointed out, “than I would imagine [for] a straight-forward ‘data tells us x’ story”5.
Engagement is particularly important to organisations who rely on advertising (rates can be increased where engagement is high). but also those for whom subscriptions, donations and events are important: these tend to be connected with engagement too.
The expansion of non-profit funding and grants often comes with an explicit requirement to monitor or demonstrate impact which is about more than just reach. Change and action in particular - political or legal – are often referenced. The International Consortium of Investigative Journalists (ICIJ), for example, highlight the impact of their Panama Papers investigation in the fact that it resulted in “at least 150 inquiries, audits or investigations … in 79 countries”, alongside the more traditional metric of almost 20 awards, including the Pulitzer Prize.6 In the UK a special place is reserved in data journalism history for the MPs’ expenses scandal. This not only saw The Telegraph newspaper leading the news agenda for weeks, but also led to the formation of a new body: the Independent Parliamentary Standards Authority (IPSA). The body now publishes open data on politicians’ expense claims, allowing them to be better held to account and leading to further data journalism.
But policy can be much broader than politics. The lending policies of banks affect millions of people, and were famously held to account in the late-1980s in the US by Bill Dedman in his Pulitzer-winning “Color of Money” series of articles. In identifying racially divided loan practices (“redlining”) the data-driven investigation also led to political, financial and legal change, with probes, new financing, lawsuits and the passing of new laws among the follow-ups.7
Fast-forward 30 years and you can see a very modern version of this approach: ProPublica’s Machine Bias series shines a light on algorithmic accountability, while the Bureau Local tapped into its network to crowdsource information on algorithmically targeted ‘dark ads’ on social media.8 Both have helped contribute to change in a number of Facebook’s policies, while ProPublica’s methods were adopted by a fair housing group in establishing the basis for a lawsuit against the social network.9 As the policies of algorithms become increasingly powerful in our lives — from influencing the allocation of police to Uber pricing in non-white areas — holding these to account is becoming as important as holding more traditional political forms of power to account, too.10
What is notable about some of these examples is that their impact relies upon — and is partly demonstrated by — collaboration with others. When the Bureau Local talk about impact, for example, they refer to the numbers of stories produced by members of its grassroots network, inspiring others to action, while the ICIJ lists the growing scale of its networks: “LuxLeaks (2014) involved more than 80 reporters in 26 countries.11 Swiss Leaks (2015) more than 140 reporters in 45 countries”. The figure rises to more than 370 reporters in nearly 80 countries for the Panama Papers investigation: 100 media organisations publishing 4,700 articles.12
What’s more, the data gathered and published as a result of investigations can become a source of impact itself: The Offshore Leaks database, the ICIJ points out, “is used regularly by academics, NGOs and tax agencies”.
There is something notable about this shift from the pride of publishing to winning plaudits for acting as facilitators and organisers and database managers. As a result, collaboration has become a skill in itself: many non-profit organisations have community or project management roles dedicated to building and maintaining relationships with contributors and partners, and journalism training increasingly reflects this shift too.
Some of this can be traced back to the influence of early data journalism culture: writing about the practice in Canada in 2016, Alfred Hermida and Mary Lynn Young noted “an evolving division of labour that prioritizes inter-organisational networked journalism relationships”.13 And the influence was recognised further in 2018 when the Reuters Institute published a book on the rise of collaborative journalism, noting that “collaboration can become a story in itself, further increasing the impact of the journalism”.14
Changing what we count, how we count it, and whether we get it right
Advanced technical skills are not necessarily required to create a story with impact. One of the longest-running data journalism projects, the Bureau of Investigative Journalism’s Drone Warfare project has been tracking US drone strikes for over 5 years.15 Its core methodology boils down to one word: persistence.16 On a weekly basis Bureau reporters have turned ‘free text’ reports into a structured dataset that can be analysed, searched, and queried. That data — complemented by interviews with sources — has been used by NGOs and the Bureau has submitted written evidence to the UK’s Defence Committee17.
Counting the uncounted is a particularly important way that data journalism can make an impact — indeed, it is probably fair to say that it is data journalism’s equivalent of ‘giving a voice to the voiceless’. The Migrants Files, a project involving journalists from over 15 countries, was started after data journalists noted that there was “no usable database of people who died in their attempt to reach or stay in Europe”.18 Its impact has been to force other agencies into action: the International Organization for Migration and others now collect their own data.
Even when a government appears to be counting something, it can be worth investigating. While working with the BBC England Data Unit on an investigation into the scale of library cuts, for example, I experienced a moment of panic when I saw that a question was being asked in Parliament for data about the issue.19 Would the response scoop the months of work we had been doing? In fact, it didn’t — instead, it established that the government itself knew less than we did about the true scale of those cuts, because they hadn’t undertaken the depth of investigation that we had.
And sometimes the impact lies not in the mere existence of data, but in its representation: one project by Mexican newspaper El Universal, Ausencias Ignoradas (Ignored Absences), puts a face to over 4,500 women who have gone missing in the country in a decade.20 The data was there, but it hadn’t been broken down to a ‘human’ level. Libération’s Meurtres conjugaux, des vies derrière les chiffres does the same thing for domestic murders of women, and Ceyda Ulukaya’s Kadin Cinayetleri project has mapped femicides in Turkey.21
When data is bad: impacting data quality
Some of my favourite projects as a data journalist have been those which highlighted, or led to the identification of, flawed or missing data. In 2016 the BBC England Data Unit looked at how many academy schools were following rules on transparency: we picked a random sample of 100 academies and checked to see if they published a register of all their governors' interests, as required by official rules. One in five academies failed to do so — and as a result the regulator Ofcom took action against those we’d identified.22 But were they serious about ensuring this would continue? Returning to the story in later years would be important in establishing whether the impact was merely short-term, or more systemic.
Sometimes the impact of a data journalism project is a byproduct — only identified when the story is ready and responses are being sought. When the Bureau Local appeared to find that 18 councils in England had nothing held over in their reserves to protect against financial uncertainty, and sought a response, it turned out the data was wrong.23 No-one noticed the incorrect data, they reported. “Not the councils that compiled the figures, nor the Ministry of Housing, Communities and Local Government, which vetted and then released [them]”. Their investigation has added to a growing campaign for local bodies to publish data more consistently, more openly, and more accurately.
Impact beyond innovation
As data journalism has become more routine, and more integrated into ever-complex business models, its impact has shifted from the sphere of innovation to that of delivery. As data editor David Ottewell wrote of the distinction in 2018:
“Innovation is getting data journalism on a front-page. Delivery is getting it on the front page day after day. Innovation is building a snazzy interactive that allows readers to explore and understand an important issue. Delivery is doing that, and getting large numbers of people to actually use it; then building another one the next day, and another the day after that.”24
Delivery is also, of course, about impact beyond our peers, beyond the ‘wow’ factor of a striking datavis or interactive map — on the real world. It may be immediate, obvious and measurable, or it may be slow-burning, under the radar and diffuse. Sometimes we can feel like we didn’t make a difference — as in the case of the Boston Globe’s Catholic priest — but change can take time: reporting can sow the seeds of change, with results coming years or decades later. The Bureau Local and BBC do not know if council or schools data will be more reliable in future — but they do know that the spotlight is on both to improve.
Sometimes shining a spotlight and accepting that it is the responsibility of others to take action is all that journalism can do; sometimes it takes action itself, and campaigns for greater openness. To this data journalism adds the ability to force greater openness, or create the tools that make it possible for others to take action.
Ultimately, data journalism with impact can set the agenda. It reaches audiences that other journalism does not reach, and engages them in ways that other journalism does not. It gives a voice to the voiceless, and shines a light on information which would otherwise remain obscure. It holds data to account, and speaks truth to its power.
Some of this impact is quantifiable, and some has been harder to measure — and any attempt to monitor impact should bear this in mind. But that doesn’t mean we shouldn’t try...
Alan Rusbridger, ‘Alan Rusbridger: Who Broke the News?’, The Guardian, 31 August 2018.
Christoph Schlemmer, ‘Speed is Not Everything: How News Agencies Use Audience Metrics’, Reuters Institute for the Study of Journalism, (2016).
Elia Powers, ‘Selecting Metrics, Reflecting Norms’, Digital Journalism 6:4, (2018), pp. 454-471.
Dylan Byers, ‘20% of NYT Visitors Read 538’, Politico, 11 June 2012.
David Higgerson, ‘How Audience Metrics Dispel The Myth That Readers Don’t Want To Get Involved With Serious Stories’ 14 October 2015.
Will Fitzgibbon and Emilia Diaz-Struck, ‘Panama Papers Have Had Historic Global Effect – And the Impacts Keep Coming’, ICIJ, 1 December 2016.
Maeve McClenagan, ‘Campaigners Target Voters With Brexit “Dark Ads”’, The Bureau of Investigative Journalism, 18 May 2017.
Mary Clare Jalonick, ‘Facebook Vows More Transparency Over Political Ads’, The Seattle Times, 27 October 2017.
Maurice Chammah, ‘Policing The Future’, The Marshall Project, 2 March 2016.
Jennifer Stark, ‘Investigating Uber Surge Pricing: A Data Journalism Case Study’, Global Investigative Journalism Network, 2 May 2016.
Mar Cabra, ‘How ICIJ went from having no data team to being a tech-driven media organisation’, ICIJ, 29 November 2017.
Uri Blau, ‘How Some 370 Journalists in 80 Countries Made the Panama Papers Happen’, Nieman Reports, 6 April 2016.
Alfred Herminda and Mary Lynn Young, ‘Finding The Data Unicorn’, Digital Journalism 5:2, (2016), pp. 159-176.
Richard Sambrook, ‘Global Teamwork: The Rise of Collaboration in Investigative Journalism’, eds. By Richard Sambrook, Reuters Institute for the Study of Journalism, 2018.
BBC, ‘Libraries Lose a Quarter of Staff As Hundreds Close’, BBC News, 29 March 2016.
Maria Crosas Batista, ‘How One Mexican Data Team Uncovered the Story of 4,000 Missing Women’, Online Journalism Blog, June 2016.
Data Driven Journalism, ‘Kadincinayetleri.org: Putting femicide on the map’, Data Driven Journalism, 10 February 2016.
BBC, ‘Academy Schools Breach Transparency Rules’, BBC News, 18 November 2016.
Gareth Davies, ‘Inaccurate and Unchecked: Problems with Local Council Spending Data’, The Bureau of Investigative Journalism, 2 May 2018.
David Ottewell, ‘The Evolution of Data Journalism’, Towards Data Science, 28 March 2018.