In November and December 2021, we surveyed over 1500 people involved in data journalism worldwide in what we believe is the biggest effort to measure the state of data journalism to date. The results provide an excellent overview of the various aspects of the field: the community’s demographics, the skills and tools, and the impact of the pandemic.


Key Takeaways

Have a look at a few of our main takeaways from the survey.


Our Mission

Data journalism is now an established part of the media ecosystem, with many newsrooms having a dedicated data team and others looking to create one. The formalisation of the field is less than a decade old, and the practices, skill sets, and technologies used are rapidly evolving, as discussed in our latest edition of the Data Journalism Handbook.

We think the establishment status of the field means that data journalism deserves to be studied, mapped, and taken seriously. But we also see that its rapid evolution indicates the need for continuous snapshots to understand how data journalism is conducted and how it changes over time.

Another exciting reason to measure the field is exemplified by the imaginative, collaborative, problem-solving nature of data journalism. Data journalism is made by individuals who blend together different data sources, analysis tools, and visualisation, to create powerful storytelling. What is possible then, and how are things being done, we ask?

Yet we also acknowledge the challenge to upskill beyond the realms of journalism. This includes learning statistics, data visualisation and programming. And further to that, to keep up with the pace of evolving work practices. As time progresses, new tools get adopted, and some see the dawn of day. Team structures change, and new job opportunities arise. All the while, data journalists are affected by the same struggles of other media players: shrinking resources, time scarcity, and waning public trust in journalism.

Within this complex landscape, the COVID-19 pandemic has brought new challenges to data journalists, but it also put them in the spotlight thanks to a wider audience. In the survey, we ask what impact the pandemic has had on data journalists’ work practices.

The reflections presented were the driving factor that led us to launch the State of Data Journalism 2021. At present, the field is lacking a regular and systematic approach that can help us make sense of the role, modus operandi, and industry composition of data journalism. Previous efforts include a 2017 survey by Heravi and Lorenz1 and a Google News Lab report2 from the same year.

These studies generated useful and unique insights. Yet much has happened since 2017. We build upon the learned lessons from these authors and create a survey that poses new, relevant questions, helping us understand the field today, in 2021.


Methodology

The State of Data Journalism 2021 Survey comprised a total of 63 questions and had 1594 respondents, of which 1285 were used for the analysis. The survey was organised in several sections, ranging from demographics to work characteristics, challenges and opportunities, and the impact of the COVID-19 pandemic on data journalism. It was available in four languages: English, Italian, Arabic, and Spanish.

1. Population of interest and sampling strategy

The population of interest was the global community of individuals involved in data journalism. Targeted respondents included full-time and part-time employed data journalists, as well as freelancers; data editors and team leads; trainers, faculty members, educators; and students.

When it came to a sampling strategy, we faced the similar issue of Heravi and Lorenz1 of not having any information on the global data journalism population parameters. Discarding the possibility to draw a random or a representative sample, we then, similar to Heravi and Lorenz, followed the approach of trying to reach as widely as we could through a variety of influential channels. Settling for a margin of error of 0.05 and a confidence level of 95%, we estimated that we wanted to obtain a minimum of 350 respondents. We believe our analysis sample size of 1285 provides us with stable estimates. 

As an additional check, we used Diffbot’s Knowledge Graph to check the properties of what they gather as the current data journalism population worldwide, against those of our sample. We obtained a very similar gender distribution, with theirs being 63% male and 37% female.

2. Outreach strategy and incentives

The survey was open between November 8 and December 31, 2021. Participation was encouraged through various communications channels, to minimise bias obtained by only targeting one online community. We used direct mailing, social media promotion, and asked the DataJournalism.com and European Journalism Centre network for help in spreading the word.

To thank survey respondents and to incentivise participantion, we offered a selection of rewards, which have now been distributed. The prize drawing included one trip to the International Journalism Festival 2022 in Perugia, Italy, Amazon vouchers, and digital goodies.

3. Survey logic

To minimise survey length while maximising survey inclusion, questions targeting a specific subgroup were only shown to those respondents, but questions about journalistic practices were left open to all (this was done to reflect that students, educators, and editors might be involved in producing and publishing data journalistic work from time to time). The survey included a mixture of mandatory and non-mandatory questions.

4. Data Cleaning

For the purpose of the analysis, only complete answers were considered. The definition of complete is that a person has finished the survey, meaning he/she/they clicked on “Submit” at the end of the questionnaire.

We found 13 duplicate names, an indication that some people filled the survey twice. Whenever the name was the same but the email address was different, we looked for additional information in the filled questionnaire (such as company, job role, and skills) to determine if it was the same person or not. Instead, whenever the email address was the same we directly determined it was the same person who had taken our survey twice. Out of 13 duplicate names, we removed 11 duplicate questionnaires. We randomly selected which to keep for each person. 

5. Survey Metadata

  • The median time spent on the survey was 16 minutes.
  • Eighty-four percent of respondents took the survey in English, followed by 7% who took it in Italian, 6% in Spanish, and 3% in Arabic.
  • Forty-eight percent found the survey to be a good length, whereas 41% found it a bit long. We appreciate the feedback and will take this into account for future survey editions.

Cited work

  1. Heravi, Bahareh R., and Mirko Lorenz. “Data Journalism Practices Globally: Skills, Education, Opportunities, and Values.” Journalism and Media 1, no. 1 (2020): 26-40.
  2. Rogers, Simon, J. Schwabish, and D. Bowers. “Data journalism in 2017.” Google News Lab (2017).


Thank you

A warm thank you to all of the people involved in data journalism who took the time to take our survey.

A heartfelt thank you to all those who helped us craft and polish our survey before launch.

And to our supporters, for offering tools, goodies, and insight:


Contact

If you have any questions about the survey and the results, you can reach us at [email protected].

To share your thoughts with the community, head over to our Discord community, where we meet regularly to discuss all things data journalism.