Skills and Tools
Skill level
How data journalists rate themselves in terms of skill level varies. Journalism is the area where data journalists feel they most excel, with 60% considering themselves to be advanced in this area. As for data analysis, 21% of those surveyed describe themselves as advanced.
When it comes to data skills, just less than half consider themselves to be at an intermediate level for data analysis and data visualisation. Including statistics, these are the areas in which less than 1 in 10 data journalists consider themselves to have no skills. Data wrangling, scraping, and machine learning is where over half of data journalists consider themselves to have no skills at all or view themselves as novices. For the latter, only 22% felt they are at an intermediate or advanced level.
Programming
As we see in the next section on work practices, programming is not a mainstream task for data journalists (chosen by 29% of respondents). We asked a couple of questions about programming practices to respondents who indicated that they used coding or develop data-driven applications for their work.
The most used programming language is Python (63%), followed by HTML/CSS (51%), and R (46%). Both R and Python are extremely popular for statistical analysis, with the former having been created with that purpose in mind, and the latter being a general-purpose programming language. Python being more popular than R is also a reflection of the general community of users of each language, where Python is more widely known than R. Though JavaScript and SQL are less common, they are used by 37% and 33% of data journalists who use programming languages for work.
In terms of frequency, the vast majority uses programming daily (59%). Experience in programming varies from having more than 16 years (18%) to having between 3-5 years (28%), or 1-2 years (18%). This reveals within data journalism there is a mixture of veterans and individuals who have been programming for relatively much less time.
Once again, a self-taught path is the biggest winner in terms of learning mode (81% choosing one or both), followed by higher education (36%). Workplace training (27%) and bootcamps (27%) are the least popular places for learning how to code.
Graphic tools
We built on a Google News Lab report1 question exploring data visualisation tools in news organisations. Most data journalists work in companies where they are mostly reliant on external software to produce visualisations (33%). Our findings show that in-house software is the reality for only around 11% of data journalists, which is in stark contrast with the 2017 findings from the Google News Lab report, although they only focused on mapping tools.
Whether the difference in choice relies on affordability issues for in-house tools, or the development of better, more customisable, and more affordable external software remains unknown. Even very large companies of 500+ people do not opt for in-house tools much more than medium-sized companies of 50 to 99 individuals, suggesting that price might not be playing a role in determining the nature of the chosen visualisation tool.
Tools
If you are a data journalist, chances are that you will have had to use Excel and/or Google Sheets — the two most well-known spreadsheet software in existence. Largely uncontested in terms of tools for data storage, manipulation, and analysis, the former is used by 3 in 4 data journalists and the latter by 3 in 5.
The third, fourth, and fifth most popular tools are software that facilitates data visualisation: Datawrapper (37%), Flourish (32%), and Tableau (27%). Programming languages Python (26%) and R (20%) appear again, followed by OpenRefine (15%).
Undoubtedly the list of tools used by data journalists for data analysis and data visualisations is much longer (for example mapping software or network analysis software). Yet, the picture portrayed here is of a pretty undiversified landscape in terms of tools most used globally.
Training / upskilling / demanded training
In the survey, our aim was to find out what topics data journalists received training in, what they would like to upskill on, and what areas educators and trainers were asked to deliver training in.
Data visualisation and data analysis are the two areas where more than half of data journalists would like to upskill despite just less than half receiving training. While machine learning is what data journalists have been trained the least in (15%), it is also the topic most data journalists want to upskill in (52%). There appears to be a disconnect between the areas in which data journalists desire to be trained in, and the ones in which they actually receive training in, with the technical areas of statistics, wrangling, and scraping being the most prominent. The areas in which educators see demand for training happen to be inbetween the other two.
Cited work
- Rogers, Simon, J. Schwabish, and D. Bowers. “Data journalism in 2017.” Google News Lab (2017).