The early 2000s have witnessed a rising adoption of data for investigative reporting and storytelling. Despite the growing interest in data journalism by newsrooms, and its emergence as an academic discipline in its own right, there is a lack of systematic research into this domain, resulting in a divide between academic and industry practices.
To bridge this gap, we spoke to Megan Lucero, data Journalism editor of The Times and The Sunday Times in London.
This interview comprises part of a larger project on the global state of data journalism, carried out by Bahareh Heravi from the School of Information and Communication Studies at University College Dublin, and Mirko Lorenz, an information architect and co-founder of Datawrapper.
Could you briefly introduce your background, current role, and organisation?
I am the data journalism editor of The Times and The Sunday Times in London. I originally started at The Times as an intern five years ago, but became part of the first data journalism team two years later when we started a digital team that works across both outlets. This role has evolved to running the data team as the data editor.
We are a computational investigative team, with a statistician and programmer journalists all working together to find stories in data that you couldn’t do without computing. My role as editor is to come up with story ideas, pitching them to the newsdesk, investigate them, and coordinating different skill sets, timescales and roles to lift up findings and data.
How would you describe ‘data journalism’ to someone who has never heard the term?
The best way to describe data journalism is to think of it like any other form of journalism. I believe that data journalism is a repetitive term -- data journalism is simply journalism. Data is just a tool that aids the process. ‘Data journalism’ is very similar to saying pen-driven, typewriter-driven, or computer-driven journalism. It is something that aids the process. Look at data and the digital sphere as your pool to swim in. Find ways to navigate it, mine it, and analyse it to locate stories and drive your journalism. That’s what my team does.
What do you think, hope, or believe are the main advantages of data journalism?
For me it is not about thinking, or hoping; I know that data journalism is on the rise. It is the future of journalism. Data is the source of stories. Organisations release records, digitise, and produce data every single day. We tap in and out of the tube. When we walk down the road, CCTV captures us. The data of every minute of our lives are being captured -- that’s a lot of data. There are stories after stories after stories buried in that. That’s where we need to go.
Think of it like the way that newsrooms send their journalists to where the stories are. For example, when the war in Iraq started newspapers sent reporters to Iraq -- they sent them into war. Court reporters are sent to court hearings because they know there is a story there. Similarly, we send our journalists into the pool of data. We know that’s where government bodies and organisations are producing stories. To mine that data properly, we need to send journalists there, skill them up and provide them with people to collaborate with.
In your opinion, what is the most important aspect of data journalism?
Speaking at this moment, at the end of 2016, the most important thing is collaboration between traditional investigative reporting in newsrooms, and new technological advances. The most important thing is learning to bridge that. My fear is that we are going to enter into a world in which we are skilling up programmers to become journalists. But they are not skilled with how to find the story, how to find the really good lead, and lose the journalism element of it. My other fear is that we let the really great investigative journalists die out and fall away. As those techniques fall away, so too do the intuition and the ability to listen, ability to hear, to talk to sources, and to know when those fluctuations happen. It is really important that those always remain bridged. Otherwise, we are going to have issues on both sides of technology advancements and traditional journalism.
What is the relevance of data, statistics, visualisations, and coding in newsrooms?
They are all relevant, in completely different ways. Data is a very wide term. I just spoke about the importance of data as a source and the space we have to navigate to find stories. Having a good understanding of statistics is increasingly important as we handle more numerical, tabulated, or list forms of information, to avoid jumping to spurious conclusions, I would always recommend Stats 101. Every journalist should be able to do that.
Visualisation has to play a very specific role. I am very much against visualisations that move just to move. You should only visualise in a technological way if it pushes you to see something that you can’t in a static way. I think sometimes people just assume you can tack a visualisation on to any kind of story... sometimes it is TV, sometimes it is radio, sometimes it is written. Sometimes the best way to tell a story is on camera, sometimes the best way to tell a story is in written word, sometimes it is in a visualisation. Think of it as a new medium.
Coding in newsrooms is a completely different element. It is important for all journalists to have a basic understanding of HTML and coding, because their computers are based on it. It is useful to understand how to get that information in case you need it. However, it is more important to me that journalists know what is possible, not necessarily how to do it. Although it is important for programmers and journalists to work together, I wouldn’t say you need an entire newsroom of programmers. Having some very skilled programmers is going to be an important aid for journalists, but it’s even more key for journalists to know what is possible.
If you were the editor-in-chief, what would be your first move to bolster data journalism?
That’s a tough question!
A normal expectation for any newsroom is that you have to know your beat well. Nowadays you need to know when the latest data release for your beat is dropping. If you are a crime reporter you need to know when the official statistics drop, you need to know how to collect local crime reportings, and how you can get that data quickly. That would be one of the first things I would say -- knowing your beat means you need to know the datasets, and you need to know when they are happening. I would want to equip my newsroom with the means to do that. The problem is, as I mentioned, the bridging between old-school journalism and the newer form could be intimidating and awkward if you force it. A pilot program would be helpful to start. I would partner a handful of journalists who are open to the idea with a data journalist; they would look through the datasets of their beat, write scrapers to monitor relevant sites, write bots to notify them when things are updated, build a dataset of historical databases relevant to their newsbeat, and work together throughout several weeks or months. Find ways to aid the process. For example, if a journalist is doing an action repeatedly, help them write the necessary code instead of manual work. I would do an audit of the journalistic process, and be very clear that it’s about aiding the journalists and the newsroom to get stories. It would be important to say, in short, that knowing the data on your beat is expected of you. I would go a step further back that up with support through a pilot program, or partnering a digital team with journalists.
How should we teach data journalism to those who are younger and older, less experienced and more experienced?
You don’t need a different technique for different types of journalists. You want to prove to journalists that technology and computing can make their jobs easier, make them better journalists, get stories no one else has, and save them time. Whether you are an experienced journalist or an expert journalist, with or without a solid understanding of data, those are the key principles in selling digital to a journalist. Increasingly, journalists do not have time, whether they are young or more experienced. Then again, what a journalist values and needs will vary from person to person. The fundamental principle is to listen and help write a script to fulfil actions they are repeating over and over again. Show them the dataset that they are expected to know about. Ask, what are your fears? What do you understand or not understand? How can we make this easier? Help them understand how computational methods, scrapers, bots, analysis, and statistics get stories that they never would have before. I think every journalist would be open to that, but the application to each person is different.
...prove to journalists that technology and computing can make their jobs easier, make them better journalists, get stories no one else has, and save them time.
Have you made any surprising observations in the past few years? For example, were there any unexpected areas of application or unexpected barriers to adoption?
Of course, there are always unexpected areas of application. There was a journalist who told me when I started that Twitter is a joke and no one would ever use it. Today, they are the biggest tweeter and use Twitter for everything; getting sources, getting quotes, tweeting stories, and writing stories about social media. So, I guess I am constantly surprised by people who were previously sceptics.
Onto your second question about barriers to adoption: I had this naive belief that, again, if you apply the concept I just told you, saying that I am just trying to make your life easier, I am just trying to help you get the stories you would never get before, and get them quicker than anyone else, and get the ones that no one else would get, you’d think that would be a normal application. You would think every journalist would be jumping up and down, wanting to be trained, wanting to be paired with data journalists. To me that seems simple, because being a great journalist means being resourceful. You have to get your best contacts, know everyone in the industry, and be a go-to. You would think that one of your best contacts would be your data. So, I was quite surprised when it wasn’t actually as big of an uptake for a lot of people.
But, like I said, all these things take time. We are working in an industry of change. Yes, I am working in data journalism, but ultimately I am in the industry of industry change. It’s about changing the industry of journalism. And changing any kind of industry is very difficult. People are obviously very fearful of what this all means, and what their jobs are, and the fear of automation of their work.
There is a lot to learn about human experience, human fear for change and technology, and the question of who chooses to stay and adapt.
I think the last thing I have learnt is that the UK has a long way to go in terms of actually making data available. There have been a lot of advances and fighting to open data up, and maintain FOI. When you look at the US we are still very far behind in terms of open data and how data is published. The more stories I work on and the more datasets I look at, the more clear it becomes that the government bodies producing this data have not made it systematic. The data is so messy that there is no way they could be analysing it to inform policy. As journalists, we are kind of doing the work for them.
As not just a journalist but also a citizen of a globalised state, I find it interesting that we need this field. The work that I do to advance journalism through technology needs to happen at every element of our society. These kinds of technological advances need to happen everywhere -- in the government, the public sector, and the private sector -- for journalism to improve as well. This was shown by Barack Obama’s policies in the US, like creating the roles of chief technology officer and chief information officer, and requiring that every government body has to make machine readable formats.
These changes led to a lot of very interesting open data projects, like in the New York Times and Washington Post. It is because the US have to release their data in an open, constant, and comparable format. I am very surprised to see how long we have to make it happen in the UK.