AMA with Steve Doig

Conversations with Data: #41

Do you want to receive Conversations with Data? Subscribe

Conversations W Data header

Hi there! Can you believe it’s December already? With only one edition left before the end of the year, we thought we’d give you some lessons to ponder over the holiday season from one of the field’s greats.

Doig pic

That’s right, in this 41st edition of Conversation with Data, we let you lose to question renowned data journalist, now professor, Steve Doig.

With over 20 years of experience teaching budding journalism students at Arizona State University, and another 20 years pioneering data work at the Miami Herald before that, he’s got plenty of tips and tricks up his sleeve.

What you asked

What was your first data-based story and what did you learn from it?

“Hmm, I guess it might be from when I was covering the Florida Legislature in the Herald's state capital bureau. One task when doing a story that involved a roll call vote was to write out a ‘How They Voted’ sidebar list of names of who voted for and against some measure. Often in the story there also would be mention of some simple metric like how the parties had split in the vote. I realised my then-new IBM-PC could help me to that better.

I had started teaching myself to write programs in BASIC, so I conceived and built, over the course of a week or so, a clunky program that let me quickly mark in a table how each lawmaker had voted, and then with a press of a button would generate the sidebar. But even better, I had included what I'd call political demographics about each lawmaker, including such categorical data as party, gender, race, rural vs urban, state region, leadership vs. rank-and-file, and so on. So my program also would generate cross-tabs on each of those categories, which often would reveal more interesting explanatory patterns than simple party breakouts.

What I learned is that the computer can be a great tool for handling repetitive tasks, like typing out that required sidebar or looking for interesting patterns. I also learned that new tools come along quickly, including the simple off-she-shelf database program I discovered a few months later that made my roll call analysis program obsolete!”

Since that start to your data journalism career, what is the best advice you’ve received about data storytelling?

“I would say it was from an editor who made me do more reporting for an early numbers-heavy story about poverty. "This story needs fewer numbers and more voices," she said. "Find more people who are affected by those numbers, and get them into the story." I quickly learned that sometimes a good data story might hang on just one number pulled out of a large data analysis; the people, not just a table of numbers, usually are the real story.”

On the topic of editors: How can you convince your editor that you -- a ‘normal’ journalist with a personal interest in data -- need time to clean the data, and it can turn out to be useless? The transition from ‘personally interested’ to ‘data journalist’ is can be challenging in these times of overwork.

“I feel your pain. Good for you for wanting to add data skills to your reporting toolbox, but the reality these days is that you almost certainly will have to invest your own time and own money into developing a skill level high enough to impress the boss. That in fact was my own career path at the Miami Herald, a very competitive newsroom where you needed a superpower to stand out. In about 1981, I bought an Atari 800 computer to play with at home, but began to realise it could help me at work. Before long, I bought my first PC and my first spreadsheet (Lotus 1-2-3) and began adding some data analysis to daily stories. My bosses began to notice, and encouraged me to do more. Data work became my superpower.

So I suggest you start with small stories that you can present as ready-to-go, even if you have to spend time at night and on days off working with the data. It would be hard to persuade editors who may not have data skills themselves to give you, with no real track record of doing data projects, time to clean up a large and messy dataset, do a major analysis, and perhaps discover that there's no real story there. I have my students write pitch memos to a hypothetical editor describing the story, including what has been found in a preliminary analysis, a sample lede, and perhaps a few bullet points. But happily, there are lots fewer editors these days who think that data journalism means you just hit a few keys and stories magically appear.”

Great answer! This is something Maud Beelman and Jennifer LaFleur touch on in our guidance for editors as well. Now to another challenge faced by one of our readers: I tell stories through data visualisation and infographics with less text because of my graphics design background. Can I still call myself a data journalist?

“Yes, of course you are a data journalist. Good data viz is just another way of telling a story, no worse or better in the hands of a good journalist than text or videos or podcasts or other media. For some (many?) data-heavy stories, good infographics in fact can be the most effective way to tell such stories. If you are trained in graphics design, you probably know to avoid ‘chart junk’ and other data graphics sins deplored by Edward Tufte.”

What have you learnt during your transition from a practicing data journalist to a data journalism professor?

“The most important thing I learned is that expertise is necessary, but not sufficient, for being a good teacher. When I became a professor, I was accustomed to teaching newsrooms pros how to do simple things with a spreadsheet, usually because someone would come to me and say "I have this data, and I want to use it to answer these questions, but I don't know how to get those answers”. With young students, though, I learned I first had to show them how to think about data as a source of stories. I would have them do exercises of taking a dataset, looking at the variables in it, and then coming up with lists of interesting questions the data could -- and couldn't -- answer.”

So what do you think are the most important skills for data journalism students today?

“You probably want me to tell you which specific tools you should master; okay, Excel is the gateway drug into data journalism. But I think the most important basic skill is the mental agility to learn new tools and techniques, and to realise there is no single correct tool as long as whatever you use produces a correct answer. Consider programming languages: These days there are camps of journalists who do data cleanup and analysis with Python or SQL or R or SAS, all of which can do the job. But many of those journalists, at least the older ones, may have started with Visual BASIC or Perl or using now-extinct database programs like dBase or Paradox or Reflex or FileMaker. And five years from now, you may be among a contingent using tools that haven't even been invented yet. Furthermore, the skills and tools you wind up using will depend on what branch of data journalism you want to pursue, whether it is analysis of data for investigative projects, or design of front-facing interactive web graphics, or development of back-end newsroom systems. New ways of doing all those things are steadily emerging, and you need to be ready to adopt those that offer value. I'll add that college taught me NONE of the tools I use today, but college did teach me how to learn new things.”

For more from Steve, check out our video course, Doing Journalism with Data: First Steps, Skills and Tools, or our interview with him in Data journalism in disaster zones.

ICYMI: other happenings on

Even good predictions are hard to communicate to readers. But bad predictions, especially in high-stakes situations (such as elections or financial recessions), can be more than confusing and misleading -- they can be dangerous. From The Economist’s G. Elliott Morris, our recent Long Read, The dos and don'ts of predictive journalism, draws on examples from political journalism to describe some guidelines for good predictive journalism.


Our next conversation

This year, we’ve aimed to make our conversations a little more global, travelling across Africa and Asia to explore how data journalism is practiced in different contexts. Following on with this journey, for our final edition of 2019, we’ll be heading down to Latin America as well. Joining us all the way from Argentina, we’re excited to have the brilliant data team from La Nacion with us for our next AMA. Comment to submit a question!

As always, don’t forget to comment with what (or who!) you’d like us to feature in our future editions.

Until next time,

Madolyn from the EJC Data team

subscribe figure