Mastering data for better business journalism

A beginner's guide to using data for economic reporting

If money makes the world go round, business journalists communicate and explain the dizzying spins that affect everyone.

Their reporting underpins almost every part of society. There's no shortage of stories about how multinationals make their billions, not-for-profits fund activities, or people invest their money. What journalists add is the critical bridge between complex issues and how people understand the impacts on their lives.

Business reporters who cast these stories in understandable and accessible ways help their audiences make better decisions. As one of the most demanding and dynamic fields in journalism, it can’t be done without an excellent grasp of data.

As a business journalist, you can find yourself doing a quick turnaround on earnings reports to filing briefs on corporate comings and goings in a trade publication, or interviewing a CEO for a profile piece.

Maybe you follow a few specific companies with a contact list of insiders who can give you scoops. Or perhaps you are chasing a long-form piece with character arcs and narrative plotting and pacing.

No matter how you cover business -- text, short social media posts, video, podcasts -- data journalism should be in your tool kit.

You've almost certainly been working with data already: industry or government statistics, U.S. Security and Exchange Commission filings, projections from market analysts, and more. They probably don't seem like "data journalism" because the volume of information is low, or the analysis seems nowhere near as complex as you imagine what data journalists do.

But it's all a matter of degree. Business journalists typically look for some numbers for their stories because they want to compare and contrast things. How company A differs from B. The valuation of a startup and what it would need to achieve to make that number seem reasonable. Consumer trends and financial pressures. All numbers.

Reporters and editors do this without a second thought because it's just part of the work. Large scale or small, though, it's all data journalism, translating information into words or images. Learning more about it, and even collaborating with colleagues who focus on the data aspect, can enrich your work. The more you understand, the better your handicraft will be. It can also be easier than it seems.

Beyond narrative alone

Data journalism relies heavily on maths—essentially specialised languages for conveying certain types of relationships and truths—and technology that allows the storage, manipulation, and analysis of all sorts of information. These fields, different from spoken and written language, can help add power to reporting.

Journalism has an affinity for such traditional story narrative elements as characters, plots, development, and emotional hooks. While fine, the approach has limitations. Reporters and editors might pass over stories that lack the inherent "dramatic" elements but nonetheless are important for an audience.

Data and, by extension, technology and mathematics may seem, through exposure at school, cryptic, cold, and cruel businesses. But, as types of languages and skills, they aren't any more so than the study of music or German or automobile repair.

Data and analysis can mix with the narrative impulse of many journalists. Not as a replacement, as most people won’t take up a quasi-mathematical treatise along with descriptions of the data structures for their entertainment reading. Instead, aim to bolster your current information sources and even story concepts.

Never be satisfied with a summary of data. Get the whole study.

Framing coverage

Before considering how data journalism can help, it's best to start with some analytic thought about your coverage. Glueing data analysis and visualisations onto stories doesn't make much sense if they aren't compatible or needed.

Just as you learn to ask who, what, when, where, how, and why in basic reporting, start with some questions. Here are a few examples, although don't treat them as limits:

  1. What is the nature of my beat and where does it intersect with information?
  2. How do people in the industry measure their businesses' performance?
  3. Are the measures they use reasonable?
  4. Is there information that might illuminate aspects of what I'm trying to describe?
  5. Can data support or refute claims that people I've interviewed are making?
  6. What information puts a company or person into a larger context?
  7. Can I generalise the specific, finding larger frameworks of data that extrapolate from an example and find a matching trend?

As a business reporter with more than 25 years in the field covering everything from startup issues to multinational controversies, here are some examples where I've asked such questions and found answers and applications in my own work:

Are the measures they use reasonable?

I got into a discussion that turned heated on the part of a CEO whom I was interviewing for a company profile.

The business repeatedly used pro forma financials—presentation of results that eliminate one-time gains or losses to show how the ongoing business was doing. Pro forma statements can provide insight or cover shortcomings, depending on how they're used and understood. But this company used them every quarter, which means what should have been exceptions were really usual and expected conditions.

I insisted on discussing standard accounting treatments (called GAAP, or generally accepted accounting principles, in the U.S. and IFRS, for international financial reporting standards, in most of the rest of the world). Such rules allow comparison of companies on an even basis and prevent executives from using accounting as a way to hide the truth of corporate performance.

The approach told me something important about the company's performance and what it was trying to achieve, which was to create a pretty picture that didn't really exist. It also enraged the CEO, who saw a carefully cultivated picture begin to crack.

Is there information that shows aspects of what I'm trying to describe?

A piece I wrote for Fortune explained why life could be so expensive even though overall inflation is at historical lows. To tell the story, I pulled together sources of information about the growth of U.S. inflation (the consumer price index), per-capita disposable income (money left after paying taxes), and costs of homeownership, rent, health, and school and childcare.

I assembled the columns of numbers in a spreadsheet and then indexed each. That is, for every category, I divided the values of all months by the value of the first. By making every month a multiple of the first, I could show growth over time as a series of percentages of that first value.

Then I put everything into a graph format to make the comparisons easy to see. Disposable income always lagged far behind everything else.

Screenshot 2020 05 21 at 12 56 21

Can data support or refute claims that people I've interviewed are making?

A PR person described a client of his as a Fortune 500 company. The business was in a segment of technology I had covered in some depth and yet I had never heard of the name. A quick browser excursion to Fortune's site let me search through the current Fortune 500 list of the largest public corporations. Surprise, surprise, there was no listing for the company.

When I challenged the PR person, he said it was a "Fortune 500 type company." I sent the email to the trash. But imagine the devilry that would have arisen had someone quoted the assertion without checking data to verify. This is an example of how data journalism can do critical background work invisible to the audience.

Forget performing mathematical calculations. Start with a consideration of what a given set of statistics claims to show.

What information puts a company or person into a larger context?

Taking a backward look at the results of the global financial crisis on wages for Forbes.com, I wanted to move beyond mean and median representations of data, which often obfuscate a fuller reality.

Means and medians do offer one way to categorise a body of data, but they do so by eliminating an understanding of how things may vary in different circumstances. If you have €20 and a drinking companion has none, there's little doubt of who will have to pick up the bar tab, even though the average amount each of you has is €10.

A web search revealed that the Federal Reserve Bank of Atlanta had broken out wage growth by wage size, skill level, and full- or part-time status. Downloading the data, I was able to create a set of graphs, like the one below. The graphs painted a picture of how income inequality increased and, even as overall wages began to recover, some categories of workers had lost more than they would regain.

Screenshot 2020 05 21 at 12 57 56

Role of data in business journalism

All journalism serves to answer questions that the public might have. Some outlets— Nate Silver's FiveThirtyEight, TheUpshot at the New York Times, Guardian's data journalism blog—regularly use data as the main tool in business journalism. An intriguing new non-for-profit venture, The Markup, develops and builds its own datasets to report on large high tech companies.

Data sets can become characters in their own right. A story might address a curious pattern someone noticed in information, accompanied by explorations of how the results came to be and why they are relevant to readers.

Or the story could be about the existence and use of the data itself, as in the piece that Olivia Solon and Cyrus Farivar did for NBC News about how Facebook has used its data "to fight rivals and help friends."

Data can also support more traditional coverage, whether inverted pyramid hard news coverage or a feature. When LinkedIn filed a piece to go public, I pulled together a story for CBS Interactive, using data from web searches and financial filings, to compare long-standing claims of profitability with the reality of being in the red for much of that time.

An important rule of thumb is to let data highlight and amplify the answer to an inquiry, or even the existence of the question itself. Avoid using data for its own sake or you risk losing your readers. Before anything else, though, there is preparatory work.

Everyone in business or investing wants to know the future. Estimates and projections try to scratch that itch.

Statistics, studies, and polls

Forget performing mathematical calculations. Start with a consideration of what a given set of statistics claims to show. Who released the numbers? How were they gathered? Are you looking at a collection of data over time, like from a government agency? Or is this a study or poll that requires additional information to understand its validity and limitations?

Never be satisfied with a summary of data. Get the whole study. Years ago, I wrote an entire piece about how a "statistic" about 14% of all laptops being stolen was utter hogwash. A combination of interviews, data, and some easy analysis set up the entire story:

· The company offering the number to every reporter available had a business in computer insurance.

· Because it only looked at its own customer base, the subjects seemed more likely to lose or damage a machine. Otherwise, why buy insurance?

· The insurer did not release the details of how it arrived at its numbers, offering only a figure for how many claims it had. Reporters used estimates from market analysts to calculate a percentage of loss and did not consider how the company framed things to look particularly favourable for its interests.

· Law enforcement agencies and the broader insurance industry did not track laptop loss or theft at all, so there was nothing comparable.

· If you calculated the cost of laptops, the loss rate, and the price of the policies, you would see that the company should have been out of business almost immediately.

Never underestimate how much a company is willing to misuse you. After this article appeared, I noticed that virtually all use of the "statistic" suddenly disappeared from business and tech coverage.

There are many resources to learn more about seeing whether what you are given seems reasonable. I wrote a piece for the Reynolds National Center for Business Reporting about how to assess a survey. Other resources include the American Association for Public Opinion Research, the Pew Research Center, the International Center for Journalists, and the Poynter Institute. Meanwhile, NICAR-Learn also recently made its data journalism videos free for one year.

Historical trends can lead to wrong conclusions.

Checking metrics

Metrics are measurements of ongoing activity, like production figures of a company's blue spanner line and a country's unemployment rate when half of the blue spanner makers are laid off.

They may be the product of regular data collection or something pulled together for an article when journalists round-up information and then categorise and count it. Public companies throw off never-ending compilations of metrics as required by government agencies.

As with statistics, do not automatically take them at face value. Information handed to you might be correct or arranged to create an effect, as the pro forma financials mentioned higher up. Also, historical trends can lead to wrong conclusions. Pointing to sales history as a guide to how a new product will do helps not at all if the product and sales territory are outside of what the company has previously done.

Estimates and projections

Everyone in business or investing wants to know the future. Estimates and projections try to scratch that itch. There was a stretch of time during the big slide this winter when I was monitoring the major U.S. stock indices for Fortune. I kept calculations in a set of spreadsheets, regularly updated to incorporate new information, until finally conditions were right for an article on how the Dow lost all the gains it had made since Donald Trump's inauguration on January 20, 2017.

Warning: not all editors are comfortable with such work. I recently had an editor insist that I take out similar types of modelling in a piece I was working on and only use a number provided by another source. Ironically, the final choice of citation was an article in a publication where a data journalist had independently used the same approach I was using.

More often, third parties offer their projections. For example, how quickly consumers will adopt 5G telephone service, or where the stock market might be in six months. Projections are almost always wrong. If they were regularly correct, the people who make them would find more effective ways to profit from their prognostic talents. Not to suggest you should always avoid estimates. But consider the background of who creates them, their potential motivations, and the history of their previous estimates. Always take such guesses with a grain of salt and remember that data are often uncertain. Recognise the limits while employing it.

Data sources

You must obtain data before incorporating it. This is both easy and challenging. The easy part is gaining stacks of information from many governments. Data on economic and commerce is omnipresent through official agencies. Many countries have clearinghouses that generate and disseminate data. Then additional government agencies often have their own.

Take the U.S. as an example. There are tremendous data resources at all the cabinet-level agencies as well as regulatory bodies and the Federal Reserve and its regional banks. Similarly, you may find data, though not as much, at state and local levels. You will also find data available through universities, thinktanks, corporations, political groups, industry groups, lawsuit court filings (an underappreciated resource), international institutions, polling firms, and analysts, to name a few potential sources.

However, before you embrace that cacophony of information, take a moment to consider a point made by Liliana Bounegru in the Harvard Business Review. Reliance on existing data sets can "exacerbate the tendency to amplify issues already considered a priority, and to downplay those that have been relegated or which aren't on the radar screens of major institutions."

That falls short in two major ways. One, when everyone uses the same data, it can become challenging to find a story not available to everyone else. The second is that neglected people and issues get passed over. One way to get passed this is to start pulling together data from different sources to create a fuller view. A personal example came after the Wall Street Journal ran its articles on public companies that got federal COVID-19 financial relief intended for small businesses.

The Journal sorted through the U.S. Securities and Exchange Commission financial filings to identify public companies that had mentioned the U.S. Payment Protection Program, the small business loan program that was part of the government’s response to the pandemic-fueled economic crisis, then put together its list.

It can be a big task, but one made easier if you use the full-text search at the SEC’s Edgar site or use a third-party SEC data search engines like SEC Info that can allow the type of broad searching across filings. I used the latter when writing about mortgage-backed bonds that included buildings with WeWork as a major tenant last October for Fortune. A search across the body of filings for the term WeWork turned up each prospectus that had to list the largest tenants in the buildings whose mortgages were included in the bond.

Business journalists typically look for some numbers for their stories because they want to compare and contrast things.

Getting the tools you need

You do not have to be a math whiz to do much of this, but it greatly helps to improve your understanding of what you are looking at.

Much of the analysis I have mentioned required a combination of some technical skills, patience, and fundamental maths: addition, subtraction, multiplication, division, and working with and calculating percentages and fractions. Additionally, a grasp of basic probability and statistics helps identify weaknesses in data analysis and source material.

There are many free and low-cost online courses where you can brush up skills and gain new ones. For example, the Knight Center for Journalism at the University of Texas at Austin periodically has offerings in data journalism and data visualisation. The Poynter Institute has self-directed courses. The Reynolds National Center for Business Journalism at Arizona State University has video workshops. And, of course, there is useful material at DataJournalism.com.

If you have the chance to work with a data journalist on a project, that person can also potentially provide help and bolster those areas where you might be weak. You will also want tools to make work easier. Some of the basics for analysis are calculators and spreadsheets. Databases can help, but are more complicated; if you do not have experience, find someone who does.

Sometimes more specific statistical analysis is helpful. Microsoft Excel has many applicable functions, but you need to understand what they do and how they work. For advanced statistical tools, take a class or find someone who already knows how to use them.

Chances are you'll also want data visualisation tools that can help build images that can often better portray data. Do not expect to become an expert in any of this overnight. But, on the brighter side, toss the notion of having to achieve some acknowledged level of ability before starting to incorporate data.

Instead, start from where you are. Look for ways to incorporate data into developing stories or see what ideas data itself might generate. Over time, your data work will become better, enriching your business reporting.

Screenshot 2021 06 28 at 21 48 55 squashed
subscribe figure