Write a response

How data can power public health investigations — through collaboration

Name: DataJournalism.com
Price range: $

How collaboration makes sense of data and drives investigations

29 November 2022

By Betsy Ladyzhets, Dillon Bergin

In the summer of 2021, the Documenting COVID-19 project published an article with The Kansas City Star about an elected coroner in Macon County, Missouri, who told us he routinely went against CDC guidance and wrote down causes of death that excluded COVID-19 if it “pleases the family.”

The story went viral, and was picked up by multiple national outlets. After the story was published, a professor at the Boston University School of Public Health reached out to our team to say that he had been studying the potential undercount of COVID deaths across the country. He too was concerned that the anecdote from the Macon County coroner was part of a much larger problem, one that showed itself in his analysis of mortality data at the county-level.

When our team first spoke with that professor, Andrew Stokes, he had almost as many questions for us as we did for him. His team had been working for months on a statistical model that led them to believe that something about the country’s death investigation system was resulting in gaps in the expected number of COVID deaths.

Over the next several months, we worked with Stokes’s Boston University team to find counties across the country whose trend in deaths during the pandemic raised concerns that a significant amount of COVID deaths were being missed in official death tolls. At the same time, we began working with the USA TODAY network and local reporters, including those from hard-hit states like Missouri, Louisiana and Mississippi. In a follow-up project coming this year, we have continued to collaborate with Stokes and his team, along with local reporters from states with notable demographic disparities in COVID-19 deaths.

This reporting has required reporters and editors across seven newsrooms, a close working relationship with a team of demographers at Boston University, and feedback from several other experts throughout the process. Reporters asked questions and re-asked questions, assessed questions against the data and interviews, took findings back to experts, and started the process over again.

It took time and significant resources, but in the end, we were able to tell a story that we couldn’t have without collaboration. In this article, we’re going to share some of the things we learned during this project, from deciding if the data is the right foundation for a collaborative project, to working alongside academic researchers and reporters in other newsrooms.

Captura de pantalla 2022 11 29 a las 10 57 19

Why this data made sense for collaboration

The COVID-19 pandemic forced an exceptional scenario in journalism: Every newsroom across the globe was working on the same story and every reporter needed the most robust, current data to understand that story. As governments failed to quickly provide this data in early days of the pandemic, groups of journalists, scientists and citizens stepped in to provide it. These collaborations were not just a matter of preference but of necessity.

Numerous data-focused collaborations sprung up in the first year of the pandemic. They ranged from the well-known, hundreds-of-volunteers-strong COVID Tracking Project, housed at The Atlantic magazine and supported through foundation grants, to smaller projects monitoring COVID-19 cases in schools, state and local pandemic policies, contact tracing efforts and many other metrics.

Some initiatives, like the New York Times COVID-19 dashboard and the Bloomberg vaccination tracker, were housed entirely within journalism organizations and took advantage of existing infrastructure and resources. Others were decidedly more “bootstrapped,” running on simple spreadsheets or data visualization platforms.

For all of these initiatives, a similar pattern arose:

The more people you have collaborating on a dataset, the more capacity you build for catching errors, identifying nuance and communicating data findings to a wide audience. Without careful setup at the beginning of a collaboration, however, such communication can get unwieldy. Projects can languish with unclear goals and wonky data limitations, and important insights may not make it to the people who need them most.

The Documenting COVID-19 project, a collaborative open-records initiative sponsored by Columbia University's Brown Institute for Media Innovation and MuckRock, learned these lessons through its Uncounted investigation, which relied on a combination of data analysis at a national level and local reporting on the ground to reveal how short-staffed, undertrained and overworked coroners and medical examiners were nowhere near unified in investigating a possible death from COVID-19. As we prepare to publish a second national story from this project, we’re also pursuing forthcoming projects that will leverage public records to identify more detailed trends in the death investigation systems of specific states.

The far and the near

Across the world, scientists, journalists and state health organizations have estimated the “true toll” of the pandemic using a metric called “excess deaths.” Excess deaths are the number of deaths in a given time period that exceed what would be considered normal in any other year. In the case of COVID-19, researchers hypothesize that some of these excess deaths are, in fact, COVID-19 deaths that did not get counted correctly, or otherwise occurred because of the pandemic’s social and economic upheaval. This could include people who died during COVID-19 surges because they were unable to receive medical care for other conditions, for example.

While many people first heard the term “excess deaths” during the pandemic, the metric has a long history in public health research as a way to calculate the broadest possible impact of major health events. Researchers even use excess deaths to look back into history at events like the 1918 flu pandemic.

The Uncounted project focuses on excess deaths in the U.S., but these data are available from almost every country. The Economist, which tracks this metric globally, has complied sources for 117 countries around the world.

As we investigated excess American deaths, our main question wasn’t just whether COVID deaths were being undercounted, but how. We also knew that any hypothesis to that question would have to confront the answer to how in different regions of the country. We needed to pair questions about the larger scheme with knowledge of the specifics of public health in a given county. We were searching for “the far and the near”, and that necessitated finding reporters who wanted to collaborate and knew what stones to turn so that our whole team could compare data to lived experiences in that area.

Death data originates at the local level

This approach turned out to be even more essential for this project because mortality data is shaped by how and where it is recorded. Similar to the overall public health system in this country, death investigations in the United States are a patchwork system.Some deaths are investigated with state-of-the-art technology and expertise, while others don’t go beyond a phone call with the family.

When someone dies in a hospital or health care facility, death investigations can be straightforward and are mostly standardized.

When a doctor isn’t present, a separate system often comes into play — the death investigation system of coroners and medical examiners. In the United States, the training, expertise and resources of a coroner or medical examiner in one county can be wildly different from the person investigating deaths in the county next door. This changes the quality of the data, and makes comparing data from different counties a complicated task.

Questions to consider

Does your data tell you about the far or the near? Are you the right person to explain those angles, or could you tell a fuller story with the help of someone who knows that angle more intimately?
Is there an expert that you could develop a mutually beneficial relationship with?
Does the story the data tells end with questions that someone else could answer?
Does the data connect different areas or issues?

A quick dive into the data behind mortality statistics

Where do you find data about death?

Our investigation started with data about death: who is dying and how. Most of the time, the public sees this data in big picture mortality statistics, like the percentage of people who died from cancer in 2022, or from homicides in 2021. Reporters are more interested in where this all starts. So, where do you find data beyond those big national statistics, about what people die of at a local level? The simple answer is the basic unit of this data: the death certificate. The more complicated answer is that, in the United States, how information reaches death certificates and how it is shared afterwards is highly variable.

Most states in the U.S. don’t consider death certificates public record, though the implications for public health and safety make death certificates the type of document that needs more sunlight.

The wonder of WONDER

In the United States, the next best way to get data about death at a local level is using a query portal offered by the Centers for Disease Control and Prevention called WONDER. WONDER, which stands for Wide-ranging Online Data for Epidemiologic Research. After information is recorded on a death certificate, the data is sent on to the CDC where it is entered into WONDER’s provisional mortality database. For more details about WONDER, see our reporting recipe here.

Captura de pantalla 2022 11 29 a las 10 58 54

Making sense of data in collaboration with experts

Taking advantage of different skillsets

When a data journalist reaches out to an expert, the process is typically rather one-sided. The journalist comes with endless questions and relies on the expert to guide them through analysis; the expert is primarily donating their time and skills in service of a resulting story.

Successful collaborations like Uncounted start from a different mindset, in which both journalists and experts seek to have their work elevated through partnership. While Stokes and his team provide us with novel data analysis, our reporting provides the demography team with new ideas for research and connects the academic work to people’s experiences.

For example: Stokes and his team have used the CDC’s mortality data to demonstrate that people who die at home are more likely to have their deaths attributed to nonspecific causes than those who die at the hospital. As reporters, we can talk to the coroners who fill out those certificates of people dying at home and learn about the resource limitations they face in determining specific causes of death.

Making time for thorough collaboration

In-depth collaboration takes much more time than a simple Q&A about methodology. At different points in reporting Uncounted stories, we have had weekly meetings with Stokes – along with hundreds of emails and Slack messages exchanged between teams. The meetings often include progress updates, discussing new findings, and figuring out the best way to visualize an important result in the final story.

Close collaborations with experts can be particularly valuable in the final stage of a story’s production, when data get fact-checked and headlines are determined. Experts can ensure that key statistics are represented accurately and answer last-minute questions from editors, though reporters might have to spend time translating from academic jargon to more accessible language.

Captura de pantalla 2022 11 29 a las 11 00 13

Finding human stories behind the data in collaboration with other reporters

Balancing local- and national-level reporting

Different types of journalists bring different skills and contexts to a project, which can enhance a collaboration if organized carefully. With Uncounted and other large stories at the Documenting COVID-19 project, we’ve found it particularly valuable to pair up with local reporters: while we offer expertise in data, investigative, or beat reporting, local reporters bring invaluable knowledge of their communities.

As specialist reporters, we often take the lead on analyzing data and talking to experts, using the results to prepare memos; these memos provide key questions that our partners can pose to local sources. For example, an analysis of CDC mortality data for a particular county may lead us to an unusual finding about deaths from ill-defined or nonspecific causes. A local reporter can then question their county’s coroner or public health officials about the finding. “Is it possible some of these deaths were actually from COVID-19?” they might ask. After receiving a response, we can interpret it together.

Communicating frequently, remaining flexible

Regular meetings with journalism collaborators, similarly to those with experts, are valuable for staying on the same page about story progress.. These meetings might include anything from figuring out a story’s overall angle to nitty-gritty details like which types of charts work best in a partner’s CMS. We also check in with partners using email, phone calls or a shared Slack server, depending on their preference.

Early in the collaboration process, we make sure everyone involved in the project understands overall objectives and is comfortable with the planned timeline. Plans need to be flexible, though, especially when you’re working with local reporters who have limited bandwidth. Journalists managing daily news deadlines might need more time to complete an enterprise project – especially true for health reporters on call when new COVID-19 surges hit. Similarly, covering a challenging topic like the pandemic itself requires flexibility and empathy; reporters on either side of the partnership might need to take a step back due to burnout from years of covering this crisis.

Another crucial thing to discuss is the story’s editing process which editors will take on which sections of the story, or which stages of the draft? Who will fact-check data points? Which publication’s style guide will you use when there are conflicts? Who will decide when the story is finally ready for publication? All these questions may sound overly tedious at the start of a project, but anticipating these issues early on can save headaches later.

Questions to consider

What skillsets do different members of the collaboration team bring to the table, and how might they be used for this project?
How much time could reporters on the collaboration team feasibly spend on this project, and how much time might it take to complete?
What timeline and story format (length, multimedia components, etc.) are feasible based on other deadlines and commitments?
Who is responsible for different aspects of reporting and editing? What is the hierarchy of edits; who makes final decisions)?
What audiences are served by the different partner outlets, and how can you ensure the final story meets their needs?
What are the team’s preferences for communication? (Email, Slack, regular meetings, all three?)

Sharing final stories with different audiences

The collaborative work doesn’t actually end when a story is published: coordinating with partners on sharing final stories can help your work reach the widest audience possible. We always make sure stories are published on MuckRock’s site on the same day as their publication on partner sites, and even try to coordinate with experts on releasing data findings in academic formats like preprints. Writing social media posts in advance, ensuring everyone has access to graphics, and tagging each other can all help with a unified marketing campaign for the story. And if you’re planning follow-ups, sharing reader responses across outlets can also be incredibly valuable.

The value of collaboration is worth the hassle

Working on collaborative journalism projects can mean endless hours in meetings and email threads, haggling over basic style choices or going over data points numerous times. But the effort is worth it to produce truly unique stories that couldn’t come from any one partner.

This is especially true for local newsrooms. In the U.S., many local publications are shrinking: few are able to dedicate time and resources to big investigative projects or to complicated beats like science and health. The Documenting COVID-19 project offers these newsrooms assistance with specialized reporting tasks; we help them produce high-impact enterprise stories while maintaining capacity for day-to-day news.

As we continue new projects in this collaborative model, we’re inspired by other organizations that work similarly, like the ProPublica Local Reporting Network, the nonprofit environmental newsroom Floodlight, and the international project Unbias the News. We hope our work can be a resource for data journalists interested in trying this model, as we prioritize sharing skills instead of competing for scoops.

How data can power public health investigations — through collaboration - How collaboration makes sense of data and drives investigations

13 min Click to comment

Longform reads

Verification Handbook

Data Journalism Handbook 2

New course

Quality journalism

Countering hate speech

New course

Video course

Fundamental search for journalists

Popular course

Coding

Python for journalists