Write a response

Simulating a pandemic

The backstory behind The Washington Post's most-read article

When Harry Stevens joined the Washington Post as a graphics reporter in September 2019, he never imagined a story he’d publish six months later would become one of the most viewed articles ever on the newspaper’s website. The interactive piece he visualised circulated around the world, showing how a disease could spread through a number of different scenarios, including taking up social distancing.

His motivation for the piece was simple. “Because social distancing was a relatively new phrase for most people, I wanted to show how a disease like COVID-19 spreads,” says Harry. The outcome showed social distancing was the most effective way to flatten the epidemiological curve of a disease, even more so than China’s imposed quarantine.

After the former U.S. President Barack Obama tweeted the visualisation out to his millions of followers, it wasn’t long before Harry’s coronavirus simulator caught the attention of public figures throughout the world. “I saw the Venezuelan dictator Nicolas Maduro sharing it on state television,” says Harry.

Even celebrities like Shakira shared a video on Instagram and Twitter referencing the simulation while asking her fans to stay home. The message? Practising social distancing could have a major impact on slowing the spread of the virus by flattening the epidemiological curve. The graphic explained what public officials couldn’t with words alone. In a bid to democratise information during this pandemic, The Washington Post decided to lift its paywall for certain COVID-19 content. Fortuitously, Harry’s piece was one of them. This, no doubt, contributed to the 27,000 likes and 96,000 shares of the article on The Washington Post’s Facebook page.

Gone global

While Harry says he received both praise and accolades from mathematicians and scientists, readers also reached out to tell him how his piece brought a sense of hope to a woefully uncertain situation. “There’s definitely been an emotional response to this piece,” says Harry. “This is a very anxious time for a lot of people. But when you see that you can change the outcome of this by modifying your own behaviour, it gives you a sense of control.”

Soon readers reached out wondering if it could be translated into other languages. “We had a lot of people saying I want to share this with my parents, but they don't speak English. Can I translate it into Romanian or whatever language they speak?’,” he says. The story is now available in 13 different languages thanks to readers volunteering to translate it themselves followed by the newspaper proofing it.

The kernel of an idea

Like many data stories, the idea for Harry’s coronavirus simulator piece came to him at a pitch meeting while brainstorming with fellow reporters. With a background in frontend design and web development, Harry remembered a Javascript code on network detection that he’d developed a year ago and wondered how it might apply to COVID-19. “I had the code sitting around and I showed it to the team,” he says. “I suggested we might repurpose it to simulate how things spread through a network and how social distancing works.”

The simulation intended to show how networks interact and their exponential nature of growth, not forecast the disease.

After a nod from his editor, he created a number of prototypes. As he began to tinker with the design of balls bouncing around the screen, he knew the data for the piece would determine everything. “At first I wanted to use real-life data from the COVID-19 and simulate the actual virus,” he says. But, after a conversation with Lauren Gardner, an associate professor in the Department of Civil and Systems Engineering at Johns Hopkins Whiting School of Engineering, he realised it would be impossible to accurately represent COVID-19’s spread in real-life. As a professional forecaster of trajectory outbreaks, she explained the process of simulating models showing COVID-19 cases and deaths. It required a team of PhDs to run computationally intensive mathematical models on supercomputers for hours and hours. But even then, she warned him much uncertainty existed in the results.

While he received some criticism from readers about the simulation not showing how COVID-19 would unfold, he explained doing so would be impossible. Others wondered why he didn't have some of the balls (representing people) die off. But the simulation intended to show how networks interact and their exponential nature of growth, not forecast the disease. “The point is that there is no way that I could simulate COVID-19 in the real world. That's why I made a fake disease and made it clear in the piece that it's a fake disease called simulitis,” says Harry.

To explain the phenomenon of how a similar virus to COVID-19 could spread exponentially, with or without social distancing or quarantine, he decided to create coloured balls bouncing against each other showing sick, healthy, and recovered individuals. He used randomised data for his fake disease simulitis, not COVID-19 data. The prototype was adjusted accordingly based on the made-up disease’s randomised data. “I think that even though it was so simple, it still mimicked the growth curve that we see in the real data,” says Harry.

Screenshot 2020 03 30 at 13 49 41

The visualisation showed the spread of a simulitis over a number of different scenarios. He used different coloured dots representing healthy, sick, and recovered people bouncing around the screen. The outcome of the spread was shown through the following scenarios: a) no quarantine or social distancing measures b) an attempted quarantine c) moderate social distancing d) extensive social distancing. Social distancing proved to be the most effective measure, even over forced quarantine as taken in China.

But the one part of the story that did use actual real-life COVID-19 data was the exponential curve of confirmed cases in the United States. This was necessary to set the scene and show the steep growth curve of the disease in the country. Using the data set from Johns Hopkins University, the epidemiological curve showed confirmed cases from the first detected case in the country on 22 January 2020 to 13 March 2020, the day before he published the story.

Screenshot 2020 03 30 at 13 13 13

For his graph, he chose the COVID-19 data set from Johns Hopkins University due to its accuracy in data collection. “In the US, they've been very carefully collecting data. Because there's no central repository of all cases and deaths in the United States. You can't just go to a CDC website and get that,” explains Harry. “Johns Hopkins has been contacting all of the states, counties and collecting their data and putting it in a central database.”

Behind the design

Amongst the hundreds of messages from readers, a vast number of requests came through asking how he technically designed the story. While The Washington Post doesn’t share its code on GitHub, the original experimental code that inspired his story is published online here. “I repurposed a lot of that code, so it shouldn't be too hard for someone who knows JavaScript to also spin up a simulation from it,” says Harry.

Screenshot 2020 03 30 at 14 08 59

For the graphic at the top of the story, he used D3.js, a Javascript library for manipulating documents based on data. As for developing the simulations, he used Geometric.js, a library designed for computational geometry. Instead of using an SVG for the simulations that could lag the loading of the webpage, he opted for Canvas API, which delivered a smoother experience for the user.

For the exponential curve showing real data from COVID-19, he wrote a web scraper that pulled the data set in from Johns Hopkins University’s GitHub page. Every couple of days while designing the interactive, he would update the scraper to see if the data from the curve had flattened. Instead, Harry found the curve grew sharper and sharper by the day.

Data journalism takes centre stage

By persuading behaviour change in a global health crisis, the article has served as an exemplary case study for the wider data journalism community. Factual information has never been more in demand, but now there’s an even greater need for making data meaningful to audiences. And that applies to coverage about and beyond COVID-19, too.

For Google News Lab’s data editor Simon Rogers, a leading voice in data journalism, he believes the impact of the story is clear: “Data journalism has had these moments that have made it more important. How many people now are looking at epidemiological curves and understanding them now because of data journalists?”

My hope is that one of our biggest learnings is to continue to focus on iterating on how we illustrate uncertainty better

Simon isn’t alone. Amanda Makulec, a senior data visualisation lead at Excella and the Data Visualisation Society’s operations director, believes Harry’s story is a shining example of responsible data storytelling. “It belongs in a top-five hall of fame for being an illustration of data that helped inspire an entire country, and really the world, to make certain choices that are really hard to make around limiting your activity,” she says.

She advocates that the power of data visualisation is helping people understand complex concepts. In a recent Fast Company article she penned, she warns readers to be wary of misleading charts, graphs, and maps, while also offering advice on how audiences can interpret data visualisations as intended.

She also highlights the need for people to create responsible designs that add value and make an impact: “How are people going to see it and recognise it in these uncertain times? You don't want to have put something out there that can mislead somebody. And right now, misleading information isn't what we need in the public sphere.”

As for what journalists and designers can learn from this pandemic, her answer is simple: “My hope is that one of our biggest learnings is to continue to focus on iterating on how we illustrate uncertainty better.”

subscribe figure