Some Favorite Examples
Written by: Angelica Peralta Ramos Simon Rogers , Steve Doig , Angelica Peralta Ramos , Sarah Slobin Steve Doig Sarah Slobin , Angelica Peralta Ramos
We asked some of our contributors for their favorite examples of data journalism and what they liked about them. Here they are.
Do No Harm in the Las Vegas Sun
My favourite example is the Las Vegas Sun’s 2010 Do No Harm series on hospital care (see Figure 5). The Sun analyzed more than 2.9 million hospital billing records, which revealed more than 3600 preventable injuries, infections and surgical mistakes. They obtained data through a public records request and identified more than 300 cases in which patients died because of mistakes that could have been prevented. It contains different elements, including: an interactive graphic which allows the reader to see by hospital, where surgical injuries happened more often than would be expected; a map with a timeline that shows infections spreading hospital by hospital; and an interactive graphic that allows users to sort data by preventable injuries or by hospital to see where people are getting hurt. I like it because it is very easy to understand and navigate. Users can explore the data in a very intuitive way. Also it had a real impact: the Nevada legislature responded with six pieces of legislation. The journalists involved worked very hard to acquire and clean up the data. One of the journalists, Alex Richards, sent data back to hospitals and to the state at least a dozen times to get mistakes corrected.
Government Employee Salary Database
I love the work that small independent organizations are performing every day, such as ProPublica or the Texas Tribune who have a great data reporter in Ryan Murphy. If I had to choose, I’d pick the Government Employee Salary Database project from the Texas Tribune (Figure 6). This project collects 660,000 government employee salaries into a database for users to search and help generate stories from. You can search by agency, name or salary. It’s simple, meaningful and is making inaccessible information public. It is easy to use and automatically generates stories. It is a great example of why the Texas Tribune gets most of its traffic from the data pages.
Full-text visualization of the Iraqi War Logs, Associated Press
Jonathan Stray and Julian Burgess’ work on Iraq War Logs is an inspiring foray into text analysis and visualization using experimental techniques to gain insight into themes worth exploring further within a large textual dataset (Figure 7).
By means of text-analytics techniques and algorithms, Jonathan and Julian created a method that showed clusters of keywords contained in thousands of US-government reports on the Iraq war leaked by Wikileaks in visual form.
Though there are limitations to the methods presented and the approach is experimental, it presents an innovative approach. Rather than trying to read all the files or reviewing the War Logs with a preconceived notion of what may be found by inputting particular keywords and reviewing the output, this technique calculates and visualises topics/keywords of particular relevance.
With increasing amounts of data — both textual (emails, reports, etc.) and numeric — coming into the public domain, finding ways to pinpoint key areas of interest will become more and more important — it is an exciting sub-field of data journalism.
One of my favorite pieces of data journalism is the “Murder Mysteries” project by Tom Hargrove of the Scripps Howard News Service (Figure 8). He built from government data and public records requests a demographically-detailed database of more than 185,000 unsolved murders, and then designed an algorithm to search it for patterns suggesting the possible presence of serial killers. This project has it all: hard work gathering a database better than the government’s own, clever analysis using social science techniques, and interactive presentation of the data online so readers can explore it themselves.
I love ProPublica’s Message Machine story and nerd blog post (Figure 9). It all got started when some Twitterers expressed their curiosity about having received different emails from the Obama campaign. The folks at ProPublica noticed, and asked their audience to forward any emails they got from the campaign. The presentation is elegant, a visual diff of several different emails that were sent out that evening. It’s awesome because they gathered their own data (admittedly a small sample, but big enough to tell the story). But it’s even more awesome because they’re telling the story of an emerging phenomena, big data used in political campaigns to target messages to specific individuals. It is just a taste of things to come.
One of my favourite data journalism projects is Andrew Garcia Phillips' work on Chartball (Figure 10). Andrew is a huge sports fan with a voracious appetite for data, a terrific eye for design and the capacity to write code. With Chartball he visualises not only the sweep of history, but details the success and failures of individual players and teams. He makes context, he makes an inviting graphic and his work is deep and fun and interesting – and I don’t even care much for sports!