Humanising climate data for solutions reporting

Conversations with Data: #82

Do you want to receive Conversations with Data? Subscribe

379a931b 4f2b 46a3 86d6 c949e71914ff 4

Welcome back to the Conversations with Data newsletter!

We hope you had a relaxing and restful summer and are ready to dive back in! At, we are excited to bring back our monthly live Discord chats. Our next live conversation will take place next Wednesday at 3 pm CEST with Code For Africa. Add the Discord chat to your calendar.

Now on to the podcast!

Climate data journalism offers a vital opportunity for organisations and people looking to spur action, find solutions and hold power to account. But what do data journalists need to be mindful of when covering this pressing issue?

To better understand this, we spoke with data journalist Clayton Aldern from Grist, a nonprofit, independent media organisation dedicated to telling stories of climate solutions and a just future. In this episode, he explains how climate investigations can rely on data analysis and machine learning for solutions reporting. We also hear about his go-to coding languages and digital tools for wrangling climate data.

You can listen to the entire podcast on Spotify, SoundCloud, Apple Podcasts or Google Podcasts. Alternatively, read the edited Q&A with Clayton Aldern below.

What we asked

Tell us about Grist and its approach to solutions journalism when covering climate justice reporting.

There is often this tension in media between solutions journalism and perhaps the more traditional vein of journalism known as accountability reporting. I don't think there needs to be that tension there. At Grist, we're interested in illustrating what a better future looks like. By extension, we're interested in people reaching out toward that future, realising that it's possibly within grasp if one indeed puts a little effort in. So, I think where accountability reporting or investigative journalism and solutions journalism meet is indeed at that point of desire.

Talk to us about the data-led investigative piece you worked on covering abandoned wells that is now a finalist in the ONA Awards.

This piece was a collaboration with my Grist colleague Naveena Sadasivam and The Texas Observer's Christopher Collins. During the middle of the investigation, I joined them to provide some data support, which became an excellent analytical and visual aid to the piece. But the two of them were responsible for the deep investigative dives here looking at abandoned oil and gas wells in the Permian Basin region of the United States.

As oil and gas companies weathered volatile oil prices last year, many halted production. More than 100,000 oil and gas wells in Texas and New Mexico are idle. Of these, there are about 7,000 "orphaned" wells that the states are now responsible for cleaning up. But statistical modelling by Grist and The Texas Observer suggests another 13,000 wells are likely to be abandoned in the coming years. A conservative estimate of the cleanup cost? Almost $1 billion. And that doesn't consider the environmental fallout.

E378077e 64c0 31b1 55de 88d58676cea3 1

Clayton Aldern is a climate data reporter at Grist. He is also a research affiliate of the University of Washington's Center for Studies in Demography and Ecology, and with the housing scholar Gregg Colburn, he is the author of the forthcoming book Homelessness is a Housing Problem.

How did you use machine learning in this story?

This is a perfect problem for machine learning classification. There are two classes of inactive wells -- those that are not producing wells at any given time, and then there are proper orphaned wells that have been inactive for a number of years or even up to a decade. There is a clear distinction between these two as categorised by the states. We looked at the data to see if we could build a model that distinguished between orphaned and inactive wells.

It's important to remember there are a whole lot of characteristics that describe wells. There's the depth, the type and the region. You can imagine building a logistic regression model that seeks to distinguish between the two classes and then asking how well it does on a chunk of the data it's never seen before. We sought to distinguish between inactive wells and orphaned wells as categorised by the state. We found that there's this subpopulation of inactive wells that are statistically indistinguishable from orphaned wells.

What has been the impact of this investigation?

Well abandonment and well plugging is a hot button issue in the United States right now. If you think about the infrastructure negotiations, for example, it's come up again and again. Why is that? There are many wells in the country, and there aren't great rules on the books. Some of these plugin cost formulas have bonding formulas written decades ago. Since then, drilling technologies have really improved, allowing for more of this activity. This isn't just an issue for Texas and New Mexico. It's also an issue for California, Colorado, and the Dakotas, where there's a lot of shale drilling.

As soon as we published this investigation, many advocates, environmentalists and environmental lawyers and indeed oil and gas industry folks came out of the woodwork and said, "Hey, we're working on this issue in this other state. Can you apply your model to that state? We would like to know what the coming wave of abandonment looks like in our environment." This is an open source model that anybody can use. It takes a little bit of technical know-how. And obviously, you need a data set. But this analysis is replicable in other contexts.

Grist logo w celery

Tell us about your coding skills. What tools do you rely on regularly?

With my background in computational neuroscience, I learned Matlab. I don't use any Matlab these days, but it was my entree to programming. I'd say I spend about a third of my day in R for most of my scripting. I use Python for web scraping. Then there's room for some D3. Those are the three Musketeers of data journalism: Python, R and D3. At Grist, we run our website on WordPress. Unfortunately, there's no great D3 and WordPress integrations. That's why we use this other JavaScript library called amCharts, which makes responsive, interactive design easy. For most of my exploratory work, I'm a big fan of Tableau.

Finally, what advice do you have for journalists looking to specialise in climate data reporting?

My advice for folks looking at the climate crisis is not to forget where you've come from. If you come from a traditional journalism background where you wrote about people, you ought to be doing the same thing when you're writing about the climate crisis. I think there's this tendency to imagine data journalism as a kind of island that sits off the coast of journalism. Some see it as this special thing that the programmers do, and then you end up with these nice, immersive visual features. That's all well and good, but you still want to use your traditional investigative reporting toolkit.

You should still be doing archival research, filing public records requests, and most importantly, you should still be talking to people. Remember to examine how the dataset impacts communities. The climate crisis is a story of environmental injustice and climate injustice. If you are missing those angles, you are either missing the lead or burying it.

Join our live Discord chat on 29 September at 3pm CEST.

Our next conversation will be a live Discord chat on Wednesday, 29 September at 3 pm CEST with Tricia Govindasamy and Jacopo Ottaviani from Code For Africa. Tricia is an award-winning South African scientist specialising in Geographic Information Systems and data science. Jacopo is a computer scientist and senior strategist who works as Code for Africa’s Chief Data Officer. We will discuss what the data journalism and media landscape look like in Africa. We will also hear about some regional data projects related to climate change and gender data. Add it to your calendar here.

Jacopo and Tricia Discord event 1

Latest from

Our latest long read article features case studies examining how a team of researchers, journalists and students used digital recipes to delve into COVID-19 conspiracy content sold on Amazon, the world's largest online retailer. Written by Jonathan Gray, Marc Tuters, Liliana Bounegru and Thais Lobo, this walk-through piece serves as a useful guide for researchers and journalists seeking to replicate a similar collaborative investigation using digital methods. Read the full article here.

405af550 a272 da57 a6d2 3a16784f2eea 1

News Impact Summit is happening tomorrow! Register today!

Want to make your journalism more diverse and inclusive? Join us tomorrow for European Journalism Centre's News Impact Summit on Thursday, 23 September. Learn from industry experts about innovative approaches to bringing about equality in the newsroom. Register here.

Preview chat NIS2021 1

As always, don't forget to let us know who you would like us to feature in our future editions. You can also read all of our past editions here or subscribe to the newsletter here.


Tara from the EJC data team,

bringing you, supported by Google News Initiative.

PS. Are you interested in supporting this newsletter or podcast? Get in touch to discuss sponsorship opportunities.

subscribe figure