Investigating crime and corruption with data

Conversations with Data: #69

Do you want to receive Conversations with Data? Subscribe


Welcome to our latest Conversations with Data newsletter.

In this week's Conversations with Data podcast, we caught up with investigative journalist Pavla Holcová. She is a regional editor for Central Europe at Organized Crime and Corruption Reporting Project (OCCRP) and is the founder of the Czech independent outlet She talks to us about using data to investigate the February 2018 murder of her colleague Slovak investigative journalist Ján Kuciak and his fiancee Martina Kušnírová.

You can listen to the entire podcast Spotify, SoundCloud, Apple Podcasts or Google Podcasts. Alternatively, read the edited Q&A with Pavla Holcová below.

What we asked

What led you to become an investigative journalist?

I used to work in Cuba with human rights defenders and independent journalists. We conducted training there to help journalists understand the difference between fact-based reporting and opinion-based reporting. One day we were detained by the Cuban police and ended up in the same jail cell as investigative journalist Paul Radu, the executive director of OCCRP. Because we knew we couldn't talk about the job we were doing in Cuba, we started to talk about his job. He explained how he was an investigative journalist doing cross-border project journalism. And that was the moment in Cuban jail when I decided that once I will quit my job, I would like to do this. That was in 2010 and I founded in 2013. A year later our publication joined OCCRP, where I am also now a regional editor for Central Europe.

Unnamed 2

What happened to investigative journalist Ján Kuciak and his fiancee in February 2018?

Ján Kuciak and his fiancee were shot dead on the evening of the 21st of February 2018. It happened a couple of weeks before their wedding. The police started to investigate and discovered that he was shot because of the work he had done. As an investigative journalist, he had exposed a couple of stories that uncovered the corrupt nature of the system in Slovakia, particularly the judiciary. I'm talking about tax fraud and about a group of people who were basically untouchable. They were never prosecuted. Ján exposed how it was possible and the system of corruption in Slovakia.

How did you and Ján work together?

A year before he was shot, we decided to work on a story together that exposed the ties of the Italian mafia to the then prime minister of Slovakia, Robert Fico. It was about a week before we were about to publish the story when he was shot. So the first lead was that the Italian mafia ordered his killing. But later it turned out that the lead suspect and the person with the highest motivation to silence Ján and his investigative work was Marian Kočner. He was a well-known Slovak businessman who felt threatened by Ján's reporting and he had the highest motivation. The court first ruled that there was a lack of evidence preventing him from being sent to prison. But there's an appeal and the case is still ongoing. He is currently serving a 19-year sentence in prison for another conviction -- the forging of promissory notes.

What did Ján's death reveal about Slovakia?

This case revealed how the system truly worked and how corrupt the system was. We are now able to prove it and expose it because we got access to 70 terabytes of data that we now call the Kočner Library. It's the evidence that was collected by Slovak police and it includes the files and all the annexes regarding the murder of Ján Kuciak and his fiancee Martina Kušnírová.

It was a very special situation in Slovakia. This was because in 2020 elections were happening in the country. It meant the political party that created the corrupt system could potentially not win reelection. We obtained those 70 terabytes of data in late November 2019, and we knew at the time that we only had about four months to work with the data and to start this project.

How did you obtain that 70 terabytes of data?

We did not ask for permission from the police. We got it from a source. We got it legally and it was not a leak. We know who the source is, even though we won't really reveal the source's identity.

Unnamed 11

What stories came out of this data?

The stories from the data set really changed Slovakia. The impact was huge. First, we described to the public how Ján and Martina were murdered. Then we exposed the world of Marian Kočner -- how he used honey traps and blackmail to escape justice. He was bribing the general attorney and had good connections to the police force. They were giving him information and neglecting his cases. He was even telling corrupt judges how to rule in his cases. Those people are now in jail because of the information from this data set and the public pressure. A big revolution is happening in Slovakia. It's not so visible now because of the coronavirus situation, but it is changing completely.

What kind of journalist was Ján Kuciak?

He was one of the most modest journalists I've ever worked with. We were not just colleagues, we were also good friends. So for me, his death was a shock. But in Slovakia, before he was murdered, he was not well known. He was only 27 years old and he was much more into analysing the data than relying on secret sources and then publishing what they told him. He was much more into using the public registries, collecting the data, putting them together, analysing them and then connecting the dots. He was also helping fellow journalists. If someone from another news outlet had a problem understanding a big financial scheme, he was the one who helped them piece it together.

Unnamed 5

A memorial for Ján Kuciak and Martina Kušnírová next to the statue of saint John Paul II. in Prešov By Petinoh - Own work, CC BY-SA 4.0

How did privacy concerns impact the stories you told from the data?

That was actually my role -- to make sure that we don't expose random bystanders in the conversation or actually the victims of Marian Kočner. One of the criteria for being selected to the team was that the journalists would require the highest ethical standards. They needed to take into account that we may expose people who were not part of the system and that they shouldn't do that. But we also needed to coordinate with the authorities to ensure that we didn't jeopardise the police investigation. So before when we started to work on a story, it was my task to coordinate with the lawyers and with the police just to check that we won't share some kind of information that shouldn't be revealed.

What about the safety of the journalists working on this?

We definitely needed to consider personal safety. One of the team members received a bullet in his mailbox. He got police protection and everyone was alerted to what was happening. But in the end, we didn't find who sent him the bullet. The team did manage to keep their sanity and work-life balance in spite of this.

We also established a safe room in Bratislava, the only place where journalists could access the data. We provided a set of computers that were stripped of all functionalities and could only work if the librarian started the system, which connected to the data. And then anyone who wanted to work in the library needed to come to that location to download the data on an encrypted USB stick and bring it home to work on. This meant that if anyone would steal it, they would not be able to access the data.

How does OCCRP hope to keep the memory of Ján Kuciak alive?

We shouldn't just remember him once a year. We should always remember that his murder changed Central Europe in an unprecedented way. The impact of his murder was huge. We want to finish all of the stories he started and we are still not done with the data. Following the steps of the journalists who are threatened or who have been murdered, and finishing their stories, will show that killing someone doesn’t mean killing their stories.

Latest from

The News Impact Summit has released an introduction to R programming language training video by data journalism trainer Jonathan Stoneman. Master the basics such as how to set up the R console and import data. You'll also learn how to pivot and filter data and use the R package ggplot. Watch the full video here.

Wikidata can be a useful resource for journalists digging for data on a deadline. Monika Sengul-Jones explains the joy and perils of using the searchable data trove for your next story. Read the long read article here.

Unnamed 12

Our next conversation

Our next Conversations with Data podcast will feature investigative data journalist Matthew Kauffman from Solutions Journalism Network. He leads a data reporting project helping newsrooms pursue solutions reporting by identifying positive deviants -- outliers in data that might point to places that are responding to social problems.

Before joining the Solutions Journalism Network, Matthew spent 32 years writing stories for the oldest newspaper in the United States, The Hartford Courant. He is also a two-time Pulitzer Prize finalist and teaches data journalism. He will speak with us about the power of combining data with solutions journalism to tell compelling stories that engage audiences.

As always, don’t forget to let us know what you’d like us to feature in our future editions. You can also read all of our past editions here. You can also subscribe to this newsletter here.


Tara from the EJC Data team,

bringing you, supported by Google News Initiative.

P.S. Are you interested in supporting this newsletter as well? Get in touch to discuss sponsorship opportunities.

subscribe figure