Name: DataJournalism.com
Price range: $

Home
Read
Newsletters
Social data reporting: AMA with Lam Thuy Vo

October 2023

OSINT for environmental investigations

September 2023

Navigating environmental justice with data

July 2023

Connecting local and climate journalism

June 2023

Powering your climate solutions reporting with data visualisation

December 2022

Mastering the art of data collaboration

November 2022

How to save data journalism

October 2022

Inside the Uber Files
Eye on data journalism in Iran

September 2022

The state of FOI in Europe

August 2022

Making sense of data collection during conflict
Innovative storytelling for war coverage

July 2022

Navigating the Russia-Ukraine War with OSINT technologies

June 2022

The rise of data journalism in Asia
Breaking into data journalism

February 2022

Inside Outlier Conference 2022

January 2022

The history of data journalism

December 2021

Uncovering systemic inequality with data

November 2021

Vaccinating Europe's undocumented: A policy scorecard
Exploring data journalism in Brazil

October 2021

Inside the Pandora Papers
Eye on Africa's data journalism eco-system

September 2021

Humanising climate data for solutions reporting
Inside the OpenLux investigation

August 2021

Exploring the sound of data
Inside The Economist's "Off the Charts" newsletter

July 2021

Build a data hypothesis for your next story
Delta variant vs. the vaccine rollout

June 2021

Closing the data literacy gap

May 2021

Designing better data visualisations
Visual storytelling inside The Pudding

April 2021

Tackling quantitative imperialism
The story behind Datawrapper

March 2021

Better data visualisations
Finding data outliers for solutions journalism
Investigating crime and corruption with data

February 2021

What's on at #NICAR21
Vaccines and variants: What you need to know

January 2021

Privacy Day 2021: What journalists need to know
Digital verification for human rights advocacy

December 2020

Navigating vaccine hesitancy
Taking stock of open data

November 2020

Mind the map
Meet the undercover economist

October 2020

Politics and probability
FinCEN Files: Q&A with ICIJ's Emilia Diaz-Struck

September 2020

Storytelling beyond charts and graphs
Q&A with Alberto Cairo

July 2020

A.I. adoption in media
AI in the newsroom

June 2020

The power of predictive analytics
Trustworthy data

May 2020

Detecting deepfakes

April 2020

Debunking disinformation
Data uncertainty & COVID-19
Visualising stories around COVID-19

March 2020

Q&A with Simon Rogers
Closing the gender data gap with Data2X

February 2020

Verification for data
Neurodiversity and design

January 2020

Decoding the dynamics of a data team
Q&A with The Sigma Awards team

December 2019

AMA with La Nación
AMA with Steve Doig

November 2019

Sensor-based journalism
AMA with Vincent Ryan

October 2019

Following the money
AMA with Rachel Glickhouse

September 2019

APIs for journalism
AMA with Code for Africa

August 2019

Learn to code like a journalist
AMA with Kuek Ser Kuang Keng

July 2019

AMA with The Economist's data team
Bad charts
Under-reported news

June 2019

AMA with Eva Constantaras

May 2019

Parsing the European Parliament elections
Data on the crime beat
Ethical dilemmas in data journalism

April 2019

Python and newsrooms: AMA with Winny de Jong
Data visualisation trends and challenges: AMA with Data Journalism Handbook 2 authors
Launch of DataJournalism.com

March 2019

Why we love journalistic databases
Data journalism in the post-truth world: AMA with First Draft

February 2019

Award-worthy data journalism
Historical data journalism
SPIEGEL ONLINE: AMA with Christina Elmer, Marcel Pauly, and Patrick Stotz

January 2019

Reproducible journalism

December 2018

Data Journalism Handbook 2: AMA with Jonathan Gray and Liliana Bounegru

November 2018

Lying charts? AMA with Alberto Cairo
Favourite maps

October 2018

The Markup: AMA with Jeff Larson
Data scraping for stories

September 2018

Social data reporting: AMA with Lam Thuy Vo
Environmental data journalism

August 2018

Fact-checking: AMA with Africa Check, Tirto.id, RMIT ABC Fact Check, Correctiv/EchtJetzt, and Factchecker.in
Favourite chart types
West Africa Leaks: AMA with Will Fitzgibbon and Daniela Lepiz

July 2018

World Cup data journalism
Open data
Immersive storytelling: AMA with Journalism 360 ambassadors

June 2018

Crowdsourcing

May 2018

Algorithmic reporting
Our concept

Social data reporting: AMA with Lam Thuy Vo

Conversations with Data: #11

Do you want to receive Conversations with Data? Subscribe

What do you remember most about 2012? Some of you might think back to the London Olympics, or perhaps Gangnam Style… but we’ll always remember it as the year that the European Journalism Centre launched the Data Journalism Handbook.

Flash forward six years and we’re producing a second edition for beta release. But we’re a wee bit too excited to wait until then. In anticipation, we’ve already published an exclusive preview chapter on algorithmic accountability.

And now, this edition of Conversations with Data brings you an 'ask me anything' with the author of the Handbook’s social media chapter, BuzzFeed’s Lam Thuy Vo. She answers your questions on getting started with user-generated content, privacy, combating fake news, and more.

What you asked

What makes social reporting different from other types of data journalism?

Lam: "There’s a tendency among some journalists to examine the social web as a one-to-one representation of society itself. It’s a natural inclination: the ways in which we see information come in on newsfeeds, timelines and comment strings, makes it seem continuous and like a representation of the people who compose our actual immediate environment.

But social media data is odd: while there’s a rigidity to its format that’s akin to that of other data sets (date times, content categorisation like text, video or photos, etc.), it comes with all kinds of irregularities related to who posted the content."

Can you give us an example?

"Take the data from a Facebook group, for instance. Even if a group has thousands of followers, only a fraction of them may actually actively react or comment on content, let alone post. The posts may come in spurts or fairly frequently without much of a regularity to them -- and this data may change at any moment as more people comment or react to the posts. All this makes for highly unwieldy data that needs to be interpreted with care and caution."

In addition to this risk of misinterpretation, social data is inherently personal, bringing with it privacy and ethical challenges. How can journalists address these?

"First, there’s an approach I often refer to as a quantified selfie -- an examination of a person’s data with their permission and also with their help of interpretation. Since social media data is by nature highly subjective, doing stories about an individual is likely best done with their help to interpret these stories."

BUZZ-2 — From Lam’s quantified selfie project, which used most played songs to showcase the emotional impact of moving to New York.

"When working with content created by everyday people, journalists should also be very conscientious about people’s privacy and what the amplification of social media posts could mean to them. BuzzFeed News’ ethics guide contains very helpful language on the subject: 'We should be attentive to the intended audience for a social media post, and whether vastly increasing that audience reveals an important story -- or just shames or embarrasses a random person'."

What are some other challenges?

"What’s particularly difficult is that social data can be very ephemeral. For one story, for instance, Craig Silverman, Jane Lytvynenko, Jeremy Singer-Vine and I looked into how popular hyperpartisan news organisations were as compared to their more mainstream counterparts. The hardest part was that during the reporting, data parsing (more than four million Facebook posts) and analysis, Facebook pages included in the analysis kept being deleted. Not only did we have to deal with large amounts of data that would strain my computer, this data set was also a constantly moving target."

Following on from that example, how do you think journalists can use social media data to combat fake news?

"One very important part of combating the spread of misinformation and false made up stories is to report on the subject and find ways to hold companies accountable while also educating news consumers."

"...there’s a huge need for people to learn about the fallacies of how we see the world through social media. Explanatory reporting in that realm can be hugely impactful. From the distortion of information through filter bubbles to issues surrounding troll attacks, through examples of other individuals we can hopefully raise a level of scepticism in people to prevent them from sharing information in reactionary and emotional ways, and to encourage a more critical reading of the information they encounter online."

BUZZ-3 — A screenshot from Lam’s trolling visualisation, illustrating to audiences what a Twitter attack feels like.

Finally, do you have any advice for journalists who are new to user-generated content?

"While each story is unique I’d say there is one good guideline I’ve picked up: be specific in your stories -- people can get lost in data stories that are too large in scope. For example, it’s more definitive and hard enough to do a story about one particular group of politicians who are Facebook users and are spreading hate speech, than to try and prove the spread of hate speech among an entire part of society."

Our next conversation

While social media data can usually be obtained via API, there’s often times where you’ll have to scrape it yourself. For our next conversation, we want to feature your tips for these situations.

Until next time,

Madolyn from the EJC Data team

Time to have your say

Sign up for our Conversations with Data newsletter

Join 10.000 data journalism enthusiasts and receive a bi-weekly newsletter or access our newsletter archive here.

Email Email

I agree that my data will be processed for sending me this newsletter. All processing will happen according to the EJC Privacy Policy*

Almost there...

Head over to your inbox and click the confirmation link in the email to complete your subscription.
If you experience any other problems, feel free to contact us at [email protected]

About

About us
The team
Blog
Partnerships
Branding
Contact

Made possible by

Write for us
Contributors
Partners

Useful links

Latest discussions
FAQ
Newsletters archive

Small print

Privacy Policy
Cookie Policy
Terms and Conditions
Code of conduct

Social media

Twitter
LinkedIn
Facebook

Created by

Supported by

Longform reads

Verification Handbook

Data Journalism Handbook 2

New course

Quality journalism

Countering hate speech

New course

Video course

Fundamental search for journalists

Popular course

Coding

Python for journalists

Time to have your say

Longform reads

Verification Handbook

Data Journalism Handbook 2

New course

Quality journalism

Countering hate speech

New course

Video course

Fundamental search for journalists

Popular course

Coding

Python for journalists

Social data reporting: AMA with Lam Thuy Vo

Conversations with Data: #11

What you asked

What makes social reporting different from other types of data journalism?

Can you give us an example?

In addition to this risk of misinterpretation, social data is inherently personal, bringing with it privacy and ethical challenges. How can journalists address these?

What are some other challenges?

Following on from that example, how do you think journalists can use social media data to combat fake news?

Finally, do you have any advice for journalists who are new to user-generated content?

Our next conversation

Time to have your say

Sign up for our Conversations with Data newsletter

Review your cookie settings for the optimal site experience.