ProPublica is a nonprofit newsroom that investigates abuses of power. Sign up to receive our biggest stories as soon as they’re published.
Today we’re launching a guidebook on how newsrooms can collaborate around large datasets.
Since our founding 11 years ago, ProPublica has made collaboration one of the central aspects of its journalism. We partner with local and national outlets across the country in many different ways — including to work with us to report stories, to share data and to republish our work. That’s because we understand that in working together, we can do more powerful journalism, reach wider audiences and have more impact.
In the last several years, we’ve taken on enormous collaborations, working with hundreds of journalists at a time. It started in 2016 with Electionland, a project to monitor voting problems in real time during the presidential election. That project brought together more than 1,000 journalists and students across the country. Then we launched Documenting Hate in 2017, a collaborative investigation that included more than 170 newsrooms reporting on hate crimes and bias incidents. We did Electionland again in 2018, which involved around 120 newsrooms.
In order to make each of these projects work, we developed software that allows hundreds of people to access and work with a shared pool of data. That information included datasets acquired via reporting as well as story tips sent to us by thousands of readers across the country. We’ve also developed hard-won expertise in how to manage these types of large-scale projects.
Thanks to a grant from the Google News Initiative, we’ve created the Collaborative Data Journalism Guide to collaborative data reporting, which we’re launching today. We’re also developing an open-source version of our software, which will be ready this fall (sign up here for updates).
Our guidebook covers:
- Types of newsroom collaborations and how to start them
- How a collaboration around crowdsourced data works
- Questions to consider before starting a crowdsourced collaboration
- Ways to collaborate around a shared dataset
- How to set up and manage workflows in data collaborations
The guidebook represents the lessons we’ve learned over the years, but we know it isn’t the only way to do things, so we made the guidebook itself collaborative: We’ve made it easy for others to send us input and additions. Anybody with a GitHub account can send us ideas for changes or even add their own findings and experiences (and if you don’t have a GitHub account, you can do the same by contacting me via email).
We hope our guide will inspire journalists to try out collaborations, even if it’s just one or two partners.