In 2019, I found myself sitting in a dim, rarely frequented bar in Belgrade, Serbia, listening to a discussion I couldn’t really follow. The debate took on a religious fervor I hadn’t expected for the evening — the topic was R vs. Python.
Both are programming languages used to analyze data. On that night in Belgrade, several up-and-coming data journalists from DW Akademie’s newly founded “Dataship – The Data Journalism Fellowship” were sharing their opinions on which programming language works best for data-driven journalism.
What is data journalism?
Data journalism is often defined as the journalistic processing of structured or unstructured data on information that is relevant to the public. Data journalists work with either an existing dataset or gather their own data. Mysterious whistleblowers are rare which means journalists often spend day after day scraping together data by themselves, using Python or R, for example, combining spreadsheets and classical investigative work with open data.
It is not only about the evaluation of the data, but also the data processing. Many data journalists I spoke with emphasized the difference between adding another infographic to a story and the visualization of the data. Many prefer the term data-driven journalism, since the story comes from the data that is first statistically evaluated and then graphically presented for a wider audience. For Kevin Odanga Madung, a Dataship fellow and co-founder of data analytics organization Odipo Dev, it’s the element of curiosity that makes data journalism especially interesting: “The beauty [is] that you follow the data and find a story.”
Data journalism as teamwork
Former US President Barack Obama’s 2009 Open Government Directive and WikiLeaks’ publication of Iraq and Afghan War documents in 2010 played a large role in cementing data journalism in newsrooms. In Europe, the British newspaper the Guardian was the pioneer, launching their Datablog in 2009. In the US, several newsrooms, beginning with the New York Times, have established a data team or employ an increasing number of fulltime data scientists.
Many newsrooms used to have just one or two journalists carrying the data load, but now the research and work are being shared around. Data journalism has finally become teamwork. In investigative reporting, there are many steps which often require varying expertise. Collaborations can stretch across numerous disciplines, bringing together journalists and open-data activists, graphic designers, data scientists or programmers to support each other in tackling this growing, ever-evolving realm of modern technology.
More data, more access
Data journalism is innately rooted in technology and morphs along with tech developments — it requires competence in both technology and journalism. Open data and open source software have had a huge impact on modern journalism. There is more data and increasing access, which is critical to successful data journalism.
Technological advancements in programming and machine learning have not made journalists obsolete, but rather more crucial than ever, since their role demands an understanding of data and the ability to critically assess and analyze it. In today’s modern news world, artificial intelligence (AI) helps process documents by clustering data according to similar topics, classifying data into newsworthy groups and cleaning and standardizing data sets. Algorithms can be used to sort through documents and identify patterns or extract data, which can be used to predict which documents might be newsworthy.
Using data to reveal untold stories
Data journalism opens the door to a new world of fascinating technological possibilities.
Dataship fellow Claudia Báez, co-founder of Colombian investigative website Cuestión Pública, said that data journalism is “a really powerful tool for doing great journalism together with classical investigative journalism and new techniques in the public interest.”
Some of the latest trends in data journalism include sensor journalism and the processing of geographical data. As indicated by its name, sensor journalism focuses on processing data collected by sensors. While it might not sound that spectacular, sensors are actually a great opportunity: they are relatively cheap and journalists can apply them exactly where it makes sense. This makes the journalist more independent from measurements published by local authorities or industries. Another asset of sensor journalism is its benefit for local journalism. The data collected can contribute to other data sets, be used as evidence or be used in investigative reporting. The data fellowship team from Kenya collected air pollution data at main crossroads, residential areas and playgrounds in the Kenyan capital Nairobi. You can check out their project here.
Doors opening but barriers remain
We have seen extremely interesting developments in geographical data, which have enabled satellite images to become the base for reporting on environmental crises such as draughts and forest fires. Satellite images helped to expose the luxury compounds of former Botswana President Ian Khama, which he was building with public money. Another piece from the Center for Investigative Reporting shed light on the “Wet Prince of Bel-Air,” which revealed that a certain Hollywood producer was still watering his acres of land during a drought in California.
You can examine isolated or remote communities without ever stepping foot there or disrupting communities’ lives. Many journalists use open source satellite data, such as from NASA, for stories on land fallowing, groundwater aquifers and floods. Lastly, these data can give a visualization of human migration flows and refugee routes, like Reuters did on a large scale for the Rohingya refugees, who were forced to flee from Myanmar to neighboring Bangladesh following a military crackdown starting in 2016. It can even show gentrification patterns in cities and the effects of deregulation on land.
While data journalism provides opportunities that would be otherwise not possible, there are still significant barriers. Projects are often time consuming, and audience interest can be disproportionate to the time and effort invested, especially if journalists accumulate data in ways which are hard for the general public to understand. One of the main criticisms of data journalism is that it is elite, since funding and access to new technologies are an essential part of data journalism.
DW Akademie’s Dataship
Restricted access to data and barriers to its usability limits data journalism’s expansion throughout the world, especially in the Global South. This is why, in 2018, DW Akademie — together with Deutsche Gesellschaft für Internationale Zusammenarbeit (GIZ) — launched “Dataship – The Data Journalism Fellowship.” The 12-month fellowship program was aimed at fostering data journalism throughout the world, with a focus on increasing local data journalists’ access to international professional networks.
True to its mission of supporting the development of quality journalism worldwide, DW Akademie is committed to contributing to the advancement of data journalism, especially in countries where the practice is yet to become commonplace. Therefore, the program’s aim was to support 15 well-selected young data journalists from non-OECD countries with outstanding projects.
Which brings us back to the rarely frequented bar in Belgrade, where the Dataship fellows started their journey following data and finding their stories.
Bahia Albrecht is a co–project manager at DW Akademie. She coordinated the Dataship Fellowship Program from 2018 to 2020.
You can read about some of the Dataship projects in the links below.