Television As Data: Mapping 6 Years of American Television News

What happens when we think of all information as data? Imagine if you could take all of the closed captioning of American television news over the last 6 years and run it through incredibly sophisticated natural language understanding algorithms that were able to sift through for any mention of a location on earth and then put all of these mentions on a map? What would that map look like and what might it tell us of what we see when we turn on the television at night?

Facebook raised an international furor last month when reports emerged alleging that it may have underrepresented conservative news in its Trending Topics module. Yet, the story that the media missed was not one of conservative versus liberal bias, but rather that Facebook, like most American news organizations, has a massive Western bias in the stories it links to.

The map below, which accompanied my in-depth look last month at Facebook’s global news sourcing operation, shows just how geographically skewed Facebook’s Trending Topics is. Each country is colored from white (zero) to dark red (high density) by the percent of news outlets monitored by Facebook that are from each country. Immediately clear is that Africa and the Middle East, Central and Eastern Europe and portions of Latin America are completely absent, while the rest of the world outside of a handful of nations are starkly underrepresented. In short, the only news about Africa that would have appeared in Facebook’s Trending Topics feature was that which a news outlet elsewhere in the world found of interest about the continent. Indeed, the New York Times in 2011 quoted Facebook’s founder as having once said “a squirrel dying in your front yard may be more relevant to your interests right now than people dying in Africa.”

Yet, Facebook is far from alone in its Western bias. One of the most important facts of life to recognize when turning to the news is that no single media outlet perfectly covers the entire planet. Rather, media outlets are themselves reflections of the distinct interests of their respective readership. Outlets will therefore naturally cover events in their own backyards far more intensely than those occurring in a faraway land with few cultural or linguistic ties.

Commentators today talk about a “divided media” and how society is fragmenting as media outlets increasingly narrow their focus, arguing that this is somehow a novel feature of an Internet-driven press. The Associated Press wrote earlier this week that “In a simpler time, [Americans] might have gathered at a common television hearth to watch Walter Cronkite deliver the evening news. But the growth in partisan media over the past two decades has enabled Americans to retreat into tribes of like-minded people who get news filtered through particular world views. Fox News Channel and Talking Points Memo thrive, with audiences that rarely intersect. What’s big news in one world is ignored in another. Conspiracy theories sprout, anger abounds and the truth becomes ever more elusive.”

Yet, such sweeping statements presume that there was once an unbiased and evenhanded media that gave us a truthful and uniform look at everything happening around the world. That when Americans gathered around the television to watch Walter Cronkite recite the major events of the day, that those events captured the events of every country, not just the United States and its cultural neighbors. As the map above shows, this is simply not the case today and looking back over the last 200 years it was never the case.

This raises the question of what television looks like today? When a typical American turns on their television at night to watch the news, what parts of the world are they hearing from?

The Internet Archive’s Television News Archive has been archiving major national and select local American television news programming over the past six years. The raw closed captioning of all of these shows are processed each morning on the Archive’s servers using its Virtual Reading Room to create a codified metadata annotation of what’s trending on American television news. Part of this processing includes running incredibly sophisticated algorithms that are able to recognize mentions of remote rural hilltops across the world, disambiguate which hilltop the mention refers to, convert that textual mention to a mappable geographic coordinate and output a final geographic representation.

All of this codified metadata is freely available as both CSV files and as a public dataset in Google’s BigQuery database. This means that a single line of SQL can process all 6 years of data and generate a final histogram in just 2.2 seconds with the final map rendered by CartoDB in just a few seconds more. The US was removed since the intensity of domestic coverage skewed the scale.

From the map below it is clear that American television news over the last 6 years has heavily emphasized Russia, China and the Middle East. Central Africa is poorly represented as is Central Asia, while Eastern Europe, Latin America, Northern Africa and South Asia are also relatively sparsely discussed. The three countries in Africa with the greatest coverage include Egypt and Libya, both of which are currently experiencing heavy conflict and South Africa, which has strong economic and cultural ties to the US. In Latin America, only Mexico and Brazil receive extensive coverage.

Yet, this map reflects all mentions of each country, including both mentions of particular cities and locations within the country and simple mentions of the country itself such as “political unrest in Burundi.” In contrast, the map below displays every subnational location (ranging from a city to a hilltop to a major building) that was mentioned at least 5 times during the same 6-year period. Here the map looks very different, reflecting the enormous density of individual location mentions within the US and high densities in a handful of countries around the world, especially those experiencing active conflict and those with strong US cultural or trade ties. (Note that coordinates were rounded to three decimal places and when you mouse over a dot it will display the longest location name at that location, so sometimes you may see a more rare location name appear as the mouseover hint). Also, keep in mind that the very high levels of transcription error in modern television closed captioning mean these results are by no means perfect, but at the very least reflect a macro-level view of what we see on TV.

Overall, relatively few countries have large numbers of city or landmark level locations mentioned frequently on American television news over the past half-decade. In other words, while American news will frequently mention individual cities in Syria and Afghanistan, in contrast Angola and Zambia are most frequently mentioned at the country level. Television stations are simply estimating that the typical American is likely familiar with a few major cities in Syria due to hearing about them on a daily basis, while most Americans would struggle to name a single major city in Zambia. This is by no means limited to television – even major print outlets like the New York Times do the same.