Data Shows Search Engines Perpetuate Structural Racism
The worst of the real world has made its way online.
By Charlton McIlwain, New York University
The racial inequalities afflicting Americans and our society today are in many ways a result of the result of spatial segregation. White people and nonwhite people tend to live in different neighborhoods, go to different schools and have dramatically different economic opportunities based on their race. That physical manifestation of structural racism has been true historically in this country, and is still the case today.
Today’s internet is built on a similar spatial logic. People travel from website to website in search of content in the same way they travel from neighborhood to neighborhood looking for stuff to do and people to hang out with. Websites accrue and compound value as visitor traffic and site visibility increases.
But there is a crucial difference: Internet users have – more or less – complete freedom to travel where they choose. Websites can’t see the color of a user’s skin and police incoming traffic in the same way human beings can and do in geographical spaces. Therefore, it’s easy to imagine that the internet’s very structure – the social environments it produces and the new economies it births – might not be racially segregated the way the physical world is.
And yet the internet does appear in fact segregated along racial lines. My research demonstrates that websites focusing on racial issues are visited less often, and are less visible in search result rankings than sites with different, or broader, focuses. This phenomenon is not based on anything that individual website producers do. Rather, it appears to be a product of how users themselves find and share information online, a process mediated mostly by search engines and, increasingly, social media platforms.
Exploring Online Racism
Words like “racist” and “racism” are loaded terms, primarily because people almost always associate them with individualized moral and cognitive failures. In recent years, though, the American public has become increasingly aware that racism can apply to cultures and societies at large.
My work looks for online analogues of this systemic racism, in which subtle biases permeate society and culture in ways that yield overwhelming advantages for whites, at the expense of nonwhites. Specifically, I am trying to determine whether the online environment, one completely constructed by humans, systematically produces advantages and disadvantages along racial lines – whether intentionally or inadvertently.
This is a difficult question to approach, but I begin by assuming that today’s technological systems have developed within a culture and society that is systemically and structurally racist. This makes it possible – even likely – that existing biases operate in similar ways online.
In addition, the historical geographical configurations that produced and perpetuated racial inequality provide a useful guide to investigating what systemic racism might look like online. The online landscape, and how people travel through it, are both important factors to understand this picture.
Understanding Online Navigation
First, I wanted to look at the map – how the web itself is structured by website producers. I analyzed what Alexa.com characterizes as the internet’s top 56 African-American sites using a software program called Voson. Voson crawls the web to identify what websites the source sites link to, and what sites link to the source sites.
Then I set out to determine the racial content, if any, of each of those thousands of websites, to begin measuring any inequalities that might exist in the online landscape.
Measuring spatial inequality offline typically involves measuring attributes of the people who live in a specific geographic location. For example, ZIP code 65035 designates a “white” neighborhood because 99.5 percent of the people residing there (Freeburg, Missouri) are white, according to U.S. census data. By contrast, ZIP code 60619, an area in Chicago, would be considered “nonwhite,” because 0.7 percent of its residents are white.
To make this type of distinction between websites, I relied on website metatags – website producers’ descriptions of the site coded to be picked up by and reflected in search engine results. I designated as “racial” websites with metatags including terms such as “african american,” “racism,” “hispanic,” “model minority” and “afro.” Sites without those terms in their metatags I designated “nonracial.”
By using website metatags, I was able to distinguish between racial and nonracial sites (and the segregated traffic between them) based on whether the site’s producers themselves define the site’s identity in racial terms.
Understanding Online Navigation
Once I had labeled each site as racial or nonracial, I looked at the links website producers created between them. There were three possible types of links: between two racial sites, between two nonracial sites, or between a racial site and a nonracial one.
How many of each type of link the data contained would reveal whether bias influenced website producers’ decisions. If there were no bias, the number of links would be proportional to the number of each type of site in the data set. If there were bias, the numbers of links would be disproportionately high or low.
While I found slight differences between the ideal theoretical proportions and the actual number of links, they were not significant enough to indicate that any segregation in people’s internet behavior is caused by web producers. People who travel the web just clicking links on websites at random would not arrive at racial or nonracial sites substantially more or less than they should based on the number of such sites that exist. But people don’t just follow links; they exercise their preferences when navigating the web.
For my second inquiry, I wanted to find out how people actually move between websites. I looked at the same 56 sites as for the previous analysis, but this time used Similarweb, a prominent web traffic metrics site. For each site, Similarweb produces data showing what websites people came from and what websites people navigated to next. I characterized those sites, too, as “racial” or “nonracial,” and identified three types of paths people took when clicking: between two racial sites, between two nonracial sites, or between a racial site and a nonracial one.
In this analysis, the number of clicks between different types of sites would reveal whether bias influenced users’ decisions. I found significantly greater numbers of clicks between nonracial sites, and fewer numbers of clicks between racial and nonracial sites. That indicates that users are going out of their way to visit nonracial sites.
Capitalizing on Search Engines
This gets us closer to the whole story when it comes to segregated traffic patterns and potential inequalities along racial lines. My data also showed that nonracial sites rank significantly higher in search results, and therefore likely enjoy greater visibility, than racial sites. The racial sites are less visible, get less traffic and therefore likely reap fewer benefits from visibility (such as advertising revenue or higher search engine rankings).
It might be tempting to suggest that this merely reflects user preferences. That could be true if users knew what websites they want to go to, and then navigate directly to them. But usually, users don’t. It’s much more likely that people type a word or phrase into a search engine like Google. In fact, direct traffic accounts for only about one-third of the traffic flow to the web’s top sites. To quote a conclusion from search optimization firm Brightedge, “overwhelmingly, organic search trumps other traffic generators.”
While more research is of course necessary, my work so far suggests that in conjunction with users’ preferred choices to navigate to nonracial sites more than racial sites, search engines do something with a similar effect: Nonracial sites rank significantly higher than racial sites. That can give racial sites less traffic and less financial support in the form of advertising revenue.
In both of these situations, people and search engines steer traffic in ways that give advantages to nonracial websites and disadvantages to racial sites. This approximates what, in the offline world, is called systemic, structural racism.
By Charlton McIlwain, Associate Professor of Media, Culture, and Communication, New York University
This article was originally published on The Conversation. Read the original article.