Chapter Seven: The Power Chapter

Examining how power is wielded through data means participating in projects that wield it back. The projects we discuss in this chapter deal openly and explicitly with questions about power, and name the structural forces like sexism and racism that lead to power imbalances.
Chapter Seven: The Power Chapter
Contributors (2)
Published
Nov 01, 2018

This chapter is a draft. The final version of Data Feminism will be published by the MIT Press in 2019. Please email Catherine and/or Lauren for permission to cite this manuscript draft.



In 1970, the Detroit Geographic Expedition and Institute released a provocative map, titled "Where Commuters Run Over Black Children on the Pointes-Downtown Track". The map starkly shows where many Black children were killed. On one single corner alone, there were six children killed by white drivers over the course of six months. Just gathering the data that the community already knew to be true posed a difficult problem. No one was keeping detailed records of these deaths, nor making them publicly available. The only reason it ended up being collected and published was because of an unlikely collaboration formed between low-income, urban, Black youth led by Gwendolyn Warren and white male academic geographers.

<p>Gwendolyn Warren was the Administrative Director of the Detroit Geographic Expedition and Institute, a collaboration between Black youth in Detroit and white academic geographers that lasted from 1968-1971. The group worked together to map aspects of the urban environment related to children and education. Warren also worked to set up a free school where youth could take college classes in geography for credit.</p><p>Credit: The Detroit Geographic Expedition and Institute</p><p>Source: Gwendolyn Warren, “Field Notes III: The Geography of Children" (1970)</p><p>Permission: Pending</p>

Gwendolyn Warren was the Administrative Director of the Detroit Geographic Expedition and Institute, a collaboration between Black youth in Detroit and white academic geographers that lasted from 1968-1971. The group worked together to map aspects of the urban environment related to children and education. Warren also worked to set up a free school where youth could take college classes in geography for credit.

Credit: The Detroit Geographic Expedition and Institute

Source: Gwendolyn Warren, “Field Notes III: The Geography of Children" (1970)

Permission: Pending

Contrast this map with a map made thirty years prior by the (all white and male) Detroit Chamber of Commerce and the (all white and male) Federal Home Loan Bank Board. This map set the stage for "redlining", a discriminatory practice of rating the risk of home loans in particular neighborhoods based on residents' demographics (their race, not their creditworthiness). Redlining began as a visual technique of red shading for all the neighborhoods in a city that were deemed "undesirable" for granting loans. All of Detroit's Black neighborhoods in 1940 fall in red areas on this map. Denying loans to Black residents set the stage for decades of structural racism and blight that was to follow.   

<p>This is a "redlining" map of Detroit published in 1939. Created as a collaboration between the (all white and male) Detroit Chamber of Commerce and the (all white and male) Federal Home Loan Bank Board, the red colors signify neighborhoods that these institutions deemed red neighborhoods "high-risk" for bank loans. Paul Szewczyk, a local historian, has demonstrated how all of Detroit's majority African American neighborhoods were colored red. Detroit was not an isolated case - Redlining was a standard practice in virtually all of America's major cities. It was a scalable, "big data" approach to systematic discrimination under the guise of data and objectivity.</p><p>Credit: The Detroit Chamber of Commerce and the Federal Home Loan Bank Board</p><p>Source: https://detroitography.com/2014/12/10/detroit-redlining-map-1939/</p><p>Permissions: Pending</p>

This is a "redlining" map of Detroit published in 1939. Created as a collaboration between the (all white and male) Detroit Chamber of Commerce and the (all white and male) Federal Home Loan Bank Board, the red colors signify neighborhoods that these institutions deemed red neighborhoods "high-risk" for bank loans. Paul Szewczyk, a local historian, has demonstrated how all of Detroit's majority African American neighborhoods were colored red. Detroit was not an isolated case - Redlining was a standard practice in virtually all of America's major cities. It was a scalable, "big data" approach to systematic discrimination under the guise of data and objectivity.

Credit: The Detroit Chamber of Commerce and the Federal Home Loan Bank Board

Source: https://detroitography.com/2014/12/10/detroit-redlining-map-1939/

Permissions: Pending

Both of these maps use straightforward cartographic techniques: aerial view, legends and keys, color and shading, to indicate different characteristics. But what is starkly, undeniably different about the two maps are the worldviews of the makers and their communities. In the second map you have the racist, male-dominated city and federal institutions seeking to further institutionalize segregation and secure white wealth. Black neighborhoods were deemed to pose a "high risk" to the financial solvency of white institutions, so redlining maps became a way to systematically and "scientifically" protect white resources. These institutions succeeded, in no small part because of maps like this one. In contrast, in the first map you have a community who had recently learned the cutting-edge geographic techniques of their era who decided to take action against those same structures of power that created the first map. One is a map of securing power and the other is a map contesting power.

Who makes maps and who gets mapped? The DGEI map is, unfortunately, a rare instance in which communities of color, led by a young Black woman, determined what they wanted to map. It is more frequently the case that communities of color are mapped by institutions in power, whose worldviews and value systems may differ vastly from those of the community. One of the most dangerous outcomes of this imbalance of power – in evidence in this example of harm that was inflicted on people systematically for decades using maps and data – is when those institutions in power obscure their political agendas behind a veil of objectivity and technology. 

This veil is not just a historical phenomenon. One can make a direct comparison between yesterday's redlining maps and today's risk assessment algorithms. The latter are used in many locales to inform whether a person who has been detained should be considered at low or high risk of committing a future crime. Risk assessment scores can affect whether a person is let out on bail and what kind of sentence they receive – they have the power to set you free or lighten your sentence. 

The issue is that different bodies are differently weighted by the risk assessment algorithm. For example, in 2016 Julia Angwin led a team at ProPublica to investigate one of the most widely used risk assessment algorithms created by the company Northpointe (now Equivant). Her team found that white defendants were more often mislabeled as low risk than Black defendants, and conversely, that Black defendants were mislabeled as high risk more often than white defendants. Digging further into the details, the journalists uncovered a 137-question worksheet that detainees fill out. Their answers feed into the software and are compared with other data in order to spit out the risk assessment score for the individual. While the questionnaire does not ask directly about race, it asks questions that are direct proxies for race, like whether you were raised by a single mother, whether you have friends or family that have been arrested, and whether you have ever been suspended from school. In the US context, each of those data points has been demonstrated to have disproportionate occurrences for Black people – 67% of Black kids grow up in single parent households, for example, whereas the rate is only 25% for white kids. So, while the algorithm creators claim that it isn't considering race, it is considering race by proxy and using that information to systematically disadvantage Black and brown people.   

<p>Northpointe’s risk assessment algorithm is called COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) and is derived from a defendant's answers to a 137-question survey about their upbringing, personality, family and friends, including many questions that can be considered proxies by race, such as whether they were raised by a single mother. Note that evidence of family criminality would not be admissible in a court case for a crime committed by an individual, but here it is used as a factor in making important decisions about their freedom.</p><p>Credit: The Northpointe risk-assessment survey, sourced by ProPublica.</p><p>Source: https://assets.documentcloud.org/documents/2702103/Sample-Risk-Assessment-COMPAS-CORE.pdf</p>

Northpointe’s risk assessment algorithm is called COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) and is derived from a defendant's answers to a 137-question survey about their upbringing, personality, family and friends, including many questions that can be considered proxies by race, such as whether they were raised by a single mother. Note that evidence of family criminality would not be admissible in a court case for a crime committed by an individual, but here it is used as a factor in making important decisions about their freedom.

Credit: The Northpointe risk-assessment survey, sourced by ProPublica.

Source: https://assets.documentcloud.org/documents/2702103/Sample-Risk-Assessment-COMPAS-CORE.pdf

The redlining map and the Northpointe risk assessment algorithm have a lot of similarities. Both use the cutting-edge technologies and aggregated data about social groups for institutions to make decisions about individuals – Should we grant a loan to this person? What's the risk that this person will re-offend? Both use past data to predict and constrain future individual behaviors. Note that the past data in question (like segregated housing patterns or single parentage) are a product of structurally unequal conditions amongst social groups, and yet the technology uses those data as a causal element that will influence an individual's future behavior. Effectively this constitutes a demographic penalty that tracks an individual through their lives and limits their future potential – Live in a Black neighborhood? Then you don't get a loan. Raised by a single mom? Then you can't be freed on bail because you are a flight risk. And the kicker is that because of their use of tech and data, both of these racist data products have the appearance of neutrality. Scholar Ruha Benjamin has a term for this – "the New Jim Code" – a situation which combines software code and imagined objectivity to contain and control Black and brown people.

What's the alternative? Let us for a moment imagine a completely different set of values to encode in our data products. The values in evidence in redlining maps and risk assessment algorithms are about preserving a race- and class-based status quo. White, wealthy men working in powerful institutions adopt a focus on risk – a single loan in default threatens to decrease the wealth of their institution and the data and computational systems are mobilized to avoid this possibility at all costs. But instead of penalizing people for their statistical affiliation with specific race, gender and class demographics, we could imagine an alternative approach grounded in equity and demographic healing. A system could mobilize the same data – say, zip code and neighborhood demographics – to determine where more strategic investment was needed to counteract the toxic effects of structural inequality. And when people applied for loans, the red color in certain neighborhoods would indicate their higher need and place them higher up in the priority line for individual loans.1 The values in our alternate world are not about preserving the dominance of certain institutions and elite people but about equalizing the effects of structural inequality. Sharing power and wealth could easily be hardcoded into the computational systems of the future. The data and technology would remain almost the same but the values driving their use (and the people who derive benefit from their use) would be almost exactly opposite. But this alternate world won't happen of its own accord. As Frederick Douglass stated in 1857, and as Yeshimebeit Milner recently reminded Data for Black Lives members: "Power concedes nothing without a demand."

What is the demand? Which demands and on behalf of whom? In order to formulate those demands it is important to do two things: examine how power is currently wielded with and through data and, in parallel, imagine and model how things could be different. Examining intersecting dimensions of power has long been part of a feminist toolkit. Back in 1977, the Combahee River Collective, the famed Black lesbian activist group out of Boston, urgently advocated for "the development of integrated analysis and practice based upon the fact that the major systems of oppression are interlocking."

Examining how power is wielded through data means doing projects that wield it back like Warren's map and ProPublica's Machine Bias story – These deal openly and explicitly with who has power and who doesn't, as well as naming the structural conditions like racism and sexism that underlie those facts. It involves lifting the veil of what Benjamin calls the "imagined objectivity" of code and exposing the differential harms and benefits resulting from the deployment of data science. Good work in this vein is emerging from spaces like activism, journalism,2 machine learning, 3 and law.

But data science and visualization work that examines power still mostly happens around the margins of the field, for three reasons. First, unless you work in an accountability field (such as journalism or law), there typically isn't funding or other professional incentives for such work. Corporations typically want to visualize their supply chain, not their sexism.

Second, the people that have access to data and to the technical skills to work with it are those that have the most stake in reproducing the status quo. The elephant in the server room, only very occasionally acknowledged, goes back to one of the issues that we raised in Bring Back the Bodies: that women and people of color are not well-represented in the fields of data science and visualization, and the problem is getting worse. In the graphic below, you can see that female graduates in Computer/Information Science in the US peaked in the mid 1980's at 37%. We have seen a slow decline in the years since then. The rate of female graduates in 2010-11 fell below the rate of female graduates in 1974-5. What this means is that the most highly-touted methods of producing knowledge and deriving insight in the age of big data and artificial intelligence are being designed and deployed primarily by the people with the most privilege.

<p>One would expect female graduates in Computer/Information Science to be around 50%, but actual rates have never come close to half. Female graduates in the US peaked in the mid 1980's at 37% and we have seen a slow decline in the years since then. The rate of female graduates in 2010-11 (17.6%) is now below the rate of female graduates in 1974-5. </p><p>Credit: Graphic by Catherine D’Ignazio</p><p>Source: Catherine D’Ignazio</p>

One would expect female graduates in Computer/Information Science to be around 50%, but actual rates have never come close to half. Female graduates in the US peaked in the mid 1980's at 37% and we have seen a slow decline in the years since then. The rate of female graduates in 2010-11 (17.6%) is now below the rate of female graduates in 1974-5.

Credit: Graphic by Catherine D’Ignazio

Source: Catherine D’Ignazio


<p>[ DRAFT IMAGE: A redesigned version of a pie chart included in the AAUW report, "COMPUTING WORKFORCE, BY GENDER AND RACE/ETHNICITY, 2006–2010" which juxtaposes the statistics on women in computing with women in The overall population ]</p><p>Credit: Graphic by Catherine D’Ignazio</p><p>Source: Catherine D’Ignazio</p>

[ DRAFT IMAGE: A redesigned version of a pie chart included in the AAUW report, "COMPUTING WORKFORCE, BY GENDER AND RACE/ETHNICITY, 2006–2010" which juxtaposes the statistics on women in computing with women in The overall population ]

Credit: Graphic by Catherine D’Ignazio

Source: Catherine D’Ignazio


Relatedly, the third and final reason that examining power is the exception rather than the norm is that, as feminist sociologist Michael Kimmel says, "privilege is blind to those who have it."

What does this mean? If you remember Kimmel's colleague's powerful statement from Chapter X, it went like this. His African-American colleague said, "When I look in the mirror I see a Black woman. When a white woman looks in the mirror she sees a woman." And Kimmel, a white man, rejoins, "And when I look in the mirror, I see a human being." For people in the dominant group, their gender, race, sexuality or class is so normalized that it is invisible. It is not seen as a marker of difference, but rather simply "the way things are". Take enough of those privileged individuals and put them together collectively at the helm of data science and algorithm development and you have a major structural deficiency. This basic imbalance of power remains mostly unacknowledged – except when it reveals itself in surprising and uncomfortable ways.

For example, Joy Buolamwini, a Ghanaian-American graduate student at MIT, was working on a class project using facial analysis technology. These are software packages that will detect a face in an image, similar to when your phone camera will create outlines around the people's faces that it "sees" in the picture. But there was a problem – the software couldn't "see" Buolamwini's dark-skinned face. It had no problem seeing her lighter skinned collaborators. When she drew a face on her hand and put it in front of the camera, it detected that. And then when Buolamwini put on a white mask, essentially going in "white face," the system detected the mask's facial features perfectly. Digging deeper into the code and benchmarking data behind these systems, Buolamwini discovered that the data set on which many of the facial recognition algorithms are tested contains 77.5% male faces and 83.5% white faces. When she did an intersectional breakdown of a separate test dataset – looking at gender and skin type together – only 4.4% of the faces in that data set were female and dark skinned. In their evaluation of three commercial systems, Buolamwini and Timnit Gebru showed that darker-skinned females were up to forty-four times more likely to be misclassified than lighter skinned males. No wonder the software was failing on faces like Buolamwini's if both the training data and the benchmarking data relegate women of color to a tiny fraction of the overall data set.

<p>Joy Buolamwini found that she had to put on a white mask in order for the facial detection program to "see" her face. Buolamwini is now founder of the <a href="https://www.ajlunited.org/">Algorithmic Justice League</a> (AJL).</p><p>Credit: Photo by Joy Buolamwini</p><p>Source: Photo by Joy Buolamwini</p><p>Permissions: Pending</p>

Joy Buolamwini found that she had to put on a white mask in order for the facial detection program to "see" her face. Buolamwini is now founder of the Algorithmic Justice League (AJL).

Credit: Photo by Joy Buolamwini

Source: Photo by Joy Buolamwini

Permissions: Pending

As she tells it, "I didn't start out on a mission for social justice," but after seeing the need for more fairness and accountability, Buolamwini has now gone on to launch the Algorithmic Justice League (AJL) – an organization that works to highlight and address algorithmic bias. Buolamwini and the AJL have done art projects, written research papers, taken to the media to call for a moratorium on facial analysis and policing, and they are even advising on legislation and professional standards for the field of computer vision.

But imagine, for a moment, a world where female, Black and brown engineers are the ones designing the computer vision training data sets and algorithms in the first place. A world where people like Joy Buolamwini and Gwendolyn Warren are the norm, not the exception. Would such a system have worked from the start? Not necessarily, says Buolamwini. "No technologist works in isolation – we rely on the libraries and datasets developed by the community over time." And these datasets have what she has termed "power shadows" – they reflect the structural inequality of the world they draw from. So when it is easiest to collect faces of powerful public figures for your benchmarking data, those datasets will contain power shadows - disproportionate male and white representation.

So what does "working" mean if you want to make data products that are anti-racist and anti-sexist? On the one hand, the software did "work". It was pretty good at detecting faces for the white men who comprised 78% of the data set. But Buolamwini likes to remind her audiences that Europeans are less than 10% of the world's population, so it didn't work for the majority of the global population. And even so, "it's not just about creating accurate algorithms but creating equitable systems," she says. We can't just build more precise surveillance apparatuses; we also need to look at the deployment, governance, use and impacts of these technologies: "Communities, not companies, should determine whether and how this technology is used by law enforcement."

Where we might say that the technology did "work" is that it accurately reflected back to Buolamwini the biases of the people in power towards Black women. In that sense, it faithfully reinforced the racist messages Black people receive all the time that their lives as well as their voices, bodies, and representations do not matter. bell hooks referred to this phenomena as "representational harms." Specifically writing about data, artist Mimi Onuoha has called this phenomenon "algorithmic violence" and data ethicist Anna Lauren Hoffmann has used the term "data violence" for the way in which it participates in (and legitimates) the circulation of damaging narratives and ideas about particular groups of people. This is the harm that occurs with imagined objectivity – when software engineers wield data "neutrally" (in an attempt to wiggle out of having to deal with squishy things like values) they build things that support the existing status quo. And that status quo is ugly – it is racist, patriarchal, heteronormative and more.

In fact, one of the structural forces that software engineers and data scientists need to contend with is that data is by and large a tool of management, wielded by those institutions in power, like the Detroit Chamber of Commerce in the 1940s, who have a vested interest in maintaining the ugly status quo because they benefit from it. Joseph Weizenbaum, artificial intelligence trailblazer and creator of the famous ELIZA experiment in the 1960s, looked back on the history of computing and said it like this: "What the coming of the computer did, 'just in time,' was to make it unnecessary to create social inventions, to change the system in any way. So in that sense, the computer has acted as fundamentally a conservative force, a force which kept power or even solidified power where it already existed."

The first step to pushing back against this fundamentally conservative force is to understand that the single most damaging thing one can do to uphold the oppressive order of the world is to claim that they have no values, no politics, and that their work with data is neutral. This is Haraway's god trick and Benjamin's imagined objectivity - the veil at work to obscure power differentials. This neutrality narrative would be item #1 in the BuzzFeed listicle "Things Straight White Men Tell Themselves to Stay on Top."

The second step is to begin to understand the ways that privilege – and oppression, its counterpoint – manifest themselves in data science. Privilege and oppression are complicated and there are "few pure victims and oppressors," as sociologist Patricia Hill Collins notes. A helpful way to start to grasp these functionings is through Collins' concept of the matrix of domination. As we described at the outset of this book, a core distinguishing feature of contemporary feminism is its insistence on intersectionality – the idea that we must take into account not only gender but also race, class, sexuality and other aspects of identity in order to fully understand and resist how power operates to maintain an unjust status quo. Collins' matrix of domination describes the overall social organization of those intersecting oppressions. She outlines four major domains in which the matrix of domination operates: the structural domain, the disciplinary domain, the hegemonic domain, and the interpersonal domain. "Each domain serves a particular purpose," writes Collins.

The structural domain is that of laws and policies and schools and institutions – it organizes and codifies oppression. If we take the example of voting, most US states prohibited women from voting in elections until the 1910s. Even after the passage of the Nineteenth amendment in 1920, many state voting laws included literacy tests and other ways to specifically exclude women of color,4 so it wasn't until the Voting Rights Act in 1965 that all Black and brown women were enfranchised. The disciplinary domain administers and manages oppression through bureaucracy and hierarchy (rather than explicit laws). In our voting example, this might take the shape of a company prohibiting factory workers from leaving early to vote or penalizing workers who distribute information about voting.  

Neither of these domains are possible without the hegemonic domain which deals with culture, media, and ideas. Discriminatory policies and practices in voting can only happen in a world that widely circulates oppressive ideas about who "counts" as a citizen. For example, an anti-suffrage pamphlet from the 1910s proclaimed that "You do not need a ballot to clean out your sink spout." This and other such memes of the era reinforced pre-existing societal notions that a woman's place is in the domestic arena, outside of public life. And the final part of the matrix of domination is the interpersonal domain, which influences the everyday lived experience of individuals. For example, what would it feel like to be the butt of jokes made by males in your family as they read that pamphlet? How did it feel like to wait in line for twelve hours to cast your vote, knowing that the system was deliberately trying to screw you out of a voice?

If you are a Black woman in the US, you are intimately familiar with the matrix of domination because you brush up against it in everyday encounters. Writes Collins, "Oppression is not simply understood in the mind—it is felt in the body in myriad ways. Moreover, because oppression is constantly changing, different aspects of an individual U.S. Black woman’s self-definitions intermingle and become more salient: Her gender may be more prominent when she becomes a mother, her race when she searches for housing, her social class when she applies for credit, her sexual orientation when she is walking with her lover, and her citizenship status when she applies for a job. In all of these contexts, her position in relation to and within intersecting oppressions shifts." In each of these cases, the woman is made aware of her differences and her subjugated position in relation to a dominant norm. This experience is an essential form of data – lived experience as primary source knowledge.

But let's imagine for a moment you are a straight, white, middle-class, cisgender male U.S. citizen. Your body doesn't change in childbirth and breastfeeding so you don't think about workplace accommodations. You look for a home or apply for a credit card and people are eager for your business. People smile or don't look twice when you hold your girlfriend's hand in public. You present your social security number in jobs as a formality, but it never hinders an application from being processed or brings unwanted attention. The ease with which you traverse the world is invisible to you because it is quite simply the way things are and you imagine they are the same for everyone else. This is what it means to be blind to your own privilege – despite having the best education, the most elite among us are pathetically deficient when it comes to recognizing injustice, across all of the domains in the matrix of domination. They lack the lived experience – the undeniable data of lived experience  – that reminds them everyday that their bodies, their sexuality, and/or their race depart from a desired norm.

Projects that reveal those norms often focus on the absences and silences – those who are purposefully omitted or simply forgotten because of who has consolidated privilege and power. We’ve already introduced you to the work of artist and designer Mimi Onuoha in Chapter One. Her project, Missing Data Sets, if you recall, is a list she maintains of issues and events that go uncounted. Her missing data sets name important phenomena that you would expect institutions to collect systematic information about topics such as police killings, hate crimes, sexual harassment, and caucasian children adopted by people of color.5

<p>Missing Data Sets, by Mimi Onuoha, 2015 - present, is a list of data sets that are not collected because of bias, lack of social and political will, and structural disregard. https://github.com/MimiOnuoha/missing-datasets.</p><p>Credit: Photo by Mimi Onuoha</p><p>Source: Mimi Onuoha</p><p>Permissions: Pending</p>

Missing Data Sets, by Mimi Onuoha, 2015 - present, is a list of data sets that are not collected because of bias, lack of social and political will, and structural disregard. https://github.com/MimiOnuoha/missing-datasets.

Credit: Photo by Mimi Onuoha

Source: Mimi Onuoha

Permissions: Pending


Onuoha exhibits Missing Data Sets as an empty set of tabbed file folders in art exhibitions. The viewer can browse the files and open the folders to reveal that there are no papers inside. What should be there, in the form of paper records, is "missing" – absent not because the topics are unimportant, but because of bias, social and political will, and structural disregard. As Onuoha says, "That which we ignore reveals more than what we give our attention to. It’s in these things that we find cultural and colloquial hints of what is deemed important. Spots that we've left blank reveal our hidden social biases and indifferences." 

What is to be done about missing data sets? Taking a feminist perspective in this unequal ecosystem can mean pointing at their absence, as in the case of Onuoha. Or, sometimes, it means walking right straight ahead into the unequal playing field and collecting the missing data yourself, because somebody has to do it. 

This is exactly what pioneering data journalist and civil rights advocate Ida B. Wells did as early as 1895, when she assembled a set of statistics on the epidemic of lynching that was sweeping the United States at the time; or what Princesa, the anonymous Mexican woman who we introduced in Bring Back the Bodies, has been doing for the past three years. She has logged 2,355 cases of femicide since 2016,6 and her work provides the most accessible information on the subject for journalists, activists and victims' families seeking justice.

Femicide is a term first used publicly by feminist writer and activist Diana Russell in 1976 while testifying before the first International Tribunal on Crimes Against Women. Her goal was to situate the murders of women in a context of unequal gender relations. In this context, men use violence to systematically dominate and exert power over women. Indeed, the research bears this out. While male victims of homicide are more likely to have been killed by strangers, a 2008 report notes a “universal finding in all regions” that women and femmes are far more likely to have been murdered by someone they know. Femicide includes a range of gender-related crimes, including intimate and interpersonal violence, political violence, gang activity, and female infanticide.While such deaths are often depicted as isolated incidents, and treated as such by authorities, those who study femicides characterize them as a pattern of underrecognized and under-addressed systemic violence.

Femicides in Mexico rose to global visibility in the mid-2000's with widespread media coverage about the deaths of poor and working-class women in Ciudad Juárez. A border town, located across the Río Grande from El Paso, Juárez is a home to more than 300 maquiladoras – factories that employ many women to assemble goods and electronics, often for low wages and in substandard working conditions. Between 1993 - 2005, nearly four hundred women were murdered in the city, with around a third in brutal or sexual form. A conviction was made in only three of those deaths. When alleged perpetrators were arrested, they were often tortured into confessions by police, casting doubt on the investigations. Activist groups like Ni Una Más (Not One More) and Nuestras Hijas de Regreso a Casa (Our Daughters Back Home) were formed in large part by mothers who demanded justice for their daughters, often at great personal risk to themselves.7 These groups succeeded in gaining the attention of the Mexican State who established a Special Commission on Femicide chaired by politician Marcela Lagarde. After three years of investigating, the Commission found in 2006 that femicide was indeed occurring and that the Mexican State was systematically failing to protect women and girls from being killed. Moreover, Lagarde suggested that femicide be considered, "a crime of the state which tolerates the murders of women and neither vigorously investigates the crimes nor holds the killers accountable.”

Despite the Commission's work and the fourteen volumes of detailed accounts and statistics about femicide, as well as a 2009 ruling against the Mexican state by the Inter-American Human Rights Court; As well as a United Nations Symposium on Femicide in 2012; As well as the fact that sixteen Latin American countries have now passed laws defining femicide – despite all this, deaths in Juárez have continued to rise and the toll is now more than 1500. Three hundred women were killed in Juárez in 2011 alone, and only a tiny fraction of those cases have been investigated. The problem extends beyond Ciudad Juárez in the state of Chihuahua to other states in the nation such as Chiapas and Veracruz.

While there is increasingly a legal and analytical basis for characterizing deaths as femicides, there is still a great deal of missing data. In a report titled Strengthening Understanding of Femicide, the authors state that "instances of missing, incorrect, or incomplete data mean that femicide is significantly underreported in every region." In the case of femicides, as in so many cases of data collected (or not) about women and marginalized groups, the collection environment is compromised. Lagarde's very definition of femicide includes the fact that the State – comprised mainly of privileged men who have a vested interest in maintaining a gendered order –  is complicit through indifference and impunity, so how could data be reliably collected?

This circles us back to a point we first made in Chapter One, and elaborated in Unicorns, Ninjas, Janitors, and Rock Stars, about how collecting large amounts of data is costly and resource-intensive. Only government states, corporations, and some elite institutions have those resources, so data collection efforts tend to be driven by their values and priorities. Not surprisingly, those institutional actors can be compromised by their own privilege, and their interest in maintaining the status quo. In the case where a government state is itself the bad actor, there can be no other authority with enough resources or channels of influence to shift collection practices. This is especially true in the case of femicides, in which collecting high-quality data would rely on shifting policy for local law enforcement and medical examiners, the entities that log homicide information.

But as data journalist Jonathan Stray asserts, "Quantification is representation." Looking at U.S. census data prior to 1970, he explains, you might come to the conclusion that there were no Latinx people living in the United States. This is not true, of course. There were actually already millions of Latinx people living in the U.S. But 1970 was the first year that “Hispanic” was included as an ethnic category on the census. Prior to that, it would have been hard to know anything about Latinx people as a group because the federal government was simply not collecting any information about them. So when the category was added to the census, most Latinx people were pleased to see it. It meant that they mattered.

But the inverse of the “quantification is representation” equation is also true: if data is not collected on a particular group, or on a particular issue, then institutions in power can pretend that the issue doesn't exist. Similar to the case of universities and sexual assault statistics, as discussed in The Numbers Don't Speak for Themselves, no Mexican state wants to have high rates of femicide. It is into this lack of government will that Princesa, who recently has spoken out in public under her given name María Salguero, has inserted her map of femicides. Salguero studied Geophysical Engineering in Mexico's Instituto Politécnico Nacional. She learned her mapping and journalism skills from attending trainings with Chicas Poderosas, a Latin American feminist group that focuses on training cis and trans women in data storytelling. The femicides map takes two forms – one, depicted in Figure 7.08a, is a point map where Salguero manually plots a pin for every femicide that she collects through media reports or through crowdsourced contributions. The other visualization, seen in figure 7.08b, consists of the same data in a dashboard format, with gender-related killings grouped as smaller or larger bubbles for different geographies depending on their incidence. One of her goals is to "show that these victims had a name and that they had a life. They weren't statistics," so Salguero logs as many details as she can about each death. These include name, age, relationship with the perpetrator, mode and place of death, whether the victim identified as transgender, as well as the full content of the news report which served as the source. It can take her three to four hours a day to do this unpaid work (see Show Your Work for a further discussion on labor, gender, and data). She takes breaks for preserving her mental health, and she typically has a backlog of of a month's worth of femicides to add to the map.


<p>María Salguero's map of femicides in Mexico 2016-present. Map extent along with a detail of Ciudad Juárez with a focus on a single report of an anonymous transgender femicide. She crowdsources points on the map based on reports in the press and reports from citizens to her. </p><p>Credit: María Salguero. </p><p>Source: https://feminicidiosmx.crowdmap.com/ and https://www.google.com/maps/d/u/0/viewer?mid=174IjBzP-fl_6wpRHg5pkGSj2egE&amp;ll=21.347609098250942%2C-102.05467709375&amp;z=5. </p>

María Salguero's map of femicides in Mexico 2016-present. Map extent along with a detail of Ciudad Juárez with a focus on a single report of an anonymous transgender femicide. She crowdsources points on the map based on reports in the press and reports from citizens to her.

Credit: María Salguero.

Source: https://feminicidiosmx.crowdmap.com/ and https://www.google.com/maps/d/u/0/viewer?mid=174IjBzP-fl_6wpRHg5pkGSj2egE&ll=21.347609098250942%2C-102.05467709375&z=5.


While media reports and crowdsourcing are imperfect ways of collecting information, this map – created and maintained by an individual – fills a vacuum created by the government's deflection of responsibility. Mexico's National Health Information System (SINAIS) logs national homicide data, but only records the name, location, and how the person died, and to count a death as a femicide you must know the circumstances of the death as well as the relationship between the perpetrator and the victim. Various federal agencies point fingers in different directions regarding femicide data collection. In 2017, the Federal Institute for Access to Public Information and Data Protection (INAI) – led by Commissioner Ximena Puente de la Mora – ordered the National Commission for Human Rights (CNDH) to turn over statistics about femicides for 2015 and 2016. The CNDH declared itself unable to provide such information and referred the request to two other federal agencies, neither of whom collect data about femicides. 

In the meantime, Salguero's femicides map provides the most authoritative source of data on femicides at the national level. It has been featured in national Mexican media outlets and used to help find missing people. Salguero herself has testified before the Mexican Senate. Though Salguero is not affiliated with a specific group, she makes the data available to activist groups for their efforts. And parents of victims have called her to give their thanks for making their daughters visible. The urgency of the problem makes the labor worthwhile. Princesa affirms, "this map seeks to make visible the sites where they are killing us, to find patterns, to bolster arguments about the problem, to georeference aid, to promote prevention and try to avoid femicides." 

How might we explain the missing data around femicides in relation to the four domains of power that constitute Collins' matrix of domination? The most grave and urgent manifestation is in the interpersonal domain, where women are victims of extreme violence and murder at the hands of men. And although the structural domain – law and policy – has recognized femicide, there are no specific policies implemented in order to ensure adequate information collection, either by federal agencies or local authorities. Thus, the disciplinary domain, where law and policy are enacted, is characterized by deferral of responsibility, failure to investigate, and victim blaming, precisely because there are no consequences in the structural domain.

And none of this would be possible without the hegemonic domain - the realm of media and culture – that presents men as dominant and women as subservient; men as public, women as private; with any challenge to this gendered order of operations perceived as a grave transgression, deserving of punishment. Indeed, government agencies have used their position to publicly blame victims. Following the femicide of 22-year-old Mexican student Lesvy Osorio in 2017, , as Maria Rodriguez-Dominguez reports, the Public Prosecutor's Office of Mexico City shared on social media that the victim was an alcoholic and drug user who had been living out of wedlock with her boyfriend. Here was the office that was supposed to be investigating the murder, and instead of doing their job they turned to social media to imply that Osorio was a degenerate. This led to public backlash and the hashtag "#SiMeMatan (If they kill me)" and tweets such as "#SiMeMatan it’s because I liked to go out at night and drink a lot of beer."

This is the data collection environment for femicide information and it is characterized by extremely asymmetrical power relations, where those with power and privilege are the only ones who can actually collect the data but they have overwhelming incentives to ignore the problem, precisely because addressing it poses a threat to their dominance. Here it is important to note that data on femicides is not an isolated case. It is an expected outcome and regular feature of an unequal society, in which a gendered, racialized order is maintained through willful disregard, deferral of responsibility and organized neglect for data and statistics about those bodies who do not hold power. For example, doctoral student Annita Lucchesi has created "The Missing and Murdered Indigenous Women Database" which tracks indigenous women who are killed or disappear under suspicious circumstances in the US and Canada. She thinks approximately 300 indigenous women per year are killed but the exact number is unknown because nobody (other than Lucchesi) is actually counting. Other examples in the US context include police killings of unarmed Black and brown people,8 maternal mortality statistics, and people killed by US drones.

What is to be done? It's important to remember that asymmetrical power relations don't mean absolute power. And it's also important to remember that States and entities with power are not monolithic. There are plenty of public servants – women and men and others – in Mexico advocating internally for better data collection around femicides, like Ximena Puente de la Mora from INAI who initiated the femicides data request.

Crowdsourced data collection efforts that count and measure the extent of structural oppression can be a first step towards demanding public accountability. This is an important, urgent role for data journalism in the 21st century.9 As we discussed in Bring Back the Bodies, ProPublica has an on-going investigative series about "Lost Mothers" – mothers in the US who lose their lives in childbirth due to poor care and preventable causes. One of their findings was that there was no comprehensive federal data on maternal mortality, so ProPublica began crowdsourcing stories of individuals to attempt to count the phenomenon. Their database and their reporting has spurred the creation of more than 35 state level review committees who are investigating maternal mortality in their state, as well as a proposed bill in Congress to allocate $12.5 million to the Centers for Disease Control and Prevention to undertake better data collection.

But, at the same time, we also have to work on dismantling the consolidated power and privilege that organize the matrix of domination.

Could we statistically model oppression? It's a provocative question and one that Google researcher Margaret Mitchell has been investigating at the level of collective human speech. She describes how, in speech patterns, people use unqualified nouns for the "default" case of something. For example, bananas that are green are modified with "green bananas" or "unripe bananas" to indicate that they depart from the ready-to-eat yellow banana. But nobody needs to say "yellow banana" because it is implied by our shared concept of banana. This is called "reporting bias" in artificial intelligence research. So, studying the adjectives that modify "banana" in large data sets can actually tell us a lot about what people's default idea of bananas is in a particular culture. And when applied to humans, the "default case" reveals a lot about our collective norms and biases. For example, a doctor who is female is more typically qualified as a "female doctor" in human speech because it represents a departure from a perceived norm of doctors being male. So if "female doctor" is used in speech patterns for a particular culture, we might be able to infer that the social norms for that culture are patriarchal and thus pay special attention to the ways in which women are oppressed. Of course, this only works with those ideas that make it into human speech. As we have already outlined, there are many important issues related to cis and trans women, such as sexual assault, about which people are almost completely silent.

Or perhaps we need to start looking at privilege as an ethical and legal liability and start quantifying it. Anti-racist feminists have long opposed quantifying privilege at the scale of the individual body (which can lead to something Roxane Gay calls "the oppression olympics" - competition for who is most oppressed). However, building off of recent calls for monitoring Big Tech with things like Sasha Costanza-Chock's "Intersectional Media Equity Index," one could fairly easily quantify the collective privilege of an organization and then create a prediction score for just how likely that institution is to create racist, sexist data products that lead to harmful impacts for users as well as legal and public relations disasters for the firm. Such a score could incorporate demographic information for firm ownership, leadership, employees (with a special focus on the demographics of those who are producers of data products for the company) and users. It could consist of a grade from 0-100, where 0 signifies perfect alignment between the firm and its users and 100 signifies a high risk of discrimination because of misalignment between the firm and its users. This privilege hazard score would measure just how much or how little the firm was influenced by those who already have the most privilege and power, and conversely, just how likely it would be to produce discriminatory "mistakes" and oversights. Consequently, the media might be less surprised when Google, whose board consists of 82% white men, creates image classification algorithms that only show white men in image searches for "CEO". Or when the Mexican State, comprised of X% rich men, is complicit in the murders of its working class women and girls. Such discriminatory outputs would have been entirely expected based on their privilege hazard score. As discussed in What Gets Counted Counts, there is an explicit politics of being counted here. Quantification can operate as a kind of sousveillance - "watching from below" – where the Great Quantifiers like Google and Amazon and even whole nation-states are quantified and predicted right back. 

But let us return to the Frederick Douglass quote, "Power concedes nothing without demand." So far, we have discussed the feminist project of examining power – interrogating how power works through data to prioritize some bodies over other bodies and to secure the wealth and status of the dominant group. Buolamwini's algorithmic auditing quantified exactly how much facial recognition software was failing women of color. Princesa's map exposes the fact that gender-based killings are rampant and going untracked by the powers that be. Many important efforts to redress data discrimination and algorithmic bias are working in this mode of examining power. But in order to truly formulate demands, a feminist approach additionally requires imagining and modeling power differently to achieve equity. Equity is justice of a specific flavor, and it is slightly different than equality. Fighting for a world which treats everyone equally means that those who start out with more privilege will get further, achieve more and stay on top. Fighting for a world which treats everyone equitably means taking into account present power differentials and distributing resources accordingly. More simply said, equality upholds patriarchy and white supremacy. Equity dismantles them.

So how might data be used not only to examine power but also to transform gendered power relations? To support self-determination of marginalized groups? What does a society that values data and equity look like and feel like?

Let's circle back to present day Detroit. At the end of 2017, the Detroit Digital Justice Coalition and the Detroit Community Technology Project published a collaborative report entitled  Recommendations for Equitable Open Data. It was the result of two years of research, conversations and explorations about the city's Open Data Portal. The report is specific about what equitable open data is and who it benefits: "[W]e mean accountable, ethical uses of public information for social good that actively resist the criminalization and surveillance of low income communities, people of color and other targeted communities." Note here how the authors named and made explicit whose perspectives they were centering and why – these communities have been historical targets of discriminatory institutional practices. We saw this targeting explicitly in the redlining map introduced earlier in the chapter. The report goes on to outline seven recommendations for the City of Detroit to adopt to make their open data practices more equitable and more likely to benefit people of color and low-income communities. These include "Protect the people represented by the numbers","Engage residents offline about open data" and "Prioritize the release of new datasets based on community interest." These are concrete demands, offered to improve the use and benefits of open data for the people who are most often left out of open data conversations.

So, following Collins, there is a matrix of domination with four different domains of power. Examining that power using data-driven methods is an important step towards challenging that matrix, particularly in egregious cases like femicides where there is a violent, unjust status quo. Additionally, we have a responsibility to create space for women, people of color, queer and trans folks and others to imagine and dream power differently – to model better and beautiful futures where all can thrive – something which we will address further in Teach Data Like an Intersectional Feminist! But it's hard to see the contours of the matrix of domination, let alone empower others or imagine things differently, when you are the recipient of a lot of benefits from it. When the system works for you, you are able to set racism and sexism and other oppressive forces aside and you will experience little penalty for such ignorance.10

So what is to be done when you are in a position of power and privilege? Most people working in data science, visualization, machine learning and statistics have significant privilege and power accumulated through their education and their institutional connections, as well their race, gender and ability. Can you use your power and privilege for "good", even though we have explored how much of a hazard it is for your ability to accurately apprehend the injustice of the world? Emphatically, unequivocally "Yes!", with some caveats and elaborations.  

The feminist grounding for navigating this quandary is called an "ethics of care", which we introduced in Show Your Work. While there are many contemporary discussions about data ethics, most derive from a version of moral reasoning introduced by Immanuel Kant in the 18th century, which prioritizes abstract dilemmas, rules and obligations, and universal application. In these conceptions, the focus is on an individual, independent human actor, and their relationships with others are conceived as contractual, business-like negotiations among equals. It is important to note that Kant based morality on reasoning, believed women to be incapable of reason, and thus concluded that women could never be full moral persons, i.e. were not fully human.11 This relates back to the "master narrative" we described in On Rational, Scientific, Objective Viewpoints from Mythical, Impossible, Imaginary Standpoints, which valorizes reason and (supposed) impartiality over all other ways of knowing and asserts the superiority of males in that capacity. More recently, technical folk are digging this approach because this kind of blanket ethical logic is easy to code into large systems. But it's important to note that this approach was explicitly designed to exclude half of humanity.

On the other hand, a feminist ethics of care prioritizes responsibilities, issues in context, and, above all else, relationships. Feminist philosopher Alison Jaggar has detailed numerous ways that traditional ethics has failed women. Masculine ethical approaches have systematically showed less concern for women's issues as opposed to men's issues, have devalued ethical quandaries in the "private" realm (the realm of housework, family and children), and have valued traditionally masculine traits like independence, autonomy, universality and impartiality, over traditionally feminine traits like interdependence, community, relationships, and responsibility. While the central unit of analysis in Kantian ethics is the individual human, an ethics of care focuses on the relationship between two or more things (possibly human, possibly not), and the ways that they are bound together by that relationship. Which is to say that rather than valuing impartiality, an ethics of care prioritizes intimacy and honors the deep, emotional, personal investment that comes with being responsible for the well-being of another, whether that is a child or the environment. And vice versa – the ways that your own well-being is tied up in how a child or the environment cares back towards you and nurtures you. This kind of situated ethics is not as easy to encode into large computational systems, but we shouldn't rule it out as impossible until someone has actually tried.

What does a feminist ethics of care mean for those of us who work everyday with data science, journalism or visualization and enjoy some relatively high degree of privilege? First, accept that your privilege and power are not just an asset, but also a liability. They structure what you and your institutions see in the world and also what (and who) you and your institutions disregard about the world. The antidote to your privilege deficiency is to establish meaningful, authentic, on-going relationships across power differentials (whether based on gender, race, class, technical knowledge, ability, etc) – and to listen deeply to those new friends. This sounds simple, but it is hard, both at the individual level and at the institutional level, because it involves a reorganization of priorities and revaluation of the metrics of success.

Relationships in an ethics of care are a two-way street. For this reason, it's also important to reframe "doing good" with data as something more akin to "doing equity" or "doing co-liberation" with data to remove some of its paternalistic overtones. All too often, well meaning "help" is conceived as saving unfortunate victims from their own technological ignorance. In presenting the origin story of the Detroit Geographic Expedition and Institute, Gwendolyn Warren reflected on the ignorance of the white male academics her community worked with, "We had this group of geographers, one of whom lived in the neighborhood, who decided that they were going to 'discover us'. They were going to go and explore the 'hood and discover us. And show us how to make change[...] There was no way in hell they were going to save us, but they didn't know it."

Whereas an act of data service performed by a technical organization for a community-based group is often framed as charity, an ethics of care would frame it as one step in deeper relationship building and broader demographic healing. There is a famous saying from aboriginal activists that goes like this,

If you have come here to help me, you are wasting your time. But if you have come because your liberation is bound up with mine, then let us work together. 

Following a logic of co-liberation leads to different metrics of success. The success of a single project would not only rest on whether the database was organized according to spec or whether the algorithm was able to classify things properly, but also on how much trust was built between institutions and communities, how effectively those with power and resources shared their power and resources, how much learning happened in both directions, how much the people and organizations were transformed in the process, and how much inspiration for future work, together, was co-conspired.

Likewise, data projects undertaken by technical folk with an ethics of care would openly acknowledge and account for power differentials by explicitly prioritizing whose voices matter most in the process as input. We saw this in the Detroit Equitable Open Data Report – the authors prioritized the needs of communities that are targeted for surveillance – those who stand to experience the least benefits and the most harm from open data. By prioritizing the needs of those at the margins, we create a system that works for everyone. In some situations, this means working your absolute hardest to establish authentic relationships that did not previously exist. For example, for the past five years, Catherine has been co-leading a feminist hackathon project called Make the Breast Pump Not Suck. The first version of the hackathon took place in 2014 and focused primarily on the product design and experience of using a breast pump.12

But after a couple years, it was clear that the innovations emerging in the breast pump space were primarily for white knowledge workers – the smart pumps were coming in at $400, $500 and $1000, not covered by insurance and thus only accessible to those with disposable income. So, in organizing the second Make the Breast Pump Not Suck Hackathon in 2018, our leadership team decided to center the voices of mothers of color, low wage workers, and queer parents because those are the groups that face the most barriers to breastfeeding in the US context. We invited members of those groups as hackers – and we also put into place an Advisory Board composed primarily of high-profile advocates of color that work directly with community organizations. This Board caught multiple oversights of the majority white leadership team, and shifted the project in significant ways. Everyone was paid for their time. In On Rational, Scientific, Objective Viewpoints from Mythical, Impossibly, Imaginary Standpoints, we discussed "design from the margins" as an underlying principle of feminist human computer interaction. This additional layer might be characterized as "governance from the margins." It functioned as an accountability mechanism to simultaneously check the leadership team's privilege and prevent us from doing harm, and also to deepen emerging relationships across race, social capital and technical knowledge.

But for this to work, those that are doing co-liberation with data have to trust that the people who experience the most harms from a social issue have the best ideas for reimagining it. As Kimberly Seals Allers, one of our Advisory Board members said at her keynote, "whatever the question, the answer is in the community." And while the emphasis of data projects is often to develop a time-bounded thing – a database, an algorithm, a model, a visualization  – it's important to remember that the longer-term goal is to build meaningful, authentic, on-going relationships across differences in power and privilege in order to transform yourself and your institution and the world. 

Discussions

Labels
Lauren Klein: LK note to self: Need a better cite on this.
Catherine D'Ignazio: This is from Lilla Watson - https://en.wikipedia.org/wiki/Lilla_Watson - I changed this back to "aboriginal activists" because that's how she wants it to be credited
?
Os Keyes: advising != governance
?
Os Keyes: One’s duty is not completed by recognising the existence of The Proles.
?
Os Keyes: It’s never been done; Sasha’s work is excellent but untested. This feels like a dangerous thing to imply is a shoe-in.
Lauren Klein: Good point.
?
Os Keyes: Also it wasn’t legal even on paper for native americans to vote until 1962. The “women of color” statements seem to be positioning it as “black people and others”: it would be good to recognise specific issues that indigenous folk face.
Lauren Klein: Noted. We’ll address this with more nuance in the revision.
?
Os Keyes: “Oppressive”? If you’re going to call out specific intersects then great but race and gender have consistently been the examples; aspects like disability being incorporated more would be good, but if not, generalisability would be good.
Lauren Klein: Updated but also keeping this comment as a reminder throughout.
?
Os Keyes: In both cases where you provide these examples, you use gender data. Can I suggest mixing it up?
Lauren Klein: Good idea.
?
Os Keyes: Again, this is not the only pair of deficits
Lauren Klein: Thanks for this important reminder.
?
Nikki Stevens: This is a great story of transitioning to more equitable methods.
?
Nikki Stevens: I’m concerned that this recommendation puts the burden of education on those on the opposite side of the “power differentials”, as well as strips them of their agency. I’m sure it wasn’t the intention, but it feels like exploitation here - a dressed up version of “i have a Black friend”
Lauren Klein: Thanks for this comment. We’ll think about how to revise this claim.
?
Nikki Stevens: again a false binary.
Lauren Klein: Thanks for pointing this out. It’s a problem with a lot of feminist ethics more generally as well.
?
Nikki Stevens: this language reinforces the gender binary.
Lauren Klein: Noted.
?
Nikki Stevens: was race a factor? are stats same for all races?
?
Nikki Stevens: here again - the data excludes gnc folks. additionally, race isn’t mentioned.
?
Pratyusha Kalluri: Want to leave this comment somewhere: there’s a few places in the book where the second person is used to refer to a presumed straight white cis male character relatively unaware of social justice issues. I would be mindful of whether doing this subtly/slowly prioritizes that more ignorant reader over the many other folks reading, and sends the wrong message — that a book to data scientists is not a book to WOC etc.
Lauren Klein: Thanks for this. We need to attend more to our imagined readership, for sure.
Elizabeth Losh: Agree with others that this example seems repetitive.
Lauren Klein: Thanks. We need to decide where it goes.
Elizabeth Losh: See earlier argument about not amplifying Kimmel.
Lauren Klein: Good point!
Elizabeth Losh: With the rise of new data science programs, it might be useful to look specifically at disparities in gender in this relatively recent field.
Elizabeth Losh: Good point about how data is read. You might want to think more about how to highlight data literacy issues for classes that teach the book. Obviously that is much of the work of the next chapter.
?
Margaret Pearce: See also projects combining crowdsourced accounts of massacres + archives: by 1) Lyndall Ryan at c21ch.newcastle.edu.au/colonialmassacres/map.php and 2) Judy Watson at namesofplaces.maps.arcgis.com/apps/MapJournal/index.html?appid=1fca23b6fd87494e8f98ff2e29c71b4b
?
Margaret Pearce: [removed my comment]
?
Margaret Pearce: I don’t understand what the “missing bodies” bar means in the “Who Is Missing?” graph…? And why bar sections don’t add up to 100%?
?
Margaret Pearce: Also, I would recommend sorting the bars from hi to lo, or lo to hi, so we can compare the heights relative to each other.
?
Ronald Morrison: as well as relinquishing an authoritative claim to “knowing”
Lauren Klein: This is key. Thank you!
?
Ronald Morrison: This is true but not universally, the politics, operational goals, and collector matter severely. Simone Browne’s work has been great for showing the ways that collection and representation through data have their own violences. Dorothy Robert’s work also comes to mind.
Lauren Klein: Great point. We talk about this in the counting chapter but there is overlap with this discussion. At the least, we should figure out a way to cross-reference.
?
Ronald Morrison: Very crucial to call out. Especially appreciate the focus on the politics of collection.
?
Ronald Morrison: This is a nice key phrase.
?
Ronald Morrison: Very key quote from Collins, often overlooked. Thank you for including it.
?
Surya Mattu: Not sure if this is helpful at this stage of the process but another aspect that may be relevant to the in the power conversation is the role of wealth. I think this plays out in two ways: Thinking of data privacy as luxury that one needs to pay a premium for The asymmetric relationship between those who generate data and those who get to monetize it. The example that came to mind was fitness trackers. in particular the way insurance companies are using them. John Hancock announced that they will no longer offer policies that don’t include fitness trackers. I thought of it because of the discussion on data collection and who it represents, and also imagined objectivity. This practice by health insurance companies clearly conflates what a fitness tracker can measure (accelerometer + some fancy algorihtms) in terms of a persons health. It also makes customers pay extra for the premium of keep their data private. For more context: https://www.bbc.com/news/technology-45590293 This chapter is already really polished and I don’t think you necessarily need to add this, just felt worth mentioning.
?
Ronald Morrison: Yes, I generally very much agree with this statement, but working from the facial recognition example this feels a little disjointed. Had Buolamwini found that her face was originally recognized would the representational harm be elided, particularly when there is still such over policing and surveilling of black and brown bodies? Just want to be sure not to set the bar for equity in data by equal representation in being collected, which feels like a potential false equivalency.
Lauren Klein: Such a good point. Thank you.
?
Ronald Morrison: Yes!
?
Ronald Morrison: Can we please start a working group to envisioning some of this? Especially outside of the rote conversation on ethics within algorithmic programming.
Catherine D'Ignazio: YES!
?
Ronald Morrison: The creation of these proxies could be a nice tie in to the parallel creation of racialized imaginaries premised by the segregationist practices laid out earlier by redlining. George Lipsitz’s “How Racism Takes Place” lays this argument out well.
?
Ronald Morrison: In line with the earlier comment, not just about protecting financial resources but making explicit the material benefits of whiteness and a differentiation of value to be able to allow for white European assimilation into whiteness while preserving the hierarchy of racial capitalism. See Cedric Robinson.
Lauren Klein: The nod to Cedric Robinson is right on here. Thank you.
?
Ronald Morrison: I think its also important to mention some of the socio-cultural attachments to financial solvency embedded in such spatial segregationist practices. Particularly the ways that racializing of space under redlining created wholly different imaginaries of space and identity, equating blackness with abjection (sequestered to the inner city) and newly expanded category of post war whiteness (attached to suburban America dreams and hegemonic notion of family, deserving citizen).
Lauren Klein: Such a good point. Thank you!
?
Yanni Loukissas: Not quite sure what you mean here
?
Yanni Loukissas: Read “The Omnivore’s Neighborhod? Online restaurant reviews, race, and gentrification” for another example of how privilege is reproduced through data and how little technical skill is necessary to do so: https://journals.sagepub.com/doi/abs/10.1177/1469540515611203?journalCode=joca
Catherine D'Ignazio: Fascinating - I’ve never heard of “discursive redlining” before - thanks for this reference
+ 1 more...
?
Surya Mattu: A thing worth noting is that Equivant didn’t account for the fact that African-Americans are more likely to be arrested by the police regardless of whether the committed a crime or not. The system sort of makes an assumption that if you have been arrested you are probably at higher risk . This also reinforces existing racial inequality. Not sure if you need to add that nuance but posting here for posterity
Catherine D'Ignazio: Super helpful nuance - thanks Surya!
Patricio Davila: It would be great to see more exploration of the uses of not being counted. This would prompt more thinking about who is being counted for whom and what purpose. Subaltern counterpublics may not want to be quantified by hegemonic actors. Visibility/legibility by the state or private sector may not be desired at all for liberation. How does data for counterpublics take form? I think of zines as poor vehicles for mass communication but often essential means for forming community.
Lauren Klein: Thanks so much for this. Curious if you read our “What Gets Counted” chapter and if so, what you thought.
Patricio Davila: This may be explored already in the Labour chapter but it would be great to see some more connections made between power, labour and participation. What are the relationships between researchers and the communities they seek to represent? Participatory mapping (e.g. like that done by Iconoclasistas) takes this head-on by facilitating the generation of data and its representation in maps.
Patricio Davila: I think this point (governance from the margins) may need more expansion in terms of power. Power, among other things, is participation in decision-making. Methods such as Participatory Action Research have made attempts to acknowledge and de-centre the power dynamic inherent in the researcher/designer – participant relationship. Participatory design methods have also made some attempts at addressing this issue of power in the design process.
Lauren Klein: Thanks for this comment.
Patricio Davila: You may be interested in looking at (or even referring to) this framing of “accomplises not allies”. It is an extension of the co-liberation logic.
Patricio Davila: http://www.indigenousaction.org/accomplices-not-allies-abolishing-the-ally-industrial-complex/
+ 1 more...
?
Nicole S.: by Immanuel Kant in the 18th century, which prioritizes abstract dilemmas,
Lauren Klein: Nerd note that we might want to update this to Rawls.
?
Nicole S.: But let's imagine for a moment you are a middle-class, straight, white, male US citizen.
?
Yanni Loukissas: One thought is to examine how race and gender are locally produced or characterized in specific data settings—and how privilege then arises in those settings—rather than treating these as objectively existing categories.
?
Nicole S.: I noticed that at some points in this book, God is written in capital letters, or sometimes the phenomenon is written as “God trick,” “god trick” God trick, or god trick. I suggest picking one way and sticking with it throughout the book.
?
Nicole S.: the structural domain, the disciplinary domain, the hegemonic domain, and the interpersonal domain.
+ 2 more...
?
Nicole S.: Which chapter is this? Or is this a reference to a chapter in another book? Either way, this is a bit unclear.
?
Ksenia Gueletina: Purely structural comment, might want to split this idea into two sentence, it took quite a few attempts to grasp the idea.
?
James Scott-Brown: The numbers that you present are for ‘Computer/Information Science’, which is quite a different category (e.g. it omits statistics, and includes areas of computer science other than data science and vis ).There was a Diversity Panel at IEEE VIS 2017 (https://www.youtube.com/watch?v=U4hXNhth5sY), and a followup book is in preparation (in the Morgan & Claypool Synthesis Lectures on Visualization series). These might be good sources for statistics more tightly focused on ‘visualization’. Elijah Meeks’ survey of Data Visualization practitioners in industry (https://github.com/emeeks/data_visualization_survey) also gives some demographic information about diversity. A visualization of the 142 female respondents was longlisted fro the Kantar Information is Beautiful Awards: https://www.informationisbeautifulawards.com/showcase/3269-the-women-of-data-viz
Lauren Klein: Thanks for these references.
?
Christopher Linzy: The negative connotations of conspired/co-conspirator suggest using a different word here. Perhaps “co-developed”, “co-revealed”, or “co-created”.
Lauren Klein: “Conspirators” does have its own tradition in resistance work, with positive connotations. In either case, Patricio’s references will help to build this out.
?
Christopher Linzy: I think this phrasing may inadvertently emphasis/agree with the “females are incapable of reason” line from Kant. I suggest adding emphasis that the exclusion is due to the assumption not the reality that women are less capable of reason.
?
Christopher Linzy: I assume this is a placeholder for the actual percentage, to be added later, but I figured I would tag it to be sure.
?
James Scott-Brown: Refer back to the ealier description of Joy’s work in Chapter 1.
?
Yanni Loukissas: Yeah, this reads a bit like a retelling of the story you introduced earlier.
?
James Scott-Brown: I don’t think that Figure 7.08b show a dashboard: it is a map with an information panel that provides additional details of a selected feature.A dashboard would typically consist of a tiled layout of several small charts or numbers.
Patricio Davila: Agreed. It seems that this is the same map with the legend/info panel expanded or collapsed.The image for figure 7.08a may be the incorrect one.
+ 1 more...
?
James Scott-Brown: In this footnote you could state that employers with more than 50 employees are legally required to provide a space that is shielded from view (and which cannot be a bathroom) each time that an employee needs to express breast milk.https://www.dol.gov/whd/nursingmothers/faqBTNM.htm
?
James Scott-Brown: In Chapter 4, you Quote Michael Kimmel as saying: "privilege is invisible to those who have it" However, in chapters 4, 7, and 8 you also quote this as "privilege is *blind* to those who have it": this is comment that makes less sense (either the privilege is invisible, or people are blind to it).
Lauren Klein: Yes. This whole quote should be subbed with a better one.
?
James Scott-Brown: C.f. Cathy O’Neil’s term “math-washing”, which is more evocative of algorithms having the appearance of neutrality (but less evocative of the racial aspect).
?
Shannon Mattern: I love this. But how would you respond to someone who asks: “but how do you measure this?” :)
?
Shannon Mattern: Fantastic discussion of ethics + care in preceding paragraphs
?
Shannon Mattern: Fantastic discussion of ethics + care in preceding paragraphs
?
Shannon Mattern: What are the politics of turning quantification back on those organizations that abuse it — or using quantification to critique its own misuse?
?
Shannon Mattern: This entire femicide data discussion is fantastic
?
Nicole S.: I agree fully!
?
Shannon Mattern: Maybe you could unpack this a bit. To what does he attribute this conservatism?
?
Shannon Mattern: Excellent