The data science industry is lined with biases and practices that devalue the presence and successes of women and people of color.
Like tech industries in general, the field of data science has a problem: Research suggests only 15 percent of data scientists are women, and fewer than 3 percent are women of color. For a long time, experts have described this as a “leaky pipeline” problem: From middle school through graduate school, girls and women are said to “leak” out of computer science, math and other fields that typically lead to careers in data science.
Women who have succeeded in these fields think otherwise. The pipeline isn’t leaky so much as it’s toxic, they say, lined with practices that can devalue the presence and successes of women and people of color.
With degrees in both mathematics and computer science, Dr. Fatima Abu Salem, a computer science professor and data scientist in Beirut, Lebanon, has faced these problems head-on. According to Abu Salem, data science’s sexism pervades the industry, including areas and methods of assessment, perception of results, and prioritization of certain projects over others.
In academia, research shows widespread evidence of systemic bias against women and people of color. Fifty percent of women in STEM report they have experienced discrimination on the job while 41 percent of women in other fields have experienced workplace prejudice.
Abu Salem says male colleagues often offer editorial comments on her writing as they review her research articles: “They do not debate the content; they debate the way I express the ideas.”
Abu Salem says reviewers have criticized her writing as “flowery,” “dramatic” and inappropriate for communicating about data. In doing so, they repeat a long history of women’s perspectives and research not being considered sufficiently objective. “Because there haven’t been enough women, men set the standards for us, and we have to abide,” Abu Salem said.
According to Abu Salem and other women in the industry, data science needs diversity to offer fresh, innovative ideas to research and development. Women often focus on projects oriented toward the public good rather than those primarily intended to generate profit.
As Dr. Margot Gerritsen, a professor in the Department of Energy Resources Engineering at Stanford University and co-founder of the Women in Data Science Initiative (WiDS), recently put it, “Every time you design a product, you reflect the culture in which it was created.” When that culture is lacking in diversity, it designs products and projects that are limited in their scope and reach.
Abu Salem often invites students to join her research team, which focuses on research for the public good. She’s noticed a pattern.
“I throw these topics out every year, and it’s hardly ever a group of male students who comes forward,” she said. “They’re interested in self-driving cars and back-end, front-end design and data engineering.”
In contrast, women working in data science often focus on tackling social issues that don’t come with money and prestige the way other data science research does. Some of Abu Salem’s most recent work in artificial intelligence, for example, explores dementia and birth defect prediction, fake news and bias detection as well as predictions of healthcare demand and refugee mobility.
Women working in data science often focus on tackling social issues that don’t come with money and prestige the way other data science research does.
Essentially, Abu Salem says the way data science works now sets women up for failure: “The reality is, yeah, they’re not good enough for that kind of system because it’s not meant for them.”
The industry’s failure to diversify, along with the education students receive in math and computer science, create an environment where diversity is discouraged. Abu Salem says it’s no wonder women steer clear of data science.
“I don’t think people realize how serious it is for women scientists to have to come and put on all the stylistic attitudes of how men work,” she said. “This alienates a lot of women who can’t identify with the rules of this game.”
If data is going to serve a diverse range of citizens and consumers rather than a small subset, it’s imperative that the rules of the game change.