In the wake of the 2008 banking crisis, Cathy O’Neil, a former Barnard College math professor turned hedge fund data scientist, realized that the algorithms she once believed would solve complex problems with pure logic were instead creating them at great speed and scale.
Now O’Neil—who goes by mathbabe on her popular blog and 11,000-follower Twitter account—works at bringing to light the dark side of Big Data: mathematical models that operate without transparency, without regulation, and—worst of all—without recourse if they’re wrong. She’s the founder of the Lede Program for Data Journalism at Columbia University, and her bestselling book, Weapons of Math Destruction (Crown, 2016), was long-listed for the 2016 National Book Award.
We asked O’Neil about creating accountability for mathematical models that businesses use to make critical decisions.
Q. If an algorithm applies rules equally across the board, how can the results be biased?
A: Algorithms aren’t inherently fair or trustworthy just because they’re mathematical. “Garbage in, garbage out” still holds.
There are many examples: On Wall Street, the mortgage-backed security algorithms failed because they were simply a lie. A program designed to assess teacher performance based only on test results fails because it’s just bad statistics; moreover, there’s much more to learning than testing. A tailored advertising startup I worked for created a system that served ads for things users wanted, but for-profit colleges used that same infrastructure to identify and prey on low-income single mothers who could ill afford useless degrees. Models in the justice system that recommend sentences and predict recidivism tend to be based on terribly biased policing data, particularly arrest records, so their predictions are often racially skewed.
Does bias have to be introduced deliberately for an algorithm to make skewed predictions?
No! Imagine that a company with a history of discriminating against women wants to get more women into the management pipeline and chooses to use a machine-learning algorithm to select potential hires more objectively. They train that algorithm with historical data about successful hires from the last 20 years, and they define successful hires as people they retained for 5 years and promoted at least twice.
They have great intentions. They aren’t trying to be biased; they’re trying to mitigate bias. But if they’re training the algorithm with past data from a time when they treated their female hires in ways that made it impossible for them to meet that specific definition of success, the algorithm will learn to filter women out of the current application pool, which is exactly what they didn’t want.
I’m not criticizing the concept of Big Data. I’m simply cautioning everyone to beware of oversized claims about and blind trust in mathematical models.
What safety nets can business leaders set up to counter bias that might be harmful to their business?
They need to ask questions about, and support processes for, evaluating the algorithms they plan to deploy. As a start, they should demand evidence that an algorithm works as they want it to, and if that evidence isn’t available, they shouldn’t deploy it. Otherwise they’re just automating their problems.
Once an algorithm is in place, organizations need to test whether their data models look fair in real life. For example, the company I mentioned earlier that wants to hire more women into its management pipeline could look at the proportion of women applying for a job before and after deploying the algorithm. If applications drop from 50% women to 25% women, that simple measurement is a sign something might be wrong and requires further checking.
Very few organizations build in processes to assess and improve their algorithms. One that does is Amazon: Every single step of its checkout experience is optimized, and if it suggests a product that I and people like me don’t like, the algorithm notices and stops showing it. It’s a productive feedback loop because Amazon pays attention to whether customers are actually taking the algorithm’s suggestions.
You repeatedly warn about the dangers of using machine learning to codify past mistakes, essentially, “If you do what you’ve always done, you’ll get what you’ve always gotten.” What is the greatest risk companies take when trusting their decision making to data models?
The greatest risk is to trust the data model itself not to expose you to risk, particularly legally actionable risk. Any time you’re considering using an algorithm under regulated conditions, like hiring, promotion, or surveillance, you absolutely must audit it for legality. This seems completely obvious; if it’s illegal to discriminate against people based on certain criteria, for example, you shouldn’t use an algorithm that does so! And yet companies often use discriminatory algorithms because it doesn’t occur to them to ask about it, or they don’t know the right questions to ask, or the vendor or developer hasn’t provided enough visibility into the algorithm for the question to be easily answered.
What are the ramifications for businesses if they persist in believing that data is neutral?
As more evidence comes out that poorly designed algorithms cause problems, I think that people who use them are going to be held accountable for bad outcomes. The era of plausible deniability for the results of using Big Data—that ability to say they were generated without your knowledge—is coming to an end. Right now, algorithm-based decision making is a few miles ahead of lawyers and regulations, but I don’t think that’s going to last. Regulators are already taking steps toward auditing algorithms for illegal properties.
Whenever you use an automated system, it generates a history of its use. If you use an algorithm that’s illegally biased, the evidence will be there in the form of an audit trail. This is a permanent record, and we need to think about our responsibility to ensure it’s working well.