Copyright ý 2016 by ¡¯ rights reserved.
Published in the United States by Crown, an imprint of the Crown Publishing Group, a division of Penguin Random House LLC, .
CROWN is a registered trademark and the Crown colophon is a trademark of Penguin Random House LLC.
Copyright By cscodehelp代写 加微信 cscodehelp
Library of Congress Cataloging-in-Publication Data Name: O¡¯Neil, Cathy, author.
Title: Weapons of math destruction: how big data increases inequality and threatens democracy / ¡¯ : First edition. | : Crown Publishers 
Identifiers: LCCN 2016003900 (print) | LCCN 2016016487 (ebook) | ISBN 9780553418811 (hardcover) | ISBN 9780553418835 (pbk.) | ISBN 9780553418828 (ebook)
Subjects: LCSH: Big data¡ªSocial aspects¡ªUnited States. | Big data¡ªPolitical aspects¡ªUnited States. | Social indicators¡ªMathematical models¡ªMoral and ethical aspects. | Democracy¡ªUnited States. | United States¡ªSocial conditions ¡ª21st century.
Classification: LCC QA76.9.B45 064 2016 (print) | LCC QA76.9.B45 (ebook) | DDC 005.7¡ªdc23
LC record available at https://lccn.loc.gov/2016003900
Ebook ISBN 9780553418828
International Edition ISBN 9780451497338
Cover design by
INTRODUCTION CHAPTER 1
BOMB PARTS: What Is a Model? CHAPTER 2
SHELL SHOCKED: My Journey of Disillusionment CHAPTER 3
ARMS RACE: Going to College CHAPTER 4
PROPAGANDA MACHINE: Online Advertising CHAPTER 5
CIVILIAN CASUALTIES: Justice in the Age of Big Data CHAPTER 6
INELIGIBLE TO SERVE: Getting a Job CHAPTER 7
SWEATING BULLETS: On the Job CHAPTER 8
COLLATERAL DAMAGE: Landing Credit CHAPTER 9
NO SAFE ZONE: Getting Insurance CHAPTER 10
THE TARGETED CITIZEN: Civic Life
The small city of Reading, Pennsylvania, has had a tough go of it in the postindustrial era. Nestled in the green hills fifty miles west of Philadelphia, Reading grew rich on railroads, steel, coal, and textiles. But in recent decades, with all of those industries in steep decline, the city has languished. By 2011, it had the highest poverty rate in the country, at 41.3 percent. (The following year, it was surpassed, if
barely, by Detroit.) As the recession pummeled Reading¡¯s economy following the 2008 market crash, tax revenues fell, which led to a cut of forty-five officers in the police department¡ªdespite persistent crime.
Reading police chief had to figure out how to get the same or better policing out of a smaller force. So in 2013 he invested in crime prediction software made by PredPol, a Big Data start-up based in Santa Cruz, California. The program processed historical crime data and calculated, hour by hour, where crimes were most likely to occur. The Reading policemen could view the program¡¯s conclusions as a series of squares, each one just the size of two football fields. If they spent more time patrolling these squares, there was a good chance they would discourage crime. And sure enough, a year later, Chief Heim announced that burglaries were down by 23 percent.
Predictive programs like PredPol are all the rage in budget-strapped police departments across the country. Departments from Atlanta to Los Angeles are deploying cops in the shifting squares and reporting falling crime rates. City uses a similar program, called CompStat. And Philadelphia police are using a local product called HunchLab that includes risk terrain analysis, which incorporates certain features, such as ATMs or convenience stores, that might attract crimes. Like those in the rest of the Big Data industry, the developers of crime prediction software are hurrying to incorporate any information that can boost the accuracy of their models.
If you think about it, hot-spot predictors are similar to the shifting defensive models in baseball that we discussed earlier. Those systems look at the history of each player¡¯s hits and then position fielders where the ball is most likely to travel. Crime prediction software carries out similar analysis, positioning cops where crimes appear most likely to occur. Both types of models optimize resources. But a number of the crime prediction models are more sophisticated, because they predict progressions that could lead to waves of crime. PredPol, for example, is based on seismic software: it looks at a crime in one area, incorporates it into historical patterns, and predicts when and where it might occur next. (One simple correlation it has found: if burglars hit your next-door neighbor¡¯s house, batten down the hatches.)
Predictive crime models like PredPol have their virtues. Unlike the crime-stoppers in ¡¯s dystopian movie Minority Report (and some ominous real-life initiatives, which we¡¯ll get to shortly), the cops don¡¯t track down people before they commit crimes. , the UCLA anthropology professor who founded PredPol, stressed to me that the model is blind to race and ethnicity. And
unlike other programs, including the recidivism risk models we discussed, which are used for sentencing guidelines, PredPol doesn¡¯t focus on the individual. Instead, it targets geography. The key inputs are the type and location of each crime and when it occurred. That seems fair enough. And if cops spend more time in the high- risk zones, foiling burglars and car thieves, there¡¯s good reason to believe that the community benefits.
But most crimes aren¡¯t as serious as burglary and grand theft auto, and that is where serious problems emerge. When police set up their PredPol system, they have a choice. They can focus exclusively on so-called Part 1 crimes. These are the violent crimes, including homicide, arson, and assault, which are usually reported to them. But they can also broaden the focus by including Part 2 crimes, including vagrancy, aggressive panhandling, and selling and consuming small quantities of drugs. Many of these ¡°nuisance¡± crimes would go unrecorded if a cop weren¡¯t there to see them.
These nuisance crimes are endemic to many impoverished neighborhoods. In some places police call them antisocial behavior, or ASB. Unfortunately, including them in the model threatens to skew the analysis. Once the nuisance data flows into a predictive model, more police are drawn into those neighborhoods, where they¡¯re more likely to arrest more people. After all, even if their objective is to stop burglaries, murders, and rape, they¡¯re bound to have slow periods. It¡¯s the nature of patrolling. And if a patrolling cop sees a couple of kids who look no older than sixteen guzzling from a bottle in a brown bag, he stops them. These types of low- level crimes populate their models with more and more dots, and the models send the cops back to the same neighborhood.
This creates a pernicious feedback loop. The policing itself spawns new data, which justifies more policing. And our prisons fill up with hundreds of thousands of people found guilty of victimless crimes. Most of them come from impoverished neighborhoods, and most are black or Hispanic. So even if a model is color blind, the result of it is anything but. In our largely segregated cities, geography is a highly effective proxy for race.
If the purpose of the models is to prevent serious crimes, you might ask why nuisance crimes are tracked at all. The answer is that the link between antisocial behavior and crime has been an article of faith since 1982, when a criminologist named teamed up with a public policy expert, . Wilson, to write a seminal article in the Atlantic Monthly on so-called broken-windows policing. The idea was that low-level crimes and misdemeanors created an atmosphere of disorder in a neighborhood. This scared law-abiding citizens away. The dark and empty streets they left behind were breeding grounds for serious crime. The
antidote was for society to resist the spread of disorder. This included fixing broken windows, cleaning up graffiti-covered subway cars, and taking steps to discourage nuisance crimes.
This thinking led in the 1990s to zero-tolerance campaigns, most famously in City. Cops would arrest kids for jumping the subway turnstiles. They¡¯d apprehend people caught sharing a single joint and rumble them around the city in a paddy wagon for hours before eventually booking them. Some credited these energetic campaigns for dramatic falls in violent crimes. Others disagreed. The authors of the bestselling book Freakonomics went so far as to correlate the drop in crime to the legalization of abortion in the 1970s. And plenty of other theories also surfaced, ranging from the falling rates of crack cocaine addiction to the booming 1990s economy. In any case, the zero-tolerance movement gained broad support, and the criminal justice system sent millions of mostly young minority men to prison, many of them for minor offenses.
But zero tolerance actually had very little to do with Kelling and Wilson¡¯s ¡°broken- windows¡± thesis. Their case study focused on what appeared to be a successful policing initiative in Newark, . Cops who walked the beat there, according to the program, were supposed to be highly tolerant. Their job was to adjust to the neighborhood¡¯s own standards of order and to help uphold them. Standards varied from one part of the city to another. In one neighborhood, it might mean that drunks had to keep their bottles in bags and avoid major streets but that side streets were okay. Addicts could sit on stoops but not lie down. The idea was only to make sure the standards didn¡¯t fall. The cops, in this scheme, were helping a neighborhood maintain its own order but not imposing their own.
You might think I¡¯m straying a bit from PredPol, mathematics, and WMDs. But each policing approach, from broken windows to zero tolerance, represents a model. Just like my meal planning or the U.S. News Top College ranking, each crime-fighting model calls for certain input data, followed by a series of responses, and each is calibrated to achieve an objective. It¡¯s important to look at policing this way, because these mathematical models now dominate law enforcement. And some of them are WMDs.
That said, we can understand why police departments would choose to include nuisance data. Raised on the orthodoxy of zero tolerance, many have little more reason to doubt the link between small crimes and big ones than the correlation between smoke and fire. When police in the British city of Kent tried out PredPol, in 2013, they incorporated nuisance crime data into their model. It seemed to work. They found that the PredPol squares were ten times as efficient as random patrolling
and twice as precise as analysis delivered by police intelligence. And what type of crimes did the model best predict? Nuisance crimes. This makes all the sense in the world. A drunk will pee on the same wall, day in and day out, and a junkie will stretch out on the same park bench, while a car thief or a burglar will move about, working hard to anticipate the movements of police.
Even as police chiefs stress the battle against violent crime, it would take remarkable restraint not to let loads of nuisance data flow into their predictive models. More data, it¡¯s easy to believe, is better data. While a model focusing only on violent crimes might produce a sparse constellation on the screen, the inclusion of nuisance data would create a fuller and more vivid portrait of lawlessness in the city.
And in most jurisdictions, sadly, such a crime map would track poverty. The high number of arrests in those areas would do nothing but confirm the broadly shared thesis of society¡¯s middle and upper classes: that poor people are responsible for their own shortcomings and commit most of a city¡¯s crimes.
But what if police looked for different kinds of crimes? That may sound counterintuitive, because most of us, including the police, view crime as a pyramid. At the top is homicide. It¡¯s followed by rape and assault, which are more common, and then shoplifting, petty fraud, and even parking violations, which happen all the time. Prioritizing the crimes at the top of the pyramid makes sense. Minimizing violent crime, most would agree, is and should be a central part of a police force¡¯s mission.
But how about crimes far removed from the boxes on the PredPol maps, the ones carried out by the rich? In the 2000s, the kings of finance threw themselves a lavish party. They lied, they bet billions against their own customers, they committed fraud and paid off rating agencies. Enormous crimes were committed there, and the result devastated the global economy for the best part of five years. Millions of people lost their homes, jobs, and health care.
We have every reason to believe that more such crimes are occurring in finance right now. If we¡¯ve learned anything, it¡¯s that the driving goal of the finance world is to make a huge profit, the bigger the better, and that anything resembling self- regulation is worthless. Thanks largely to the industry¡¯s wealth and powerful lobbies, finance is underpoliced.
Just imagine if police enforced their zero-tolerance strategy in finance. They would arrest people for even the slightest infraction, whether it was chiseling investors on 401ks, providing misleading guidance, or committing petty frauds. Perhaps SWAT teams would descend on Greenwich, Connecticut. They¡¯d go undercover in the
taverns around Chicago¡¯s Mercantile Exchange.
Not likely, of course. The cops don¡¯t have the expertise for that kind of work. Everything about their jobs, from their training to their bullet-proof vests, is adapted to the mean streets. Clamping down on white-collar crime would require people with different tools and skills. The small and underfunded teams who handle that work, from the FBI to investigators at the Securities and Exchange Commission, have learned through the decades that bankers are virtually invulnerable. They spend heavily on our politicians, which always helps, and are also viewed as crucial to our economy. That protects them. If their banks go south, our economy could go with them. (The poor have no such argument.) So except for a couple of criminal outliers, such as Ponzi-scheme master , financiers don¡¯t get arrested. As a group, they made it through the 2008 market crash practically unscathed. What could ever burn them now?
My point is that police make choices about where they direct their attention. Today they focus almost exclusively on the poor. That¡¯s their heritage, and their mission, as they understand it. And now data scientists are stitching this status quo of the social order into models, like PredPol, that hold ever-greater sway over our lives.
The result is that while PredPol delivers a perfectly useful and even high-minded software tool, it is also a do-it-yourself WMD. In this sense, PredPol, even with the best of intentions, empowers police departments to zero in on the poor, stopping more of them, arresting a portion of those, and sending a subgroup to prison. And the police chiefs, in many cases, if not most, think that they¡¯re taking the only sensible route to combating crime. That¡¯s where it is, they say, pointing to the highlighted ghetto on the map. And now they have cutting-edge technology (powered by Big Data) reinforcing their position there, while adding precision and ¡°science¡± to the process.
The result is that we criminalize poverty, believing all the while that our tools are not only scientific but fair.
One weekend in the spring of 2011, I attended a data ¡°hackathon¡± in City. The goal of such events is to bring together hackers, nerds, mathematicians, and software geeks and to mobilize this brainpower to shine light on the digital systems that wield so much power in our lives. I was paired up with the Civil Liberties Union, and our job was to break out the data on one of the NYPD¡¯s major anticrime policies, so-called stop, question, and frisk. Known simply as stop and
frisk to most people, the practice had drastically increased in the data-driven age of CompStat.
The police regarded stop and frisk as a filtering device for crime. The idea is simple. Police officers stop people who look suspicious to them. It could be the way they¡¯re walking or dressed, or their tattoos. The police talk to them and size them up, often while they¡¯re spread-eagled against a wall or the hood of a car. They ask for their ID, and they frisk them. Stop enough people, the thinking goes, and you¡¯ll no doubt stop loads of petty crimes, and perhaps some big ones. The policy, implemented by Mayor Michael Bloomberg¡¯s administration, had loads of public support. Over the previous decade, the number of stops had risen by 600 percent, to nearly seven hundred thousand incidents. The great majority of those stopped were innocent. For them, these encounters were highly unpleasant, even infuriating. Yet many in the public associated the program with the sharp decline of crime in the city. , many felt, was safer. And statistics indicated as much. Homicides, which had reached 2,245 in 1990, were down to 515 (and would drop below 400 by 2014).
Everyone knew that an outsized proportion of the people the police stopped were young, dark-skinned men. But how many did they stop? And how often did these encounters lead to arrests or stop crimes? While this information was technically public, much of it was stored in a database that was hard to access. The software didn¡¯t work on our computers or flow into Excel spreadsheets. Our job at the hackathon was to break open that program and free the data so that we could all analyze the nature and effectiveness of the stop-and-frisk program.
What we found, to no great surprise, was that an overwhelming majority of these encounters¡ªabout 85 percent¡ªinvolved young African American or Latino men. In certain neighborhoods, many of them were stopped repeatedly. Only 0.1 percent, or one of one thousand stopped, was linked in any way to a violent crime. Yet this filter captured many others for lesser crimes, from drug possession to underage drinking, that might have otherwise gone undiscovered. Some of the targets, as you might expect, got angry, and a good number of those found themselves charged with resisting arrest.
The NYCLU sued the Bloomberg administration, charging that the stop-and-frisk policy was racist. It was an example of uneven policing, one that pushed more minorities into the criminal justice system and into prison. Black men, they argued, were six times more likely to be incarcerated than white men and twenty-one times more likely to be killed by police, at least according to the available data (which is famously underreported).
Stop and frisk isn¡¯t exactly a WMD, because it relies on human judgment and is not
formalized into an algorithm. But it is built upon a simple and destructive calculation. If police stop one thousand people in certain neighborhoods, they¡¯ll uncover, on average, one significant suspect and lots of smaller ones. This isn¡¯t so different from the long-shot calculations used by predatory advertisers or spammers. Even when the hit ratio is miniscule, if you give yourself enough chances you¡¯ll reach your target. And that helps to explain why the program grew so dramatically under Bloomberg¡¯s watch. If stopping six times as many people led to six times the number of arrests, the inconvenience and harassment suffered by thousands upon thousands of innocent people was justified. Weren¡¯t they interested in stopping crime?
Aspects of stop and frisk were similar to WMDs, though. For example, it had a nasty feedback loop. It ensnared thousands of black and Latino men, many of them for committing the petty crimes and misdemeanors that go on in college frats, unpunished, every Saturday night. But while the great majority of university students were free to sleep off their excesses, the victims of stop and frisk were boo
程序代写 CS代考 加微信: cscodehelp QQ: 2235208643 Email: firstname.lastname@example.org