Weapons of Math Destruction.
I can’t lay claim to the name, it comes from mathbabe.org. Her subtitle is “How Big Data Increases Inequality and Threatens Democracy”. You would think, based on that subtitle, that she doesn’t like Big Data. On the contrary, she likes it, but it needs to be used effectively.
For instance, in the United States (not sure about Canada), prosecutors plug in some data into an application and it tells them how many years the person should be sent away for due to their crime. In and of itself that isn’t a bad idea as that would prevent kids from getting away with no jail time because their parents were too rich and didn’t care about bringing him up with the proper sense of morals. (Affluenza). Or perhaps how people on various sports teams get much smaller sentences for committing crimes as “it could hurt them in the future”.
But the data that goes into the program is probably not the right data. For instance, poor people tend to live in poorer neighbourhoods as they just can’t afford to move. But because they like in poorer neighbourhoods and because people in those neighbourhoods have a higher tendency to commit further crimes, the program automatically thinks that you should send the person away for a longer period of time. So, two people commit the same crime, but one lives in a poor neighbourhood while the other lives in an affluent neighbourhood and the one living in the richer neighbourhood is going to get a lighter sentence. Or, even if you live in an affluent neighbourhood, if your neighbours are less likely to commit crimes than the guy two blocks over, you will get a lighter sentence.
Like the myth about WalMart, diapers and beer (men buying diapers usually buy beer at the same time), there needs to be an understanding between correlation and causation. Just because two things are correlated does not mean that there is causation. People need to really think about the data and ensure that there is causation between data points.
The author of Weapons of Math Destruction believes that Big Data can be used in constructive ways. For instance, in the case of a prisoner who is at a high risk for recidivism, why not target them with counesling and job training while in prison? For neighbourhoods with higher crime rates, put on more foot patrols so that people see the police as part of the neighbourhood. Instead of using Big Data to punish, look at Big Data to help. I know, it’s a heavy concept: helping people.
Even so, you may target someone for assistance, but are you targetting them for the right reasons. Once again, it comes back to causation, not correlation and I think this is the biggest problem for data warehouses, Big Data repositories, the NSA storage of all your emails, literally anything involving a large amount of data. And this is where Data Scientists come in handy. Through statistical analysis and sometimes just plain common sense, they rule out correlation and dig into causation.
Data is important, but understanding what your data really says is more important.