Big Data Reveals Crime Hot Spots

Photo: iStockphoto
Photo: iStockphoto

San Francisco, the lovely City by the Bay, is generally a safe place to live, especially when compared to other cities of similar size. Homicides are about half of what they were a few years ago, and property crimes have continued to decline as well. With a booming tourist economy and resurgence in tech start-ups, it is no wonder that more and more people are leaving their hearts in San Francisco.

Problem solving with SAP HANA

However, what appears to be a completely positive picture at the aggregate level turns out to be slightly less so when put under the lens of big data analytics. And that is exactly what a bunch of high school students from the KIPP King High School in San Lorenzo, California, had the opportunity to do. They were working with SAP as part of a groundbreaking program called SAP HANA for Humanity, which enables students to work with data from non-profits and/or government agencies and help solve some of humanity’s most pressing problems using the SAP HANA in-memory database platform.

They analyzed around 1.5 million crime records from 2003-2012 that the city has made publicly available, and overlaid additional data sets (such as location of liquor stores, presence of graffiti, presence of a police precinct in the neighborhood, etc.) to get a more comprehensive picture of the crime situation in San Francisco. When geo-location was taken into account, several hundred million data points had been collectively crunched to reveal the true face of crime and public safety in one of the most iconic cities on the planet.

Next page: What the students discovered

What the students discovered was at the same time instructive, fascinating, mind-boggling, and downright scary. For example:

  • The 800 block of Bryant Street has had over 30,000 documented instances of crime in the last 10 years, which translates to about 1 every 3 hours. From a pocket picked to a business burglarized to a hapless tourist assaulted, the list goes on and on. And this is from the list of crimes that have actually been reported to the police department. (Just in case you are wondering, the next spot down this list belongs to the 800 block of Market Street, which has had about 7,000 crimes, or a crime every 12 hours).
  • There are 10 liquor stores that have a vast majority of crime that happens within a mere 0.2 mile radius of those stores. Interestingly, those stores are all within a 2 mile radius of each other (which, taking into account the level of crime in that area, may make you wonder if this is a mere correlation or if causality can be established).
  • Crime against property continues to be an ongoing problem; and though violent crime appears to be under control, larceny/theft and vehicular theft account for 27% of the city’s reported crime incidents. (No surprises why cost of vehicular ownership and insurance is so high in the city).
  • San Francisco is a sports loving city but every time there is a home game of the 49ers or the Giants, there is a 20% spike in crime. What was fascinating was that criminals too appear to have a notion of a workweek, with weekend games resulting in much larger crime spikes than when the game was held on a Monday. Not only that, crime levels varied depending upon the opposing team, and a win for the home team typically generated more crime than a loss or a tie.

Next page: Information about the criminals

In the end, the students found themselves dealing with public data that had a limited depth to it; for example, they didn’t know the identity of the perpetrator, his/her demographic information, prior arrest and conviction records, etc. – all of which would have contributed greatly to developing deeper insights that the city can use to make the right public policy changes to positively impact the daily life of its residents.

Big data analytics and the neighborhood crime rate

What the students did discover however was that big data technologies like SAP HANA can have a real and meaningful impact on a very pragmatic basis, and that analytics can reveal layers that may otherwise be hidden at an aggregate level. For example, knowing which neighborhoods to police at what time of day, determining the optimal route for the police patrol car, ensuring that broken street lamps are promptly fixed and graffiti cleaned up before it begins to fester… these are all small actions that city hall can take to ensure that San Francisco remains not just an iconic landmark but a safe place to call home for its 800,000 residents.