Analysis Findings, Data, Statistics, Tips and Tricks, Tools

The Importance of Outliers

by John Hayes, GIS Applications Specialist and Jay Colbert, GIS Project Manager

When delivering data to large groups of people it is critical that the information be as accurate as possible.  At SAVI, we follow as many as five quality control (QC) steps to ensure that users can be confident with the numbers that they use in their grant applications, community assessments, and strategic planning efforts.

Part of the process includes outlier analysis, which helps us to identify values by year that differ the most outlier from other values for that indicator. Outliers are important for two reasons: 1) they may represent an error in the data that needs to be fixed; 2) they may represent the first indication of an important new trend.

Let’s look at some of the interesting things that have popped up during our data analysis.

This chart shows a data value that our outlier analysis found as unusual. 2011 showed a strong increase in average number of hospitalization days due to injury and poisoning in Marion County. Further investigation into the raw data revealed a case of an unfortunate patient who had spent more than 40,000 days in the hospital for a broken arm. Obviously this was a data error and we were able to fix it.


After reprocessing the data the chart looked like this…


Other finds are not so easily fixed. In the chart below we found a strong decrease in the number of mental health hospitalizations in Hendricks County in 2011. The data showed a large drop for mental health hospitalizations in one particular hospital in Hendricks County, but we are still working with the data provider to determine if it is a real trend or a data error. In the interim we have removed the suspect values from SAVI.


Another possible result in outlier analysis is the identification of new trends. Some examples follow.

One consequence of the recent recession is lower employment. The data highlights this trends in the case of Morgan County, where we can see that there were fewer males employed in 2010 than in 2000.


As Marion County continues to move towards a higher minority population, data reveals that the white population has declined in recent years. The decrease is even stronger among children as noted in the following chart.


Home loans in general decreased sharply in recent years with the housing downturn, and home improvement loans were no exception. The chart below illustrates to what extent these loans fell in the Indianapolis-Carmel MSA. But, also notice that there was a small increase in 2011, the first since 2006.


Juvenile runaways showed as a strong upward outlier in 2010, but by 2012, the numbers fell to the lowest level since 1999.


In recent years, the trend for juvenile arrests in Marion County has dramatically decreased.


While total juvenile arrests have been decreasing in recent years, juvenile arrests for gang-related activity increased sharply in 2011 and 2012. There are many possible reasons for this. One, as reported by WRTV, is that gang activity is expanding and spreading in Indianapolis and surrounding suburbs.


Which of these examples do YOU find most interesting?


No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s


Enter your email address to follow this blog and receive notifications of new posts by email.

Follow me on Twitter