TY - GEN
T1 - Predicting crime reporting with decision trees and the national crime victimization survey
AU - Gutierrez, Juliette
AU - Leroy, Gondy
PY - 2007
Y1 - 2007
N2 - Crime reports are used by law enforcement to find criminals, prevent further violations, identify problems causing crimes and allocate government resources. Unfortunately, many crimes go unreported. This may lead to an incorrect crime picture and suboptimal responses to the existing situation. Our goal is to use a data mining approach to increase understanding of when crime is reported or not. An increased understanding could lead to new, more effective programs to fight crime or changes to existing programs. We use the National Crime Victimization Survey (NCVS) which comprises data collected from 45,000 households about incidents, victims, suspects and if the incident was reported or not. We use decision trees to predict when incidents are reported or not. We compare decision trees that are built based on domain knowledge with those automatically created. For the automatically created trees, we compare three variable selection methods: two filters, Chi-squared and Cramer's V Coefficient, and a forward selection wrapper. We found that the decision trees that are automatically constructed are as accurate as those based on domain knowledge while they show a different picture. We conclude that decision trees lead to several new hypotheses for criminologists while they are automatically constructed and easy to understand which makes them practical and useful.
AB - Crime reports are used by law enforcement to find criminals, prevent further violations, identify problems causing crimes and allocate government resources. Unfortunately, many crimes go unreported. This may lead to an incorrect crime picture and suboptimal responses to the existing situation. Our goal is to use a data mining approach to increase understanding of when crime is reported or not. An increased understanding could lead to new, more effective programs to fight crime or changes to existing programs. We use the National Crime Victimization Survey (NCVS) which comprises data collected from 45,000 households about incidents, victims, suspects and if the incident was reported or not. We use decision trees to predict when incidents are reported or not. We compare decision trees that are built based on domain knowledge with those automatically created. For the automatically created trees, we compare three variable selection methods: two filters, Chi-squared and Cramer's V Coefficient, and a forward selection wrapper. We found that the decision trees that are automatically constructed are as accurate as those based on domain knowledge while they show a different picture. We conclude that decision trees lead to several new hypotheses for criminologists while they are automatically constructed and easy to understand which makes them practical and useful.
KW - Crime reporting
KW - Data mining
KW - Decision trees
KW - Filters
KW - Law enforcement
KW - National crime victimization survey
KW - Wrappers
UR - http://www.scopus.com/inward/record.url?scp=84870174847&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84870174847&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:84870174847
SN - 9781604233810
T3 - Association for Information Systems - 13th Americas Conference on Information Systems, AMCIS 2007: Reaching New Heights
SP - 586
EP - 595
BT - Association for Information Systems - 13th Americas Conference on Information Systems, AMCIS 2007
T2 - 13th Americas Conference on Information Systems, AMCIS 2007
Y2 - 10 August 2007 through 12 August 2007
ER -