Date of Award
Honors College Thesis
Jacob Chapman, Ph.D
Bernd Schroeder, Ph.D.
Sabine Heinhorst, Ph.D.
Billions of dollars are lost within insurance companies due to fraud. Large money losses force insurance companies to increase premium costs and/or restrict policies. This negatively affects a company’s loyal customers. Although this is a prevalent problem, companies are not urgently working toward bettering their machine learning algorithms. Underskilled workers paired with inefficient computer algorithms make it difficult to accurately and reliably detect fraud.
The goal of this study is to understand the idea of -Nearest Neighbors ( -NN) and to use this classification technique to accurately detect fraudulent auto insurance claims. Using -NN requires choosing a value and a distance metric. The best choice of values and distance metrics will be unique to every dataset. This study aims to break down the processes involved in determining an accurate value and distance metric for a sample auto insurance claims dataset. Odd values 1 through 19 and the Euclidean, Manhattan, Chebyshev, and Hassanat metrics are analyzed using Excel and R.
Results support the idea that unique values and distance metrics are needed depending on the dataset being worked with.
Keywords: machine learning, insurance, fraud, detection, k-NN, distance
Copyright for this thesis is owned by the author. It may be freely accessed by all users. However, any reuse or reproduction not covered by the exceptions of the Fair Use or Educational Use clauses of U.S. Copyright Law or without permission of the copyright holder may be a violation of federal law. Contact the administrator if you have additional questions.
Stout, Alliyah, "Fine-Tuning a 𝑘-Nearest Neighbors Machine Learning Model for the Detection of Insurance Fraud" (2022). Honors Theses. 863.