Colloquium Talk - Cristian Bravo
Room: 248
The Value of Data Variety in Fraud Analytics
Cristian Bravo, University of Southampton, UK
Within the five V’s of Big Data, Variety, or the use of multiple data types, is one key component that opens the door to improved analytics models and new business insights. In this talk, I will focus on the impact of two new data sources: social network data and text-based information, to the problem of fraud detection in the financial industry. The first data type appears naturally in credit card transactions, where merchants and buyers conform a social network that provides an uncorrelated and highly useful source of information to isolate the behaviour of fraudsters. By using algorithms such as the one used by Google for ranking websites, we reached accuracies of 99.9%, and provided new information to prevent fraud altogether. The second source of information is common in the car insurance industry, where fraud represents between one and five percent of all transactions. By studying written claims, redefining the use of sentiment analysis to detect the writing style and the composition of the declarations, we increased the detection rate of fraudulent claims three fold. For each application, this talk will show the models, the challenges, and the main results and business insights we obtained.