Unique Presentation Identifier:

V06

Program Type

Graduate

Faculty Advisor

Dr. Tolga Ensari

Document Type

Presentation

Loading...

Media is loading
 

Location

Online

Start Date

29-4-2025 8:00 AM

Abstract

Financial fraud detection systems increasingly rely on machine learning models, yet their vulnerability to data poisoning attacks remains underexplored in real-world financial settings. This study aims to systematically evaluate how malicious data poisoning, through label flipping and feature manipulation, affects the performance of widely used fraud detection models. Focusing on Logistic Regression, Random Forest, and Isolation Forest models, we investigate two key aspects: (1) the existence of a pollution threshold, the critical proportion of poisoned data beyond which model performance degrades sharply, and (2) the feature-level sensitivity of these models, identifying which financial transaction features are most vulnerable to poisoning-based manipulations. Using a publicly available credit card fraud dataset, we will conduct controlled experiments under varying poisoning intensities and attack patterns. Preliminary defenses, such as simple anomaly-based data filtering, will also be evaluated for their effectiveness. The expected findings will not only quantify the risks posed by data poisoning but also provide actionable insights on model selection, feature monitoring, and lightweight defense strategies for improving the robustness of financial fraud detection systems.

Keywords: Data Poisoning, Financial Fraud Detection, Pollution Threshold, Feature Sensitivity, Machine Learning, Robustness

Share

COinS
 
Apr 29th, 8:00 AM

Understanding the Impact of Data Poisoning on Financial Fraud Detection: Thresholds and Feature Sensitivity

Online

Financial fraud detection systems increasingly rely on machine learning models, yet their vulnerability to data poisoning attacks remains underexplored in real-world financial settings. This study aims to systematically evaluate how malicious data poisoning, through label flipping and feature manipulation, affects the performance of widely used fraud detection models. Focusing on Logistic Regression, Random Forest, and Isolation Forest models, we investigate two key aspects: (1) the existence of a pollution threshold, the critical proportion of poisoned data beyond which model performance degrades sharply, and (2) the feature-level sensitivity of these models, identifying which financial transaction features are most vulnerable to poisoning-based manipulations. Using a publicly available credit card fraud dataset, we will conduct controlled experiments under varying poisoning intensities and attack patterns. Preliminary defenses, such as simple anomaly-based data filtering, will also be evaluated for their effectiveness. The expected findings will not only quantify the risks posed by data poisoning but also provide actionable insights on model selection, feature monitoring, and lightweight defense strategies for improving the robustness of financial fraud detection systems.

Keywords: Data Poisoning, Financial Fraud Detection, Pollution Threshold, Feature Sensitivity, Machine Learning, Robustness