Unique Presentation Identifier:
V06
Program Type
Graduate
Faculty Advisor
Dr. Tolga Ensari
Document Type
Presentation
Loading...
Location
Online
Start Date
29-4-2025 8:00 AM
Abstract
Financial fraud detection systems increasingly rely on machine learning models, yet their vulnerability to data poisoning attacks remains underexplored in real-world financial settings. This study aims to systematically evaluate how malicious data poisoning, through label flipping and feature manipulation, affects the performance of widely used fraud detection models. Focusing on Logistic Regression, Random Forest, and Isolation Forest models, we investigate two key aspects: (1) the existence of a pollution threshold, the critical proportion of poisoned data beyond which model performance degrades sharply, and (2) the feature-level sensitivity of these models, identifying which financial transaction features are most vulnerable to poisoning-based manipulations. Using a publicly available credit card fraud dataset, we will conduct controlled experiments under varying poisoning intensities and attack patterns. Preliminary defenses, such as simple anomaly-based data filtering, will also be evaluated for their effectiveness. The expected findings will not only quantify the risks posed by data poisoning but also provide actionable insights on model selection, feature monitoring, and lightweight defense strategies for improving the robustness of financial fraud detection systems.
Keywords: Data Poisoning, Financial Fraud Detection, Pollution Threshold, Feature Sensitivity, Machine Learning, Robustness
Recommended Citation
Hong, Linru and Bhandari, Shruti, "Understanding the Impact of Data Poisoning on Financial Fraud Detection: Thresholds and Feature Sensitivity" (2025). ATU Student Research Symposium. 4.
https://orc.library.atu.edu/atu_rs/2025/2025/4
Poster
Understanding the Impact of Data Poisoning on Financial Fraud Detection: Thresholds and Feature Sensitivity
Online
Financial fraud detection systems increasingly rely on machine learning models, yet their vulnerability to data poisoning attacks remains underexplored in real-world financial settings. This study aims to systematically evaluate how malicious data poisoning, through label flipping and feature manipulation, affects the performance of widely used fraud detection models. Focusing on Logistic Regression, Random Forest, and Isolation Forest models, we investigate two key aspects: (1) the existence of a pollution threshold, the critical proportion of poisoned data beyond which model performance degrades sharply, and (2) the feature-level sensitivity of these models, identifying which financial transaction features are most vulnerable to poisoning-based manipulations. Using a publicly available credit card fraud dataset, we will conduct controlled experiments under varying poisoning intensities and attack patterns. Preliminary defenses, such as simple anomaly-based data filtering, will also be evaluated for their effectiveness. The expected findings will not only quantify the risks posed by data poisoning but also provide actionable insights on model selection, feature monitoring, and lightweight defense strategies for improving the robustness of financial fraud detection systems.
Keywords: Data Poisoning, Financial Fraud Detection, Pollution Threshold, Feature Sensitivity, Machine Learning, Robustness