FinTech Trading Surveillance Using LLM-Powered Anomaly Detection with Isolation Forests
Keywords:
FinTech, trading surveillance, large language models, anomaly detection, Isolation Forest, semantic embeddings, insider trading, spoofing, equity marketsAbstract
LLM semantic embeddings and Isolation Forest-based anomaly detection identify stock and derivatives insider trading and spoofing. High-dimensional embeddings of OpenAI's ada and Sentence-BERT models provide semantically rich vector representations of transactional and order book data that preserve latent market patterns and contextual richness. The Isolation Forest technique may identify unsupervised trading abnormalities in these embeddings that statistical monitoring misses. The system recognizes small trading behavior abnormalities better and more recallably than z-scores. SEC Rule 10b-5-compliant automated surveillance may decrease false positives and uncover manipulative trading. LLM embeddings and ensemble-based anomaly detection scale and understand financial market surveillance, empirical investigations show.
Downloads
References
M. T. Ribeiro, S. Singh, and C. Guestrin, “Why Should I Trust You?: Explaining the Predictions of Any Classifier,” Proc. 22nd ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD), pp. 1135–1144, 2016.
L. Breiman, “Random Forests,” Machine Learning, vol. 45, no. 1, pp. 5–32, 2001.
F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation Forest,” Proc. IEEE Int. Conf. on Data Mining (ICDM), pp. 413–422, 2008.
T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient Estimation of Word Representations in Vector Space,” arXiv preprint arXiv:1301.3781, 2013.
A. Vaswani et al., “Attention Is All You Need,” Proc. Advances in Neural Information Processing Systems (NeurIPS), vol. 30, pp. 5998–6008, 2017.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” Proc. NAACL-HLT, pp. 4171–4186, 2019.
A. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” Proc. EMNLP-IJCNLP, pp. 3982–3992, 2019.
OpenAI, “Introducing text-embedding-ada-002,” OpenAI Technical Report, 2022.
D. Li, D. Chen, and Z. Li, “Anomaly Detection and Diagnosis for Financial Time Series Using Transformer Networks,” IEEE Access, vol. 9, pp. 79541–79553, 2021.
G. Liu, S. Wang, and Y. Sun, “Detecting Market Manipulation in High-Frequency Trading Using Deep Learning,” Quantitative Finance, vol. 22, no. 3, pp. 471–489, 2022.
J. D. Urbano, C. Martino, and S. Chen, “AI-Based Surveillance Systems for Market Manipulation Detection,” Journal of Financial Regulation and Compliance, vol. 30, no. 4, pp. 530–552, 2022.
P. P. Khandani, A. J. Kim, and A. W. Lo, “Consumer Credit-Risk Models via Machine-Learning Algorithms,” Journal of Banking & Finance, vol. 34, no. 11, pp. 2767–2787, 2010.
Y. Zhang, H. Chen, and Q. Xu, “Graph Neural Networks for Financial Anomaly Detection,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 10, pp. 5672–5685, 2022.
S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 10, pp. 1345–1359, 2010.
C. Molnar, Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 2nd ed., Munich, Germany: Lulu Press, 2022.
F. Doshi-Velez and B. Kim, “Towards a Rigorous Science of Interpretable Machine Learning,” arXiv preprint arXiv:1702.08608, 2017.
Y. Zhang, L. Wang, and T. Jin, “A Hybrid Deep Learning Model for Fraudulent Transaction Detection,” Expert Systems with Applications, vol. 184, 115412, 2021.
S. K. Jain, A. Dey, and R. Singh, “Regulatory Technology and Compliance Automation Using AI: A Survey,” IEEE Access, vol. 10, pp. 55894–55912, 2022.
R. Anderson and T. Dyson, “Financial Market Integrity through AI-Based Anomaly Detection,” Journal of Financial Data Science, vol. 5, no. 1, pp. 25–41, 2023.
M. Goldstein and S. Uchida, “A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data,” PLOS ONE, vol. 11, no. 4, e0152173, 2016.