FinDiff: Diffusion Models for Financial Tabular Data Generation
Journal
4th ACM International Conference on AI in Finance
Type
conference paper
Date Issued
2023
Author(s)
Editor(s)
Timur Sattarov
Abstract
The sharing of microdata, such as fund holdings and derivative instruments, by regulatory institutions presents a unique challenge due to strict data confidentiality and privacy regulations. These challenges often hinder the ability of both academics and practitioners to conduct collaborative research effectively. The emergence of generative models, particularly diffusion models, capable of synthesizing data mimicking the underlying distributions of real-world data presents a compelling solution. This work introduces 'FinDiff', a diffusion model designed to generate real-world financial tabular data for a variety of regulatory downstream tasks, for example economic scenario modeling, stress tests, and fraud detection. The model uses embedding encodings to model mixed modality financial data, comprising both categorical and numeric attributes. The performance of FinDiff in generating synthetic tabular financial data is evaluated against state-of-the-art baseline models using three real-world financial datasets (including two publicly available datasets and one proprietary dataset). Empirical results demonstrate that FinDiff excels in generating synthetic tabular financial data with high fidelity, privacy, and utility.
Language
English (United States)
Keywords
neural networks
diffusion models
synthetic data generation
financial tabular data
HSG Classification
contribution to scientific community
Refereed
Yes
Book title
Proceedings of the Fourth ACM International Conference on AI in Finance
Publisher
Association for Computing Machinery (ACM)
Event Title
ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance
Event Location
Brooklyn, USA
Event Date
Nov. 27-29, 2023
Division(s)
Contact Email Address
marco.schreyer@unisg.ch
File(s)![Thumbnail Image]()
Loading...
open.access
Name
2309.01472.pdf
Size
3.35 MB
Format
Adobe PDF
Checksum (MD5)
6a9517de7f4c1966525763ff7a2e3369