Ph.D. student at the University of Tübingen, Germany @uni_tue
Research:
Deep Neural Networks and Tabular Data, Explainable Machine Learning
Oct 21, 2022 • 8 tweets • 4 min read
Leave VAEs and GANs behind: LLMs are all you need for tabular data generation!
We introduce a new method GReaT (Generation of Realistic Tabular data), with state-of-the-art generative abilities (see below). How we did it? ↓ (1/n) #tabulardata
(2/n) Tabular data frequently consists of categorical and numerical data. Furthermore, categorical data and feature names typically are words. Thus, it is possible to represent a tabular data sample as a meaningful sentence, e.g., "Age is 42, Education is HS-grad, ..."