Leave VAEs and GANs behind: LLMs are all you need for tabular data generation!
We introduce a new method GReaT (Generation of Realistic Tabular data), with state-of-the-art generative abilities (see below). How we did it? ↓ (1/n) #tabulardata
(2/n) Tabular data frequently consists of categorical and numerical data. Furthermore, categorical data and feature names typically are words. Thus, it is possible to represent a tabular data sample as a meaningful sentence, e.g., "Age is 42, Education is HS-grad, ..."