> This may revolutionize data science: we introduce TabPFN, a new tabular data classification method that takes 1 second & yields SOTA performance (better than hyperparameter-optimized gradient boosting in 1h). Current limits: up to 1k data points, 100 features, 10 classes. 1/6
[Faster and more accurate than gradient boosting for tabular data: Catboost, LightGBM, XGBoost]
Many thanks also for open-sourcing your work and making the colab notebook, I've been playing around with that a bit.
Edit: spelling
My main observation just looking at your example pictures is that its closest competitor is Gaussian Processes which I've long been a fan of.
Just looking at those pictures it looks like GP and TabPFN are very similar where there is data but TabPFN is more happy to extrapolate while GP is localised around the data (look at the top row for example).
I can't decide whether that's a feature or a bug. I guess it's good to have a choice whether you want to show that you're uncertain in regions where you've never seen data before or be able to extrapolate on what you have seen.
Would tabular classification usually refer to say, extraction of tabular data in a picture to text?
I tried googling and looking through the site but it wasn't obvious to me what this actually does.