undefined | Better HN

0 pointsfpgaminer3y ago0 comments

> pushshift.io, a website and database which logs of all of the posts that go on Reddit when they get posted

Such a great resource. It's surprisingly easy to build your own massive datasets using it. I re-derived WebText2, used for training GPT-3, just on a home machine. And with some image scraping you can build up image datasets for training interesting GAN models.

> the training process they used are not.

Seems like it'd be fairly straightforward to finetune an existing language model . GPT-3 if you've got spare change, GPT-J-6B can be finetuned in Colab for free, and GPT-NeoX-20B could be finetuned for free/cheap. Use simple concats of AITA posts followed by a top comment. Balance for NTA/YTA like the Training Data page mentions, and I'll bet you'll get comparable results.

That said, the _idea_ of this bot is really cool and fun.

0 comments

minimaxir3y ago

Straightfoward to tune, but given the dataset size it would require a substantial amount of compute, more than what a Colab can provide without timing out.

The comments by the creators imply they used some sort of SaaS for both training and deployment.

j / k navigate · click thread line to collapse