The Economics of Building ML Products in the LLM Era (opens in new tab)

(tinyml.substack.com)

3 pointscsoham3y ago4 comments

4 comments

Unfortunately this article doesn't have specific numbers, also it overstates the case for the API and makes fine-tuning seem more of a leap than it really is.

I know the pointy-haired boss wants to use APIs for everything, even feature flags and user login, but he's been canceled.

It is almost scary how productive I can be working with the new tools for embeddings and neural networks. Regularly I get things done in half an hour that used to take a whole weekend. It used to be a real black art to train networks on a GPU but now I can go down a short checklist and... it's easy.

The worst problem I had with my first "classification based on pretraining" project was that the network trained way too fast, I'd trained plenty of neural networks before and just didn't believe it could learn that fast. I wasted time looking for something wrong but really... it is that fast.

If you don't believe me, look at this paper

https://arxiv.org/abs/2304.01238

they were getting get results at spam classification with just 1000 samples repeated 3 times and that is very consistent with what I'm seeing with my problems. I used to train networks for between 20 minutes and 20 hours and now it's more like 2 minutes.

So really you can be up and running with huggingface transformers in not much longer it takes to set up API keys.

roguefort3y ago

> So really you can be up and running with huggingface transformers in not much longer it takes to set up API keys. It is much easier for sure. But not much faster than setting up API keys. Have you even used the OpenAI API? It's crazy simple and easy to build with.

I agree that networks are training faster, and you don't need PhDs to train these networks anymore. But training still has quirks. Understanding the model's behaviour still requires some DS knowledge. People with that knowledge are expensive to hire. APIs will get you in a few minutes what a DS with 1-2 months of work will get you.

roguefort3y ago

Unless you are an established organization. I don't see why anyone will move away from APIs or finetuning. Custom models are going to get fewer and far between.

PaulHoule3y ago

That's another problem with that article. The gap between API and fine-tuning is much smaller than the gap between fine-tuning and developing a foundation model. I would look at BloombergGPT as an example

https://arxiv.org/abs/2303.17564

here you have a company which can make a document collection about the same size as "The Pile", add that to "The Pile" and train a model based on that. They're not just a big company but they are in the information business so it is clear that it's worth it to them.

j / k navigate · click thread line to collapse

4 comments

PaulHoule3y ago

Unfortunately this article doesn't have specific numbers, also it overstates the case for the API and makes fine-tuning seem more of a leap than it really is.

I know the pointy-haired boss wants to use APIs for everything, even feature flags and user login, but he's been canceled.

If you don't believe me, look at this paper

https://arxiv.org/abs/2304.01238

So really you can be up and running with huggingface transformers in not much longer it takes to set up API keys.

roguefort3y ago

Unless you are an established organization. I don't see why anyone will move away from APIs or finetuning. Custom models are going to get fewer and far between.

PaulHoule3y ago

https://arxiv.org/abs/2303.17564

j / k navigate · click thread line to collapse