Just like open source?
> Training setup and data is completely non trivial for a large language model. To replicate Llama would take hundreds of hours of engineering, at least.
The entire point of having the pre-trained weight released is to *not* have to do this. You just need to finetune, which can be done with very little data, depending on the task, and many open source toolkits, that work with those weights, exist to make this trivial.