Llama 34B is
just big enough to fit on a 24GB consumer (or affordable server) GPU.
Its also just the right size for llama.cpp inference on machines with 32GB RAM, or 16GB RAM with a 8GB+ GPU.
Basically its the most desirable size for AI finetuning hobbyists, and the quality jump from llama v1 13B to llama v1 33B is huge.