But the biggest thing is going to be context. Whilst a 10gb card can run a 9b model with some context .. for coding you really want a lot of context.
So if paying 200 a year for 1T in context, vs your 32k context.. that's the thing I see as being the driver.
Personally ive found great success with using open code, having Opus as my plan agent, and omnicoder-9b as my build agent.
Get opus to plan, switch to omnicoder to build, switch back to opus to review. Etc etc.
Works great.