A 5090 has 32GB of VRAM allowing you to run a 32B model in memory at Q6_K.
You can run larger models by splitting the GPU layers that are run in VRAM vs stored in RAM. That is slower, but still viable.
This means that you can run the Qwen3-Coder-30B-A3B model locally on a 4090 or 5090. That model is a Mixture of Experts model with 3B active parameters, so you really only need a card with 3B of VRAM so you could run it on a 3090.
The Qwen3-Coder-480B-A35B model could also be run on a 4090 or 5090 by splitting the active 35B parameters across VRAM and RAM.
Yes, it will be slower than running it in the cloud. But you can get a long way with a high-end gaming rig.