The HF demo is very similar to the GitHub demo, so easy to try out.
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu128
pip install qwen3-tts
qwen-tts-demo Qwen/Qwen3-TTS-12Hz-1.7B-Base --no-flash-attn --ip 127.0.0.1 --port 8000
That's for CUDA 12.8, change PyTorch install accordingly.Skipped FlashAttention since I'm on Windows and I haven't gotten FlashAttention 2 to work there yet (I found some precompiled FA3 files[3] but Qwen3-TTS isn't FA3 compatible yet).
[1]: https://github.com/QwenLM/Qwen3-TTS?tab=readme-ov-file#quick...
Try using mps I guess, I saw multiple references to code checking if device is not mps, so seems like it should be supported. If not, CPU.