For Qwen 35B enabling native MCP on MLX models slows it down by 10%.
For Qwen 27B enabling native MCP on MLX models speeds token generation up almost exactly 1.5x.
(all tested on M5 pro).
No comments yet.