https://gr.inc/question/although-a-few-years-ago-the-fundame...
In the dropdown set to DeepSeek-R1, switch to the LIMO model (which apparently has a high frequency of language switching).
I'm not sure about examples of gibberish or totally illegible reasoning. My guess is that since R1-Zero still had the KL penalty, it should all be somewhat legible - the KL penalty encourages the model to not move too far from what the base model would say in any given context.