Skip to content
Better HN
I used RL fine-tuning to make an LLM generate ugly and unpythonic FizzBuzz code | Better HN