That's really interesting - hadn't thought of that before. To fix that, would you be able to do a square of the magnitude comparison with the radius and just bump the borderline cases, or is it more efficient without the extra branching?
I just did it across the board; since the error is in the floating-point noise I don't know if I'd even trust a comparison on that. Plus, the discrepancy between "bumped" and "unbumped" samples might cause some visible artifacts.