I've hit this exact problem with multithreading in C and setuid and just because it _can_ be managed in C doesn't make it easy or straightforward.
Therefore, I mirror the sentiment that there needs to be a way to operate on a process level, even if that has some interesting consequences.
(P.S.: In C, if you're using glibc, it DOES actually patch this issue up on its own using one hell of a nasty hack.)