This is incorrect. Searching text with a variable-length encoding does not require extra branch instructions. If you're searching through UTF-8 text, you can just pretend it's a bunch of bytes and search through that.
This isn't counting problems with normalization, of course. You will have to put your needle and haystack both into the same normalization form before searching. But you had to do that anyway.