The solution to that is simple, don't let the programmer access individual bytes in a Unicode string.
Get rid of indexing into them and replace it with iterators. Make string handling functions work on code points at the very least but better yet on grapheme clusters. There's a little more to it than that but it's a good start.
Yes, people are still stuck in the ASCII mindset and can't seem to get away from thinking in bytes. But I belive it's the ability to index into strings is what's to blame and not the encoding used.