Python has several internal string representations to reduce conversions.
Really "UTF-16 string" and "UTF-8 string" are code smells. Applications should be using character/code point sequences or byte sequences.
Using code unit sequences is....bizzare. (Yes I know Java, C#, JS has chosen to do that, but a new language has an opportunity to improve.)