I don't understand where you see the mistake in Python 3 with unicode. What encoding Python internally uses to store strings doesn't really matter. What's important is that it is always known what encoding is used. This was unclear in Python 2 and Python 3 fixed this.