It seems that putting the allowed character set into the tld would be a pretty user-friendly way of doing that.
Edit: As an added bonus, tlds are centrally managed, and are already western/latin encoded. So why not customize it with a localized abbreviation for the language or tld type?