undefined | Better HN

0 pointshnlmorg1y ago0 comments

The cleverness is in the simplicity of its implementation while maintaining backwards compatibility. Which satisfies the definition of “elegance”.

Working with Unicode is anything but elegant, but that’s another story.

0 comments

rapind1y ago

> Working with Unicode is anything but elegant, but that’s another story.

Yeah I hear ya. IMO this is really important though and I don’t think there’s much of a story if this isn’t part of it. The design is clever but the resulting usability (and error proneness) leaves much to be desired. Not knowing what’s in there, like a closed envelope, increases user complexity significantly.

hnlmorgOP1y ago

You do know what’s inside a Unicode string though. They’re not hard to parse. Very easy in fact.

The biggest problem with Unicode parsing is that you don’t know how long a Unicode string is without parsing it. Which often leads to double parsing it. But we’d have this problem even with Unicode fixed to 2 or 4 bytes (like utf-16 and utf-32) because not all glyphs are going to be printable characters (eg accents).

…or you allow for every possible combination of characters, accents, etc and fixed at specific number of bytes and then suddenly you have a 6 or 8 byte character set that breaks backwards compatibility with ASCII and increases global network throughput by six times despite that mostly being empty space, while increase a greater burden on font developers to cater for every subtle variation of glyphs.

Sure things seem messy now, but I honestly think utf-8 is the least worst solution to the problem.

j / k navigate · click thread line to collapse