undefined | Better HN

0 pointspixl972y ago0 comments

Eh, it's not D&K gone berserk, it's what happens when you attempt to compress reality down to a single dimension (text). If you're doing a haiku, you will likely subvocalize it to ensure you're saying it correctly. It will be interesting when we get multimodal AI that can speak and listen to itself to detect things like this.

0 comments

earthboundkid2y ago

The problem isn’t just that everything is text. It’s that everything is a Fourier transform of text in such a way that it’s not actually possible for an LLM to learn to count syllables.

pixl97OP2y ago

Again, that is just using text only.

Imagine you have a lot more computing resources in a multimodal LLM. It sees your request of count the syllables and realizes it can't do them from text alone (hell I can't and have to vocalize it). It then sends your request to a audio module and 'says' the sentence, then another listening module that understand syllables 'hears' the sentence.

This is how it works in most humans, now if you do this every day you'll likely make some kind of mental shortcut to reduce the effort needed, but at the end of the day there is no unsolvable problem on the AI side.

j / k navigate · click thread line to collapse

0 pointspixl972y ago0 comments

0 comments

earthboundkid2y ago

The problem isn’t just that everything is text. It’s that everything is a Fourier transform of text in such a way that it’s not actually possible for an LLM to learn to count syllables.

pixl97OP2y ago

Again, that is just using text only.

j / k navigate · click thread line to collapse