undefined | Better HN

0 pointspants22y ago0 comments

I did see that, though my interpretation is that breathing is included in its voice tokenizer which helps it understand emotions in speech (the AI can generate breath sounds after all). Other sounds, like bird songs or engine noises, may not work - but I could be wrong.

0 comments

CooCooCaCha2y ago

I suspect that like images and video, their audio system is or will become more general purpose. For example it can generate the sound of coins falling onto a table.

j / k navigate · click thread line to collapse

0 comments

CooCooCaCha2y ago

I suspect that like images and video, their audio system is or will become more general purpose. For example it can generate the sound of coins falling onto a table.

j / k navigate · click thread line to collapse