This may be true but not the whole story. It's the reason why the MS office team bit the bullet and replaced .doc with .docx about 5 years ago http://en.wikipedia.org/wiki/Office_Open_XML
Docx is basically XML in a zip file. It's a beast and has lots of compromises for backward compatibility, but as a design starting point, "zipped XML" is far far better than a binary dump of the in-memory data.