As far as compressing offsets, I haven't done any specific measurements but my intuition is that snappy (really any compression algorithm) gives us a huge benefit, since all offsets are stored in-order: they tend to have at least 2 prefix bytes in common, so it's highly compressible.
I experimented with mmap'ing all files in stenographer when it sees them, and it turned out to have negligible performance benefits... I think because the kernel already does disk caching in the background.
I think compression is something we'll defer until we have an explicit need. It sounds super useful, but we tend not to really care about data after a pretty short time anyway... we try to extract "interesting" pcaps from steno pretty quickly (based on alerts, etc). It's a great idea, though, and I'm happy to accept pull requests ;)
Overall, I've been really pleased with how doing the simplest thing actually gives us good performance while maintaining understand-ability. The kernel disk caching means we don't need any in-process caching. The simplest offset encoding + built-in compression gives great compression and speed. O_DIRECT gives really good disk throughput by offloading all write decisions to the kernel. More often than not, more clever code gave little or even negative performance gains.