We got it down to an average ~400ms under load with aggressive full page caching (including our own hand-rolled component caching) and edge caching of all static assets (freeing up the servers to be optimized for dynamic content).
This should not be considered an endorsement; getting sub-second pages was a downright herculean engineering effort requiring system-level optimizations beyond what should ever be required of a pre-packaged system.