We haven't found this to be an issue for Pnut. One of the metric we use for performance is how much time it takes to bootstrap Pnut, and dash takes around a minute which is about the time taken by bash. This is with Pnut allocating around 150KB of memory when compiling itself, showing that Dash can still be useful even when hundreds of KBs are allocated.
One thing we did notice is that subshells can be a bottleneck when the environment is large, and so we avoided subshells as much as possible in the runtime library. Did you observe the same in your testing?