Looking at the fastest Java, Haskell, Racket, OCaml, JavasScript, C#... they're all doing per-node allocation using the standard allocator, and all beating Go. The limit is not just for Go. I don't know why you think that Go is the only one being disadvantaged here.
I believe this is all addressed in my original post: https://news.ycombinator.com/item?id=28293381. If you have specific questions/concerns, I'm happy to address them, but I don't see the point in repeating myself.