except that various frameworks highly depend on their configuration/version/coding style/linux configuration/memory used/cpu's used/use case.
it's also important that some frameworks behave better when they are warm.
also some code behave's differently when you connect with a single client to make requests via wrk, vs a aggregate of multiple clients.
they still use wrk and not wrk2, their error rate is pretty high and their framework is well not always well behaving.
besides all that, it's just simple cases that they are testing.
I would never ever trust this site or any result they got.