My one complaint is that there's no benchmark that measures FFI performance. Realistically, if you build a system in Python or Ruby - you're going to be dropping down to C for your hot spots. And so scripting language performance on all these compute-intensive tasks is somewhat irrelevant, you really want to know how much overhead you'll incur crossing the scripting/C boundary (which, in my experience can sometimes be large enough that it wipes out all the gains of coding in C in the first place).
Many of the pi-digits programs use GMP.
https://benchmarksgame.alioth.debian.org/u64q/performance.ph...
And to make things more complicated, the performance increase is generally a function of the time spent on optimizing the problem, which is dependent not only on any innate speed, but also the expressiveness of the language and the programmer's familiarity with the language. Given that you don't have infinite time, there are further tradeoffs here.