The title conflates languages and their implementations. Different implementations prioritize different things. They occasionally do test different implementations, as in the main Ruby distribution vs JRuby, but it is still annoying.
The second, and I think largest issue, is that they chose the Language Benchmarks Game as the set of sample programs to test. I do not believe that the kinds of programs in the Language Benchmarks Game are representative of the broader set of software written in most languages. They tend towards math-y, puzzle-style programs, and not CLIs, web applications, GUIs, or anything else.
A very specific issue I have is that Typescript and JavaScript are very different in their analysis, and that's very confusing to me, given that all JavaScript is valid TypeScript, and you would execute it in the same way. This may be an artifact of issue #2, which is that the benchmarks game is only as good as the people who wrote the programs, and it's quite possible that the folks who submitted the TypeScript code didn't do as much perf work as the JavaScript code, but it is still a confusing result that's not explained anywhere in the paper.
A final one (and this is the one I remember least well, so I may be wrong here) is that it is not reproducible. They do not mention which date they retrieved the programs from the Benchmarks Game, let alone the source code of the program, nor released the scripts that were used to collect the data, though they describe them. This means that these discrepancies are hard to actually investigate, and makes the results lower quality than if we were able to independently verify the results, let alone update them based on what has changed since 2017, which is an increasingly long time ago.
In short, I do not think this paper is literally useless, though I think that it does not actually demonstrate its central claim very well, and is difficult to evaluate the actual quality of the results, making it a far weaker result than the title would suggest.
A more charitable reading might accept that language names may be used as shorthand for particular language implementations.
In this case:
https://sites.google.com/view/energy-efficiency-languages/se...
~
> representative of the broader set of software written in most languagesTo your knowledge, did such a collection of programs — actually shown to meet that criterion — exist?
~
> Typescript and JavaScript are very different in their analysisWhen we emphasize outliers with arithmetic means in "Table 4. Normalized global results for Energy, Time, and Memory".
With medians:
JS 7.25 times slower than C
TS 7.8 times slower than C
~
> all JavaScript is valid TypeScriptExcept `--alwaysStrict` and `--use_strict`
So a JavaScript program may have failed as a TypeScript program, and a different program which worked as TypeScript may have been measured.
~
> not reproducibleThe authors provided a repo, including test program source code, that is still available 5 years later.
page 3, footnote 1 "The measuring framework and the complete set of results are publicly available at https://sites.google.com/view/energy-efficiency-languages"