I think that the result is overcooked. Their hypothesis 1 is somewhat falsifiable in that I don't think that there is a widespread reproducibility crisis. I have been unable to reproduce results a couple of times in my career, but I think that each time that was due to naughtiness (deliberate) on the authors part or incompetence by me. Almost always you can reproduce and when I have run into trouble I've found that the authors almost always help out (most people are just delighted that you are interested!) On the other hand this paper is very useful in that I think it will be used to establish better criteria for papers in the future. I often reject papers because they make no claim and have no results, contribution or conclusions (this makes reviewing them quick so I really like papers like this !) I think that it would be harsh to outright reject a paper because the hardware set up is poorly documented, but it would be reasonable to ask for that change before publication (for example). I agree with the authors that their criteria are useful.
One issue though, open sourcing software is a good aspiration, but it's not always possible due to IP and licensing - also export controls in some cases (not always US -> other places too). If the community insists on opensource pre-publication some important stuff is not going to get published.