In this case, does "deceivingly robust" mean they look robust but are fragile? or does it instead mean they look fragile but are robust?
This isn't a criticism of you, soundsop. Rather, it's intended to keep pointing at how difficult it can be to concisely deliver a message.
---
edit: sounds like the correct interpretation of the title is "P-hacked hypotheses appear more robust than they are."
For example, they used this quote as an example of "in appearance but not in reality":
> It’s no mystery why images of shocking, unremitting violence spring to mind when one hears the deceptively simple term, “D-Day.” [Life]
But the term "D-Day" is simple. It's deceptive because it might wrongly lead you to think the event it refers to is also simple.
Similarly, if something is "deceptively simple-looking", it really is simple-looking; it's just not simple.
https://languagelog.ldc.upenn.edu/nll/?p=3500
https://brians.wsu.edu/2016/05/25/deceptively/
https://www.academia.edu/37488247/The_Deceptively_Simple_Pro...
Shoot, even oxford gives exactly opposite definitions of the word.
https://www.oxfordlearnersdictionaries.com/us/definition/eng...
---
I mean, even when I saw the title of the thread, I had an obligation to click (clickbait I guess?) because I could have interpreted the title as either a warning about p-hacking or an attestation in favor of the practice. In fact, on first glance, I read the title as "P-hacked hypotheses appear less robust than they actually are."
That's kinda ... useful, actually.
It feels like this is sort of the same issue with overfitting in ML. Attempts to use ML results predictively often fail in hilarious ways.
To insist that p-hacking, by itself, implies pseudo-science is fetishism. There is no substitute for understanding what you are doing and why.
I think actually relying on "conceptual replications" in practice is impossible. If the theory is only coincidentally supported by the data, that makes the replication more likely to exceed p < .05 coincidentally in a very difficult to analyze way.
The author mentions that problem, but doesn't mention a bigger issue: If you think people are unlikely to publish replications using novel data sets, just imagine how impossibly unlikely it is for people to publish failed replications with the original data set! If you read a "replicated" finding of the same theory using the same data set, you can safely ignore it, because 19 other people probably tried other related "replications" and didn't get them to work.