I think you got it - there are many ways for studies to fail and its hard, at this time, to validate that a study succeeded except multiple studies confirming the original finding.
Even then, I would worry that the results may be caused by some confounder in the original dataset/design instead of something you can trust.