I do think it makes sense to stick with the PhD meaning you're a world class expert in some area, but if so then we need to adjust our expectations for what Master's level work means in the sciences. Right now it seems to just represent a hurdle you need to whiz past on your way to the PhD.
The problem is that grade inflation means the majority of students will fall short of these goalposts. I agree with your assessment that undergrad degrees represent hurdles, regardless of whether a student is planning to stay in academia or not.
My experience with getting a Masters degree was that it was really tough work that required my full dedication for two years. But I had a world-class scientist as an advisor breathing down my neck the whole time and expecting results, and my experience doesn't seem to match that of many other MScs I know. Some departments seem to be "degree factories"; it takes an unreasonable amount of effort to follow up students in the classical "apprenticeship" tradition described by GP. It would be very strange if every department at every university managed this level of dedication, with student numbers being what they are.
I'm having trouble believing this.
In UK, most Masters degrees last 1 year, and there are several degrees considered good such as Imperial's MSc in Machine Learning, Cambridge's MPhil in Machine Learning, Speech and Language Technology, Edinburgh's MSc in Cognitive Science, and others. Is it really possible to become a "master" of machine learning in one year?
Also, at least in Computer Science, most of the Bachelors degrees considered good in UK do not seem to focus at all on replicating science. In fact, for my final year undergraduate project, I was encouraged to find something novel, and at no point my supervisor hinted towards focusing on replicability.
Is it perhaps more common in US?
Of course, you can define the term "master" to mean pretty much whatever you like. But I'd say that two years of additional, focused study when you are already proficient in your field should be more than enough to have a mastery of the specific skills and knowledge that is at least on a high national level. I'm from Norway, so the US picture is unknown to me.
Actually I think farming out replication to undergrads would be an excellent approach. Your final year undergrad project should be to choose an under-replicated study and repeat it, publishing your findings. Each individual study might be less reliable if done by an undergrad than done by a seasoned researcher, but if each study is repeated by say 5 undergrads and 2+ of them fail to replicate the results, that would be enough to indicate that the study warrants further attention.
> One student could spend their time working on a important result gaining tons of insight and expertise while another could be stuck replicating a task that turned out to be worthless bullshit and have very little to show for it.
The whole point of science is that we don't know what will turn out to be an important result and what will turn out to be worthless bullshit. No study is worth a damn unless it's been replicated but everyone is too busy trying to land-grab the next little piece of unexplored territory to actually validate anything that comes before.
If nothing else, we need to regain the perception that a negative result is just as important as a positive result - to paraphrase Edison, discovering 100 things that don't work is just as important as discovering one thing that does.