What's wrong with social science and how to fix it (opens in new tab)

(fantasticanachronism.com)

142 pointsmichael_fine5y ago101 comments

101 comments

> Publishing your own weak papers is one thing, but citing other people's weak papers? This seemed implausible...

This is practically required by reviewers and editors. If you wade into a topic area, you need to review the field and explain where you fit in, even though you know full well many of those key citations are garbage. You basically need to pay homage to the "ground breakers" who claimed that turf first, even if they did it via fraud. They got there first, got cited by others, and so are now the establishment you are operating under.

And making a negative reference to them is not a trivial alternative. For one thing, you need to be certain, not just deeply suspicious of the paper, which just adds work and taking a stand may bring a fight with reviewers that hurts you anyway.

selectionbias5y ago

Citing a paper needn't be a celebration of it, you can cite a paper to say "these guys find X but..."

goatinaboat5y ago

You basically need to pay homage to the "ground breakers" who claimed that turf first, even if they did it via fraud.

Even referring to it as “science” is fraudulent. Testable theories and repeatable outcomes, anyone? Time this whole field was defunded.

textman5y ago

Might be that I am older than most of you, but when I was in high school the term was "social studies" then later it evolved to "social science" so the university crowd could legitimize it to get more funding. there obviously is no science in it.

boomboomsubban5y ago

"Social studies" refers to a variety of different social science and humanities subjects taught in public education. The term comes from US lawmakers around 1900.

"Social sciences" are generally considered the "science of society," and the term comes from philosophers in the mid-1800's.

1 more reply

pas5y ago

That's just ignorance of how science works. There are no experiments in astronomy either ... or are there?

Natural experiments. Just as much as they happen in economics and sociology. And even though stars and black holes might be simpler than humans, it doesn't mean the soft sciences are not science.

That said exactly because of the additional complexity the bar should be higher for what's acceptable as research, viable methods and appreciable results.

ims5y ago

There were some stunning claims being made on Twitter last month based on a recently published study. Instantly skeptical, I dug into the methodology section and found this gem:

"It should be noted that the results cannot be estimated using a physician fixed effect due to a numeric overflow problem in Stata 15 which cannot be overcome without changing the assumptions of the logit model."

... The sad part was they didn't even choose a reasonable model in the first place.

selectionbias5y ago

(Ignore my previous reply I found it myself). To be fair to the authors, it is not their primary specification, that was a linear probability model. The logit model is just a robustness check to make sure the linearity assumption isn't driving the results.

ims5y ago

Yes, the primary specification was a linear probability model for the likelihood of a binary dependent variable conditioned on two binary input variables. As far as I could tell, the fit was max likelihood without regularization and the paper's bombshell conclusion was based on the regression coefficients' p-values.

The Stata thing was just one of many, many red flags.

gwern5y ago

When your robustness check fails because of numerical overflow...

bitxbit5y ago

I mean they couldn’t just do it in Matlab? or Python? So incredibly lazy.

garden_hermit5y ago

Python, R, Mathab etc. are outside of many people's skillsets. I've tried to evangelize to many fellow researchers, but they simply don't have the time or the interest in novel programming languages when the tools they have (largely) work.

watwut5y ago

Matlab is quote expensive.

selectionbias5y ago

Which paper was this?

xrisk5y ago

Is there a similar study done on the physical sciences? I’m getting a bit of holier-than-thou feeling from this article.

Edit: from all this talk of reproducibility, I wonder what percentage of cutting edge ML research is reproducible (either from lack of public training sets / not enough compute)

unishark5y ago

There are definitely studies criticizing ML publications similarly. As a kind of statistics (but often without the rigor), screwups make ML methods appear better than they really are. Hence the literature is packed with screwups.

Other CS subfields that get a lot of criticism are "network science" and bioinformatics.

dehrmann5y ago

I've been playing with some ML lately. I'm sure some of this is because I only have a high level understanding of what I'm doing, but it really feels like throwing things at the problem and see what sticks and is fast. Add features, remove features, try different network architectures, different activation functions, etc. I think the best understood thing might be overfitting, which is oddly reassuring.

unishark5y ago

> ...throwing things at the problem and see what sticks and is fast.

That generally isn't enough to get published nowadays though, at least in a simple sense (in a broader sense that process might describe all research, of course). To get published requires some deeper demonstration of a new kind of method that not only works, but is superior to all that preceded it in some important way. In other words you show your new method compared to other methods, where yours must be better. Obviously, bad research here can show a supposed advantage by either doing a unfair job applying competing methods, or overfitting their new methods. Or both. As I understand it, the first one is quite common: comparing a poorly-tuned old method to a carefully-tuned new method.

jack_pp5y ago

Well yes. I remember hearing that there are now meta-networks, networks that optimize other networks by mechanisms you described.

garden_hermit5y ago

ML has the benefit though of a rapid turnaround time from Academia -> Industry. Things that work/replicate will be immediately used to make money. Things that don't work will be abandoned pretty quickly (at least outside of ML researchers).

garden_hermit5y ago

There are tons of replication issues across the sciences, they are just most salient in the Social Sciences because the topic is just really hard to study well.

Clinical trials can often be flawed, even if the stats are fine, just in how they sample. For example, women are often excluded from trials due to hormonal changes, but how drugs impact women is really important! Participants are also typically drawn from specific locations, and so may not be representative of people with different diets, lifestyles, and environmental factors.

Physics has its own controversies, though not always directly related to replication. For example, Harry Collins recounts the social factors involved in the discovery of gravitational waves: https://blogs.sciencemag.org/books/2017/03/28/harry-collins-...

CuriouslyC5y ago

Biological sciences are more often than not just as difficult to reproduce, mostly due to the difficulty of controlling living organisms, the somewhat random nature of the outcome, and p-hacking.

rmbeard5y ago

This one comes close https://www.nature.com/news/1-500-scientists-lift-the-lid-on...

thu21115y ago

He mentions that epidemiology has actually more severe problems than economics. Having read some epi papers I understand why. Not sure if you'd count that as a physical or social science though: at least theoretically it's biologically based, but in reality the data it works with is mostly social and demographic.

not2b5y ago

This guy overstates his case somewhat. Consider:

"If the original study says an intervention raises math scores by .5 standard deviations and the replication finds that the effect is .2 standard deviations (though still significant), that is considered a success that vindicates the original study!"

Why the exclamation point here? The replication study isn't magically more accurate than the original study. If the original paper finds an 0.5 standard deviation effect and the replication study finds an 0.2 standard deviation effect, that increases our confidence that a real effect was measured, but there's no reason to believe that the replication study is more accurate than the original study. Maybe the true effect is less than measured, but maybe not. So yes, it should be considered a success.

vharuck5y ago

When I advise decision makers on reading statistics (in my case, state-wide health data), I urge them to focus on effect size and only use significance as a filter. Two reasons:

1. Effect size is the most important thing. The point of the study is (usually) to guide decisions. Sticking with the article's example, let's say combining both studies shows the increase is likely 0.35 standard deviations. Is the intervention still worth the cost? Is it still the best option?

2. If there's enough data (e.g., an observational study) or a good chance of omitted variables, there's going to be a "statistically significant" difference. No matter what's measured. I would bet my life's savings there's a statistically significant difference in profits of New York businesses depending on whether the owner's named Jim or Bob. A replication of the experiment with all Jim and Bob businesses in another state would also guarantee significance. So it's a coin toss whether the second study would "successfully replicate" the same direction of effect.

itsdrewmiller5y ago

I think his point here is that the effect in replication is closer to 0 than to the original claim. It might be more obvious if he chose an order of magnitude difference as an example - going from the dominant factor to technically-not-nothing might be replication but it's not vindication.

unishark5y ago

They reported two facts, statistical significance and effect size, one of which was not replicated. Of course this doesn't prove anything definitively, but it still arouses suspicion. As for whether it's exclamation-point-worthy, at worst it depends. For example, 0.5 standard deviation improvement in math scores might normally require a ton more expensive effort or even have been thought previously impossible, while a 0.2 was easy and not publication-worthy.

hinkley5y ago

If someone is selling you on rethinking an entire discipline they are probably overstating their case a bit. Or you’re ignoring them outright.

That doesn’t mean they’re wrong, necessarily. Overcoming inertia is a huge challenge. Daunting, even.

doonesbury5y ago

Agree. In the face of institutional inertia - here crossing industry and academic fields - your starting point is when I see a fly I use a canon. It is extremely difficult to budge. As so the usual reminder: the first 51% of communication is repetition

mehrdadn5y ago

> the first 51% of communication is repetition

Is this a famous saying? It sounds nice but I hadn't heard it before. (And Google doesn't pop up anything obvious.)

1 more reply

hinkley5y ago

I have found that the repetition usually only works if it comes from multiple sources. It's the primary reason I encourage the 'new guy' to take their "this doesn't make sense" questions up the chain of command or to the groups we collaborate with.

One, I hate to crush a spirit, although I'm just sending them to someone else to do it. Two, about one time in ten they come back with a preliminary roadmap to a solution.

I can tell Steve until I'm blue in the face that this code is nuts and get nowhere, or I can send three other people to tell him once and finally he'll take it seriously.

I don't think it's malicious, I think it's a combination of basic human psychology with learning strategies. Sometimes you have to shop for new instruction because the perspective your teacher brings simply doesn't resonate. Each person reinforces the neural connections and phrases it a little differently (perhaps especially with new people, because they are fresh off the street and don't use our jargon yet?)

bobcostas555y ago

It depends on how the replication is done, but the big replication projects typically use very large sample sizes compared to the original studies, so their error bars are much smaller.

SubiculumCode5y ago

Indeed, and even if the replication isn't significant, it doesn't mean that the replication and the original study significantly differ from each other.

Overall, the condescending trash talking in this article led me to flag it.

jonnycomputer5y ago

Could you speak a little more on that point?

SubiculumCode5y ago

bruh

konjin5y ago

I'm sorry, but this is such a ridiculous counter argument I'm speechless.

An explanation point as criticism?

>How dare he wear tweed, his argument is invalid.

>The replication study isn't magically more accurate than the original study. If the original paper finds an 0.5 standard deviation effect and the replication study finds an 0.2 standard deviation effect, that increases our confidence that a real effect was measured

It also increased our confidence that the effect is small enough to be ignored. You can't pretend that the two studies are independent from each other. The second is directly the result of the first and you need to use Bayesian methods to calculate your belief of the result. The questions of 'is there an effect' and 'the effect size is >= 0.5 sd' give you two vastly different probabilities and vastly different policy responses.

jbay8085y ago

> An explanation point as criticism

As in "eats, shoots, and leaves", a little punctuation can totally change the meaning of a sentence. In this case, a period would have expressed agreement while an exclamation point expresses incredulity.

konjin5y ago

stewbrew5y ago

The "social sciences" include a lot. Wrt Sociology, I'd say one problem is the overemphasis on quantitative methods - they try to be as serious as the big boys.

The best sociological research I've read was qualitative though. Questionable replicability is of course built-in in this type of research but the research dealt with relevant questions. Most quantitative sociology seems rather irrelevant to me.

Another problem is of course that most quantitative sociologists don't have a clue what they are doing. They don't even know the basics of statistics and then use some statistical methods they don't understand. It's some kind of overcompensation, I think. Although, psychologists are even worse in this respect. It's really fun to watch an psychologists torturing SAS.

I write this as someone who was originally trained as sociologist and over the years turned into a data scientist.

xrisk5y ago

I’m really interested in what you feel about the potential applications of CS/ML to sociology. Or if you might have any resources that talk about that.

I ask because I’m enrolled in a research program in “computational humanities”. My initial feeling towards the program is that it’s kind of a sham.

Computational Humanities seems to be as computational as an accountant using Excel for their work. Not that I particularly mind, I’m not very interested in the computational aspect at all.

stewbrew5y ago

There is a Springer "Journal of Computational Social Science". That could be a source of inspiration. CS/ML in Social Sciences gets interesting where a great amount of routine data is generated -- i.e. areas close to public administration.

Why did you subscribe in the program when you're not interested in the computational aspect? Or are you more interested in some kind of grand social theory/philosophy of computation? If you read German, Armin Nassehi "recently" published "Muster. Theorie der digitalen Gesellschaft" (Pattern. Therory of a digital society). He is not the first but I find his stance interesting - based on several interviews, I haven't read the book though. Many sociologists deal with the Internet & AI but I find those works less inspiring because they usually lack an adequate technical understanding. To me it often feels like bushmen theorizing about empty coca cola bottles (you probably don't know the movie?).

crazygringo5y ago

I've tried to understand this (obviously quite angry/ranty) article and cannot actually figure out what data it has.

It seems to not be based on actual replication results, but predicted replication results? But then the first chart isn't even predictions from the market, but just the author's predictions?

The author clearly has a real hatred for practices in the social sciences. But I don't see any actual proof of the magnitude of the problem, the article is mostly just a ton of the author's opinions.

Is there any actual "meat" here than I'm missing? Or is all this just opinions based on further opinions?

leftyted5y ago

It's based on this: https://www.replicationmarkets.com, which is linked in the first paragraph of the article.

Per https://www.replicationmarkets.com/index.php/rules, volunteers are predicting whether 3000 social science papers are replicable. According to the rules, of those 3000 papers, ~5% will be resolved (i.e. attempts will be made to replicate). According to the article, 175 will be resolved. It's unclear to me who exactly will do that work but I would guess it's people behind replications markets dot com (they are funded by DARPA). The rules say that no one knows ahead of time which papers will be resolved so I assume the ~5% (or 175) will be chosen by random.

The data in the article seems to be based on what the forecasters predicted, not which papers actually replicated (that work hasn't been done yet...or at least hasn't been made public). The author of the article is assuming that the forecasters are accurate. To back up this assumption, he cites previous studies showing that markets are good at this kind of thing.

The tone is ranty but, by participating in the markets, the author is putting his money where his mouth is.

brandmeyer5y ago

I think you're right. Take a look at the before/after curves for "this is what the predictions look like after the papers".

The before curves are gaussian+ distributed and pessimistic, but the after curves are all distinctly bimodal (or worse). This suggests that some population of the participants were broadly pessimized by their surveys and another population was broadly optimized by their surveys.

This could instead be a measurement of how people's trust in science is predicated on how well it matches their own prior beliefs.

+ A sharper eye shows they aren't quite bimodal in the prior belief. Even in those cases, the separation between the modes gets much wider.

jonnycomputer5y ago

No, you are exactly right.

mhh__5y ago

I try not to look down on social science, for the most part data is data as long as you can reason about how it was collected and who by.

The only thing that worries me a little (or a lot sometimes) is that there doesn't seem to be much "bone" for the meat to hang off of - that is, in physics, if your theory doesn't match experiment it's wrong whereas in social science you're never going to have a (mathematical) theory like that so you have to start (in effect) guessing. The data is really muddy, but thanks to recent (good) political developments whatever conclusions can be drawn from it may not be right in their eyes. For example, (apparently) merely commenting on the variation hypothesis can get you fired [https://en.wikipedia.org/wiki/Variability_hypothesis#Contemp...].

kovac5y ago

Requiring social science theories to have a mathematical founding might be a little too much to ask the social scientists because unlike Physicists, their command of mathematics is far from adequate to do any serious exploration.

I majored in Mathematics but out of curiosity I took some Psychology modules when I was in university. What I found baffling was their lack attention to details. They just seem to have an intuitive model of their subject and they were just reinforcing that intuition while overlooking any details that could have challenged it. Coming from a field where every symbol, punctuation matters, I realised to Psychologists exact details of a curve don't seem to matter much as long as the general trend made sense.

Someone who really impressed me was Dan Ariely who is a behavirol economist. Even though I didn't see any mathematics in his lectures, I loved his approach to the field. I'd be quite happy if more of social science took a similar approach even if they didn't back it up with rigorous mathematics.

efavdb5y ago

I read one guess that 2/3 of the published results in social science are wrong. Suppose you tried to develop a deeper theory of these things and derived consequences from these “results” as one does in math and physics. If your corollary depends on 4 prior results, each with a 1/3 chance of actually being true — assuming no logical errors on your part — then the chance your result really is correct is (1/3)^4 <0.01. With results like this it’s not going to be easy to get much depth that holds water.

jessriedel5y ago

Summary tweet thread by the author: https://mobile.twitter.com/AlvaroDeMenard/status/13043994376...

throwawaysea5y ago

Threadreader version: https://threadreaderapp.com/thread/1304399437641461760.html

He mentions changing the threshold for significance as a possible tweak but the issue is something more fundamental. Humans have flaws - like political biases or a tendency to favor one’s own hypotheses (confirmation bias). Humans also operate within systems that have incentives that can motivate them away from truth seeking (publication bias). All this exacerbated the fundamental problem that statistical techniques are easy to manipulate. Virtually all academic (university) studies, in their published format, simply lack the necessary information, controls, and processes a reader would need to easily detect flawed statistical claims. Instead a reader has to blindly trust - assuming that data was not selectively included/excluded or that the parameters of the experiment were rigorously (neutrally) chosen or whatever else. There is no incentive for the academic world to correct for this - there isn’t for example, a financial consequence for a decision based on bad statistics, as a private company might face.

throwawaysea5y ago

I am glad this topic is getting attention. There is significant bias in academia in social science even outside flaws in statistical techniques. The field has been weaponized to build foundational support for political stances and blind institutional trust granted to academia is enabling it. This author mentions the implicit association test (IAT) as an example of a social science farce that is well known to be a farce, and notes that most social science work is undertaken in good faith.

However the damage has been done and it doesn’t matter if MOST work is done in good faith if the bad work has big impact. As an example, IATs have been used to make claims about unconscious biases and form the academic basis of books like “White Fragility” by Robin DiAngelo. Quillette wrote about problems with White Fragility and IAT as early as 2018 (https://quillette.com/2018/08/24/the-problem-with-white-frag...), and others continue to write about it even recently in 2020 (https://newdiscourses.com/2020/06/flaws-white-fragility-theo...). However few people are exposed to these critical analyses, and the flaws in the scientific/statistical underpinnings have not mattered, and they have not stopped books like White Fragility from circulating by the millions.

We need a drastic rethink of academia, the incentives within it, and the controls that regulate it to stop the problem. Until then, it’s simply not worth taking fields like social science seriously.

thu21115y ago

It could just be abolished. If the NSF was scrapped, what would the scientific world look like? Probably like the world before WW2, when most science was done by hobbyists or "inventors". Einstein did his seminal work whilst a patent clerk, etc etc.

Most analyses of the problems in science are really analyses of the problems in academia. There's no iron law that states academia has to be funded to the level it is today, and for most of history it wasn't. And let's recall, that these meta-studies are all about science, which is one of the better parts of academia. Once you get outside that into English Lit, gender studies, etc, the whole idea of replicating a paper ceases to be meaningful because the papers often aren't making logically coherent claims to begin with.

A lot of people look down on corporate funded science, but it has the huge advantage that discoveries are intended to be used for something. If the discovery is entirely false the resulting product won't work, so there is a built-in incentive to ensure research is telling you true things. The downside is there's also no incentive to share the findings, but that's what patents are for.

Of course a lot of social psych and other fields wouldn't get corporate funding. But that's OK. That's because they aren't really useful except for selling self-help books, which is unlikely to be a big enough market to fund the current level of correlational studies. That would hardly be a loss, though.

dleslie5y ago

> If the NSF was scrapped, what would the scientific world look like? Probably like the world before WW2, when most science was done by hobbyists or "inventors".

There were scientists who received financial backing from wealthy individuals in a manner not so different than VCs operate today; Tesla among them.

Regardless, I tend to agree that science that exists for the sake of publishing, because publishing is a requirement of receiving grants, has diluted the respectability of science.

mNovak5y ago

Does anyone have links to the Replication Prediction Market results mentioned in the article? That sounds super interesting.

As an amusing nudge, I bet you could do some ML to predict replicability of a paper (per author's suggestion that it's laughably easy to predict) and release that as a tool for authors to do some introspection on their experimental design (assuming they're not maliciously publishing junk).

throwawaysea5y ago

Here’s the paper, published only recently: https://royalsocietypublishing.org/doi/10.1098/rsos.200566

> I bet you could do some ML to predict replicability of a paper (per author's suggestion that it's laughably easy to predict)

I am betting any such ML system could be gamed and addressing the issue would ultimately still need humans in the loop. For example, what if I am selective with my data, beyond the visibility of ML evaluating the final published paper? I don’t think this is “laughably easy” to predict. It may be easy to spot telltale signs today that predict replicability, but as soon as those markers are understood, I imagine authors will simply squeeze papers through the cracks in a different way.

Another issue is this bit from the author on Twitter:

> Just because it replicates doesn't mean it's good. A replication of a badly designed study is still badly designed. There are tons of papers doing correlational analyses yet drawing causal conclusions, and many of them will successfully replicate. Doesn't mean they're justified.

thu21115y ago

IIRC from prior discussions of this, a lot of the accuracy of the markets comes from people just applying common sense - like, if a really surprising claim that people should really have noticed before now comes with a huge effect size, it's probably false. ML can't judge that because it doesn't have the ability to do basic sanity checks on claims like that. It takes a sceptical human with life experience to do that.

pas5y ago

Huh? That sounds exactly the thing that a ML system would learn quickly from the data. You probably don't even need shiny deep learning (though it helps).

Just like with the Netflix Prize stuff, where the conclusion was very similar, ie. just dump in as much data as you can, crank up the ML machinery, and it'll discover the features (better than you can engineer them) and learn what to use for recommendation ranking. And that's basically what we see with GPT-3 too. If you have some useful labels in the data it'll learn them even without supervision, because it has so many parameters, it basically sticks.

Get some papers run it through a supervised training phase where you give it a set with every paper scored based on how retracted/bad/unreplicating it is and you'll get a great predictor. Then run it to find papers that stick out, and then have a human look at them, and try to replicate some of them to fine-tune the predictor. Plus continue to feed it with new replication results.

That said, using an ML system as the gatekeeper as OP suggested is a bad idea, as it'll quickly result in the loss of proxy variables' predictive power.

Though ultimately a GPT-like system has the capacity to encode "common sense".

thu21115y ago

Even GPT-3 doesn't encode common sense, which is why it can't do a lot of basic physical reasoning. It's "just" word prediction, albeit very impressive word prediction.

If you look at what GPT produces closely, a lot of it is simply bullshit. It sounds plausible but is wrong. That's exactly the wrong type of AI to detect plausible-but-wrong-bullshit papers, which are the most common type.

1 more reply

Kednicma5y ago

> Even if all the statistical, p-hacking, publication bias, etc. issues were fixed, we'd still be left with a ton of ad-hoc hypotheses based, at best, on (WEIRD) folk intuitions.

This is the quiet part which most social scientists, particularly psychologists, don't want to discuss or admit: WEIRD [0] selection bias massively distorts which effects are inherent to humans and which are socially learned. You'll hear people today crowing about how Big Five [1] is globally reproducible, but never explaining why, and never questioning whether personality traits are shaped by society; it's hard not to look at them as we look today at Freudians and Jungians, arrogantly wrong about how people think.

[0] https://en.wikipedia.org/wiki/Psychology#WEIRD_bias

[1] https://en.wikipedia.org/wiki/Big_Five_personality_traits

teorema5y ago

I'm not sure that psychologists really even make the distinction between "what is socially learned" and what is "inherent to humans" to be honest. I want to say no one really denies traits are influenced by social factors, but I'm sure you could find some citation to the contrary somewhere.

The Big Five are pretty reproducible in part or in whole, but it's strawman to say psychologists are "never questioning whether personality traits are shaped by society." That's not just not true, nor is it even clear what that question means. Go to Google Scholar and search for "Big Five" and terms like "measurement invariance" or "cultural" or "social" or "societies" and take a look.

The Big Five are meant to be descriptive, the "why" is a different issue. (Just to explain it a different way, let's say you do unsupervised learning of cat images, and find over and over and over and over and over again over decades and different databases that the algorithms always return the same 5 types of cats, plus or minus a little. Wouldn't you make a note of it if you were interested in visual types of cats?) And it's important to remember that some consensus around the Big Five wasn't really until the 90s (even today I'm not even sure there's "consensus" around the Big Five).

I agree that there's a problem with selection of participants, but the only way to do that is to increase participation of the scientific community worldwide. And there are whole fields (cultural psychology) dedicated to the problems surrounding this issue.

The Freudian comparison is also worth commenting on in two respects: first, Freudians got in trouble for not pursuing falsifiable empirical research, which is simply not the case for the things you're talking about. Second, everyone loves to hate on Freud, but the basic tenets of unconscious versus conscious processes that sometimes conflict are still a bedrock of neurobehavioral research, including two-system theories ("fast and slow"), which won someone a Nobel prize and is a darling of cognitive researchers. There are legitimate discussions to be have about the utility of two-system theories but those discussions are far more sophisticated than the criticisms I think you're referring to.

Kednicma5y ago

You're right that I'm thinking of very basic criticisms. In particular, there's zero evidence that humans aren't p-zombies [0] and no definitive rejection of the Dodo Bird hypothesis [1]; in combination, this suggests that psychologists are both wrong to imagine that there's anything interesting going on inside of a human's mental states, and also wrong to try to classify those mental states into appropriate and inappropriate states. Instead, what ends up getting studied is society's own idea of what ought to be happening inside our homoncular Cartesian theater [2].

Given these foundational issues, it's folly to try to support Big Five or any other descriptive model just by saying that it's a good fit for the numbers. Any principal component analysis will find something which factors out as if it were a correlative component. This dooms Big Five just as reliably as it dooms g-factors or Myers-Briggs or any other astrology-like navel-gazing.

(If you want an example of actual five things showing up again and again and again, mathematics has examples [3][4][5], but it turns out that when actual five things show up, then the reaction is not to serenely admire the correlation, but to admit terror before cosmic uncertainty. Psychologists do not seem to go insane and kill themselves like statistical mechanics or set theorists; have they really seen the face of god?)

[0] https://en.wikipedia.org/wiki/Philosophical_zombie

[1] https://en.wikipedia.org/wiki/Dodo_bird_verdict

[2] https://en.wikipedia.org/wiki/Cartesian_theater

[3] https://en.wikipedia.org/wiki/ADE_classification

[4] https://en.wikipedia.org/wiki/Monstrous_moonshine

[5] https://en.wikipedia.org/wiki/Classification_of_finite_simpl...

[6] https://en.wikipedia.org/wiki/Hard_problem_of_consciousness

friendlybus5y ago

There's also zero evidence that humans are p-zombies, and plenty of criticism that p-zombies don't work as a thought experiment.

It's a pretty big leap to throw away consciousness on the back of equal outcomes in psychotherapy. There are partial rejections of the Dodo bird verdict in your link.

The Cartesian theatre doesn't account for the mind's ability to imagine things that never were. As soon as you account for that via some emergent material property we have an opening to inject the properties of consciousness back into the discussion.

It's easy to say that the Big Five's cross-cultural statistical correlations are not good enough to describe people, though to dismiss it entirely off your grounding is not really going to work?

Repetition of natural structures is common. Many idealised aesthetic styles rely on that, like the Fibonacci spiral. Why would a fixed and repetitive uncertainty be any less terrifying that any other kind of uncertainty? We don't know what's before the big bang or what colour people really see in their mind when we say red.

1 more reply

barry-cotter5y ago

The Big Five is a much less impressive accomplishment than you’d think for how much people talk about it.

https://carcinisation.com/2020/07/04/the-ongoing-accomplishm...

> The interesting thing about the Five Factor Model is what it gets away with, in terms of being considered a theory, even though it is not causal, and makes no predictions. What counts as a “replication” of the Five Factor Model, as in Soto (2019), is the following: a correlation is found between one or more factors of the Five Factor Model and some other construct, and that correlation is found again in another sample, regardless of the size of the correlation. In almost all cases, and in 100% of Soto (2019)’s measures, the construct compared to a Big Five factor is derived from an online survey instrument.

> What counts as a “consequential life outcome” is also fascinating. In most cases, the life outcome constructs are vague abstractions measured with survey instruments, much like the Big Five themselves. For instance, the life outcome “Inspiration” is measured with the Inspiration Scale, which asks the subject in four ways how often and how deeply inspired they are. Amazingly, this scale correlates a little bit with Extraversion and with Open-mindedness. Do these personality traits “predict” the life outcome of inspiration? Is “Inspiration” as instrumentalized here meaningfully different from the Big Five constructs, such that this correlation is meaningful?

konjin5y ago

The people who use Hanlon's razor to explain away malice are both incompetent and malicious. Only someone who is an idiot would ever think to use 'I'm very stupid' as an excuse or explanation why they did something very damaging. If you are smart enough to realize you are incompetent after the fact you were smart enough to realize it before the fact, and that means you were malicious in not recusing yourself.

harmful_stereo5y ago

The way this was stated in one discrete thought leads me to a problem with human nature i dwell on: how much of what our society and culture is, and what authority is, is just a effort to disguise our frailty and fallibility? It is tremendously hard to be reliable and competitive across multiple disciplines and for the majority of tasks involved in basic human life. There is too big a trade off between available time and location and doing any task well. We are primarily hunting-gathering the easiest ways to meet our needs. How can you blame people for not recusing themselves from participating or misrepresenting themselves as competent when our culture values that and expresses it so dramatically at the highest levels of public performance,from IPOs to high office and everywhere else. Storytelling in the tradition of the heroic myth is mostly about becoming qualified to assume a social role, as an upward stuggle.

It seems built into human character to bite off far more than we can chew, as in free real estate, and then leverage the social value of holding something others are willing to compete for. I think it amounts to a social survival instinct, and i lament how there's very little chance of discouraging people from doing it because of the potential payoff. If anything i think it's a failure of institutions for being built to exploit that competition rather than guard against its excesses.

Fannon5y ago

Imho we also tend to underestimate the impact of cognitive biases on our own views and behaviors. We are often largely unaware of this. In this case, I find that Hanlon's razor is to simple with the black and white distinction of incompetence and malice. Biases often fall in neither category.

People who view themself as rational / technical might be even more prone to not realizing how much they are affected by this? If your self-image is that you are very rational person (more rational than others), you might be especially prone to denying and therefore not being aware of biases.

garden_hermit5y ago

I guess I fall under the field of "Progress Studies" though I think I'm much less concerned with the replication crisis than most.

Most new social science research is wrong. But the research that survives over time will have a higher likelihood of being true. This is because a) it is more likely to have been replicated, b) its more likely to have been incorporated into prevailing theory, or even better, have survived a shift in theory, and c) is more likely to have informed practical applications or policy, with noticeable effect.

Physics and other hard sciences have a quick turnaround from publication to "established knowledge". But good social science is Lindy. So skip all the Malcolm Gladwell books and fancy psych findings, and prioritize findings that are still in use after 10 or 20 years.

karaterobot5y ago

> This is because a) it is more likely to have been replicated, b) its more likely to have been incorporated into prevailing theory, or even better, have survived a shift in theory

Not if this article is to be believed! He claims that studies that could not be replicated are about as likely to be cited as studies which are. That implies the problem may instead get worse and worse, the structure more and more shaky as time goes on.

garden_hermit5y ago

Citation is not an endorsement—plenty of things are disagreed with in order to disagree with something, reference history in a field, or contextualize a result against past findings.

Here, the author seems to only look at recent papers, and so we don't really get to see how the citation patterns have evolved over 10, 20, or 30 years. But even then, established ideas tend to not be cited at all— the concept of "knowledge spillovers", for example, is common in Economics and other fields, yet the original reference is rarely used. Other times, more established claims will be encoded in a book or some work of theory—and people will cite the theory rather than the paper that made the original claim.

scythe5y ago

It's common to see this topic: what's "wrong" with social science. But there are always some things wrong with every science. If nothing was wrong, there wouldn't be any science left to do.

Social science asks more of us than any other science. Physics demands that we respect electricity and don't increase the infrared opacity of the atmosphere. Chemistry requires that we not emit sulfur and nitrogen compounds into the air. But social sciences will not rarely call for the restructuring of the whole society.

This is the "problem" with social science, or more properly, with the relationship between the social sciences and the society at large. When we call for "scientific" politics, it is a relatively small ask from the natural sciences, but it is a revolution -- even the social scientists themselves use this word -- when the social sciences are included in the list (Economics is no different). Psychology, as usual, falls somewhere in between.

So the relationship between the social scientists and the politicians may never be as cordial as the relationship between the natural sciences and the politicians. The "physics envy", where social scientists lament that they do not receive the kind of deference that natural scientists do, will have to be tempered by the understanding that the cost of such deference differs widely.

(All of this is ignoring that physics had a 200-year head start)

teorema5y ago

Social scientists turn the microscope on themselves also. When the microscope turns elsewhere you see similar patterns to differing extents (cf. recent articles on reanalysis of fMRI data, pharmacology replication rates, Theranos or hydroxychloroquine).

Meta-science has always been the gift of social science. This will all eventually funnel down elsewhere, just like meta-analysis.

But you're right, in that social science hits very close to home, more so than other sciences. Imagine that it suddenly worked very very well, and someone in the field of neuropsychology could manipulate behavior just like you might a lightbulb. Isn't that what critics are really asking for?

konjin5y ago

>Physics demands that we respect electricity and don't increase the infrared opacity of the atmosphere.

Physics does no such thing. It tells us that increasing the heat retained in the atmosphere increases the planets surface temperature. It is a descriptive science. Not a prescriptive one. Wanting to have industrial civilization possible in the next century is why you don't increase the infrared opacity of the atmosphere. But that is a value judgment far outside the scope of physics, and one social sciences claim is theirs by right of ... something.

The metaphors people use to think about the natural world are terrible, or as Carl Sagan put it Demon-Haunted.

The reason why physics, and other hard sciences, are so useful and respected is that you can switch dependent and independent variables around with a lot of success.

If I have the ideal gas law:

PV = nRT

Then I can rearrange it and be fairly confident it still works.

P = nRT/V

If you are an engineer this is a godsend. You want to set a hard value for P but can only directly control V or T? Try the second equation! You have a chance at succeeding without having to spend decades building machines that blow up and kill everyone around them!

Politicians see that and are jealous. Surely if those lame eggheads can get things to work like that we can too. So the social sciences give you equations as well. After a bunch of statistics we see that:

time spent in school = a*wealth - c

We can't control wealth, but we can control how long people spend in school:

wealth = (time spend in school + c)/a

So if we force everyone to stay in school until they are 50 everyone will have 20 million dollars in their bank accounts.

And to anyone who asks how this works, politicians say: Why are you against science and hate poor people?

tomrod5y ago

This is why knowledge of causal inference is essential.

Causality is not established via tweaking a correlation or regression analysis, and we social scientists should know that.

konjin5y ago

Casual inference is the bottom most rug of what gives hard sciences its power. It is that we understand the objects we are manipulating at a much deeper level so we don't sound like idiots.

Suppose that we take:

g = ma

A perfectly valid way to find experimental values for gravity at a location. But that doesn't mean that if we push an object really hard we increase the gravity in that location, or decrease it if we pull on the object. Just because symbol manipulation gives as an answer doesn't mean that the answer makes sense, you need to keep track of all the implicit state of the universe.

1 more reply

tomrod5y ago

> Stupidity: they can't tell which papers will replicate even though it's quite easy.

I am not familiar with this work. What exactly makes a paper predictably replicatible?

itsdrewmiller5y ago

There is a footnote for that claim - https://journals.sagepub.com/doi/full/10.1177/25152459209196... - but basically "things that are hard to believe" and "things that barely passed statistical analysis" (high p-values).

emmelaich5y ago

Of malice vs stupidity, I'm pretty certain it's stupidity. Or more precisely, self-delusion.

The story of Millikan's oil drop experiment replications and also James Randi's (and CSICOP's) battle with pseudo-scientists convince me of this.

luckylion5y ago

There's probably a mix of both. At some point, most people probably realize that there's something fundamentally wrong - but by then, they're a few decades in and too much of their career and personal life depends on it being true, so openly changing course is very daunting. When your identity and career depends on something wrong being considered true, you have no incentive to point out problems, and every incentive to mislead.

DrNuke5y ago

Shameless plug with the ten relevant problems I scooped from a very recent literature review: interculturalism, introspection, truth, authenticity, human enhancement, critical thinking, technocracy, privilege, ethics, higher education. Link to free intro: https://www.tenproblems.com/2020/08/01/ten-problems-for-soci...

jonnycomputer5y ago

That's a lot to say before even the replication results are actually out.

tkelemen5y ago

It's not science.

solinent5y ago

The epistemal value of epidemiological studies is very low. Nothing can be done to fix it.

gverrilla5y ago

Most people just don't have a clue about what they are doing and have no passion for their research whatsoever. When you have money as the main driver for science, this kind of stuff is exactly what you should expect. There's homeless and crackheads etc at a 3km radius from the majority of social sciences schools around the world. It's a complete a failure and scam. Science development is analog to social development and nothing is going to change by appealing to scientists to don't cite weak research lmao

wrnr5y ago

Lots of social science in crap, for sure, no arguing about it, dunno how to make it better then not to do it. Though, some are interesting if you have the patience for it, ex. Linguistics, Psychology and Economics, even things like critical theory are sort of useful, think of it like the abstract algebra of social science. Just people pulling apart concept to see if they can be put back together in another way to create something new. I now a lots of CS researchers and they do shit work and cite each others excrement, honestly CS is the sociology of STEM. Their I said it.

j / k navigate · click thread line to collapse

101 comments

unishark5y ago

> Publishing your own weak papers is one thing, but citing other people's weak papers? This seemed implausible...

selectionbias5y ago

Citing a paper needn't be a celebration of it, you can cite a paper to say "these guys find X but..."

goatinaboat5y ago

You basically need to pay homage to the "ground breakers" who claimed that turf first, even if they did it via fraud.

Even referring to it as “science” is fraudulent. Testable theories and repeatable outcomes, anyone? Time this whole field was defunded.

textman5y ago

boomboomsubban5y ago

"Social studies" refers to a variety of different social science and humanities subjects taught in public education. The term comes from US lawmakers around 1900.

"Social sciences" are generally considered the "science of society," and the term comes from philosophers in the mid-1800's.

1 more reply

pas5y ago

That's just ignorance of how science works. There are no experiments in astronomy either ... or are there?

Natural experiments. Just as much as they happen in economics and sociology. And even though stars and black holes might be simpler than humans, it doesn't mean the soft sciences are not science.

That said exactly because of the additional complexity the bar should be higher for what's acceptable as research, viable methods and appreciable results.

ims5y ago

There were some stunning claims being made on Twitter last month based on a recently published study. Instantly skeptical, I dug into the methodology section and found this gem:

... The sad part was they didn't even choose a reasonable model in the first place.

selectionbias5y ago

ims5y ago

The Stata thing was just one of many, many red flags.

gwern5y ago

When your robustness check fails because of numerical overflow...

bitxbit5y ago

I mean they couldn’t just do it in Matlab? or Python? So incredibly lazy.

garden_hermit5y ago

watwut5y ago

Matlab is quote expensive.

selectionbias5y ago

Which paper was this?

xrisk5y ago

Is there a similar study done on the physical sciences? I’m getting a bit of holier-than-thou feeling from this article.

Edit: from all this talk of reproducibility, I wonder what percentage of cutting edge ML research is reproducible (either from lack of public training sets / not enough compute)

unishark5y ago

Other CS subfields that get a lot of criticism are "network science" and bioinformatics.

dehrmann5y ago

unishark5y ago

> ...throwing things at the problem and see what sticks and is fast.

jack_pp5y ago

Well yes. I remember hearing that there are now meta-networks, networks that optimize other networks by mechanisms you described.

garden_hermit5y ago

There are tons of replication issues across the sciences, they are just most salient in the Social Sciences because the topic is just really hard to study well.

CuriouslyC5y ago

Biological sciences are more often than not just as difficult to reproduce, mostly due to the difficulty of controlling living organisms, the somewhat random nature of the outcome, and p-hacking.

rmbeard5y ago

This one comes close https://www.nature.com/news/1-500-scientists-lift-the-lid-on...

thu21115y ago

not2b5y ago

This guy overstates his case somewhat. Consider:

vharuck5y ago

When I advise decision makers on reading statistics (in my case, state-wide health data), I urge them to focus on effect size and only use significance as a filter. Two reasons:

itsdrewmiller5y ago

unishark5y ago

hinkley5y ago

If someone is selling you on rethinking an entire discipline they are probably overstating their case a bit. Or you’re ignoring them outright.

That doesn’t mean they’re wrong, necessarily. Overcoming inertia is a huge challenge. Daunting, even.

doonesbury5y ago

mehrdadn5y ago

> the first 51% of communication is repetition

Is this a famous saying? It sounds nice but I hadn't heard it before. (And Google doesn't pop up anything obvious.)

1 more reply

hinkley5y ago

One, I hate to crush a spirit, although I'm just sending them to someone else to do it. Two, about one time in ten they come back with a preliminary roadmap to a solution.

I can tell Steve until I'm blue in the face that this code is nuts and get nowhere, or I can send three other people to tell him once and finally he'll take it seriously.

bobcostas555y ago

It depends on how the replication is done, but the big replication projects typically use very large sample sizes compared to the original studies, so their error bars are much smaller.

SubiculumCode5y ago

Indeed, and even if the replication isn't significant, it doesn't mean that the replication and the original study significantly differ from each other.

Overall, the condescending trash talking in this article led me to flag it.

jonnycomputer5y ago

Could you speak a little more on that point?

SubiculumCode5y ago

bruh

konjin5y ago

I'm sorry, but this is such a ridiculous counter argument I'm speechless.

An explanation point as criticism?

>How dare he wear tweed, his argument is invalid.

jbay8085y ago

> An explanation point as criticism

konjin5y ago

stewbrew5y ago

The "social sciences" include a lot. Wrt Sociology, I'd say one problem is the overemphasis on quantitative methods - they try to be as serious as the big boys.

I write this as someone who was originally trained as sociologist and over the years turned into a data scientist.

xrisk5y ago

I’m really interested in what you feel about the potential applications of CS/ML to sociology. Or if you might have any resources that talk about that.

I ask because I’m enrolled in a research program in “computational humanities”. My initial feeling towards the program is that it’s kind of a sham.

Computational Humanities seems to be as computational as an accountant using Excel for their work. Not that I particularly mind, I’m not very interested in the computational aspect at all.

stewbrew5y ago

crazygringo5y ago

I've tried to understand this (obviously quite angry/ranty) article and cannot actually figure out what data it has.

It seems to not be based on actual replication results, but predicted replication results? But then the first chart isn't even predictions from the market, but just the author's predictions?

Is there any actual "meat" here than I'm missing? Or is all this just opinions based on further opinions?

leftyted5y ago

It's based on this: https://www.replicationmarkets.com, which is linked in the first paragraph of the article.

The tone is ranty but, by participating in the markets, the author is putting his money where his mouth is.

brandmeyer5y ago

I think you're right. Take a look at the before/after curves for "this is what the predictions look like after the papers".

This could instead be a measurement of how people's trust in science is predicated on how well it matches their own prior beliefs.

+ A sharper eye shows they aren't quite bimodal in the prior belief. Even in those cases, the separation between the modes gets much wider.

jonnycomputer5y ago

No, you are exactly right.

mhh__5y ago

I try not to look down on social science, for the most part data is data as long as you can reason about how it was collected and who by.

kovac5y ago

efavdb5y ago

jessriedel5y ago

Summary tweet thread by the author: https://mobile.twitter.com/AlvaroDeMenard/status/13043994376...

throwawaysea5y ago

Threadreader version: https://threadreaderapp.com/thread/1304399437641461760.html

throwawaysea5y ago

thu21115y ago

dleslie5y ago

> If the NSF was scrapped, what would the scientific world look like? Probably like the world before WW2, when most science was done by hobbyists or "inventors".

There were scientists who received financial backing from wealthy individuals in a manner not so different than VCs operate today; Tesla among them.

Regardless, I tend to agree that science that exists for the sake of publishing, because publishing is a requirement of receiving grants, has diluted the respectability of science.

mNovak5y ago

Does anyone have links to the Replication Prediction Market results mentioned in the article? That sounds super interesting.

throwawaysea5y ago

Here’s the paper, published only recently: https://royalsocietypublishing.org/doi/10.1098/rsos.200566

> I bet you could do some ML to predict replicability of a paper (per author's suggestion that it's laughably easy to predict)

Another issue is this bit from the author on Twitter:

thu21115y ago

pas5y ago

Huh? That sounds exactly the thing that a ML system would learn quickly from the data. You probably don't even need shiny deep learning (though it helps).

That said, using an ML system as the gatekeeper as OP suggested is a bad idea, as it'll quickly result in the loss of proxy variables' predictive power.

Though ultimately a GPT-like system has the capacity to encode "common sense".

thu21115y ago

Even GPT-3 doesn't encode common sense, which is why it can't do a lot of basic physical reasoning. It's "just" word prediction, albeit very impressive word prediction.

1 more reply

Kednicma5y ago

> Even if all the statistical, p-hacking, publication bias, etc. issues were fixed, we'd still be left with a ton of ad-hoc hypotheses based, at best, on (WEIRD) folk intuitions.

[0] https://en.wikipedia.org/wiki/Psychology#WEIRD_bias

[1] https://en.wikipedia.org/wiki/Big_Five_personality_traits

teorema5y ago

Kednicma5y ago

[0] https://en.wikipedia.org/wiki/Philosophical_zombie

[1] https://en.wikipedia.org/wiki/Dodo_bird_verdict

[2] https://en.wikipedia.org/wiki/Cartesian_theater

[3] https://en.wikipedia.org/wiki/ADE_classification

[4] https://en.wikipedia.org/wiki/Monstrous_moonshine

[5] https://en.wikipedia.org/wiki/Classification_of_finite_simpl...

[6] https://en.wikipedia.org/wiki/Hard_problem_of_consciousness

friendlybus5y ago

There's also zero evidence that humans are p-zombies, and plenty of criticism that p-zombies don't work as a thought experiment.

It's a pretty big leap to throw away consciousness on the back of equal outcomes in psychotherapy. There are partial rejections of the Dodo bird verdict in your link.

It's easy to say that the Big Five's cross-cultural statistical correlations are not good enough to describe people, though to dismiss it entirely off your grounding is not really going to work?

1 more reply

barry-cotter5y ago

The Big Five is a much less impressive accomplishment than you’d think for how much people talk about it.

https://carcinisation.com/2020/07/04/the-ongoing-accomplishm...

konjin5y ago

harmful_stereo5y ago

Fannon5y ago

garden_hermit5y ago

I guess I fall under the field of "Progress Studies" though I think I'm much less concerned with the replication crisis than most.

karaterobot5y ago

> This is because a) it is more likely to have been replicated, b) its more likely to have been incorporated into prevailing theory, or even better, have survived a shift in theory

garden_hermit5y ago

Citation is not an endorsement—plenty of things are disagreed with in order to disagree with something, reference history in a field, or contextualize a result against past findings.

scythe5y ago

It's common to see this topic: what's "wrong" with social science. But there are always some things wrong with every science. If nothing was wrong, there wouldn't be any science left to do.

(All of this is ignoring that physics had a 200-year head start)

teorema5y ago

Meta-science has always been the gift of social science. This will all eventually funnel down elsewhere, just like meta-analysis.

konjin5y ago

>Physics demands that we respect electricity and don't increase the infrared opacity of the atmosphere.

The metaphors people use to think about the natural world are terrible, or as Carl Sagan put it Demon-Haunted.

The reason why physics, and other hard sciences, are so useful and respected is that you can switch dependent and independent variables around with a lot of success.

If I have the ideal gas law:

PV = nRT

Then I can rearrange it and be fairly confident it still works.

P = nRT/V

time spent in school = a*wealth - c

We can't control wealth, but we can control how long people spend in school:

wealth = (time spend in school + c)/a

So if we force everyone to stay in school until they are 50 everyone will have 20 million dollars in their bank accounts.

And to anyone who asks how this works, politicians say: Why are you against science and hate poor people?

tomrod5y ago

This is why knowledge of causal inference is essential.

Causality is not established via tweaking a correlation or regression analysis, and we social scientists should know that.

konjin5y ago

Casual inference is the bottom most rug of what gives hard sciences its power. It is that we understand the objects we are manipulating at a much deeper level so we don't sound like idiots.

Suppose that we take:

g = ma

1 more reply

tomrod5y ago

> Stupidity: they can't tell which papers will replicate even though it's quite easy.

I am not familiar with this work. What exactly makes a paper predictably replicatible?

itsdrewmiller5y ago

emmelaich5y ago

Of malice vs stupidity, I'm pretty certain it's stupidity. Or more precisely, self-delusion.

The story of Millikan's oil drop experiment replications and also James Randi's (and CSICOP's) battle with pseudo-scientists convince me of this.

luckylion5y ago

DrNuke5y ago

jonnycomputer5y ago

That's a lot to say before even the replication results are actually out.

tkelemen5y ago

It's not science.

solinent5y ago

The epistemal value of epidemiological studies is very low. Nothing can be done to fix it.

gverrilla5y ago

wrnr5y ago

j / k navigate · click thread line to collapse