Comment - The Many Labs Project and the importance of replication in Social Psychology

The current issue of the BPS’s Psychologist magazine features an article on the Many Labs project (in press manuscript can be found here https://openscienceframework.org/project/WX7Ck/files/ManyLabsManuscript.pdf/).

As Honorary Secretary of the Social Psychology Section of the BPS, I have been asked to provide a comment on this initiative.

Below is a more elaborate version of my comments.

There are at least two ways to assess the strength and solidity of an effect: one is performing a meta-analysis (i.e. statistically combine the results of a series of studies which included the effect of interest), the second experimentally reproducing the study and see if the same effect appears in the new sample. The present study attempted a large scale replication of some very popular effects in social psychology.
Replication is at the basis of scientific progress: the fact that we find a certain effect in a certain study does not mean the effect is present in general, it could be an oddity of the sample or it could be linked to the specific conditions in which the study was run.

Over a meta-analytical work, the project reported has the advantage of guaranteeing that standardised procedures were adopted in each replica-experiment, though it meant that at times the design had to be necessarily over-simplified. It also has the advantage of getting around the ‘file drawer effect’, that is, the fact that most studies who fail to replicate an effect do not get published, and are therefore difficult to retrieve for meta-analytic purposes.

This ambitious project shows the importance of having a scientific community which engages in collaborative research and joins forces and resources in the common pursue of knowledge. Not only, but the transparency with which the data, their origin and their analysis are shared with the public are commendable and should set the standard for future work.
Indeed, no study is perfect, and also this project has margins of improvement, but it provides some solid ground to build upon.

Some possible areas of improvement are:

– Over-simplification: the effects which are successfully replicated are very basic

– Context in/dependence: the two studies which failed to replicate might be more linked to contextual factors (e.g. the lack of support for the flag-effect could be due to the ‘Obama effect’ in US and to the fact that the political discourse in general has changed it focus since the time in which the experiment was run; as for the currency, the recent economic downturn might have negatively impacted the link between money and system justification). This does not mean that the effect does not exist: it could exist under certain circumstances.

– Westernisation: of the 36 samples, only three were non-western (Turkey, Malaysia and Brazil) and three from Eastern Europe (Two in the Czech Republic and one in Poland), thus there are still issues to be addressed re: applicability of findings outside western world

– For the study including the IAT: being based on a contrast score we cannot tell whether the difference observed is due to a worse evaluation of maths or a better evaluation of arts which drives the results, so I am particularly unconvinced of the solidity of the Gender differences in maths attitudes study