Values in science? Diversity, scientific objectivity and criticism

One of the liveliest discussion topics in the philosophy of science in the last three decades is whether non-epistemic (or non-cognitive) values can have a legitimate role in science –that is, not whether they do intervene but whether they should intervene. It is much less controversial that non-epistemic values influence how scientist choose their topics, the social/institutional organization of scientific activity, or how scientific findings are employed in practical applications or policy-making. By a legitimite or illegitimate intervention people usually mean the context of theory appraisal (including data collection and analysis strategies) –the scientific evaluation of the empirical support of scientific claims and their acceptance/rejection.

This philosophical discussion is directly related to several issues that the practicing scientist would also be immediately familiar, such as discussions of diversity/ inclusivity and representation in science, or research on sensitive topics like race.

There has been a significant number of different arguments against the value-free ideal of science (that is, free of non-epistemic values) and in favor of the intervention of non-epistemic values in science. One of the ‘hottest’ takes on this is called the ‘gap’ argument in Elliott’s (2022) Cambridge Element Values in Science. In this post I will try to sketch some preliminary points in defense of the value-free ideal against the gap argument, and do so from the perspective of Popper’s critical rationalism.

The gap argument is chiefly due to Longino (1990). It goes from the problem of underdetermination of hypotheses by data to the claim that science should not be value-free. The problem of underdetermination refers to the impossibility of conclusively refuting or verifying hypotheses based on empirical evidence alone, because we need to make certain auxiliary or background assumptions that link the data to the hypotheses (like how to best measure a concept, how to decide what is causally relevant and irrelevant in an experimental setting etc). Thus, there is an interpretive Spielraum in deciding what to take as evidence for or against a claim, depending on how we choose our auxiliary or background assumptions. And its in this choice where we have a potential (inevitable for Longino) value-encroachment from the non-epistemic realm.

Longino’s suggestion is to abandon the value-free ideal completely and embrace the value-ladenness of the innermost activities of science. This is not an argument against the objectivity of science though. Quoting from Elliott, “Longino argues that the best strategy for maintaining the objectivity and trustworthiness of science is not to try to eliminate values from science but rather to create a social context in which background assumptions and the values associated with them are subjected to critical scrutiny.”

The first claim by Longino is that we should not criticize scientific claims for being value-laden, because this is inevitable.

Firstly, value-freedom in science is not necessarily a descriptive claim. A descriptive claim would say that non-epistemic values never or rarely intervene in scientists’ reasoning. Such a claim should be studied empirically, and I would not be surprised if it turns out to be false currently or in some period in the past. But the question itself is not very interesting from a philosophical perspective, it is probably more interesting to a sociologist or historian. Value-freedom is a normative claim, and people who endorse the value-free ideal can easily accept that non-epistemic values do encroach into science. The catch is that they view such encroachments as irrational influences or biases, which are by default detrimental to scientific objectivity. On the basis of the value-free ideal, we acquire a standing to criticize scientists when they make value-laden choices in the research process or in their appraisal of other’s claims. Under the value-laden model, we would lose this standing.

Secondly, value-free science does not mean value-free individuals. It might as well be the case that it is absolutely impossible for individuals to be value-free. So, we do not have to assume or require that scientists show a skeptical attitude or impersonal detachment (cf. the Mertonian norm “disinterestedness”). On the contrary, if we must accept that individuals cannot be value-free, the only way to scientific objectivity through critical social interactions between scientists can be conceived under the value-free ideal. Critical discussion reveals and problematizes the encroachment of values into scientific reasoning.

A prominent example is Popper’s critical rationalism, which purports that all values introduce bias into epistemic judgments but scientific objectivity arises out of a social process of criticism that involves revealing such biases from diverse standpoints. Value-diversity means that even if scientists are blind to own biases (highly probable), they will not be so to the biases of others. Thus, there will always be others in the scientific community who are not blind to the same biases, if we have viewpoint diversity in science and no systematic obstacles in the way of criticism. If diversity decreases, or some views are shielded from criticism, the likelihood that the biases of certain worldviews will be accepted as part of scientific knowledge through scientific consensus increases, and scientific objectivity suffers.

What may be described as scientific objectivity is based solely upon that critical tradition which, despite all kinds of resistance, so often makes it possible to criticize a dominant dogma. In other words, the objectivity of science is not a matter for the individual scientist but rather the social result of mutual criticism, of the friendly-hostile division of labour among scientists, of their co-operation and also of their competition.

Popper (1994)

The value-free ideal is the only framework in which we can argue for diversity in science on epistemic grounds. When we take the framework of value-freedom away, diversity can only be valuable on other, non-scientific criteria (e.g., social representation), and we do not have an argument anymore for what makes science a special institution. Also, if diversity is not defended by virtue of its epistemic value, as an error-probing social mechanism or on some other epistemic ground, the implementation of diversity promoting practices would be servile to popular political leanings among scientists at the time.

The second claim by Longino is that scientific objectivity depends (or should depend) on critical scrutiny. This is already a meaningful claim in under the value-free ideal, but this is misleading, because Longino  changes the accustomed meanings of both scientific objectivity and critical scrutiny, where neither is grounded in epistemic values (alone). By critical scrutiny Longino does not (and cannot) mean an assessment of background assumptions based on epistemic values, which directly undermines the value-ladenness argument and restores value-freedom. Instead, she does (and must) mean an assessment based on non-epistemic values.

Longino’s second claim actually means:

Scientific claims should be criticized based on non-epistemic values.

Since according to Longino we should not criticize scientific claims for being value-laden (which is inevitable), we can criticize them only for being laden with values we do not endorse. Who would do this criticizing, then? We can either appeal to majority or hegemonic values (social, political, ethical) as some sort of a common sense, or say that everyone (or group) who subscribes to divergent values should voice criticism.

Appealing to majority values has some obvious problems:

Values that are hegemonic in a society change over time.

If we take the values that shape the Zeitgeist as our benchmark in appraising scientific claims, our appraisal of scientific claims becomes essentially time and fashion dependent. This implies that with no change to a given theory or set of data, their evidential relationship would change just as a matter of social change.

Minority values would be disregarded.

This is not a consequence for science itself but for the society in large. Science’s value free-ideal means that science is an impartial arbiter of facts that different social groups can all appeal to in resolving inter-group disputes. If the scientific “facts” are necessarily value laden, being laden with majority or hegemonic values makes science likely an institution of oppression from the viewpoint of minorities. What is worse, minorities lose one of their strongest tools (i.e., scientific facts) to counter oppressive discourses and practices.

The distinction between is and ought would be blurred.

The value-free ideal means that science does not make any normative suggestions. It (ideally) just tells what is the case, what would be the case if we did X, or how best to achieve Y. It does not tell us what should be the case, whether we should do X, or aim for Y. These latter are for the society to decide. Under the value-laden model, we would lose this important distinction.

Appealing to divergent values has other problems:

Scientific objectivity would become parasitic on social consensus.

Decisions to accept or reject scientific claims would require a much broader agreement on the legitimacy of a certain set of values in relation to those claims. Scientific claims may often be highly controversial for a long time until a scientific consensus takes shape on their fate. But when non-epistemic values also take a legitimate seat at the table, the process of reaching scientific consensus would be essentially social/political, and such a consensus is notoriously difficult to reach (without autocratic interventions). If consensus is sought only within groups that comprise scientists with similar views, that would mean (by implication) accepting the legitimacy of “alternative facts” for different political groups.

A further, more problematic implication is that the very distinction between objectivity and consensus becomes vacuous. How can critical scrutiny lead to epistemic objectivity if it does not rely on epistemic values?

The problem of “voice”.

Political representation is already a thorny issue and many viewpoints constantly take the short end of the stick. Following a value-laden view of science, one strategy to facilitate scientific objectivity (in Longino’s sense) could be to diversify the representation of non-epistemic values in the scientific community in a way that potentially goes way beyond the diversity of the broader society. This may sound like a reasonable approach at first sight, but comes with the caveat that most decisions to hire, promote, reward or fund scientists would have to become political, rather than based on their scientific competence or achievements. To maintain a (non-epistemic) viewpoint diversity in science, we would have to actively engineer scientific communities in light of non-epistemic concerns. This would entrust science a social mission which it does not currently have; namely, creating a ‘better’ society within science to solve the problems of the broader society.

At this point we face a still bigger problem: a better society under whose light, for whose problems?

What counts as value diversity, and which values should be included or excluded, is highly context dependent. The values represented (or could be represented) in a scientific community in the US would be very different from those in a scientific community in eastern Europe, southeast Asia or Africa. Will we then operate with different senses of objectivity and social mechanisms for critical scrutiny depending on the context? If so, we can end up with multiple, mutually conflicting scientific truths, and science would possibly lose any claim to universality and authoritativeness. Longino does not argue for such a radical epistemic pluralism, but she opens the way to this implication by not grounding the link between objectivity and critical scrutiny in epistemic values.

To sum it all up, the value-free ideal neither means that scientists are socio-politically, morally, or aesthetically disinterested, nor that they can be. The ideal of value-freedom in science means that what enters the canon as scientific knowledge should be free-from the influence of non-epistemic values. One of the best arguments proposed so far for how science achieves (or can achieve) value-freedom, Popper’s critical rationalism, underlines the crucial role of criticism – Value-freedom in science is a social achievement that is conditional on viewpoint diversity and openness to criticism as an institutionally endorsed virtue. If we lose sight of this ideal, any sanctioning of the encroachment of values into science would lead to undesired consequences for everyone, including the critics of the value-free ideal. Last but not least, neither diversity nor criticism in science can preserve their meaning and significance if we abandon the value-free ideal in science.

References:

Elliott, K. C. (2022). Values in Science. Elements in the Philosophy of Science.

Longino, H. E. (1990). Science as social knowledge: Values and objectivity in scientific inquiry. Princeton: Princeton University Press.

Popper, K. R. [1962](1994). “The logic of the social sciences.” In In Search of a Better World: Lectures and Essays from Thirty Years, 64–81. Routledge.

Advertisement

Trust and criticism in science, Part III: Distribution of epistemic labor, responsibility and credit

It is often pointed out that scientific knowledge is social in character. This statement can mean several quite different things, such as that science is laden with political and moral values, that scientific knowledge is socially constructed, or that it is a public good governed by public interests. More often than not this characterization, usually offered by sociologists of science, is intended to indicate an external source of epistemic vulnerability. A less investigated but nowadays quite relevant sense in which science is social is that scientific knowledge production is not the work of individual geniuses but genuinely a collective achievement. This “epistemically” social character of science may go far beyond the cumulative character of scientific knowledge and come in full relief in research collaborations that distribute scientific labor in order to reach certain epistemic ends. This is typically the case when the investigation of a research question far surpasses the limited competence and cognitive capacities of individual scientists. Science is becoming increasingly social in this strictly epistemic sense, as more and more fields come to feature large research collaborations, or “team science”. However, this sense of sociality is not free from critique either, but this time by epistemologists.  

Though there have been a few influential and enthusiastic accounts of collaborative science as distributed, social knowledge in the philosophy of science literature, such as those by Ronald Giere and Karin Knorr-Cetina, epistemological treatment of the topic is extremely limited and mostly skeptical. Due to the traditionally individualist perspective of the discipline of epistemology, collectivity is easily interpreted as a source of epistemic vulnerability. So, some doubt that research collaborations can produce knowledge reliably, and still others argue that they undermine epistemic responsibility and thus accountability. I will defend, on the contrary, that collectives can actually produce knowledge even more reliably and responsibly than individuals do when certain conditions are in place. But we need to adopt a non-individualist epistemology to appreciate the opportunities presented by collaborative science. 

Scientific inquiry as distributed cognition

Scientific inquiry has various dimensions, but at bottom it is a highly structured cognitive process. We intuitively think that cognitive processes are realized in the head, so scientific inquiry is something that happens in the head of the individual scientist. But it is often the case that such a complex form of cognition as scientific inquiry is simply impossible without substantial reliance on scientific instruments, computer programs and other experts. Scientists can tackle “big questions” by forming epistemic collectives where all these elements are organized into complex cognitive systems that generate knowledge at a supra-individual level. Such collectives pose unique epistemic opportunities and challenges, both of which are due to the fragmented nature of the processes of knowledge production involved. Thus, it is worthwhile to have an illuminative conceptual framework to better analyze how knowledge is produced within research collaborations. The concept of distributed cognition provides us with such a framework. 

Distributed cognition describes a situation where multiple agents collectively realize a cognitive task through dynamic interactions with one another and possibly with various artifacts. The task typically surpasses the cognitive capacities of any single individual.

Ron Giere, on the basis of his observations at the Indiana University Cyclotron Facility, and Karin Knorr-Cetina, on the basis of her field research stay at CERN, both described the experiments they examined in terms of distributed cognition. Giere wrote:

In thinking about this facility, one might be tempted to ask, Who is gathering
the data? From the standpoint of distributed cognition, that is a poorly framed
question. A better description of the situation is to say that the data is being
gathered by a complex cognitive system consisting of the accelerator, detectors,
computers and all the people actively working on the experiment.
Understanding such a complex cognitive system requires more than just
enumerating the components. It requires also understanding the organization
of the components. And […] this includes the social organization.

In her influential book Epistemic Cultures Knorr-Cetina similarly emphasized that the knowledge is produced not at the level of the individual scientists but at that of the experiment:

The point is that no single individual or small group of individuals can, by themselves, produce the kind of results these experiments are after ̶ for example, vector bosons or the long “elusive” top quark or the Higgs mechanism. It is this impossibility which the authorship conventions of experimental HEP exhibit. They signify that the individual has been turned into an element of a much larger unit that functions as a collective epistemic subject. […] No individual knows it all, but within the experiment’s conversation with itself, knowledge is produced. 

In many other fields from genetics to climate science, large research collaborations are becoming increasingly common. Moreover, they are not unique to certain fields of the natural sciences. There has recently been calls for big team science in psychology, and we have already begun to see various projects as well as standing initiatives that can come under this title. To name a few, ManyLabs[1][2][3][4] denotes several collaborative replication projects where individually produced datasets are pooled together, and Psychological Science Accelerator (PSA) is a crowdsourced research network consisting of more than 500 laboratories in more than 70 countries that aims to enlarge and diversify samples. PSA bore its first fruit with a registered report, and here is a retrospect on the process. All these examples also deserve closer investigation from the perspective of distributed cognition.

When is scientific knowledge production socially distributed?

Scientific knowledge production is already social in certain epistemically relevant senses even without being a distributed cognitive process, because it involves epistemic dependence on others. First and foremost, scientists rely on the theories, findings, or protocols of many others, past and present. In the simplest possible case, if (i) A knows that p, I know that (ii) if p then q and that (iii) A knows that p, then I can be said to know that q through epistemic dependence on A. While A’s evidence for p is a first-order justification for believing that p, my reasons for believing that A knows that p is second-order justification. In the scientific context second-order justification concerns the assessments of reliability regarding the data, methods, instruments, or the track-record of other experts as informants. It is close to impossible to find any example of scientific inquiry that does not feature similar kinds of epistemic dependence. 

We can further say that a considerable portion of all research has a clear social dimension by virtue of relying on technologies of cognitive enhancement, from scientific instruments which render otherwise unobservable phenomena observable to computer software which undertake complex computations or construct models out of big data in a humanly impossible way. Usually, the scientist using such tools does not have the competence to produce them or even to scrutinize their reliability. Thus, they rely on other people for both supplying the tools their research depends on and for providing evidence of their reliability. 

Neither of these give us socially distributed knowledge, and pose any serious challenge to the traditional individualistic conception of knowledge, because they exemplify one-way epistemic dependencies. While I depend on A to know that q, A may not at all be part of the process of coming to know that q—A might even be a long dead scholar who just established that p. Similarly, a programmer who writes a deep learning algorithm is often not an integral part of the research projects it is used in. In distributed cognitive systems, however, we speak of mutual epistemic dependence relations within a group that is unified around a common epistemic task, so that, as Hardwig put it, “individual researchers are united into a team that may have what no individual member of the team has: sufficient evidence to justify their mutual conclusion”. Here we speak of a unitary cognitive task, such as designing and running an experiment that can adequately test a scientific claim, which is achieved only collectively.  

Distributed knowledge implies networks of epistemic dependence, and collaborations are criticized precisely on the grounds that this generates epistemic vulnerability. In the first part of this blog series, I have defended the view that epistemic dependence does not necessarily imply epistemic vulnerability, hence it is not necessarily an impediment to knowledge, whether in the form of reliance on artifacts or on other people. I argued that scientific knowledge can also be (and actually is) reliably and responsibly produced on the basis of warranted or rational trust—this is when we lack sufficient evidence for believing a scientific proposition, but accept it on the basis of sufficient second-order justification that the proposition is the outcome of a reliable knowledge-generation process. It is often the case that scientists (let alone lay people) don’t individually have sufficient second-order justification for the reliability of the knowledge-generation process behind a scientific claim but have good reasons to justify reliance on other experts who scrutinized the research process on behalf of the scientific community. An integral part of warranted trust in scientific claims is thus the existence of an efficient and reliable social process of criticism, which Popper emphasized as the core of scientific progress. In the context of research collaborations, networks of trust are the very fabric of scientific inquiry: The individual members by themselves have only partial first-order justification and partial second-order justification. For this trust to be warranted, two conditions must be met: (i) the distributed process of scientific inquiry should be reliable, i.e., get things right sufficiently more than it errs and (ii) the collaboration should realize (in parallel to the socially distributed research process) a a reliable and efficient socially distributed process of internal criticism, so that there is sufficient (second-order) justification for any member to trust the reliability of the contributions of others. To the extent that these are met, the individual pieces of evidence contributed by the members of the collaboration can cohere into a unified body of sufficient (first-order) justification for the scientific claim put forward by the collaboration, and the collaboration manifests epistemic responsibility as a collective property, in the sense that it is vigilant towards errors and has social and technological means at its disposal to fix them. 

Depending on their social organization, research collaborations can indeed be in an even better position to minimize sources of error than individual researchers or small teams. This is because they can complement the traditional forms of scientific quality control with even more rigorous internal review mechanisms. Actually, the social process of criticism can be better realized as a distributed process just like the distributed research process in collaborations, than via dependence on the sporadic, entirely voluntary and mostly post hoc scrutiny of random peers, as it is pretty much the case with traditional peer review. A socially distributed process of criticism would be organized so as to make use of available expertise and resources in the most efficient and effective way, and could do so by relying on the already established social organization of a research collaboration. 

Networks of epistemic dependence in research collaborations 

There are many different forms of scientific collaborations depending on the nature of the division of epistemic labor, and one illuminative way to approach these differences is to look at the web of epistemic dependencies that make up the epistemic system underlying the collaboration.  

A big portion of scientific collaborations consist of people with overlapping or complementary expertise, for instance multi-author projects within the same field or interdisciplinary ones that bring together neighboring expertise such as collaborations of developmental psychologists and cognitive linguists on language acquisition. In the former case the division of epistemic labor is often in terms of human cognitive resources, and each member of the collaboration has the epistemic competence to scrutinize as well as reproduce the epistemic justification other members have for the epistemic output they contribute. Here there is no ineliminable need for warranted trust, because the kind of knowledge produced can in principle be produced by each individual member. In the latter case the division of epistemic labor is along disciplinary lines, but the members of the collaboration often have the appropriate epistemic competence to understand and even to scrutinize, although possibly not to reproduce, the justificatory grounds of the others’ contributions. Here first-order justification, namely scientific evidence, may be distributed but everyone can have individually derived second-order justification for trusting the parts of evidence provided by the others.

Especially in cases where the division of epistemic labor reflects differentiation in terms of highly divergent areas of expertise, we are faced with the questions of whether individual members of the collaboration can be attributed with significant epistemic credit and responsibility for the resulting epistemic successes and failures (hence, can be said to know the outcome of the distributed process). This is because typically each agent lacks the competence required to scrutinize or even understand some aspects of the collectively conducted research. In the case of large research collaborations where various experts interact in a systemic way that produces scientific knowledge irreducibly at the system level, we often encounter examples where both first-order and second-order justification is truly distributed. This is because the task of scrutinizing the reliability of evidence can only be collectively achieved. It is especially with respect to such cases of radically distributed scientific cognition we need the notion of warranted or rational trust.  

The questions that are of greatest relevance to scientific practice here are thus how we can assess the reliability of distributed research processes and whether the distribution of epistemic justification poses a problem regarding epistemic credit and responsibility. 

In regard to the first question, the particular organization of the research process around differentiated competences, various instruments and specific social practices that shape the information flow are crucial. In regard to the second, the idea of networks of higher-order (social) justification is quite pertinent. 

Let’s look at a skeptical epistemological take to illustrate the issue more concretely. 

Evaluating reliability and responsibility in research collaborations 

In their paper investigating how interests and values exert an influence in massively distributed epistemic collaborations, Winsberg, Huebner and Kukla argue that accountability becomes a problem because in such collaborations there is no coherent justification to be given for the entire process, hence no single person can be accountable for the whole study. In other words, they argue that the reliability of the research process is undermined because it is distributed, and for this reason it is quite improbable that coherent and adequate second-order justification can be offered. The main reason they give is that collaborative research involves lots of unforced methodological choices, where the degree of freedom is usually too high to give a rational story based on best epistemic standards in the abstract. So, decisions have to be taken regarding concrete problems, by researchers with different skills, training and methodological standards. 

…when a collaboration relies on numerous epistemically substantial contributors who must exercise expert judgment (rather than ‘human computers’ under top-down control), we introduce a role for at least that many sets of interests and goals, each of which will shape the research process in their own way […], unforced methodological choices must be made at every turn, and there is plenty of room for value-laden inductive risk balancing within these choices. Different researchers will use different standards and methodologies, which will be driven by different goals and pressures; but without centralized coordination, there is no built-in guarantee that the justificatory stories that undergird various aspects of a study will form one coherent justification. 

I think this argument might perhaps legitimately apply to certain individual cases of collaborative research that face domain specific challenges (the authors focus on examples from climate science and biomedical research), but would be a mischaracterization of how research is generally organized in successful research collaborations. More importantly, it rests on a rather traditional, individualistic conception of epistemic responsibility or accountability, which cannot easily accommodate distributed processes of knowledge production. Orestis Palermos, for instance, offers a pertinent account of how epistemic responsibility emerges as a collective property in distributed systems through self-regulation: 

The continuous interactions between the members of the group allow them to continuously monitor each other’s performance. In result, if there is something wrong with their distributed process, the group will be alerted to it and respond appropriately. When no alert signals are communicated, the group can responsibly accept the deliverances of its distributed cognitive ability by default.  

The heterogeneity of expertise doesn’t necessarily imply diminished epistemic responsibility, because individuals don’t have to scrutinize all aspects of the research process if this can be realized as a distributed process. Probably the most pertinent example of this would be the collaborations in high energy physics (HEP), where we have massively collaborative experiments that are collectively planned, executed, monitored and analyzed. The collaborators aren’t tools to be governed top-down, because they contribute their expert judgment in all these phases. A coherent justificatory story is indeed given, despite the highly distributed nature of the research, thus despite a lack of “centralized coordination”. Collaborations in HEP are particularly interesting also because one would expect them to suffer from high degrees of epistemic vulnerability due to the high complexity of the technological infrastructure and the heterogeneity of expertise required to investigate the typical research questions. Despite this, they arguably furnish the most prominent and successful examples of collaborative science. 

In another paper, Huebner, Kukla and Winsberg actually include a section on if/how interests and values may also influence HEP experiments and undermine their reliability: 

…consider an anecdote that was relayed to us. There were two major groups looking for the Higgs particle at CERN: ATLAS and CMS. When ATLAS reported data consistent with the observation of the Higgs particle at 2.8 sigma, the buzz around CMS was that they needed to do whatever was necessary to “boost their signal.” This meant shifting their distribution of inductive risks to prevent them from falling too far behind the ATLAS group—toward higher power, at the expense of reliability, or towards a lower probability of missing a Higgs-type event, at the expense of a higher probability of finding a false positive. Hence even here, it seems that we see the influence of local pressures and interests on methodology, in ways that cannot be simply eliminated. 

We can generally say that any bias towards the background or the signal hypothesis might influence error probabilities. But we would arguably need more tangible reasons to think this was actually the case. To begin with, the very existence of two independent collaborations, formed around two detectors, that investigate the same research question in parallel is a methodological feature aimed at assessing the reliability of the findings, in a way that is very similar to replication studies. The two experiments combined Higgs search results on the basis of a unified framework for common methodological tools and information exchange, which was decided upon through lengthy discussions over the minutest details. Thus, as I confirmed in a discussion with Dr. Philip Bechtle from the ATLAS collaboration, any “tempering” with the statistics would have easily been noticed outside of the CMS collaboration. Both teams use the same statistical procedures, developed collectively over the years, and apply the same criteria. The collaboration between the two collaborations was realized in the same way as the collaborations operate internally: through consensus building in planning and distributed processes in implementation and quality control. 

The authors also mention the blinding procedures used to reduce the effect of bias on the results. They do not pass judgment on their effectiveness, but take the very existence of such procedures as indicating the ubiquitousness of inductive risk balancing throughout the research process: 

…this structural mechanism is designed to minimize any distortion issuing from the interests of the scientists. Whether of not this technique helps to address this particular problem, it indicates a way in which inductive risk balancing continues to occur in unpredictable and perhaps unrecoverable ways throughout the research process.

Firstly, we have good reason to think that the blinding procedures are effective against bias, because, as Dr. Bechtle stated, one both (i) cannot see the data in the signal region and (ii) is responsible towards colleagues to demonstrate complete understanding of all relevant uncertainties in all control and validation regions. Moreover, and more importantly for our discussion, the very function of a blinding procedure is to increase the reliability of a process through distributing it. There isn’t a single case of research that doesn’t involve similar risks to differing extents. The reliability of a particular research design with an integral blinding procedure could only be higher than a similar design lacking it. So instead of signaling a liability, it testifies to a potential epistemic virtue of distributed cognition. 

Clearly no procedure is completely proof against mere error or biased management of uncertainties. But it is hard to say this is a more serious issue in collaborative research – arguably research collaborations like those in HEP are in an even better position to minimize error and bias, since the distributed nature of the research process allows researchers to work in multiple roles and overlap in multiple dimensions, thus to cross-check each other’s output. In the extremely seldom case when a false discovery nonetheless can’t be prevented, large teams can actually be in a much better position to detect it. Dr. Bechtle mentioned, as an informative case of error, the “discovery” of the superluminal neutrinos, which was identified and dealt with by the collaboration after the preprint was out. For an (again, rare) example of bias in analyses, he mentioned the “observation“ of pentaquark states by several smaller experiments, whereupon a larger collaboration, HERA-B, did more proper analyses and showed no evidence

The authors lastly argue that even though reliability may be less of a problem in HEP in comparison to other fields, there are no good reasons to think that accountability isn’t problematic. Similar concerns about accountability are raised in a paper by Huebner and Bright. They argue that only the collaboration as a whole can be held accountable, because responsibility in the senses of (i) “attributing” a scientific outcome to somebody and (ii) identifying somebody as “answerable” (who is to provide epistemic justification) can be too diffused. But they point out that it is hard to conceive how a whole collaboration can be punished.  

I think we can say that the nature of the continuous information flow (as portrayed by Knorr-Cetina), of the internal as well as external decision making, cross-checking and review mechanisms through nested work groups, panels and committees, together with high transparency allow the members of the collaborations to have good second-order justification to stand behind the findings and conclusions. It is not necessary that everyone in the collaboration is in a position to scrutinize all other contributions; it suffices if each contribution is reliably cross-checked by some. What is needed is transparency of how the web of second-order justifications is organized and reflection on its reliability.

I also think that what is identified as the most problematic issue regarding the dissipation of accountability is not essentially a concern about research reliability, but about whom to punish in the case of a breach of epistemic responsibility or, much more broadly, of scientific integrity. In the case of HEP experiments, breaches of integrity like fraud are much less of a concern because quality control occurs in much earlier steps and more rigorously than in any typical research project. In the rare case where such issues emerge, the transparency of the social structure of epistemic dependencies would allow for their timely identification.

Huebner and Bright add that massive collaborations have a serious impact on the checking mechanisms in peer review, because only an extremely limited pool of reviewers can have the required competences. They claim that this makes it very hard to detect and to punish fraud or other problematic research practices. 

The charge of “a serious impact on the checking mechanisms in peer review” should be conjoined with another premise in order to raise deep concern; namely, that external peer review in its traditional form is the best or alternative-less social mechanism for scientific quality control, which the authors do not explicitly argue for (one of the co-authors even argues elsewhere for the abolition of traditional peer review). It is plausible to think that the traditional peer review cannot handle the complexity of research in the case of large collaborations, not only because there are few individuals with multiple epistemic competences but more importantly due to its own lack of complexity as a social mechanism. It is an empirical question how much peer review indeed adds in case there are much more complex, multi-layered and multi-phase internal review mechanisms. This is actually the case at CERN, where any paper draft goes through several steps of internal peer review before it is submitted for external peer review, such as internal presentations to close colleagues, collaboration-wide open peer-review, and publication committee review. 

Collective responsibility beyond collective knowledge?

Lastly, a significant portion of the problems relating to scientific quality control pertain equally to the community level, such as publication bias. They are collective epistemic responsibility problems. Thus, epistemic responsibility should not be confined to scientists, whether individuals or collaborations, but considered also at the scientific community level. 

While distributed processes of knowledge production extend only as far as the cognitive system realizing them does, it is possible to speak of distributed social processes of criticism or scientific quality control also beyond cognitive systems. Just as reliable quality control within a research collaboration does not necessarily imply that every member equally scrutinizes the contributions of all the others, community level quality control may as well be undertaken by specialized initiatives supported by scientific bodies and journals. A good example is Registered Replication Reports in psychological science, which is an initiative for conducting multiple preregistered close replications of a selected study that is supervised by scientific journals

We can further mention some recent suggestions aimed towards distributing the social process of criticism and thereby the epistemic responsibility for quality control to bigger portions of the scientific community. Open peer-review, for instance, is a proposal intended to render the reviewers accountable for their judgments on scientific manuscripts as well as to increase the overall reliability of peer-review. Open peer commentary and post-publication peer review, which are forms open peer review can take, replaces or complements the traditional peer-review process with ongoing, massively distributed criticism by the community. 

In closing this post and the three-part series, let me re-state the main point: Good science does not hinge on self-reliance and skepticism, but on warranted trust and collectively organized, distributed criticism. From the employment of complex scientific instruments to large collaborations, scientific inquiry increasingly consists in networks of epistemic dependence. What epistemic responsibility demands of the scientists is not self-reliance in gathering evidence and wide-ranging skepticism against all external epistemic sources, but the collective maintenance of a social process of criticism that can reliably assess the reliability of scientific evidence. Scientific quality control based on a social process of criticism implies not less but often even more epistemic dependence on social mechanisms and technologies of cognitive extension. The earlier we embrace this basic fact, the better suited our perspective would be to deal with the problems of reliability and responsibility in contemporary science. 

Trust and criticism in science, Part II: Technological extension

In the first part I have talked about where and when trust might play an ineliminable role in science, in contrast to the prevalent opinion that scientific inquiry essentially requires a skeptical attitude.

I maintained that the problematic aspect of trust is that it generates epistemic vulnerability, which we clearly should not tolerate in the context of scientific knowledge production, and not that it involves epistemic dependence — that is, reliance on external sources such as testimony for justification. I argued that trust can be irrational to the extent that it implies epistemic vulnerability, but it can also be rational if it is based on justified (second-order) beliefs about the trustworthiness or reliability of an epistemic source and if it is constantly calibrated. In the scientific context such beliefs are justified to the extent that the research findings behind a scientific assertion have high inquirability and the social process of criticism, or scientific quality control, is working reliably. I described research findings as inquirable to the extent that the process whereby they are generated is transparent, the data are available, the assumptions regarding the reliability and validity of the measures are visible, the methods are repeatable by independent others and so on. Inquiribility thus allows that empirical evidence can easily be subjected to criticism. The social process of criticism, on the other hand, is reliable to the extent that it is generally able to detect errors, evidential inadequacies, weak inferential connections and so on, when such are present. I argued that inquirability facilitates and proliferates criticism, and actual criticism in turn raises the bar on accepted standards for inquirability. Moreover, both the inquirability of scientific research and the actual processes of criticism can be scaffolded by technologies of cognitive extension and distribution of epistemic labor and credit, which imply that we partly rely on external epistemic sources in judging the reliability of (various aspects of) scientific inquiry. 

Thus, I concluded, we have to minimize the role of trust in science where it implies epistemic vulnerability, but not necessarily epistemic dependence — because we often need to become less self-reliant in order to make our processes of knowledge production more reliable, hence science as a whole more trustworthy. When the trustworthiness of the scientific enterprise comes into question, what we have to increase is inquirability and criticism, not skepticism per se, so that we can cultivate well-justified, rational trust.

In this part I focus on how technologies of cognitive extension can play a significant role in increasing the inquirability of research and rendering the social process of criticism more efficient and reliable. 


To take off from where I left with a summary description of the current states of affairs in the first part, thus to recap why we should be talking about the issues of trust and criticism in science in the first place, there are increasingly vocal concerns that the quality control mechanism of science in several fields is not working adequately, if not broken. The core of this concern is clearly that epistemic vulnerability is prevalent, thus trust in scientific findings is not justified.

There are several core factors contributing to the prevalence of epistemic vulnerability where this trust is not adequately justified:

(i) One factor is the increased scale of science. Contemporary science has a quite different outlook than the science back in the early modern and modern times. While in the past scientific production as well as quality control were in the hands of a much smaller group of people, now science has become much more democratic in its production, if not equally so in its criticism. In the past the role authoritative scientific institutions played in the criticism of scientific claims was quite decisive. For instance, in its earlier days a central mission of the Royal Society (and of similar institutions such as the Accademia del Cimento) was to design and conduct replication studies to asses the credibility of reported findings. Now the actual rate of replications is unsurprisingly much lower due (among other things) to the sheer number of published findings. Another crucial element of scientific criticism, peer-review has never been a fail-safe process to acknowledge solid research and filter out weak reports, but now we hear more than ever before that peer-review does not fulfill its purpose of quality control and should better be abolished in its traditional form.

(ii) Another factor is the ever increasing complexity of contemporary science. Today even in the so-called softer fields of science the nature of the produced knowledge is often computational and require highly specialized skill sets for its evaluation. As the technologies of data gathering, data analysis, model building and testing become more complex, novel phenomena enter the purview of science (such as patterns in big data) and it becomes to that extent more difficult to evaluate scientific outputs using naked human cognition. Consequently, the evaluation of scientific findings requires much more resources than in the past, which we need to shift from the domain of discovery but not seldomly prefer not to due to unfavourable incentive structures. 

(iii) One other, relatively constant factor is the tragic mismatch between the represented epistemic subject (the model researcher) of most “rationalist” schools of philosophy of science and the actual epistemic subjects, whose decisions are studied more accurately by, for instance, social psychology than by the logic and methodology of science. When the social mechanisms for scientific quality control are inadequate or failing, epistemic weaknesses or faults of individuals become crucial determinants of the epistemic value of scientific claims.

Thus, the original scientific context in which the social practices and technologies of criticism that we still rely on today (such as the traditional peer-review) were developed was quite different than the contemporary one, but we have invested hardly enough resources to address this mismatch.

Technological extension of cognitive skills can indeed help significantly in mitigating the effect of these and similar sources of epistemic vulnerability. Let us begin with a brief exposition of the concept of extended cognition.


In their classic paper on the topic, Andy Clark and David Chalmers define extended cognition in terms of an epistemic parity consideration:

Epistemic action, we suggest, demands spread of epistemic credit. If, as we confront some task, a part of the world functions as a process which, were it done in the head, we would have no hesitation in recognizing as part of the cognitive process, then that part of the world is (so we claim) part of the cognitive process. Cognitive processes ain’t (all) in the head!

Their famous case, Otto’s notebook, illustrates an extended memory retention and retrieval process. Otto suffers from Alzheimer’s disease and always carries with him a notebook to keep all necessary information. He regularly adds new content to the notebook and relies on the information recorded in the notebook whenever he needs to recall something. Clark and Chalmers argue that Otto’s notebook functions just as an biological (onboard) memory functions: He carries it always with him, he consults it whenever he needs to recall a piece of information, and he accepts the information recorded in it more or less automatically — just as we do when we recall from biological memory. Thus, they maintain, when we deny that Otto’s notebook is truly a part of a cognitive process, we show unjustified bias towards processes taking place inside the head. 

Their view of extended cognition suggests some conditions for a resource (e.g. artifact) to be a constitutive part of a cognitive process. Clark summarizes these in three items he calls the “trust and glue” conditions:

i) the resource is reliably available and typically invoked, (ii) any information retrieved or gained via it should be more-or-less automatically endorsed, and
(iii) the information contained should be easily accessible when required.

From this perspective, depending on the extent to which we rely on them in realizing cognitive processes, a vast variety of artifacts from implants to digital devices, software and data repositories can become part of a cognitive system extended beyond the boundaries of skin and skull. “In these cases,” Clark and Chalmers argue, “the human organism is linked with an external entity in a two-way interaction creating a coupled system that can be seen as a cognitive system in its own right.”

The functions of cognitive artifacts are not limited to substituting a biological counterpart or allowing us to outsource cognitive labor. Through being coupled with our biological cognitive processes they can alter the nature of the cognitive task, amplify our cognitive performance or enable us to realize novel cognitive processes which lie beyond our mere biological cognitive capacities. 

In numerous rather mundane cases human reasoners rely heavily on environmental resources in realizing cognitive tasks, such as when one randomly re-arranges scrabble tiles in order to better recognize possible meaningful arrangements or uses pen and paper to do complex calculations. As Richard Menary maintains, such cases illustrate a change in the task space through the manipulation of external vehicles, for instance from one chiefly involving verbal memory to one equally involving perception and sensory-motor coordination.

In the context of scientific research there are numerous examples of coupled agent-artifact systems that realize processes of observation, data analysis, modelling, measurement, simulation etc. that mere human agents would find either extremely difficult or outright impossible to do. The way in which super computers, space telescopes, advance algorithms, machine learning or various other technologies are used in scientific discovery often goes beyond outsourcing of cognitive labor, as it is the case when someone constantly uses a calculator to perform even relatively simple arithmetic tasks, and illustrate a form of coupling that makes certain highly difficult or altogether novel kinds of epistemic achievements possible. 


Now, can extended cognition yield adequate epistemic justification, thus (possibly) knowledge? Can someone claim to know via an extended process such as one involving the calculation of a function using software just as one can claim to know that there is a cat on the mat because he saw it and has reliable vision? Or does extended cognition involve too high a degree of epistemic dependence and consequently not sufficient contribution from the agent? Moreover, is such epistemic dependence compatible with epistemic responsibility or implies a high propensity of epistemic vulnerability? 

These are similar questions are clearly not only relevant from a purely epistemological perspective but also in relation to scientific criticism, reliability of processes of scientific inquiry, and the nature of scientific knowledge. 

Our conception of knowledge is guided, according to Duncan Pritchard, by an “ability intuition;” namely, believing truly can count as knowledge only if the exercise of a pertinent epistemic ability or disposition on the agent’s part is chiefly or significantly responsible for the belief’s being true. In relation to the case of extended cognition Pritchard argues that an extended cognitive process can count as a genuine ability or disposition only if it is reliable (that is, it yields true beliefs much more often than it errs) and appropriately integrated into the agent’s “cognitive character;” namely, an agent’s “integrated web of stable and reliable belief-forming processes.” Integration of an extended cognitive process into the cognitive character of an agent requires, for Pritchard, that the agent knows the source of the reliability of the process in question and that it is reliable.

In the previous discussion on well-justified, rational trust I had maintained that one does not necessarily need to possess first-order evidence in order to know, but can also do so by relying on an external epistemic source if one has second-order evidence that the source has the pertinent first-order evidence. Similarly, I argue that appropriate integration into one’s cognitive character does not necessarily require that one has first-order evidence that an extended cognitive process is reliable and is adequately familiar with the source of its reliability. Especially in the context of scientific inquiry, often researchers do not have adequate familiarity with the theories, models or empirical evidence behind a complex cognitive artifact that they are heavily relying on in their own research. For instance a linguist whose research depends on a machine learning algorithm to extract information from vast databases of written text does not have to possess the proper expertise to program, modify or scrutinize it in order to have good reasons to believe that the information extraction process is reliable, as often there would be other experts in her extended network or possibly even some she is collaborating with who have good evidence for the reliability of the process and sufficient understanding of the sources of its reliability. Thus her reliance on the cognitive artifact would nonetheless be rational and responsible, although there would be less conscious cognitive engagement or agentive scrutiny on her part. 

Moreover, the guiding intuition behind the extended cognition thesis seems to require for genuine extension or coupling of systems (in contrast to merely using an external tool) a significant degree of epistemic dependence, or simply trust as appears in “trust and glue” conditions. Clark underlines this point in reference to the case of Otto:

As far as that argument goes, it should make no difference at all whether or not Otto is now, or ever was, aware of the source of the reliability of the notebook involving process. Indeed — and here comes the promised dilemma — there is a very real sense in which the more he is aware of such matters, the less the notebook will seem to be playing the same kind of functional role as biological memory. For as we noted, our biological memory is not typically subject to agentive scrutiny as a process at all, much less as one that may or may not be reasonably judged to be reliable by the agent.

Clark argues in favor of sub-personal forms of “epistemic hygiene” (e.g. unconscious meta-cognitive mechanisms) and against agentive vigilance, as he takes only the former to be compatible with genuine incorporation of an artifact into a cognitive system. But in the scientific context a more reliable and appropriate source of justification is simply the tightly connected epistemic network that supports the social process of scientific inquiry. One important aspect of saying that scientific knowledge is social knowledge is that scientific knowledge production involves division of epistemic labor in the form of specialization and collaboration. The incorporation of technologies of cognitive extension into processes of scientific inquiry takes place within a broader context of epistemic dependence on other researchers, as sources of justification as well as information. 


Going back to our discussion of the asymmetrical development of technologies of scientific discovery and technologies of scientific criticism, which I maintained is one of the reasons behind the current prevalence of epistemic vulnerability, let us first look at technologies of scientific communication and its evaluation.

An important portion of the problem of the low rate of replicability[1][[2] or credibility[3] of scientific findings in several fields is due to widespread methodological weaknesses[4][5], questionable research practices[6][7] or breaches of scientific integrity[8], together with relatively low incentives for engaging in peer-review and conducting replication studies instead of fast and numerous original research[9][10], but a significant role is played also by the fact that the current state of the technologies for increasing the inquirability of scientific outcomes and for faciliating scientific criticism leave much to be desired.

The main function of a scientific paper, the central element of scientific communication, is to report what has been observed clearly and transparently enough so that the readers can understand and in principle reproduce the whole observation process for themselves, and also judge whether they agree with the theoretical consequences drawn on the basis of the observation. However, the more complex the data collection, data analysis, the models, computations and inferential procedures behind the scientific report, the more difficult it becomes for the reader to assess the credibility of the results. In the current state, even thoroughly computational findings and highly complex theoretical inferences are communicated on simple printed paper or at best digital paper, the PDF. When the technologies of science communication lag behind those of scientific discovery, we have a potentially disasterous asymmetry, and scientific communication risks fulfilling its basic function.

Just as linguistic symbols have worked as cognitive artifacts that changed the way we cognize, as logical and mathematical notations further radically expanded the scope of what is thinkable, the nature of the cognitive labor that goes into critically evaluating scientific outcomes can be dramatically altered through incorporating interactive diagrams, codes and algorithms into scientific publications. Moreover, the inquirability of research outcomes can rise incomparably if all the computational aspects of research are done in computational notebooks, which not only change the whole cognitive task space for the researcher but also can function as a form of scientific publication that ensures transparency of data, codes and methods. For instance Mathematica, a computational notebook designed by Stephen Wolfram already in 1988 offers a whole platform of cognitive artifacts to scaffold computational work for scientists:

In Mathematica you can input a voice recording, run complex mathematical filters over the audio, and visualize the resulting sound wave; just by mousing through and adjusting parameters, you can warp the wave, discovering which filters work best by playing around.

Conducting and publishing research in computational notebooks would enable for the reader to see as well as interact with the whole computational process behind the reported findings, thus its inquirability:

To write a paper in a Mathematica notebook is to reveal your results and methods at the same time; the published paper and the work that begot it. Which shouldn’t just make it easier for readers to understand what you did — it should make it easier for them to replicate it.

Fernando Pérez’s open source IPyhton from 2000s (now continuing under project Jupyter which supports many more languages) carries the same idea much further by turning the notebook into “a computational partner, and as a thinking partner” and the process of using a computational notebook into a veritable process of extended cognition — both for the researcher and the others:

The notebook walks through the work that generated every figure in the paper. Anyone who wants to can run the code for themselves, tweaking parts of it as they see fit, playing with the calculations to get a better handle on how each one works. 

A paper announcing the (first confirmed) detection of gravitational waves which was published additionally as an IPyhton notebook even allowed that

the signal that generated the gravitational waves is processed into sound, and this you can play in your browser, hearing for yourself what the scientists heard first, the bloop of two black holes colliding.

Extending scientific cognition can scaffold scientific quality control in various other ways. For instance if tools for checking errors or inconsistencies in statistical results such as Statcheck or GRIM are fully incorporated into the routine practices of data analysis, the resulting extension of the required cognitive processes will increase their reliability significantly without necessarily requiring from the researcher any further expertise in statistics. A recent proposal to make hypothesis tests machine readible further envisions computer-assisted checks of corroboration/falsification and meta-analyses. Such a reform in scientific reports could both facilitate much more effective and reliable criticism and compel researchers to meet higher demands of transparency, methodological rigor and intersubjective testability.

Extended cognition in scientific quality control would imply that beyond the availability of certain tools, the research is communicated and evaluated in a way that strongly depends on technical instruments. We could possibly arrive at a point where evaluating scientific outcomes becomes nearly impossible without adequate technological infrastructure, similarly to Otto’s external, inorganic memory, just as today scientific discovery is nearly impossible without pertinent technologies.

Thus the way to make science more reliable and credible does not only pass through increasing self-reliance and skeptical vigilance, but possibly also through increasing our epistemic dependence on technological artifacts and other fellow scientists. This second path is also more preferable, because it addresses the asymmetry between incentives for and technologies of scientific discovery and scientific criticism more realistically and efficiently, and it is much more fitting to the inherently social and increasingly extended nature of scientific knowledge.

In the third and last part I will complete the picture I so far sketched by talking about the distribution of scientific labor and credit over extended social networks of scientific collaboration.

Originally published on medium/science and philosophy

[1] Camerer, C. F. et al. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637–644. DOI: https://doi.org/10.1038/s41562-018-0399-z

[2] Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.

[3] Ioannidis, J. P. A. (2005) Why most published research findings are false. PLoS Med, 2(8): e124.

[4] Ioannidis, J. P. A., Stanley, T. D., & Doucouliagos, H. (2017). The power of bias in economics research. Econ J, 127, F236-F265.

[5] Simmons J. P., Nelson L. D., Simonsohn U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11):1359–1366. DOI: https://doi.org/10.1177/0956797611417632

[6] John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532.

[7] Fraser, H., Parker, T., Nakagawa, S., Barnett, A., & Fidler, F. (2018). Questionable research practices in ecology and evolution. PLoS ONE, 13(7): e0200303.

[8] Fanelli, D. (2009). How many scientists fabricate and falsify research? a systematic review and meta-analysis of survey data. PLoS ONE, 4(5): e5738.

[9] Tiokhin L. and Derex M. (2019). Competition for novelty reduces information sampling in a research game — a registered report. R. Soc. open sci.6180934. DOI: https://doi.org/10.1098/rsos.180934

[10] Koole S. L., Lakens D. Rewarding replications: A sure and simple way to improve psychological science. Perspectives on Psychological Science. 2012;7(6):608–614. doi:10.1177/1745691612462586

Trust and criticism in science, Part I: Critical rationalism instead of organized skepticism

Trust seems to play an ineliminable role in a considerable portion of our ordinary activities of knowledge acquisition. I believe that 382 million people live in the U.S. not because I have conducted a census myself but on the basis of an internet query. Often such trust is rational, because it is adequately justified even though the justification in question is not of the sort that can count as evidence towards the truth of what is believed (my reasons for trusting the search result do not count as evidence towards the population of the U.S.). I do not have first-order but second-order justification: I have good reasons to believe that the information source (such as the website of the American Census Bureau) has good reasons to assert that there are 382 million people currently living in the U.S. Similarly, I trust my physician rationally when I believe her diagnosis of my disease, unless there are good reasons to doubt her expertise or integrity. When I visit a new city, it is often rational to believe the directions given by an ordinary-looking local.

In more general terms, epistemic or intellectual trust implies that one depends on another to know, understand or to be justified in believing something. If one is well justified in believing that the source of information knows, understands or generally has adequate justification for what it asserts, then one is trusting rationally. If the information happens to be true for the reasons held by the source, one can also be said to acquire knowledge by trusting. The second-order justification we have for believing an information source is about the reliability, honesty, competence or overall trustworthiness of that source. The ordinary sense of trust is clearly much broader, but for present purposes I will use the term epistemic trust in this specific sense of a process of belief formation that involves epistemic dependence on assertions by others. Given the overwhelming proportion of indirectly acquired knowledge in our total knowledge base, we can assert without much doubt that epistemic dependence is a central characteristic of ordinary knowledge acquisition.

It is tempting to think that the situation is completely different in the case of scientific knowledge — our most critical and thereby most esteemed epistemic endeavor. I observe that this is reflected in the opinions of my scientist friends whenever I ask them whether trust has any role in the production and dissemination of scientific knowledge. But when pressed to go beyond stating principles and to reflect on the ordinary practices of scientific inquiry, most practicing scientists tell that trust indeed plays a very central role. Can individual scientists re-run or scrutinize all previous studies that somehow feature in the theoretical background or even as some of the premises of their new studies? Does an experimenter personally understand, let alone test or verify, all theoretical claims and observation reports that go into establishing the reliability of each experimental instrument (e.g. optical theory of lenses and the telescope) or measurement procedure? Can any researcher single-handedly gather and analyze all the data needed in a big research project that typically demands different specializations and hence division of cognitive labor? Imagine working in a research establishment like CERN; can you see for yourself if each instrument, software, algorithm is working properly, or can you go through all the calculations and computations to detect errors? How about the works of past scientists — can anyone do without reliance on previous research in discovering new phenomena, which is arguably the key rationale of cumulative science? The answer is generally a big no. Further, the resources we invest in scientific inquiry are too precious to exhaust by re-testing each and every study, by repeating every single observation — especially highly complex and expensive ones. Scientific knowledge is practically impossible without networks of epistemic dependence, hence without trust. By extension we can also speak of placing epistemic trust on instruments of measurement, observation or computation, as we are similarly dependent on them in the context of discovery as well as in that of justification. Evidence of their reliability is usually known by a number of experts and others rely on their testimony.

The famous sociologist of knowledge Merton counted organized skepticism among five key norms in science.[1] If this norm is understood in a sense that is close to philosophical skepticism, in light of the above we can see that organized skepticism, thus understood, does not seem to be a suitable prescriptive norm, let alone a descriptive one, because the very notion of skepticism is not compatible with that of epistemic dependence. Philosophical skepticism generally implies suspension of judgment regarding things we think we know or are well justified to believe or act by. As a methodological attitude, skepticism prescribes doubt regarding all “fallible” sources of information and justification, because it seeks certainty. The skeptic typically refuses to believe or act by anything that is uncertain, however well justified otherwise. Epistemic dependence, on the other hand, necessarily implies some degree of uncertainty and fallibility. But this is true for any level of justification in science. Our most successful scientific theories may crumble under the weight of novel evidence, most esteemed experiments may turn out to contain flaws, most renowned scientists may be discovered to have made mistakes or even committed fraud; but the progress of scientific inquiry requires that we tentatively trust and act by scientific statements that are well tested and reported on current standards unless and until we discover signs indicating otherwise.

Further, skeptical inquiry describes at best a superficially social epistemic process, which is realized individually by many, but not to a robustly or substantially social one as is actual scientific inquiry. Science is social in the robust or substantial sense that processes of discovery and justification (from hypothesis formation to quality control) are distributed; that is, organized into networks of epistemic dependence — thus “inquiry” as a meaningful unit is realized only by collectives, not individuals.

How Merton himself understands organized skepticism, however, is rather different. He describes it from a clearly sociological, not epistemological perspective:

Another feature of the scientific attitude is organized skepticism, which becomes, often enough, iconoclasm. Science may seem to challenge the “comfortable power assumptions” of other institutions, simply by subjecting them to detached scrutiny. Organized skepticism involves a latent questioning of certain bases of established routine, authority, vested procedures, and the realm of the “sacred” generally. It is true that, logically, to establish the empirical genesis of beliefs and values is not to deny their validity, but this is often the psychological effect on the naive mind. Institutionalized symbols ·and values demand attitudes of loyalty, adherence, and respect. Science, which asks questions of fact concerning every phase of nature and society, comes into psychological, not logical, conflict with other attitudes toward these same data which have been crystallized and frequently ritualized by other institutions. Most institutions demand unqualified faith; but the institution of science makes skepticism a virtue. Every institution involves, in this sense, a sacred area that is resistant to profane examination in terms of scientific observation and logic. The institution of science itself involves emotional adherence to certain values. But whether it be the sacred sphere of political convictions or religious faith or economic rights, the scientific investigator does not conduct himself in the prescribed uncritical and ritualistic fashion. He does not preserve the cleavage between the sacred and the profane, between that which requires uncritical respect and that which can be objectively analyzed.

Being an institution that does not involve any sacred area and does not recognize other institutions’ demands of unqualified faith on topics that can be empirically investigated has very little to do with problems of reliability, credibility, quality control and self-correction in science. In this sociological sense organized skepticism is compatible with epistemic dependence. However, this is not because it makes a contribution to a theory of scientific knowledge that can incorporate epistemic dependence, but simply because this sociological understanding of organized skepticism is broadly irrelevant to the epistemology of science.

What is actually understood by organized skepticism, however, appears to be none of these. For instance, in the context of a study[2] investigating scientists’ subscription to the Mertonian norms, organized skepticism is described (as an item) thus: “Scientists consider all new evidence, hypotheses, theories, and innovations, even those that challenge or contradict their own work.” This is basically an epistemological definition, very similar to that of epistemic responsibility.

Simine Vazire describes organized skepticism in the form of a prescriptive rather than descriptive norm (for she maintains that actually the counter-norm of organized dogmatism prevails in science).[3] The contents of her description involve notions that belong to the epistemology of science, juxtaposed rather eclectically with notions Merton uses in describing organized skepticism. A clear indication is the choice of certain terms (marked in bold):

Merton’s fourth norm, organized skepticism, states that scientists should engage in critical evaluation of each other’s claims and that nothing should be considered sacred. Scientific self-correction relies on this norm because theories or findings that are treated as sacred cannot be corrected. Thus, the push for higher standards of evidence, and for practices such as preregistration, transparency, and direct replication that make it harder to (intentionally or not) exaggerate the evidence for an effect, is in the spirit of the Mertonian norm of organized skepticism. Self-correction requires being willing to put theories and effects to severe tests and accepting that if they do not pass these high bars, we should be more skeptical of them.

The notion of organized skepticism seems nonetheless evocative of some epistemic sensibilities that are central to scientific inquiry, thus it is understandable that it resonates well with many people who concern themselves with the epistemic values and norms of science: Compared to all other human practices scientific inquiry requires the highest level of scrutiny and the least dogmatism. But I think this is much better conceived and expressed through “criticism” rather than skepticism. The important difference is that an understanding of scientific rationality that is based on criticism instead of skepticism is compatible with epistemic dependence. As I argue in the end as well as in the second and third parts, clarifying the nature and role of epistemic dependence in scientific rationality can further help us diagnose and remedy current problems concerning quality, reliability and credibility in science better.

In comparison to such everyday processes of knowledge acquisition, what counts as “well-justified,” hence rational trust within science is rather different. It is not rational, for a scientist, to rely on somebody’s assertion on the grounds that he or she is an expert (such as the family doctor) or has no obvious incentive to be dishonest (such as the local giving directions). Further, as scientific questions often do not have objective, easily obtainable and unambiguous answers (such as the population of the U.S.), no epistemic authority can ever have the final word. Neither is it rational to place trust other scientists because they have implicit skills or knowledge (e.g. “flair”), attractive careers, or even a good track-record. In the context of scientific inquiry, I can rationally trust an assertion by a researcher only if I can make the judgment that (i) the researcher conducted the inquiry necessary to acquire the kind of evidence required for making that particular assertion, and that (i) I could have in principle reproduced the evidence if I followed the same procedure, given the same skills and background knowledge as the researcher. These skills and knowledge in turn are of a nature that they can be acquired by anyone with adequate “general” cognitive skills through education and training. Thus I rationally trust only if the first-order evidence can easily be subjected to criticism in a social process of analysis, methodological evaluation, replication and so on that is on the whole reliable: a social process of criticism that is generally able to detect errors, evidential inadequacies, weak inferential connections and so on, when such are present. This is the only possible sense of trustworthiness in science and of science.

This picture of the social process of science comes close to the perspective Popper called critical rationalism, which purports that scientific objectivity rests not on the skeptical attitude or the impersonal detachment (cf. the Mertonian norm “disinterestedness”) of individual scientists, but on the social process of criticism:

What may be described as scientific objectivity is based solely upon that critical tradition which, despite all kinds of resistance, so often makes it possible to criticize a dominant dogma. In other words, the objectivity of science is not a matter for the individual scientist but rather the social result of mutual criticism, of the friendly-hostile division of labour among scientists, of their co-operation and also of their competition.[4]

In accordance with this criterion of rationality in trusting assertions by others, we can add that the norm of assertion within science is that the assertion (e.g. any scientific claim, observation report) can be objectively criticized. Thus an assertion is normatively acceptable as a scientific assertion only to the extent that it can be subjected to criticism on empirical, logical or other methodological grounds. The focus is on criticism because all knowledge and especially scientific knowledge is fallible; that is, there can never be conclusive evidence for or proof of a scientific statement, and any supporting evidence (unlike criticism) is of little informative value. Thus, the only way forward is by accepting scientific theories that survive criticism until we have different, better methods and tools of criticism at our disposal. Popper identifies the criterion of success as high corraboration, which means that (i) the theory makes intersubjectively testable and highly informative predictions and (ii) these predictions have pasted severe tests; that is, tests that they would have more probably failed. In more general terms, what can be objectively (i.e., intersubjectively) criticized can be asserted, and what further survives serious criticism can be tentatively relied on:

What cannot (at present) in principle be overthrown by criticism is (at present) unworthy of being seriously considered; while what can in principle be so overthrown and yet resists all our critical efforts to do so may quite possibly be false, but is at any rate not unworthy of being seriously considered and perhaps even of being believed — though only tentatively.[5]

There are many different properties that make it possible for any process of scientific inquiry to be subjected to criticism. Popper talks about openness to criticism, or readiness to be criticized as the required scientific attitude for the social process of criticism to work. Openness to criticism is a psychological notion, having to do with the attitude of the scientist. Regarding the nature of scientific claims in particular, we can also talk about intersubjective testability, which requires (for Popper) that claims are falsifiable. Falsifiability has mainly to do with the logical form of scientific statements. From a broader angle, a research process can be subjected to criticism only if there is transparency regarding all relevant aspects and steps of the procedure by which evidence is collected, analyzed, evaluated and interpreted, and that the whole procedure is in principle repeatable. More particularly, we can talk about how actual research outcomes, as these are what gets to be communicated, can potentially be criticized. This is where the criteria for rational trust are the most relevant.

As an analogy, it might be useful to consider the corresponding quality of research outcomes as inquirability. Inquirability is a concept that actually belongs to the Theory of Decision Support System Design for User Calibration[6] as one of its three components (the other two being expressivity and visibility). Decision support systems are considered as inquiring systems, and their calibration is “an objective measure of the accuracy of one’s decision confidence.” Decision confidence is the belief in the quality of a decision, and obviously it might reflect or fail to reflect the objective quality of a decision. The theory conceives decisions of objectively high quality in terms of knowledge. In this respect, accurate and well justified decision confidence can be regarded as metaknowledge. Research outcomes, like decision support systems, often serve as bases for decisions, such as tentative acceptance, suspension of judgment or rejection of theories, and they have to be assessed, like the calibration of decision support systems, with regards to the accuracy of the beliefs in the quality of such decisions:

[I]nquirability indicate[s] how well the inquiring system is designed for user calibration… Inquirability is a continuum of designs for DSS actions ranging from the servile and illusory that lulls to the contrarian that engages and continuously challenges. Actions that are servile and illusory are designed to please, to unquestioningly provide data that supports the decision-maker’s position(s) and assumption(s). Little, if any, metaknowledge is identified or resolved by a DSS that is servile and illusory. At the other extreme, DSS actions designed to be contrarian, engage and challenge the decision-maker’s positions and assumptions to identify and resolve metaknowledge through the dialectic process of contrasting, debating, and resolving differences… Near the servile end of the inquirability continuum, the actions of a DSS can be designed to generate data that justifies or supports a position, a set of assumptions, or a decision that the user has already made… Servile inquirability fails to inform because it simply presents data that is in accord with the decision-maker’s position. As might be expected, data that agrees with one’s decision does not improve calibration.

Research outcomes can be said to be inquirable to the extent that the justification process that generated them is transparent, the assumptions regarding the reliability and validity of the measures are made visible, the methods are repeatable and so on. Research outcomes that lack inquirability in our adapted sense, like servile and illusory decision support systems that cannot be calibrated, cannot be assessed as to whether trust in them corresponds with high trustworthiness; on the contrary, often they present as findings what best justifies the claims of the researcher.[7]

Inquirability and actual criticism (e.g. testing, replication) are closely related. Inquirability facilitates and proliferates criticism, since the more there is transparent, intersubjectively testable research, the more there will be opportunity and incentive for criticism. In turn, the more there is a critical research culture, scaffolded by more efficient technologies and social mechanisms for criticism, research and its communication will have to meet higher demands of transparency, methodological rigor, better testability and more severe tests. Consequently, the reliability of studies will have to increase and the average scientific integrity will have to rise. As the trustworthiness of science increases, there are fewer reasons to rationally mistrust scientific claims and trust in science becomes more rational.

The criteria of rationality in trusting scientific claims are not different in nature for the scientist and the lay person, though there is considerable quantitative difference in the required sensitivity to signs of inadequate honesty, transparency, rigor and reliability — in short, to signs of trustworthiness. There is not a substantial qualitative difference in what it takes to trust rationally in science, because trust in science is rational only as trust in the reliability of the scientific method and the efficiency of the social process of criticism. When lay people trust in science on other grounds, like “science delivers certain, unshakeable truth” or “science has the right method to answer all kinds of important questions,” they are trusting irrationally just like the scientist who blindly accepts the epistemic authority of a renowned expert in the field, or who relies chiefly on unreliable signs of quality such as journal rankings. If there are accumulating signs of diminishing trustworthiness, trust in scientific claims becomes irrational both for the scientist and the lay person. The quantitative difference between expert and lay trust concerns, on the other hand, the lack of skills and background knowledge on the part of the lay person to judge the quality of, or sometimes even to understand scientific justification. In relation, the due epistemic responsibility of being vigilant towards low quality research and lack of scientific integrity falls the most on scientists working in the relevant field and the least on the lay person.

Clearly trust is at least partly blind. But rational trust is adequately and reliably sensitive towards signs of lacking trustworthiness in order to mitigate the vulnerability stemming from epistemic dependence. Systematic search for such signs is a central part of quality control in science. Instead of going skeptical and refusing to rely on second or higher order reasons to accept epistemic claims across the board, the scientific community should (usually does) endeavor to increase this sensitivity as needed.

All this is good and well, but how about the “credibility” crisis that has been haunting several scientific fields for a decade? Meta-scientific studies and the augmenting volume of self-reflexive discussions addressing a reproducibility[8] or a credibility crisis[[10]] have shed a surprising new light on the reliability and credibility of scientific claims. The widespread nature of questionable research practices[11][12] and scientific misconduct[13] have raised many questions regarding the adequacy of the internal quality-control system of science. All these show that a considerable portion of trust within as well as in science has been and still is irrational.

There has been much discussion of methodological reforms, changing the incentive structures in science as well as reforms in the social processes of gate-keeping and quality control. One rather neglected dimension in diagnosing the causes behind the plethora of problems though is that science has changed dramatically over the centuries in all respects pertaining to discovery, but the various aspects of the social process of criticism, such as scientific communication and peer-review has largely remained the same. While we currently have super computers, big data technologies, learning algorithms, machines of hitherto undreamed of complexity such as the particle accelerator, enormous space telescopes, genome mapping technologies etc., scientists still communicate their results to one another chiefly on digital “paper,” the PDF, and the communications traffic is modulated only by a handful of people per scientific paper, as if research is still being demonstrated and reviewed in small and closed learned societies. Together with the accelerating technological advances, the sheer number of scientists and scientific institutions have become incomparably bigger. Over the years we have transitioned from the small scale, easily reproducible science that is conducted and criticized within small aristocratic circles to massively collaborative, highly technological, highly specialized, inferentially complex science. The mismatch between the original context of various technologies and practices of quality control and the current one clearly leaves room for very weak networks of epistemic dependence, and the resulting epistemic vulnerability seems to have been widely exploited — intentionally or not.

We have invested a lot in the technologies of scientific discovery but close to none in those of scientific criticism. Thus contemporary science can conduct inquiry with super-human powers but has to criticize it in an all too human way.

What we need in the face of a wide credibility crisis might seem to be eliminating credence and adopting a much more skeptical attitude towards science, but from the perspective I outlined what is needed might also involve increased epistemic dependence. Counterintuitive though this may sound in the first instance, what I mean to express is that we can increase the inquirability of research outcomes and facilitate actual criticism by more reliance on technology and on social processes; that is, by virtue of extension of cognitive skills required for scientific criticism through technological tools (such as smart mathematical notebooks instead of paper or software for checking various properties of statistical findings) and establishing larger collaborative networks in which various tasks integral to the process of criticism are distributed.

In the second part I talk about the technological extension of cognitive skills, especially in relation to how it might increase the inquirability of research outcomes and the sensitivity of criticism for scientific communication and quality control.

In the third part I will talk about the distribution of cognitive labor and epistemic responsibility, with particular respect to how science could develop wider collaborative networks of criticism and reconceive accountability, integrity, credit and blame in science as applicable (also) to supra-individual entities such as scientific institutions, research groups or scientific communities of entire fields. Hope you stay tuned in!

Originally published on medium/science and philosophy

[1]Merton, R. K. (1973). The Sociology of Science: Theoretical and Empirical Investigations. University of Chicago Press.

[2]Anderson, M. S., Ronning, E. A., Vries, R. D., & Martinson, B. C. (2010). Extending the Mertonian norms: Scientists’ subscription to norms of research. The Journal of Higher Education, 81(3), 366–393. https://doi.org/10.1080/00221546.2010.11779057

[3] Vazire, S. (2018). Implications of the Credibility Revolution for Productivity, Creativity, and Progress. Perspectives on Psychological Science, 13(4), 411–417. https://doi.org/10.1177/1745691617751884

[4] Popper, K. R. [1962](1994). “The logic of the social sciences.” In In Search of a Better World: Lectures and Essays from Thirty Years, 64–81. Routledge.

[5] Popper, K. R. (1963). Conjectures and Refutations: The Growth of Scientific Knowledge. Routledge & Kegan Paul.

[6] Kasper, G. (1996). A Theory of Decision Support System Design for User Calibration. Information Systems Research, 7(2), 215–232. www.jstor.org/stable/23010860

[7] Lakens, D. (2019, November 18). The Value of Preregistration for Psychological Science: A Conceptual Analysis. https://doi.org/10.31234/osf.io/jbh4w

[8] Camerer, C. F. et al. (2018). Evaluating the replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour, 2(9), 637–644. DOI: https://doi.org/10.1038/s41562-018-0399-z

[9] Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716.

[10] Ioannidis, J. P. A. (2005) Why most published research findings are false. PLoS Med, 2(8): e124.

[11] John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532.

[12] Fraser, H., Parker, T., Nakagawa, S., Barnett, A., Fidler, F. (2018). Questionable Research Practices in Ecology and Evolution. PLoS ONE, 13(7): e0200303.

[13] Fanelli, D. (2009) How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data. PLoS ONE, 4(5): e5738.