Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Barriers to Transparency

“Self-certification” of research artifacts

Most journals have long allowed authors to (implicity or explicitly) assert that the provided artifacts are sufficient to correctly reproduce results. Many documented problems with such self-certified packages have led an increasing number of venues to actively verify such assertions during peer review by independently verifying that the code, computing environments, and workflows described in a publication actually reproduce the results.

With self-certification, there is no guarantee that the provided artifacts are complete, that they were actually used to obtain reported results, or that they have not been modified (intentionally or unintentionally).

Use of sensitive and proprietary data

Research in the social sciences often relies on access to sensitive or proprietary data that cannot be redistributed and, in many cases, is only accessible to authorized users on access-controlled resources. This includes data collected by researchers and stored on secure infrastructure at their institutions; confidential private-sector, school district, or government administrative data; as well as data from national statistical agencies. The results of research may further be subject to disclosure avoidance processes.

Use of streaming, transient, and ephemeral data

Streaming, transient, and ephemeral cannot be preserved for privacy reasons, terms of use, or because the scale of the data prevents long-term archiving. Examples include the GDPR’s right to erasure [33] and Twitter terms of use [34].

Use of very large-scale and specialized computational resources

Many researchers rely on large-scale computational resources provided by campus, state, or national cyberinfrastructure. These resources are both access controlled and time constrained, in that they are decommissioned after a period of time. It is unlikely that research conducted on these systems can be repeated without access to considerable time, labor, expertise, and technological resources.

Additional barriers