Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

TRO Declaration Design: JSON and RDF

TROV 0.1 DRAFT

This document is a draft and subject to revision. Please submit feedback or report issues.

Why TRO declarations use JSON-LD, what each audience gets from the format, and how the JSON Schema constraint makes both JSON and RDF workflows possible from a single file.

Document SectionDescription
The Core IdeaOne file, two perspectives: valid JSON and valid RDF simultaneously
What This Means in PracticeWhat producers, JSON consumers, RDF consumers, and repositories each get
The JSON Schema ConstraintHow the fixed document structure enables both audiences
The @context Is the BridgeHow the JSON-LD context connects the JSON and RDF perspectives
SummaryFour-audience comparison table

The Core Idea

A TRO declaration — the document that describes a Transparent Research Object — is a JSON-LD document. This means it is simultaneously:

This dual nature is the central architectural decision for TRO declarations: one file, two perspectives.

Note: The TRO Declaration Format defines where terms appear in the JSON-LD document tree. The TROV vocabulary defines the RDF/OWL terms themselves. The TRACE specification encompasses the entire system.


What This Means in Practice

For TRO producers

Trusted Research Systems isolate computational workflows from researcher interaction during execution, guaranteeing that results reflect the submitted code and data without modification. Some TRS implementations go further — operating inside air-gapped enclaves at institutions handling confidential data, where there is no network access, no large dependency trees, and no runtime resolution of remote resources.

A TRO producer writes JSON. It follows a schema. The @context block at the top is a fixed header that the producer copies from a template (e.g. one provided in TRO Declaration Format). The rest of the document is a predictable tree of objects with known property names at known locations.

What producers need:

What producers do not need:

For TRO consumers using JSON tools

A consumer who receives a TRO declaration can treat it as a JSON document. Because the TRO Declaration Format constrains the structure, information is in predictable locations in the document tree:

A consumer can extract information using standard JSON tools.

For TRO consumers using RDF tools

The same document loaded into a triplestore (in memory or persistent) becomes a set of RDF triples.

Cross-TRO querying. A consumer who loads multiple TROs from different institutions into the same triplestore can query across them with SPARQL. For example, given an artifact’s SHA-256 digest, a query can trace its production history across institutions — which TRP produced it, what input artifacts that TRP consumed, where each of those inputs came from (another TRP, or an original dataset), and so on transitively back through the full dependency chain, including what transparency attributes were in effect at each step.

These queries work without coordination between the institutions that produced the TROs. The @context maps JSON property names to globally unique URIs, so trov:hasCapability in one institution’s TRO means the same thing as trov:hasCapability in another’s.

Warrant chain validation. The TROV conceptual model defines a warrant chain: TRO attributes are warranted by TRP attributes, which are in turn warranted by TRS capabilities. When the full chain is present, it provides machine-verifiable justification for every transparency claim. In JSON, verifying this chain requires navigating the document tree and matching identifiers. In RDF, it is a graph traversal — SPARQL queries can walk the trov:warrantedBy links, and SHACL shapes can confirm that the chain is structurally complete.

Reasoning over collections. A triplestore holding TROs from many sources becomes a knowledge base. Consumers can ask questions that no single TRO can answer: What fraction of TROs were produced under conditions of internet isolation? Which artifacts have been independently produced by different TRSs? Are there TRPs that accessed the same input data but were conducted under different transparency attributes? Do all computations in the transitive provenance of a given set of artifacts meet a specified set of transparency requirements? These aggregate queries are the basis for meta-analyses of computational transparency.

TROV tooling leverages RDF. The trov-validate and trov-report tools we are developing use SPARQL and SHACL internally to validate warrant chains, check structural conformance, and generate human-readable summaries. The tools accept a TRO declaration as input and produce validation results or HTML/PDF reports as output.

For data repositories and aggregators

A repository that archives TROs from multiple sources can implement each function using JSON tools, RDF tools, or a combination:

Both perspectives let repositories integrate TRO metadata with other vocabularies they already use — e.g. schema.org for describing datasets, W3C PROV for representing provenance, DataCite for facilitating citation — enabling queries that span transparency claims and existing metadata.


The JSON Schema Constraint

The flexibility of RDF is powerful for interoperability but makes validation and tooling harder. If a TRO declaration were unconstrained RDF, a producer could express the same information in many structurally different ways, and consumers would need graph-pattern matching to find anything.

The TRO JSON Schema imposes a fixed document tree structure on top of the RDF vocabulary. This means:

The constraint is more restrictive than what RDF allows. You cannot rearrange the tree, nest things differently, or use blank nodes in creative ways in a TRO declaration. This restriction is what makes JSON Schema validation possible, what makes standard JSON tooling sufficient for most use cases, and what makes TRO declarations from different institutions structurally consistent.


The @context Is the Bridge

The @context block is the mechanism that connects the two perspectives. For JSON producers, it is a fixed header that can be copied from the TRO Declaration Format examples and modified only if adding a namespace prefix. For RDF consumers, it is the mapping that turns JSON property names into globally unique URIs.

A TRO producer who follows the JSON Schema and preserves the @context is producing valid linked data. The transparency claims in that TRO are machine-readable, globally identifiable, and combinable with any other TRO or linked dataset on the web.


Summary

PerspectiveToolsWhat you seeWhat you get
JSON producerAny JSON library, JSON Schema validatorA JSON document with a fixed structureSchema, field reference, examples
JSON consumerjq, JSONPath, any language’s JSON parserA predictable tree you can navigateKnown paths to every piece of information
RDF consumerTriplestore, SPARQL, SHACLA graph of triples with standard vocabularyCross-TRO queries, linked data interoperability
RepositoryJSON or RDF technologiesA collection of JSON documents or single graph of triplesIntegration with existing metadata and standards