Documentation Infrastructure#
How the TRACE project’s documentation is built and deployed across GitHub repositories, each with its own build pipeline, and how the results combine into a single public website.
Document Section |
Description |
|---|---|
The repos that contribute to the public site |
|
URL structure showing how the repos map to the public site |
|
Jekyll site serving the landing page and top-level navigation |
|
Jupyter Book build stack, deploy pipeline, and downloads |
|
Widoco-generated documentation from the TROV Turtle ontology |
|
How vocabulary submodules work and how to trigger a rebuild |
|
Steps to create a new document and make it appear on the live site |
For the specification content itself, see TRACE Specifications. For the TROV vocabulary terms, see TROV Vocabulary.
Repositories#
Four repositories under the transparency-certified GitHub organization contribute to the public documentation:
Repository |
Contents |
URL |
|---|---|---|
Project landing page and navigation |
Served as Jekyll site at transparency-certified.github.io/ |
|
Specification documents, conceptual model, examples |
Served as Jupyter Book at transparency-certified.github.io/trace-specification/ |
|
Build hub — aggregates vocabulary repos as submodules, holds w3id.org redirect config and tests |
Served as GitHub Pages at transparency-certified.github.io/trace-vocabularies/ |
|
TROV ontology source files — Turtle, SHACL shapes, Widoco config. Submodule of trace-vocabularies. |
Served via trace-vocabularies Pages site |
GitHub serves all of these independently under the same domain via its CDN. The domain transparency-certified.github.io serves as the organization’s GitHub Pages site, and repos with Pages enabled are mounted at subpaths matching their repository names.
How the Pieces Connect#
Public URL structure#
The public site at transparency-certified.github.io is assembled by GitHub Pages from independent deployments. Each URL path maps to a different repository’s output:
transparency-certified.github.io/
│
├── / ← transparency-certified.github.io repo
│ Served directly from main branch by Jekyll
│
├── /trace-specification/ ← trace-specification repo (gh-pages branch)
│ └── docs/*.html Built from Markdown sources in docs/ on main
│
└── /trace-vocabularies/ ← trace-vocabularies repo (gh-pages branch)
├── index.html Copied from main during build
├── trov/0.1/ Generated by Widoco from trov submodule
└── trov/prerelease/ Generated by Widoco from trov submodule
Source file layout#
To edit content, work with the source files on the main branch of each repository. The build pipelines (described below) generate and deploy the public site automatically.
transparency-certified.github.io/ ← edit here for the project landing page
├── index.html Landing page
├── _includes/ About, contact, and content sections
└── _config.yml Site configuration and navigation
trace-specification/ ← edit here for specification documents
├── docs/ Markdown source files
│ ├── tro-declaration-format.md
│ ├── trov-vocabulary.md
│ └── ...
├── _toc.yml Table of contents (controls sidebar)
├── _config.yml Jupyter Book configuration
├── _static/custom.css CSS overrides (table hover, print layout)
└── .github/workflows/deploy.yml Builds Jupyter Book, deploys to gh-pages
trace-vocabularies/ ← edit here for build infrastructure
├── trov/ → submodule → transparency-certified/trov
├── w3id/ w3id.org redirect config and tests
├── index.html Vocabulary index page (copied to site root)
└── .github/workflows/deploy.yml Builds Widoco docs, deploys to gh-pages
trov/ ← edit here for vocabulary source files
├── 0.1/ (submodule of trace-vocabularies)
│ ├── trov.ttl TROV 0.1 ontology
│ ├── trov-shacl.ttl SHACL shapes
│ └── trov.config Widoco configuration
└── prerelease/
├── trov.ttl Pre-release ontology (2023/05/trov#)
└── trov.config Widoco configuration
Each component is built and deployed independently. The Jekyll site links into the specification via its navigation. The specification documents link to the vocabulary reference where relevant.
The trace-specification and trace-vocabularies repos each deploy to an orphan gh-pages branch containing only generated HTML and a .nojekyll file that tells GitHub Pages to serve files directly.
The Project Website#
The transparency-certified.github.io repository is a standard Jekyll site. GitHub Pages builds it automatically from the main branch using its built-in Jekyll support — no GitHub Action or Docker image is needed.
The site provides a landing page with project description and team information, and navigation that links into the specification and vocabulary sites. Jekyll configuration is in _config.yml, and the site layout is controlled by standard Jekyll conventions (_layouts/, _includes/, etc.).
The Specification Site#
The trace-specification repository contains the specification documents as Markdown files in docs/. The build and deploy process works as follows:
Source#
The main branch contains:
Markdown source files in
docs/Jupyter Book configuration:
_config.yml(book settings) and_toc.yml(table of contents structure)A GitHub Actions workflow:
.github/workflows/deploy.yml
The _toc.yml file controls which documents appear in the built site and in what order. It organizes content into parts: TRACE Overview, TRACE Concepts, TRACE Specification, TRACE Tools, and Project Infrastructure. Short sidebar labels can be set with title: entries in the TOC without changing the document headings. Only files listed in _toc.yml are built (only_build_toc_files: true in _config.yml). Markdown files not registered in the TOC are ignored by the build even though they exist in the repository.
All new documents are written in plain GitHub-Flavored Markdown (GFM). Jupyter Book also supports MyST Markdown (a superset of GFM with Sphinx-specific directives), but MyST syntax renders as raw text when viewed on GitHub. Since the source files are frequently read directly on GitHub, GFM ensures they look correct everywhere — on GitHub, in the Jupyter Book site, in VS Code, and in Pandoc-generated PDFs. A few older documents use MyST features (cross-references, admonitions, glossary); these will be converted to GFM equivalents over time.
Build Stack#
The specification site is built by a pipeline of tools, each handling a different layer:
Layer |
Tool |
Role |
|---|---|---|
Source |
GitHub-Flavored Markdown |
Document content. What authors write |
Book engine |
Reads |
|
Doc generator |
Underlying engine that Jupyter Book delegates to. Resolves cross-references, builds the search index, generates HTML. Configured via the |
|
Theme |
Controls layout, sidebar navigation, dark mode, and toolbar (download, fullscreen, search). Selected by Jupyter Book; options in |
|
Styling |
|
Project-specific CSS overrides (table hover color, print layout adjustments) |
Authors interact only with the top layer (Markdown files and the TOC). The lower layers are configured once and rarely need attention. See the source file layout above for the location of each configuration file in the repository.
Build#
On every push to main, the GitHub Action:
Checks out the repository
Runs
jupyter-book build --all .inside thecraigwillis/jupyter-book:latestDocker imageProduces static HTML in
_build/html/
Deploy#
The same GitHub Action uses peaceiris/actions-gh-pages@v3.6.1 to push the built HTML to the gh-pages branch. Each deploy creates a commit with the message deploy: <commit-hash>, where the hash references the main branch commit that was built.
Downloads#
Each page on the built site offers Markdown and PDF downloads via the toolbar. The Markdown download is the original source file. The PDF is generated by the browser’s print rendering, with project-specific print styles in _static/custom.css that hide the auto-generated table of contents and adjust page margins.
The Vocabulary Reference#
The trace-vocabularies repository is a build hub that aggregates vocabulary source repos as git submodules. It runs Widoco to generate HTML documentation and serialization files, then deploys everything to GitHub Pages.
Source#
The trov repository is a submodule of trace-vocabularies. It contains the ontology source files:
0.1/trov.ttl— the TROV 0.1 ontology in Turtle format0.1/trov-shacl.ttl— SHACL shape constraints for validation0.1/trov.config— Widoco configurationprerelease/trov.ttl— frozen pre-release ontology (2023/05/trov#namespace)prerelease/trov.config— Widoco configuration for the pre-release
Future satellite vocabularies will be separate repos added as additional submodules.
Build#
On pushes to main (and on manual dispatch), the trace-vocabularies GitHub Action:
Checks out the repository with submodules
Runs Widoco for each vocabulary version (currently TROV 0.1 and TROV pre-release)
Organizes output under
trov/{version}/to match the GitHub Pages URL structureRenames Widoco’s default output files (
ontology.*) to vocabulary-specific names (trov.*) so the w3id.org content negotiation rules can find themVerifies all expected files exist before deploying
Deploy#
The action uses peaceiris/actions-gh-pages@v4 to push the built output to an orphan gh-pages branch. GitHub Pages serves the result at transparency-certified.github.io/trace-vocabularies/. Only pushes to main trigger deployment — pull requests build but do not deploy.
A vocabulary index page (index.html in the repo root) is copied into the build output during each deploy. It lists all published vocabularies and versions, and serves as the default landing page for unversioned requests via w3id.org.
Updating After Vocabulary Changes#
Pushing changes to a vocabulary repo (e.g. trov) does not automatically rebuild the site. The trace-vocabularies repo uses git submodules. Each submodule records a specific commit from the vocabulary repo. To pick up new vocabulary content, you advance the submodule pointer and push to trace-vocabularies, which triggers the build workflow.
The submodule pointer is a deliberate decision, not an automatic sync. This means the published documentation always corresponds to a known vocabulary commit, and vocabulary authors can push work-in-progress without affecting the live site.
For the specific git commands to advance submodule pointers, pin to a tagged release, or check submodule status, see the trace-vocabularies README.
w3id.org Namespace Resolution#
TROV namespace URIs (e.g. https://w3id.org/trace/trov/0.1#ArtifactComposition) resolve via redirect rules in a .htaccess file registered at w3id.org. The rules perform content negotiation: RDF tools requesting Turtle get trov.ttl, JSON-LD processors get trov.jsonld, and browsers get the HTML documentation page. The redirect targets are the GitHub Pages URLs served by trace-vocabularies.
The .htaccess source of truth and a Mocha test suite (27 tests, covering all redirect rules in both local Docker and live remote modes) are in trace-vocabularies/w3id/.
Why GitHub Pages, not raw GitHub URLs#
The w3id.org .htaccess rules redirect namespace URI requests to files on GitHub Pages. Content negotiation (Turtle vs JSON-LD vs HTML) happens at w3id.org, but GitHub Pages confirms the format by serving files with correct MIME types (text/turtle, application/ld+json, text/html). Raw GitHub URLs (raw.githubusercontent.com) serve everything as text/plain, which well-behaved RDF tools may reject. GitHub Pages is also required for the HTML landing page, which is served as index.html from a directory URL.
Adding a Page to the Specification#
To add a new document to the specification site:
Create the Markdown file in
docs/. For example,docs/my-new-page.md.Register it in
_toc.yml. Add a- file: docs/my-new-pageentry under the appropriate part and chapter list. Without this entry, Jupyter Book will not include the page in the build even though the file exists in the repository. Optionally add atitle:for a shorter sidebar label.For example, to add it under TRACE Specifications:
- caption: TRACE Specifications chapters: - file: docs/specifications - file: docs/questions - file: docs/my-new-page # ← new entry title: My New Page # ← optional short sidebar label
Push to
main. The GitHub Action will build the site and deploy the result togh-pages. The new page will appear attransparency-certified.github.io/trace-specification/docs/my-new-page.html.
The build runs automatically. If the build fails (e.g., due to a syntax error or a broken cross-reference), the gh-pages branch is not updated and the live site remains unchanged. Check the Actions tab on the repository to see build status.