Skip to main content

con/serve

Vision #

Note on domain examples: The con/serve platform is domain-agnostic – git-annex, DataLad, and the patterns described here work for any field. Many examples reference neuroscience and psychology because that is our primary domain at the Center for Open Neuroscience, but the tools and architecture apply wherever digital artifacts need archiving.

Research generates far more than code and data. Every Slack thread, Zoom recording, GitHub discussion, AI coding session, and conference PDF is a piece of the scholarly record – and most of it is quietly rotting on someone else’s servers.

con/serve extends YODA principles to all digital research artifacts. If it matters to your work, it belongs in a version-controlled, content-addressed repository where you own the bits, not a SaaS provider.

The core idea is a bidirectional funnel:

  1. Ingest from dozens of sources – messaging platforms, video hosting, code forges, cloud storage, reference managers, AI assistants – into a git-annex / DataLad vault.
  2. Conserve and distribute outward to domain archives, cloud backups, institutional knowledge bases, and web publications.
%%{init: {'theme': 'base', 'themeVariables': {'primaryColor': '#2d3748', 'primaryTextColor': '#fff', 'lineColor': '#718096', 'fontSize': '14px'}}}%% flowchart LR subgraph IN["INBOUND"] direction TB comm["Communications
(Slack, Matrix, Email)"] media["Media
(YouTube, Zoom)"] code["Code Artifacts
(Issues, Wikis)"] ai["AI Sessions
(Claude, Cursor)"] pubs["Publications
(Citations, PDFs)"] cloud_in["Cloud Storage
(rclone, 70+ providers)"] comm ~~~ media ~~~ code ~~~ ai ~~~ pubs ~~~ cloud_in end subgraph HUB["THE VAULT"] direction TB ga["git-annex
content-addressed storage"] dl["DataLad
dataset management"] org["YODA / STAMPED
organization & principles"] surfaces["Working Surfaces
(Hugo, HedgeDoc, LLM agents)"] ga --- dl dl --- org dl --- surfaces end subgraph OUT["OUTBOUND"] direction TB archives["Domain Archives
(OpenNeuro, DANDI, OSF)"] backup["Cloud Backup
(S3, Glacier, Dropbox)"] webpub["Web Publishing
(GitHub Pages)"] archives ~~~ backup ~~~ webpub end IN ==>|archive &
import| HUB HUB ==>|publish &
distribute| OUT classDef inbound fill:#2b6cb0,stroke:#2c5282,color:#fff,stroke-width:2px classDef hub fill:#d69e2e,stroke:#b7791f,color:#1a202c,stroke-width:3px classDef outbound fill:#2f855a,stroke:#276749,color:#fff,stroke-width:2px class comm,media,code,ai,pubs,cloud_in inbound class ga,dl,org,surfaces hub class archives,backup,webpub outbound style IN fill:#ebf8ff,stroke:#2b6cb0,stroke-width:2px,color:#2b6cb0 style HUB fill:#fefcbf,stroke:#d69e2e,stroke-width:3px,color:#744210 style OUT fill:#f0fff4,stroke:#2f855a,stroke-width:2px,color:#2f855a

For the full detailed diagram with all tools and connections, see the Architecture section.

See also the closing Beyond YODA slide and video from the YODA & BIDS webinar.

Explore #

SectionWhat you will find
AboutProject vision, principles, and how to contribute
Architecture – full system diagram (inbound / vault / outbound)
YODA Principles – how con/serve extends YODA to all artifacts
STAMPED Framework – guiding principles for artifact management
Frozen Frontiers – deliberate working boundaries and compressed context
Privacy & Access – archive aggressively, distribute selectively
Integration Levels – native-datalad, git-annex, git-only, external
AI Readiness – ai-ready, ai-partial, ai-manual
Contributing – how to add tools and content
ToolsCatalog of archival tools organized by artifact type
InfrastructureSelf-hosted services for managing and serving archived artifacts
ConceptsCross-cutting patterns for ingestion, conservation, and distribution
User StoriesConcrete archival scenarios that drive development – from personal vaults to lab infrastructure

Recent

Brain Imaging Center

·7 mins
A brain imaging center operating an MRI scanner for multiple research labs. The center collects DICOMs with ReproIn conventions, captures stimuli with ReproStim and behavioral events with CurDes BIRCH, runs weekly phantom QA to monitor scanner health, records environmental conditions, and provides streamlined preprocessing via HPC. Data flows from a single scanner to many labs, each with their own vault, while the center maintains its own operational datasets.

Neuroimaging Lab

·7 mins
A neuroimaging research lab doing MRI experiments with ReproIn-convention DICOMs, ReproStim stimulus capture, CurDes BIRCH behavioral events, Slack for communication, Google Calendar for scheduling, and GitHub for processing code. Data flows through HeuDiConv into BIDS, then MRIQC and fMRIPrep for preprocessing, with results published to OpenNeuro or DANDI.

Software Project

·5 mins
An open-source software project spanning a GitHub organization with dozens of repositories, active issue trackers, GitHub Actions CI, Discussions, and Slack for team communication. Prototypical target: the DANDI project. Archive all forge artifacts (repos, issues, PRs, Actions logs, Discussions, wikis) with ongoing sync, preserve Slack history, and mirror to a Forgejo-Aneksajo instance with GitHub authentication.

Annotation Garden

An initiative providing open infrastructure for collaborative annotation of stimuli used in neuroscience research. Uses git branches as stackable annotation layers, BIDS/HED standards for interoperability, and AI-accelerated annotation generation with human refinement. Directly relevant to ReproStim-captured audio/video stimuli and generalizable to any experiment with media needing temporal annotation.

Google Takeout

Google’s official data export service, producing massive archives covering Gmail, Google Photos, Drive, YouTube history, Calendar, Contacts, Location History, and dozens more services. A single Takeout dump is the largest personal data ingestion event most people will ever perform — and a critical starting point for anyone building a personal digital archive.

Personal Archive

The complete personal digital archive: ingesting a Google Takeout dump, organizing photos with AI-assisted browsing, archiving personal Telegram channels, preserving YouTube watch history and subscriptions, and tying it all together in a git-annex vault with browsable frontends.

PhotoPrism

A self-hosted, AI-powered photo management application with face recognition, automatic categorization, map views, and album management. In the con/serve architecture, PhotoPrism serves as a rich visualization frontend over git-annex-tracked photo archives, applying the data-visualization separation principle: photos live in git-annex, PhotoPrism provides the browsable UI.