Tools
Table of Contents
This section catalogs tools for ingesting digital research artifacts into version-controlled, content-addressed repositories. Each tool entry includes integration guidance for git-annex and DataLad, an AI readiness assessment, and links to upstream documentation.
Taxonomy #
Every tool is classified along four axes:
Category – the type of artifact the tool handles: Communications | Media | Code Artifacts | Cloud Storage | Publications | Web | AI Sessions.
Media type – the specific format or platform (e.g., slack, youtube, github-issues). A tool may handle multiple media types.
Integration level – how deeply the tool integrates with the git-annex/DataLad stack: native-datalad | git-annex | git-only | external – see Integration Levels for definitions.
AI readiness – how consumable the archived output is for LLM-based workflows: ai-ready | ai-partial | ai-manual – see AI Readiness Levels for definitions.
Sections #
- Communications – Slack, Telegram, Matrix, Mattermost, email
- Media – YouTube, Zoom, podcasts, image galleries
- Code Artifacts – GitHub issues, PRs, discussions, wikis
- Cloud Storage – Google Drive, Dropbox, S3, and 70+ providers via rclone
- Publications – Scholarly citations, PDFs, reference management
- Web – Web page and site archival
- AI Sessions – Claude Code, Cursor, Entire.io session capture