A self-hosted internet archiving tool that takes URLs and saves them in multiple formats – HTML, PDF, screenshot, WARC, media files – for long-term preservation.
A cloud-native, headless browser-based web crawler that creates high-fidelity WARC archives, capturing JavaScript-rendered content that traditional crawlers miss.
A single-file Python file server supporting HTTP, WebDAV, SFTP, FTP, and TFTP with resumable uploads, content-based deduplication, media indexing, and a full-featured web interface. Useful as an ingestion front-end and quick file sharing tool within research infrastructure.
Self-hosted collaborative markdown editor for meeting notes, lab notebooks, and documentation. Documents are exported and committed to git for long-term preservation.
A mature, well-established website mirroring tool that creates offline-browsable copies of entire websites, preserving directory structure and link integrity.
Built-in Mattermost export tooling using mmctl to produce JSONL archives of teams, channels, users, and posts, suitable for long-term preservation in git-annex.
A browser extension (and CLI tool) that saves a complete web page – including CSS, images, fonts, and iframes – into a single, self-contained HTML file.
Integration with the Zotero reference manager for synchronizing curated reference collections with DataLad datasets. Export BibTeX, JSON, and structured metadata for git-tracked bibliography management.