A self-hosted internet archiving tool that takes URLs and saves them in multiple formats – HTML, PDF, screenshot, WARC, media files – for long-term preservation.
A cloud-native, headless browser-based web crawler that creates high-fidelity WARC archives, capturing JavaScript-rendered content that traditional crawlers miss.
A DataLad extension that creates browsable Vue.js web catalogs from dataset metadata, with schema validation and support for arbitrary metadata sources.
A browser extension (and CLI tool) that saves a complete web page – including CSS, images, fonts, and iframes – into a single, self-contained HTML file.