ArchiveBox is an open-source self-hosted web archiving system for the web and the desktop

Keep track of your URL, pages, and websites with ArchiveBox

What is ArchiveBox?

ArchiveBox is a web-based self-hosted web archiving system that you can use to record and archive online links, web pages, and media pages in a single database.

With ArchiveBox you can have your collection saved, share them or keep them private for your own use.

Moreover, It is an open-source,  easy to setup, install, configure and use. Anyone can install it and start using it directly from their servers.

ArchiveBox comes with a command-line app that works directly from your terminal, a web application that works seamlessly in all modern browsers, and a new released desktop app (in Alpha stage) that works for Windows, Linux, and macOS.

ArchiveBox Features

  • Free & open source, doesn’t require signing up online, stores all data locally
  • Powerful, intuitive command line interface with modular optional dependencies
  • Support dozens of file formats: Audio, Video, Text, Docs, Media
  • ArchiveBox is an easy to setup, configure and use
  • Comes with a strong search functionality.
  • Save and record dozens of URLs (Links) separately or in a batch.
  • A clutter-free web user-interface
  • Comprehensive documentation
  • Active development
  • Has a Rich community
  • ArchiveBox supports tags.
  • Screenshot support for links
  • Automatic check if the links are valid or not
  • Take a snapshot of your URLs/ links.
  • Automatically checks for page errors, stats, and header
  • Extract URL header, meta, and included medias
  • Export URL into PDF or as a screenshot image
  • check the readability of pages
  • Extracts a wide variety of content out-of-the-box: media (youtube-dl), articles (readability), code (git), etc.
  • Supports scheduled/real-time importing from many types of sources
  • Uses standard, durable, long-term formats like HTML, JSON, PDF, PNG, and WARC
  • Usable as an oneshot CLI, self-hosted web UI, Python API (BETA), REST API (ALPHA), or desktop app (ALPHA)
  • Saves all pages to archive.org as well by default for redundancy (can be disabled for local-only mode)
  • Advanced users: support for archiving content requiring login/paywall/cookies (see wiki security caveats!)
  • SQLite support
  • Import/ export options

Platforms

  1. Web-based self-hosted
  2. Docker
  3. Linux
  4. macOS
  5. Windows

Get ArchiveBox

License

ArchiveBox is released as an open-source project under MIT license.

Resources