ArchiveBox is an open-source self-hosted web archiving system for the web and the desktop

Keep track of your URL, pages, and websites with ArchiveBox

What is ArchiveBox?

ArchiveBox is a web-based self-hosted web archiving system that you can use to record and archive online links, web pages, and media pages in a single database.

With ArchiveBox you can have your collection saved, share them or keep them private for your own use.

Moreover, It is an open-source,  easy to setup, install, configure and use. Anyone can install it and start using it directly from their servers.

ArchiveBox comes with a command-line app that works directly from your terminal, a web application that works seamlessly in all modern browsers, and a new released desktop app (in Alpha stage) that works for Windows, Linux, and macOS.

ArchiveBox Features

  • Free & open source, doesn’t require signing up online, stores all data locally
  • Powerful, intuitive command line interface with modular optional dependencies
  • Support dozens of file formats: Audio, Video, Text, Docs, Media
  • ArchiveBox is an easy to setup, configure and use
  • Comes with a strong search functionality.
  • Save and record dozens of URLs (Links) separately or in a batch.
  • A clutter-free web user-interface
  • Comprehensive documentation
  • Active development
  • Has a Rich community
  • ArchiveBox supports tags.
  • Screenshot support for links
  • Automatic check if the links are valid or not
  • Take a snapshot of your URLs/ links.
  • Automatically checks for page errors, stats, and header
  • Extract URL header, meta, and included medias
  • Export URL into PDF or as a screenshot image
  • check the readability of pages
  • Extracts a wide variety of content out-of-the-box: media (youtube-dl), articles (readability), code (git), etc.
  • Supports scheduled/real-time importing from many types of sources
  • Uses standard, durable, long-term formats like HTML, JSON, PDF, PNG, and WARC
  • Usable as an oneshot CLI, self-hosted web UI, Python API (BETA), REST API (ALPHA), or desktop app (ALPHA)
  • Saves all pages to archive.org as well by default for redundancy (can be disabled for local-only mode)
  • Advanced users: support for archiving content requiring login/paywall/cookies (see wiki security caveats!)
  • SQLite support
  • Import/ export options

Platforms

  1. Web-based self-hosted
  2. Docker
  3. Linux
  4. macOS
  5. Windows

Get ArchiveBox

License

ArchiveBox is released as an open-source project under MIT license.

Resources

Read more

How AI-Powered Documentation Is Reducing Administrative Burden in Healthcare

How AI-Powered Documentation Is Reducing Administrative Burden in Healthcare

Healthcare organizations continue to face growing administrative demands as patient volumes increase and regulatory requirements become more complex. This challenge affects healthcare providers across many specialties and locations. For instance, the Colorado Behavioral Health Administration (BHA) laws and rules establish the regulatory framework for behavioral health providers. These rules cover

By Hazem Abbas