Document Manager

Ambar: Libre Document Search Engine for Office, Text and PDF Documents

Hazem Abbas

Dec 10, 2022 — 1 min read

Photo by Jametlene Reskp / Unsplash

Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search.

Ambar defines a new way to implement full-text document search into your workflow.

Easily deploy Ambar with a single docker-compose file
Perform Google-like search through your documents and contents of your images
Tag your documents
Use a simple REST API to integrate Ambar into your workflow

Search

Tutorial: Mastering Ambar Search Queries

Fuzzy Search (John~3)
Phrase Search ("John Smith")
Search By Author (author:John)
Search By File Path (filename:*.txt)
Search By Date (when: yesterday, today, lastweek, etc)
Search By Size (size>1M)
Search By Tags (tags:ocr)
Search As You Type
Supported language analyzers: English ambar_en, Russian ambar_ru, German ambar_de, Italian ambar_it, Polish ambar_pl, Chinese ambar_cn, CJK ambar_cjk

Crawling

Ambar 2.0 only supports local FS crawling, if you need to crawl an SMB share of an FTP location - just mount it using standard Linux tools. Crawling is automatic, no schedule is needed due to crawlers monitor file system events and automatically process new, changed and removed files.

Content Extraction

Ambar supports large files (>30MB)

Supported file types:

ZIP archives
Mail archives (PST)
MS Office documents (Word, Excel, PowerPoint, Visio, Publisher)
OCR over images
Email messages with attachments
Adobe PDF (with OCR)
OCR languages: Eng, Rus, Ita, Deu, Fra, Spa, Pl, Nld
OpenOffice documents
RTF, Plaintext
HTML / XHTML
Multithread processing

License

Ambar is released under the MIT License.

Resources

https://github.com/RD17/ambar

Tags

Document Manager search-engine Open-source MIT Web-based web development Developer Tools

Docat: Host your docs. Simple. Versioned. Fancy.

In the expansive domain of document generators, a revolutionary, free, open-source, self-hosted tool is emerging. Meet Docat, an innovative tool engineered to effortlessly create and generate static yet engaging documentation websites. What is Docat? Docat distinguishes itself with its simplicity, superior version control capabilities, and sleek design. It's

Create a Permalinks for Your Google Drive Docs and Directories using YamiDrive

YamiDrive is a free open-source script that acts like Google Drive Sharer service which allows all users to share files with anyone. The app is just one script that you can download, configure and run it easily. The service is hosted online currently for free and it does not require

Unleash Your Team's Potential: A Deep Dive into the Top 12 Wiki and Knowledge Base Management Services

In the current dynamic and information-rich world, organizations heavily depend on robust knowledge management systems to effectively capture, organize, and share valuable information. Wiki and Knowledge Base Management as a Service platforms provide compelling solutions for enterprises aiming to optimize their knowledge sharing processes. These platforms provide a centralized hub

PDF4QT: Your Extraordinary and Free Open-Source PDF Editor

PDF4QT is a free PDF editor that is based on the Qt framework. It includes a C++ library, applications for viewing and editing PDF documents, and a command line tool. PDF4QT is available for Windows and Linux operating systems. It offers a modern solution for viewing, editing, and rendering PDF

Mayan EDMS Is a free Open-source DMS (Document Management System) For Enterprise

Mayan EDMS is a comprehensive and user-friendly electronic document management system that is available to organizations at no cost. It is built on an open-source platform, which means that users have the freedom to modify and customize the system according to their specific needs. Organize documents One of the main

Mayan EDMS: Open-source DMS (Document Management System)

Mayan EDMS is a comprehensive and user-friendly electronic document management system that is available to organizations at no cost. It is built on an open-source platform, which means that users have the freedom to modify and customize the system according to their specific needs. Organize documents One of the main

Teedy: An Open-source free DMS (Document Management System) for Enterprise and individuals

Teedy is an open source, lightweight document management system for individuals and businesses. Features Teedy's current features include: * Responsive user interface * Built-in Optical character recognition (OCR) * LDAP authentication. * Support image, PDF, ODT, DOCX, PPTX files * Video file support * Flexible search engine with suggestions and highlighting * Full-text search in

Read more

Do you like Reading Manga? Enjoy This Awesome Self-hosted Desktop-Ready Manga Reader

Do you like Reading Manga? Enjoy This Awesome Self-hosted Desktop-Ready Manga Reader

Exploring Tanoshi: A Self-hosted Web Manga Reader

Exploring Gotify/server: A Real-time Messaging Server

Exploring Gotify/server: A Real-time Messaging Server

When it comes to chat in real time, it can be a bit of a pain to find a tool that you can host yourself and that also packs a punch with awesome features. So today, let's check out this cool solution called gotify/server. It's

Maximizing Returns: The Role of AI in Modern Stock Analysis

Maximizing Returns: The Role of AI in Modern Stock Analysis

In today's ever-changing financial landscape, investors are constantly seeking ways to maximize their returns and stay ahead of the curve. With the advent of artificial intelligence (AI), traditional methods of stock analysis are undergoing a profound transformation. AI-driven DIY investing platforms, like the one offered by UVest4U, are

Docat: Host your docs. Simple. Versioned. Fancy.

Docat: Host your docs. Simple. Versioned. Fancy.

In the expansive domain of document generators, a revolutionary, free, open-source, self-hosted tool is emerging. Meet Docat, an innovative tool engineered to effortlessly create and generate static yet engaging documentation websites. What is Docat? Docat distinguishes itself with its simplicity, superior version control capabilities, and sleek design. It's