Tantivy is a full-text search engine library written in the Rust programming language.
It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine.
Features
- Full-text search
- Configurable tokenizer (stemming available for 17 Latin languages with third party support for Chinese (tantivy-jieba and cang-jie), Japanese (lindera, Vaporetto, and tantivy-tokenizer-tiny-segmenter) and Korean (lindera + lindera-ko-dic-builder)
- Fast (check out the ๐ โจ benchmark โจ ๐)
- Tiny startup time (<10ms), perfect for command-line tools
- BM25 scoring (the same as Lucene)
- Natural query language (e.g.
(michael AND jackson) OR "king of pop"
) - Phrase queries search (e.g.
"michael jackson"
) - Incremental indexing
- Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop)
- Mmap directory
- SIMD integer compression when the platform/CPU includes the SSE2 instruction set
- Single valued and multivalued u64, i64, and f64 fast fields (equivalent of doc values in Lucene)
&[u8]
fast fields- Text, i64, u64, f64, dates, and hierarchical facet fields
- LZ4 compressed document store
- Range queries
- Faceted search
- Configurable indexing (optional term frequency and position indexing)
- JSON Field
- Aggregation Collector: range buckets, average, and stats metrics
- LogMergePolicy with deletes
- Searcher Warmer API
License
The project is released under the MIT Language.
Resources
Easy Spider is a fascinating project that was created in 2006 to facilitate distributed web crawling. The project was developed using Perl and it is designed to crawl web pages, distribute the crawled data to a server, and generate XML files from it. What makes Easy Spider a great tool
FileMasta is a federated search application that allows you to discover a wide variety of files being shared online. Whether you're looking for video, music, books, software, games, subtitles, or anything else, FileMasta has you covered. Our data is crawled by od-database, which collects information about the contents of servers
Distributed Peer-to-Peer Web Search Engine and Intranet Search Appliance
Meilisearch is an open-source search engine that provides fast and relevant search results for various types of data. It is designed to be easy to use and highly customizable, making it ideal for developers who want to integrate search functionality into their applications.
One of the main advantages of Meilisearch
An open-source self-hosted search engine is a search engine that can be hosted on a server and used by an organization to search its own data. There are several benefits for an enterprise to use its own search engine, such as:
1. Control: An enterprise can have complete control over
OpenSearchServer is an open-source search engine software that allows developers to create their own search engine for their websites or applications. It is developed in Java and comes with a REST API that allows developers to integrate search functionality into their applications easily.
OpenSearchServer is a powerful, enterprise-class, search engine
What is Typesense?
Typesense is an incredibly fast search engine that can tolerate typos, allowing you to quickly and accurately search your data even if you make mistakes while typing. Unlike other search engines such as Algolia and Elasticsearch, Typesense is open source, which means that you can use it
DataparkSearch Engine is a powerful and versatile search engine that can be used to search for information within a website, group of websites, intranet or local system. This open-source web-based search engine is equipped with a wide range of features that make it stand out from the competition.
One of
In today's world, privacy has become a major concern for internet users. With the rise in data breaches and hacking attempts, individuals are seeking out more secure methods of browsing the internet. One such method is to use private search engines that respect your privacy.
These search engines are designed
Ambar is an open-source document search engine with automated crawling, OCR, tagging and instant full-text search.
Ambar defines a new way to implement full-text document search into your workflow.
* Easily deploy Ambar with a single docker-compose file
* Perform Google-like search through your documents and contents of your images
* Tag your