Tantivy: Open source Full-Text Search Engine

Hazem Abbas

Nov 22, 2022 — 1 min read

Tantivy is a full-text search engine library written in the Rust programming language.

It is closer to Apache Lucene than to Elasticsearch or Apache Solr in the sense it is not an off-the-shelf search engine server, but rather a crate that can be used to build such a search engine.

Features

Full-text search
Configurable tokenizer (stemming available for 17 Latin languages with third party support for Chinese (tantivy-jieba and cang-jie), Japanese (lindera, Vaporetto, and tantivy-tokenizer-tiny-segmenter) and Korean (lindera + lindera-ko-dic-builder)
Fast (check out the 🐎 ✨ benchmark ✨ 🐎)
Tiny startup time (<10ms), perfect for command-line tools
BM25 scoring (the same as Lucene)
Natural query language (e.g. (michael AND jackson) OR "king of pop")
Phrase queries search (e.g. "michael jackson")
Incremental indexing
Multithreaded indexing (indexing English Wikipedia takes < 3 minutes on my desktop)
Mmap directory
SIMD integer compression when the platform/CPU includes the SSE2 instruction set
Single valued and multivalued u64, i64, and f64 fast fields (equivalent of doc values in Lucene)
&[u8] fast fields
Text, i64, u64, f64, dates, and hierarchical facet fields
LZ4 compressed document store
Range queries
Faceted search
Configurable indexing (optional term frequency and position indexing)
JSON Field
Aggregation Collector: range buckets, average, and stats metrics
LogMergePolicy with deletes
Searcher Warmer API

License

The project is released under the MIT Language.

Resources

Source code

Tantivy: Open source Full-Text Search Engine

Hazem Abbas

Features

License

Resources

Articles

Systems

Development

Apps

Science - Healthcare

Open-source Apps

Medical Apps

Lists

Dev. Resources

Features

License

Resources

Read More Articles in search-engine

Can AI Bots like ChatGPT, Microsoft copilot Replace Search Engines? A Deep Dive into the Future of Information Retrieval

12 Free Search Engines Alternative to Google

20 Open-search Self-hosted Web and Document Search Engine Solutions

Unveiling H4X-Tools: Your Go-To OSINT Suite for Expert-Level Scraping and Searching

Easyspider is a Free Open-source Self-hosted Distributed Web Crawler

FileMasta: The Open-source Desktop Federated Search Engine Is Abandoned

Articles

Systems

Development

Apps

Science - Healthcare

Open-source Apps

Medical Apps

Lists

Dev. Resources