txtai - Open-source embedded database for Semantic search, LLM orchestration

txtai - Open-source embedded database for Semantic search, LLM orchestration

txtai is an all-in-one embeddings database for semantic search, LLM orchestration and language model workflows.

It offers versatile tools for processing text, audio, and image data.

txtai is built with Python 3.9+, Hugging Face TransformersSentence Transformers and FastAPI. txtai is open-source under an Apache 2.0 license.

Features

  1. Embeddings & Semantic Search: Supports semantic search with embeddings for retrieving and ranking results based on similarity.
  2. Pipelines: Provides pipelines for processing text (e.g., summarization, translation, question answering), audio (e.g., transcription, text-to-speech), and images (e.g., captioning, object detection).
  3. Language Model Integration: Easily integrates large language models (LLMs) like LLaMA and transformers for tasks like text generation, classification, and labeling.
  4. Retrieval-Augmented Generation (RAG): Implements RAG workflows, allowing models to retrieve information from external sources before generating responses.
  5. Workflows: Allows for complex workflows, connecting multiple processing tasks, e.g., transcribing, translating, and indexing documents.
  6. API and Distributed Setup: Offers a REST API and can be deployed across distributed environments, enabling scalable and adaptable solutions.
  7. 🔎 Vector search with SQL, object storage, topic modeling, graph analysis and multimodal indexing
  8. 📄 Create embeddings for text, documents, audio, images and video
  9. 💡 Pipelines powered by language models that run LLM prompts, question-answering, labeling, transcription, translation, summarization and more
  10. ↪️️ Workflows to join pipelines together and aggregate business logic. txtai processes can be simple microservices or multi-model workflows.
  11. ⚙️ Build with Python or YAML. API bindings available for JavaScriptJavaRust and Go.
  12. ☁️ Run local or scale out with container orchestration

License

Apache-2.0 License

Resources

txtai
txtai is an all-in-one open-source embeddings database for semantic search, LLM orchestration and language model workflows








Open-source Apps

9,500+

Medical Apps

500+

Lists

450+

Dev. Resources

900+

Read more