LocalAI is a drop-in replacement REST API that's compatible with OpenAI API specifications for local inferencing. It allows you to run LLMs (and not only) locally or on-prem with consumer grade hardware, supporting multiple model families that are compatible with the ggml format. Does not require GPU.

Features

Local, OpenAI drop-in alternative REST API. You own your data.

  • NO GPU required. NO Internet access is required either
  • Optional, GPU Acceleration is available in llama.cpp-compatible LLMs. See also the build section.
  • Supports multiple models:
  • ๐Ÿ“– Text generation with GPTs (llama.cpp, gpt4all.cpp, ... and more)
  • ๐Ÿ—ฃ Text to Audio
  • ๐Ÿ”ˆ Audio to Text (Audio transcription with whisper.cpp)
  • ๐ŸŽจ Image generation with stable diffusion
  • ๐Ÿ”ฅ OpenAI functions ๐Ÿ†•
  • ๐Ÿƒ Once loaded the first time, it keeps models loaded in memory for faster inference
  • โšก Doesn't shell-out, but uses C++ bindings for a faster inference and better performance.

License

Resources

GitHub - go-skynet/LocalAI: :robot: Self-hosted, community-driven, local OpenAI-compatible API. Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU required. LocalAI is an API to run ggml compatible models: llama, gpt4all, rwkv, whisper, vicuna, koala, gpt4all-j, cerebras, falcon, dolly, starcoder, and many other
:robot: Self-hosted, community-driven, local OpenAI-compatible API. Drop-in replacement for OpenAI running LLMs on consumer-grade hardware. Free Open Source OpenAI alternative. No GPU required. Locโ€ฆ