What is Text-to-Speech?
Text-to-speech or speech synthesis is an artificially generated human-sounding speech from text that recognize words and formulate human speech.
The first Text-To-Speech system was introduced to the world in 1968 by Noriko Umeda et al, at the Electrotechnical Laboratory in Japan.
In 1961, physicist John Larry Kelly, Jr and his colleague Louis Gerstman used an IBM 704 computer to synthesize speech, an event among the most prominent in the history of Bell Labs.
The benefits of TTS?
The primary advantageous of this technology are people with visual and reading impairments, as they were its first users.
People with dyslexia, or learning disabilities uses TTS to
Nowdays, many YouTube channels uses this technology in order to minimize their edit and increase their production.
In many modern operating system, Text-to-speech is a built-in accessibility feature to assist people who cannot read on-screen text easily.
About this list
In this article we offer you our collection of free, open-source Text-To-Speech (TTS) and speech synthesis apps.
1- MARY TTS
MARY TTS is an open-source, multilingual text-to-speech synthesis system written in pure java. It is available for Windows, Linux, and macOS.
MARY TTS is released under the LGPL-3.0 License.
Kaldi is a toolkit for speech recognition written in C++ and licensed under the Apache License v2.0.The source code is available at GitHub.
Kaldi can run on Windows, Linux, and macOS. It also can run on Android, PowerPC, and with Web Assembly.
OpenTTS is a free, open-source Open Text to Speech Server written in Python. It is released under the MIT License. It supports several languages, and comes with an easy-to-use interface. Furthermore, it comes with several alternatives libraries.
Supported languages: English (27), German (7), French (3), Spanish (2), Dutch (4), Russian (3), Swedish (1), Italian (2), Swahili (1), Finnish, Korean, Japanese, Chinese, Swedish, and more.
eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows. It supports several languages, and comes with dozens of useful features, which makes it the ideal choice for many users.
Afrikaans, Albanian, Aragonese, Armenian, Bulgarian, Cantonese, Catalan, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Farsi, Finnish, French, Georgian, German, Greek, Hindi, Hungarian, Icelandic, Indonesian, Irish, Italian, Kannada, Kurdish, Latvian, Lithuanian, Lojban, Macedonian, Malaysian, Malayalam, Mandarin, Nepalese, Norwegian, Polish, Portuguese, Punjabi, Romanian, Russian, Serbian, Slovak, Spanish, Swahili, Swedish, Tamil, Turkish, Vietnamese, Welsh.
5- Text To Speech Converter
This open-source project allows you to convert any text into speech easily by copying and paste the text into its simple interface. It is written in C# programming languages and runs on Windows for now.
6- ONLINE TTS
ONLINE TTS features simple shortcuts, and a clean user-interface.
Flite is a small, fast run-time synthesis library suitable for embedded systems and servers. The core Flite library was developed by Alan W Black firstname.lastname@example.org (mostly in his so-called spare time) while employed in the Language Technologies Institute at Carnegie Mellon University.
Flite supports Windows, Linux, macOS, Android, FreeBSD, and several other systems.
Julius is an open-source large vocabulary continuous speech recognition engine.
It is a high-performance, small-footprint large vocabulary continuous speech recognition (LVCSR) decoder software for speech-related researchers and developers. Based on word N-gram and context-dependent HMM.
Athena is an open-source implementation of sequence-to-sequence based speech processing engine
Hybrid Attention/CTC based end-to-end ASR
- Unsupervised pre-training
- Multi-GPU training on one machine or across multiple machines with Horovod
- End-to-end Tacotron2 based TTS with support for multi-speaker and GST
- Transformer based TTS and FastSpeech
- WFST creation and WFST-based decoding
- Deployment with Tensorflow C++
10- ESPnet: end-to-end speech processing toolkit
ESPnet is an end-to-end speech processing toolkit, mainly focuses on end-to-end speech recognition and end-to-end text-to-speech.
It is a developer-friendly application that can integrated into web projects. Developers also can install it using Docker.
11- Voice Builder
Voice Builder is an open source text-to-speech (TTS) voice building tool that focuses on simplicity, flexibility, and collaboration. Our tool allows anyone with basic computer skills to run voice training experiments and listen to the resulting synthesized voice.
12- Coqui TTS
Coqui TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality.
13- Mozilla TTS
Mozilla TTS is a library for advanced Text-to-Speech generation. It's built on the latest research, was designed to achieve the best trade-off among ease-of-training, speed and quality.
14- Mycoft Mimic
If you know any other open-source TTS application, toolkit, or library that we didn't mention here, let us know.