8 Open-source/ Free Text Mining and Text Analysis solutions
Ever wanted to analyze text documents for documents or articles? There are several tools, web services that provide such services but what about desktop programs?
So here in this article, we have collected several tools to help you achieve that, and even more, they are free and open-source as well. We will try to list the specific and unique features per item to make it easy for our readers to pick what they need.
Orange is an open-source platform for machine learning, data analysis, text mining and data visualization. It's has an interactive workflow, rich tool-set and comes with visual programming support.
Orange works seamlessly on Windows, Linux and macOS.
TexMiner is a free open-source generic text mining tool. It works on plain text files and PDF. TexMiner supports multiple languages starting from English, French, Spanish, Russian and German. It has thematic models for technical models, support co-occurrence analysis, letter frequency analysis and central expressions.
TextFlow is a cloud-based for machine-learning, text mining and analysis. It supports visual programming and comes packed with open-source algorithms and NLP libraries.
TextFlow requires login
Textable is an amazing text mining and analysis tool which is uses Orange. As it uses Orange it has its features like visual programming, visualization and scripting features. It has a large set of features and free recipe libraries. Textable supports multiple input source, and formats (xml, csv, tsv, html). It also supports unicode and non-uni-code text. It's very easy to use especially for beginners.
mbFXWords is a lightweight open-source text analysis and mining solution. It uses Open-NLP and comes with several NLP extensions. It works on English, French and German documents. Even Though, it's a Java application the current version only supports Windows64.
DataMelt is a free and open-source computation and visualization platform that supports many areas including text mining and analysis. It also supports many scripting language. We have reviewed it briefly in this article.
DataMelt has been used by many researchers for text-mining especially "Biomedical text mining".
Libro is text analysis and mining lightweight software, It was written in Python and Free Pascal/Lazarus. It can analyze text from different file format including text files, HTML files, ODF (OpenDocument format) and ePub (eBook format).
We (Medevel.com) have written an article about Libro a while back listing all of it's features with screenshots.
Libro is available for Windows, Linux and macOS.
Text Analysis Markup System (TAMS)
TAMS (Text Analysis Markup System) is a lightweight and completely free macOS application for text mining and analysis. It has native easy-to-use interface and contains many features for qualitative analysis, scripting, transcription and reporting.
This program supports PDF, RTF files, MySQL database, XML files. I believe the most killer feature for this small (legacy app) is supporting multiple files at once.
If you are using macOS I recommend trying this amazing software out.
As we have collected the best out there to provide our readers with alternatives. However, we have excluded several desktop programs (old and abandoned) and language-specific solutions. If you have any addition or if you are a developer who would like to add his program to this list please contact us, We will add it gladly.