8 Open-source/ Free Text Mining and Text Analysis solutions


Ever wanted to analyze text documents for documents or articles? There are several tools, web services that provide such services but what about desktop programs?

So here in this article, we have collected several tools to help you achieve that, and even more, they are free and open-source as well. We will try to list the specific and unique features per item to make it easy for our readers to pick what they need.


Orange

Orange in action (src: orange)

Orange is an open-source platform for machine learning, data analysis, text mining and data visualization. It's has an interactive workflow, rich tool-set and comes with  visual programming support.

Orange works seamlessly on Windows, Linux and macOS.


TexMiner


TexMiner is a free open-source generic text mining tool. It works on plain text files and PDF.  TexMiner supports multiple languages starting from English, French, Spanish, Russian and German. It has thematic models for technical models, support co-occurrence analysis, letter frequency analysis and central expressions.

Platform: Windows.


TextFlows

TextFlow is a cloud-based for machine-learning, text mining and analysis. It supports visual programming and comes packed with open-source algorithms and NLP libraries.

TextFlow requires login


Textable

Textable is an amazing text mining and analysis tool which is uses Orange. As it uses Orange it has its features like visual programming, visualization and scripting features. It has a large set of features and free recipe libraries.  Textable supports multiple input source, and formats (xml, csv, tsv, html). It also supports unicode and non-uni-code text. It's very easy to use especially for beginners.


mbFXWords

mbFXWords

mbFXWords is a  lightweight open-source text analysis and mining solution. It uses Open-NLP and comes with several NLP extensions. It works on English, French and German documents. Even Though, it's a Java application the current version only supports Windows64.


DataMelt

DataMelt is a free and open-source computation and visualization platform that supports many areas including text mining and analysis. It also supports many scripting language. We have reviewed it briefly in this article.

DataMelt has been used by many researchers for text-mining especially "Biomedical text mining".


Libro

Libro is text analysis and mining lightweight software, It was written in Python and Free Pascal/Lazarus. It can analyze text from different file format including text files, HTML files, ODF (OpenDocument format) and ePub (eBook format).

We (Medevel.com) have written an article about Libro a while back listing all of it's features with screenshots.

Libro: Open source Free Text Analysis Tool
Libro [http://librejo.sourceforge.net/] is an open source cross-platform software that provides a simple yet comprehensive text analysis for text files, It analyses the text content and generates graphs, and analytic summary for the extracted text from the text files. Libro is created in Brazil and…

Libro is available for Windows, Linux and macOS.


Text Analysis Markup System (TAMS)

TAMS in action (src: TAMS)

TAMS (Text Analysis Markup System) is a lightweight and completely free macOS application for text mining and analysis. It has native easy-to-use interface and contains many features for qualitative analysis, scripting, transcription and reporting.

This program supports PDF, RTF files, MySQL database, XML files. I believe the most killer feature for this small (legacy app) is supporting multiple files at once.

TAMS in action (src: TAMS)

If you are using macOS I recommend trying this amazing software out.

Platform: macOS


Conclusion

As we have collected the best out there to provide our readers with alternatives. However, we have excluded several desktop programs (old and abandoned)  and language-specific solutions. If you have any addition or if you are a developer who would like to add his program to this list please contact us, We will add it gladly.






Hamza Mu Author: Hamza Mu

A physician with programming skills, Linux user since late 1990s, Open source supporter. Coding with Python, NodeJS (Meteor, VueJS, Express, D3, PhantomJS), SmallTalk & R language.





Read more