8 Open-source/ Free Text Mining and Text Analysis solutions

Ever wanted to analyze text documents for documents or articles? There are several tools, web services that provide such services but what about desktop programs?

So here in this article, we have collected several tools to help you achieve that, and even more, they are free and open-source as well. We will try to list the specific and unique features per item to make it easy for our readers to pick what they need.


Orange

Orange in action (src: orange)

Orange is an open-source platform for machine learning, data analysis, text mining and data visualization. It's has an interactive workflow, rich tool-set and comes with  visual programming support.

Orange works seamlessly on Windows, Linux and macOS.


TexMiner


TexMiner is a free open-source generic text mining tool. It works on plain text files and PDF.  TexMiner supports multiple languages starting from English, French, Spanish, Russian and German. It has thematic models for technical models, support co-occurrence analysis, letter frequency analysis and central expressions.

Platform: Windows.


TextFlows

TextFlow is a cloud-based for machine-learning, text mining and analysis. It supports visual programming and comes packed with open-source algorithms and NLP libraries.

TextFlow requires login


Textable

Textable is an amazing text mining and analysis tool which is uses Orange. As it uses Orange it has its features like visual programming, visualization and scripting features. It has a large set of features and free recipe libraries.  Textable supports multiple input source, and formats (xml, csv, tsv, html). It also supports unicode and non-uni-code text. It's very easy to use especially for beginners.


mbFXWords

mbFXWords

mbFXWords is a  lightweight open-source text analysis and mining solution. It uses Open-NLP and comes with several NLP extensions. It works on English, French and German documents. Even Though, it's a Java application the current version only supports Windows64.


DataMelt

DataMelt is a free and open-source computation and visualization platform that supports many areas including text mining and analysis. It also supports many scripting language. We have reviewed it briefly in this article.

DataMelt has been used by many researchers for text-mining especially "Biomedical text mining".


Libro

Libro is text analysis and mining lightweight software, It was written in Python and Free Pascal/Lazarus. It can analyze text from different file format including text files, HTML files, ODF (OpenDocument format) and ePub (eBook format).

We (Medevel.com) have written an article about Libro a while back listing all of it's features with screenshots.

Libro: Open source Free Text Analysis Tool
Libro [http://librejo.sourceforge.net/] is an open source cross-platformsoftware that provides a simple yet comprehensive text analysis for text files,It analyses the text content and generates graphs, and analytic summary for theextracted text from the text files. Libro is created in Brazil and…

Libro is available for Windows, Linux and macOS.


Text Analysis Markup System (TAMS)

TAMS in action (src: TAMS)

TAMS (Text Analysis Markup System) is a lightweight and completely free macOS application for text mining and analysis. It has native easy-to-use interface and contains many features for qualitative analysis, scripting, transcription and reporting.

This program supports PDF, RTF files, MySQL database, XML files. I believe the most killer feature for this small (legacy app) is supporting multiple files at once.

TAMS in action (src: TAMS)

If you are using macOS I recommend trying this amazing software out.

Platform: macOS


Conclusion

As we have collected the best out there to provide our readers with alternatives. However, we have excluded several desktop programs (old and abandoned)  and language-specific solutions. If you have any addition or if you are a developer who would like to add his program to this list please contact us, We will add it gladly.



  • SageMath is a free open-source mathematic software for mathematicians, data scientists and statisticians. It is built on top of many mathematic python packages.   SageMath features include animated graphs, interactive plots,  portable version that works directly from USB stick, interactive Python interface, notebook, rich documentation and more. SageMath is an ideal...Read more...

  • Visualization and Analysis of Scientific Data for Windows, Linux and macOS...Read more...

  • Products such as food items and pharmaceuticals often contain sensitive, perishable goods that need to be handled carefully during shipment and storage. If not handled with the proper care, these goods can go bad and have the possibility of becoming harmful for consumption. Businesses all around the world are losing...Read more...

  • Python is an interpreted general-purpose programming language. It is used for web development, desktop application development, system scripting and automation. It is a high-level language created in the early 1991 by Guido van Rossum and maintained by Python Software Foundation. The language is easy to learn which makes it suitable...Read more...

  • Trdsql is a command-line python application that execute SQL queries on flat data files like CSV, TSV, LTSV, TBLN and JSON files. It also allows exporting the outputs in several formats. The application is written in Go language which known for speed and performance. The reason why did we choose...Read more...