WEKA: Open source Machine Learning Tools for Developers

WEKA (Waikato Environment for Knowledge Analysis) is a collection of machine learning algorithms for data mining tasks. It contains tools for data preparation, classification, regression, clustering, association rules mining, and visualization. It also supports Deep Learning.

It is written in Java and developed at the University of Waikato, New Zealand. Weka is open source software released under the GNU General Public License.

Weka provides access to SQL databases using Java Database Connectivity (JDBC) and allows using the response for an SQL query as the source of data. This tool doesn’t support processing of related charts; however, there are many tools allowing combining separate charts into a single chart, which can be loaded right into Weka.

There are two versions of Weka: Weka 3.8 is the latest stable version and Weka 3.9 is the development version. For the bleeding edge, it is also possible to download nightly snapshots. Stable versions receive only bug fixes, while the development version receives new features.

Weka 3.8 and 3.9 feature a package management system that makes it easy for the Weka community to add new functionalities to Weka.

User Interfaces:

It has 5 user interfaces:

1- Simple CLI:

Provides full access to all Weka classes, i.e., classifiers, filters, clusterers, etc., but without the hassle of the CLASSPATH. It offers a simple Weka shell with separated commandline and output.

2- Explorer:

An environment for exploring data with WEKA. It has different tabs:

Preprocess: which enables you to choose and modify the data being acted on.
Classify: to train and test learning schemes that classify or perform regression.
Cluster: to learn clusters for the data.
Associate: to learn association rules for the data.
Select attributes: to select the most relevant attributes in the data.
Visualize: which enables you to view an interactive 2D plot of the data.

3- Experimenter:

An environment that enables the user to create, run, modify, and analyse experiments in a more convenient manner than is possible when processing the schemes individually

4- KnowledgeFlow:

Supports essentially the same functions as the Explorer but with a drag-and-drop interface. The KnowledgeFlow can handle data either incrementally or in batches (the Explorer handles batch data only).

5- Workbench:

A new user interface which is available from Weka 3.8.0. The Workbench provides an all-in-one application that subsumes all the major WEKA GUIs described above.

Highlights:

Cross-Platform support (Windows, Mac OS X and Linux).
Free open source.
Ease of use (includes a GUI).
A comprehensive collection of data preprocessing and modelling techniques.
Supports Deep Learning.

Packages:

Weka has a large number of regression and classification tools. Some examples are:

BayesianLogisticRegression: Implements Bayesian Logistic Regression for both Gaussian and Laplace Priors.
BayesNet: Bayes Network learning using various search algorithms and quality measures.
GaussianProcesses: Implements Gaussian Processes for regression without hyperparameter-tuning.
LinearRegression: Class for using linear regression for prediction.
MultilayerPerceptron: A Classifier that uses backpropagation to classify instances.
NonNegativeLogisticRegression: Class for learning a logistic regression model that has non-negative coefficients.
PaceRegression: Class for building pace regression linear models and using them for prediction.
SMO: Implements John Platt's sequential minimal optimization algorithm for training a support vector classifier.
ADTree: Class for generating an alternating decision tree.
BFTree: Class for building a best-first decision tree classifier.
HoeffdingTree: A Hoeffding tree (VFDT) is an incremental, anytime decision tree induction algorithm that is capable of learning from massive data streams, assuming that the distribution generating examples does not change over time.
M5P: M5Base. Implements base routines for generating M5 Model trees and rules.
RandomForest: Class for constructing a forest of random trees.
RandomTree: Class for constructing a tree that considers K randomly chosen attributes at each node. Performs no pruning.
SimpleCart: Class implementing minimal cost-complexity pruning.
IBk: K-nearest neighbours classifier. Can select appropriate value of K based on cross-validation. Can also do distance weighting.
IB1: Nearest-neighbour classifier.
LBR: Lazy Bayesian Rules Classifier.
DecisionTable: Class for building and using a simple decision table majority classifier.
M5Rules: Generates a decision list for regression problems using separate-and-conquer. In each iteration it builds a model tree using M5 and makes the "best" leaf into a rule.
ZeroR: Class for building and using a 0-R classifier. Predicts the mean (for a numeric class) or the mode (for a nominal class).
ClassificationViaRegression: Class for doing classification using regression methods.
Stacking: Combines several classifiers using the stacking method. Can do classification or regression.
DataNearBalancedND: A meta classifier for handling multi-class datasets with 2-class classifiers by building a random data-balanced tree structure.
MDD: Modified Diverse Density algorithm, with collective assumption.
MINND: Multiple-Instance Nearest Neighbour with Distribution learner.

License:

GNU General Public License.

References:

Conclusion:

WEKA is a good choice for first time introduction to machine learning even for non-programmers due to the simplicity of the GUI interface.

What is INCEpTION? INCEpTION is a sophisticated semantic annotation platform, diligently developed by the UKP Lab at the esteemed Technical University of Darmstadt. Its primary objective is to centralize a diverse range of semantic annotation tasks into a single, user-friendly web-based platform. This innovative open-source free platform revolutionizes the annotation

A Look at 9 Free and Open-source Face Detection and Recognition Libraries

In this comprehensive blog post, we will be diving deep into the enthralling and rapidly evolving world of face detection and recognition libraries. These advanced and powerful tools are revolutionizing the way we interact with our digital surroundings. By offering more secure means of authentication, they significantly improve the security

From Segmentation to Transcription: Simplifying Data Annotation with Universal Data Tool

What is the Universal Data Tool? The Universal Data Tool is a web and desktop application for editing and annotating various types of data, including images, text, audio, and documents. It supports a range of data tasks such as image segmentation, text classification, audio transcription, and more. The tool allows

Art Meets AI: Explore 19 Free Tools for Open-Source Creativity

What are AI Generated Tools? AI Generated Tools, is an AI-based tool that utilize machine learning models to create unique content based on user inputs, are used in various fields such as digital art, design, and content creation. While many web-based AI-generated art services are commercially available and often overpriced,

Revolutionizing Visual AI with SD.Next's Diverse Capabilities

SD.Next Stable Diffusion is an innovative tool that offers a range of advanced features designed to enhance your processing capabilities. Among its standout features are multiple backends and diffusion models, which provide the flexibility and adaptability needed to handle various tasks. Furthermore, it has built-in control for text, image,

Art Revolution: One-Click Magic with Free Open-source Stable Diffusion's Web Interface

What is a Stable Diffusion? Stable Diffusion represents a significant concept in the physical and chemical sciences. It refers to the natural process in which a given substance gradually disperses or spreads out within its container or environment. This spreading occurs due to the random, erratic motion of the molecules

Dream Factory: Where GPUs Meet Artistic Dreams and Open-source

Dream Factory is a multi-threaded GUI manager for mass creation of AI-generated art, supporting multiple GPUs. It's designed for users who want to generate a large volume of AI artwork with minimal hands-on time, and can produce thousands of images daily when run on multiple GPUs. It also

Stable UI: Unleash Creativity with Cloud-Powered Stable Diffusion

Stable UI is a web interface that allows users to generate, save, and view images using Stable Diffusion for free. This is made possible through Stable Horde, a crowdsourced distributed cluster of Stable Diffusion workers, enabling access for users regardless of their processing power. Features * 🎨 Image generation utilizing Stable Horde

Transform Your Data into AI-Ready Formats with Label Studio

Label Studio is a free self-hosted open-source data labeling platform designed for flexibility in fine-tuning large language models, preparing training data, or validating AI models. It supports a wide range of data types, including images, audio, text, time series, and video, across various applications like image classification, object detection, audio

Transform Your ML Models into Interactive Web Apps with Gradio

What is Gradio? Gradio is an open-source Python package that enables quick construction of demos or web applications for machine learning models, APIs, or any Python function. It also provides built-in sharing features for easy distribution, requiring no JavaScript, CSS, or web hosting experience. It is used by big players

WEKA: Open source Machine Learning Tools for Developers

Desoky Mo

User Interfaces:

1- Simple CLI:

2- Explorer:

3- Experimenter:

4- KnowledgeFlow:

5- Workbench:

Highlights:

Packages:

License:

References:

Conclusion:

Tags

User Interfaces:

1- Simple CLI:

2- Explorer:

3- Experimenter:

4- KnowledgeFlow:

5- Workbench:

Highlights:

Packages:

License:

References:

Conclusion:

Tags

Related Articles in Machine Learning