• Skip to main content
  • Skip to secondary menu
  • Skip to primary sidebar
  • Skip to footer
  • Home
  • Crypto Currency
  • Technology
NEO Share

NEO Share

Sharing The Latest Tech News

  • Home
  • Artificial Intelligence
  • Machine Learning
  • Computers
  • Mobile
  • Crypto Currency

A Tour of the Weka ML Workbench

January 10, 2021 by systems

Johar M. Ashfaque

Weka is an easy to use and powerful machine learning platform. It provides a large number of machine learning algorithms, feature selection methods and data preparation filters.

The entry point into the Weka interface is the Weka GUI Chooser. It is an interface that lets you choose and launch a specific Weka environment:

Screenshot of the Weka GUI Chooser

In addition to providing access to the core Weka tools, it also has a number of additional utilities and tools provided in the menu.

There two important utilities to note in the “Tools” menu:

1. The Package Manager which lets you browse and install third party add-ons to Weka such as new algorithms:

Screenshot of the Weka Package Manager

2. The ARFF-Viewer that allows you to load and transform datasets and save them in ARFF format:

Screenshot of the Weka ARFF-Viewer

The Weka Explorer is designed to investigate your machine learning dataset. It is useful when you are thinking about different data transforms and modelling algorithms that you could investigate with a controlled experiment later. It is excellent for getting ideas and playing what-if scenarios.

The interface is divided into 6 tabs, each with a specific function.

The preprocess tab is for loading your dataset and applying filters to transform the data into a form that better exposes the structure of the problem to the modeling processes. Also provides some summary statistics about loaded data.

Load a standard dataset in the data/ directory of your Weka installation, specifically data/breast-cancer.arff. This is a binary classification problem:

Screenshot of the Weka Explorer Preprocess Tab

The classify tab is for training and evaluating the performance of different machine learning algorithms on your classification or regression problem. Algorithms are divided up into groups, results are kept in a result list and summarized in the main classifier output.

Click the “Start” button to run the ZeroR classifier on the dataset and summarize the results:

Screenshot of the Weka Explorer Classify Tab

The cluster tab is for training and evaluating the performance of different unsupervised clustering algorithms on your unlabelled dataset. Like the classify tab, algorithms are divided into groups, results are kept in a result list and summarized in the main cluster output.

Click the “Start” button to run the EM clustering algorithm on the dataset and summarize the results:

Screenshot of the Weka Explorer Cluster Tab

The associate tab is for automatically finding associations in a dataset. The techniques are often used for market basket analysis type data mining problems and require data where all attributes are categorical.

Click the “Start” button to run the Apriori association algorithm on the dataset and summarize the results:

Screenshot of the Weka Explorer Associate Tab

The select attributes tab is for performing feature selection on the loaded dataset and identifying those features that are most likely to be relevant in developing a predictive model.

Click the “Start” button to run the “CfsSubsetEval” algorithm with a “BestFirst” search on the dataset and summarize the results:

Screenshot of the Weka Explorer Select Attributes Tab

The visualize tab is for reviewing pairwise scatterplot matrix of each attribute plotted against every other attribute in the loaded dataset. It is useful to get an idea of the shape and relationship of attributes that may aid in data filtering, transformation and modelling.

Increase the point size and the jitter and click the “Update” button to set an improved plot of the categorical attributes of the loaded dataset:

Weka Explorer Visualize Tab

The Weka Experiment Environment is for designing controlled experiments, running them, and then analysing the results collected. It is the next step after using the Weka Explorer, where you can load up one or more views of your dataset and a suite of algorithms and design an experiment to find the combination that results in the best performance.

The interface is split into 3 tabs.

The setup tab is for designing an experiment. This includes the file where results are written, the test setup in terms of how algorithms are evaluated, the datasets to model and the algorithms to model them. The specifics of an experiment can be saved for later use and modification.

  • Click the “New” button to create a new Experiment.
  • Click the “Add New…” button in the Datasets pane and select the data/diabetes.arff dataset.
  • Click the “Add New…” button in the “Algorithms” pane and click “OK” to add the ZeroR algorithm.
Screenshot of the Weka Experiment Environment Setup Tab

The run tab is for running your designed experiments. Experiments can be started and stopped. There is not a lot to it.

Click the “Start” button to run the small experiment you designed:

Screenshot of the Weka Experiment Environment Run Tab

The analyse tab is for analysing the results collected from an experiment. Results can be loaded from a file, from the database or from an experiment just completed in the tool. A number of performance measures are collected from a given experiment which can be compared between algorithms using tools like statistical significance.

  • Click the “Experiment” button the “Source” pane to load the results from the experiment you just ran.
  • Click the “Perform Test” button to summary the classification accuracy results for the single algorithm in the experiment.
Screenshot of the Weka Experiment Environment Analyse Tab

The Weka KnowledgeFlow Environment is a graphical workflow tool for designing a machine learning pipeline from data source to results summary, and much more. Once designed, the pipeline can be executed and evaluated within the tool:

Screenshot of the Weka KnowledgeFlow Environment

The Weka Workbench is an environment that combines all of the GUI interfaces into a single interface.

Screenshot of the Weka Workbench

Weka can be used from a simple Command Line Interface (CLI). This is powerful because you can write shell scripts to use the full API from command line calls with parameters, allowing you to build models, run experiments and make predictions without a graphical user interface.

The SimpleCLI provides an environment where you can quickly and easily experiment with the Weka command line interface commands:

Screenshot of the Weka SimpleCLI

Filed Under: Artificial Intelligence

Primary Sidebar

Getting Started: How to Integrate AI Into Your Business

How to Reduce Stress in a Data Analytics Job

Toronto’s dotmobile shares plan pricing following CRTC MVNO approval

Siri, Alexa, and Other Voice Assistants Struggle with Bilinguals

Artificial Intelligence: Week #2 | 2021

Footer

  • Privacy Policy
  • Terms and Conditions

Copyright © 2021 NEO Share

Terms and Conditions - Privacy Policy