Logo
develop

General Information

  • About AMIDST
    • What is AMIDST?
    • Scalability
      • Multi-Core Scalablity using Java 8 Streams
      • Distributed Scalablity using Apache Flink
    • Related Software

Examples

  • Sparklink: Code Examples
    • Input/output
      • Reading data
      • Writing data
    • Parameter learning
  • Wekalink: using an AMIDST classifier in Weka
    • Prepare your project
    • Create the wrapper class
    • Testing the AMIDST classifier in Weka
  • Tutorial: Easy Machine Learning with Latent Variable Models in AMIDST
    • Setting up
    • Static Models
      • Learning and saving to disk
      • Learning from Flink
      • Inference
      • Custom static model
    • Dynamic Models
      • Inference
      • Custom dynamic model
  • Flinklink: Code Examples
    • Input/output
      • Reading data
      • Writing data
    • Parametric learning
      • Parallel Maximum Likelihood
      • Distributed Variational Message Passing
      • Distributed VI
      • Stochastic VI
    • Extensions and applications
      • Latent variable models with Flink
      • Concept drift detection
  • Dynamic Bayesian Networks: Code Examples
    • Data Streams
    • Dynamic Random Variables
    • Dynamic Bayesian networks
      • Creating Dynamic Bayesian networks
      • Creating Dynamic Bayesian Networks with Latent Variables
      • Modifying Dynamic Bayesian Networks
    • Sampling from Dynamic Bayesian Networks
    • Inference Algorithms for Dynamic Bayesian Networks
      • The Dynamic MAP Inference
      • The Dynamic Variational Message Passing
      • The Dynamic Importance Sampling
    • Dynamic Learning Algorithms
      • Maximum Likelihood for DBNs
      • Streaming Variational Bayes for DBNs
  • Bayesian Networks: Code Examples
    • Data Streams
    • Data Streams
    • Models
      • Creating BNs
      • Creating Bayesian networks with latent variables
      • Modifying Bayesian networks
    • Input/Output
      • I/O of data streams
      • I/O of BNs
    • Inference
      • The inference engine
    • Inference
      • Variational Message Passing
      • Importance Sampling
    • Learning Algorithms
      • Maximum Likelihood
      • Parallel Maximum Likelihood
      • Streaming Variational Bayes
      • Parallel Streaming Variational Bayes
    • Concept Drift Methods
      • Naive Bayes with Virtual Concept Drift Detection
    • HuginLink
      • Models conversion between AMiDST and Hugin
      • I/O of Bayesian Networks with Hugin net format
      • Invoking Hugin’s inference engine
      • Invoking Hugin’s Parallel TAN
    • MoaLink
      • AMIDST Classifiers from MOA
      • AMIDST Classifiers from MOA

First steps

  • Getting Started!
    • Quick start
    • Getting started in detail
  • Requirements for AMIDST Toolbox
    • For toolbox users
    • For AMIDST developers
  • Loading AMIDST dependencies from a remote maven repository
  • Installing a local AMIDST repository
  • Generating the packages for each module and for its dependencies

Contributing to AMIDST

  • Basic steps for contributing
    • Clone the repository
    • Create a new branch from develop
    • Modify the code and upload your changes
    • Merge the new branch with develop

Other

  • JavaDoc
InferPy
  • Docs »
  • Wekalink: using an AMIDST classifier in Weka
  • Edit on GitHub

Wekalink: using an AMIDST classifier in Weka¶

One of the greatests points of AMIDST is the integration with other tools for data analysis. This is the case of Weka whose integration fuctionality is provided by the module wekalink. We will be able to create a wrapper for evaluating an AMDIST classifier with Weka.

Prepare your project¶

The first thing we have to do is to load the required AMIDST dependencies in a Maven project. In this case, we will have to load the modules wekalink and latent-variable-models. For that, add the following code to the file pom.xml of your project.

<dependencies>

  <dependency>
    <groupId>eu.amidst</groupId>
    <artifactId>wekalink</artifactId>
    <version> 0.7.2 </version>
    <scope>compile</scope>
 </dependency>

  <dependency>
    <groupId>eu.amidst</groupId>
    <artifactId>latent-variable-models</artifactId>
    <version> 0.7.2 </version>
    <scope>compile</scope>
  </dependency>

<!-- ... -->
</dependencies>

Further details for creating a project using AMIDST fuctionality is given in the Getting Started section.

Create the wrapper class¶

A custom classifier that could be handled by weka should inherit from class weka.classifiers.AbstractClassifier and implement interface weka.core.Randomizable. Thus we should override at least the following methods:

  • void buildClassifier(Instances data): builds the classifier from scratch with the given dataset.
  • double[] distributionForInstance(Instance instance): returns a vector containing the probability for each label or state of the class.

Here below we show a minimal example where the Naive Bayes classifier provided by AMIDST is used.

import eu.amidst.core.datastream.Attributes;
import eu.amidst.core.datastream.DataInstance;
import eu.amidst.core.datastream.DataOnMemoryListContainer;
import eu.amidst.core.datastream.filereaders.DataInstanceFromDataRow;
import eu.amidst.latentvariablemodels.staticmodels.classifiers.NaiveBayesClassifier;
import eu.amidst.wekalink.converterFromWekaToAmidst.Converter;
import eu.amidst.wekalink.converterFromWekaToAmidst.DataRowWeka;
import weka.classifiers.AbstractClassifier;
import weka.core.Instance;
import weka.core.Instances;


public class AmidstNaiveBayes extends AbstractClassifier  {

    private NaiveBayesClassifier model = null;
    private Attributes attributes;

    @Override
    public void buildClassifier(Instances data) throws Exception {

        attributes = Converter.convertAttributes(data.enumerateAttributes(),
                                data.classAttribute());
        DataOnMemoryListContainer<DataInstance> dataAmidst =
                    new DataOnMemoryListContainer(attributes);

        data.stream()
            .forEach(instance -> dataAmidst.add(
                new DataInstanceFromDataRow(
                    new DataRowWeka(instance, attributes)))
        );

        model = new NaiveBayesClassifier(attributes);
        model.updateModel(dataAmidst);
    }


    @Override
    public double[] distributionForInstance(Instance instance) throws Exception {
        DataInstance amidstInstance =
            new DataInstanceFromDataRow(new DataRowWeka(instance,
                                this.attributes));
        return model.predict(amidstInstance).getParameters();

    }

}

Note that previous code does not implement neither the learning nor the classification processes, it simply calls to the corresponding methods eu.amidst.latentvariablemodels.NaiveBayesClassifier performing such tasks.

Testing the AMIDST classifier in Weka¶

Now we can evaluate an AMIDST classifier using only calls to functions from Weka. Here we show an example where we load a dataset in format .arff, we learn a naive Bayes classifier and we show the confusion matrix:

//Load the dataset
BufferedReader reader =
    new BufferedReader(new FileReader("exampleDS_d5_c0.arff"));
Instances data = new Instances(reader);
data.setClassIndex(6);

//Learn and evaluate the classifier
Evaluation eval = new Evaluation(data);
Debug.Random rand = new Debug.Random(1);
int folds = 10;
Classifier cls = new AmidstNaiveBayes();
eval.crossValidateModel(cls, data, folds, rand);

//Print the confusion matrix
System.out.println(eval.toMatrixString());
Next Previous

© Copyright 2017, Andrés R. Masegosa, Rafael Cabañas Revision cd3b227e.

Built with Sphinx using a theme provided by Read the Docs.