Data Mining with WEKA

January 30th, 2011

There are a number of good open source projects for statistics and data mining, for example the software WEKA developed at the University of Waikato.

The description on their website states that:

Weka is a collection of machine learning algorithms for data mining tasks.
The algorithms can either be applied directly to a dataset or called from
your own Java code. Weka contains tools for data pre-processing,
classification, regression, clustering, association rules, and visualization.
It is also well-suited for developing new machine learning schemes.

The software is written in Java and available under the GNU General Public Licence. The website also provides access to data sets from the UCI Machine Learning website for use with WEKA.

4 responses to “Data Mining with WEKA”

  1. Tom Ott says:

    I highly suggest the open source software Rapidminer by the Rapid-I team. They have a plugin extensions that allow users to use the WEKA operators and even R. It’s a disruptive technology IMHO.

  2. Abhijit says:

    I would like to add that there is an RWeka package available, which allows calling Weka routines from R.

  3. Ralph says:

    Many thanks for the suggestions guys.

    There are certainly many good software packages for statistics out there and any publicity for them is a good thing.

  4. mohammed says:

    I faced some problems in connecting weka with the database (sql)
    I will be grateful if anyone help me