Intro Primer To WEKA Explorer For Machine Learning

1 Star2 Stars3 Stars4 Stars5 Stars (6 votes, average: 5.00 out of 5)
Loading...

Previously, in our Intro Primer For WEKA Machine Learning Software post, we introduced you to Weka and suggested that the Weka Explorer tool could be useful. In this post, we will show you why it is a useful tool for exploring your data, from doing the simplest to the most complex analysis on your data.  We will guide you step by step through the analysis of simple problems using Weka Explorer tools for preprocessing, classification, clustering, association, attribute selection, and visualization of data. At the end of the tutorial, you should be able to analyze your own data with Weka Explorer using the various tools and interpret the results.

Hoang Pham Truc Phuong, hptphuong@gmail.com, is the author of this article and he contributes to RobustTechHouse Blog for our Machine Learning column. RobustTechHouse focusses on Mobile App Development in Singapore.

1. Launch Weka Explorer

Click on the “Explorer” button on “Weka GUI Chooser” and the Weka Explorer window will launch.

Intro Primer To WEKA Explorer For Machine Learning

1.1 Status Box

You should see the status box at the left bottom of the window. It displays messages that keep you informed about what’s going on in Weka. For example, if the Explorer is busy loading a file, the status box will explain as such.

Intro Primer To WEKA Explorer For Machine Learning

And if Weka explorer is working on data transfers, the status box will show messages for this.

Intro Primer To WEKA Explorer For Machine Learning

Tips:  Right click on the inside of the status box and the sub-menu will appear with two options:

  • Memory information. Shows the amount of memory available to Weka.
  • Run garbage collection. Force the java garbage collector to search for memory that is no longer needed and frees it up, allowing more space for new tasks. Note that the garbage collector is constantly running as a background task anyway.

1.2 Log Button

Clicking on this button brings up a separate window containing a scrollable text field. Each line of text is stamped with the time it was entered into the log. As you perform actions in Weka, the log keeps a record of what has happened. For people using the command line or the SimpleCLI, the log now also contains the full setup strings for classification, clustering, attribute selection, etc., so it is possible to copy/paste them elsewhere. Options for dataset(s) and, if applicable, the class attribute still needs to be provided by the user (e.g., -t for classifiers or -i and -o for filters).

 

1.3 Weka Status Icon

To the right of the status box is the Weka status icon. When no processes are running, the bird icon is taking a nap. The number beside the X symbol gives the number of concurrent processes running. When the system is idle it is zero, but it increases as the number of processes increases. When any process is started, the bird icon gets up and starts moving around. If it’s standing but stops moving for a long time, it’s sick: something has gone wrong! In that case you should restart the WEKA Explorer.

2. Preprocessing Data

Data in the real world is frequently dirty. So preprocessing is an important step for successful data mining. Weka has wonderful support for preprocessing data. Here, step-by-step, we take you through how to do data preprocessing on Weka.

2.1 Opening file from a local file system

In the stable version, you can only load some basic file types like *.arff, *. arff.gz, *.names, *.data, *.csv, *.libsvm, *.dat, *.bsi, *.xrff, *.xrff.gz. The developer version supports more file types like: *.json, *.json.gz,…

Intro Primer To WEKA Explorer For Machine Learning

2.2 Opening a File From a Website

You can directly load data from a given URL. For example, you can choose “open URL” and input this link http://storm.cis.fordham.edu/~gweiss/data-mining/weka-data/contact-lenses.arff

Intro Primer To WEKA Explorer For Machine Learning

2.3 Reading data from a database

In our last Intro Primer For WEKA Machine Learning Software post on Weka introduction , we mentioned that Weka supports connecting to database management systems (DBMS). From Weka Explorer, you can connect and load data from databases. Here are the steps to load data in Weka Explorer after connecting to a DBMS:

  • Choose Open DB
  • The URL should read “jdbc:odbc:dbname” where dbname is the name you gave the user DSN.
  • Click Connect
  • Enter a Query, e.g., “select * from tablename” where tablename is the name of the database table you want to read. Or you could put a more complicated SQL query here instead.
  • Click Execute
  • When you’re satisfied with the returned data, click OK to load the data into the Preprocess panel.

Intro Primer To WEKA Explorer For Machine Learning

2.4 Generate artificial data

Weka also supports generation of sample data. Here is the list of some sample data which is supported by Weka.

  • Agrawal: Generates a people database and is based on the paper by Agrawal et al.

Intro Primer To WEKA Explorer For Machine Learning

  • BayesNet: Generates random instances based on a Bayes network

Intro Primer To WEKA Explorer For Machine Learning

  • Led24: This generator produces data for a display with 7 LEDs.
  • RandomRBF: Data is generated by first creating a random set of centers for each class.
  • RDG1: A data generator that produces data randomly by producing a decision list.

2.5 The Current Relation

When the data loaded, the Preprocess panel shows a variety of information. The current relation box, which can be interpreted as a single relational table in database terminology, has three entries:

  • Relation. The name of the relation, as given in the file it was loaded from. Filters, described below, modify the name of a relation.
  • Instances. The number of instances (data points/records) in the data.
  • Attributes. The number of attributes (features) in the data.

The current relation box is labelled as 1 in the screenshot below.

Intro Primer To WEKA Explorer For Machine Learning

2.6 Working With Attributes

Below the current relation box is a box titled Attributes (labelled as number 2 in screenshot above). There are four buttons, and beneath them is a list of the attributes in the current relation.

The list has three columns:

  • No. A number that identifies the attribute in the order they are specified in the data file.
  • Selection tick boxes. These allows you to select which attributes are present in the relation.
  • Name. The name of the attribute, as it was declared in the data file.

When you click on different rows in the list of attributes, the fields change in the box to the right titled Selected attribute (labelled as number 3 in screenshot above). This box displays the characteristics of the currently highlighted attribute in the list:

  • The name of the attribute, the same as that given in the attribute list.
  • The type of attribute, most commonly Nominal or Numeric.
  • The number and percentage of instances in the data for which this attribute is missing.
  • The number of different values that the data contains for this attribute.
  • The number and percentage of instances in the data having a value for this attribute that no other instances have.

For example, load the weather.arff and remove a record of temperature attribute at line 4 (by pressing the “Edit” button and edit directly)

Intro Primer To WEKA Explorer For Machine Learning

Choose “temperature” attribute and you see some static value in selected attribute box:

  • Type: Nominal. It means that this is not a numeric but  a string type.
  • Missing: 1(7%). This means that we lack one value in this attribute, and this is 7% of all records.
  • Distinct: 3. This means that there 3 distinct values for records: hot, mild and cold.
  • Unique: 0. This means that other instances do not have the same value.

2.7 Working With Filters

WEKA contains filters for discretization, normalization, resampling, attribute selection, transformation and combination of attributes. Sometimes you need to transform your data from numeric to nominal values for some techniques such as association rule mining. In Weka, we can use the “discretize” feature of filters to do this transform.

To explain this feature, we can go through a small example. Load file “weather.numeric.Arff” from Weka’s sample data.

Intro Primer To WEKA Explorer For Machine Learning

In this data set, the “temperature” attribute is a numeric type and it is a continuous variable. But in some techniques, we don’t need to know the exact value of temperatue. We just need the state of temperature, such as: cold, hot etc. Weka can help us do it using the filter function. You just need to follow below steps:

  • In ‘Filters’ window, click on the ‘Choose’ button:

Intro Primer To WEKA Explorer For Machine Learning

  • It will show a pull-down menu with a list of available filters. Select unsupervised -> Attribute -> Discretize

Intro Primer To WEKA Explorer For Machine Learning

Intro Primer To WEKA Explorer For Machine Learning

Click in the red rectangular area, the option of discretize will appear and set bins to 3 (here I want to divide into three level):

Intro Primer To WEKA Explorer For Machine Learning

Intro Primer To WEKA Explorer For Machine Learning

Click “apply” button, we will have the following result:

Intro Primer To WEKA Explorer For Machine Learning

The temperature was divided into three ranges: (-inf,71];(71,78] and (78,inf). Then we can use RenameNominalValues to change to label which you want.

3. Data Visualization

Weka can visualize single attributes (1-d) and pairs of attributes (2-d), rotate 3-d visualizations (Xgobi-style). WEKA has “Jitter” option to deal with nominal attributes and to detect “hidden” data points. To open the Visualization screen, click on the ‘Visualize’ tab.

Intro Primer To WEKA Explorer For Machine Learning

Select a square that corresponds to the attributes you would like to visualize. For example, let’s choose ‘outlook’ for X – axis and ‘play’ for Y – axis. Click anywhere inside the square that corresponds to ‘play on the left and ‘outlook’ at the top.

Intro Primer To WEKA Explorer For Machine Learning

In weka3.7.x, you can download “Visualize 3D” by package manager.Intro Primer To WEKA Explorer For Machine Learning

Conclusion

Here we provided an Intro Primer To WEKA Explorer For Machine Learning. Hope you found it useful.

If you like our articles, please follow and like our Facebook page where we regularly share interesting posts  and check out our other blog articles where we write about programming, eCommerce, mobile-commerce, FinTech, Machine Learning and other interesting topics.

RobustTechHouse is a leading tech company focusing on mobile app development, ECommerce, Mobile-Commerce and Financial Technology (FinTech) in Singapore. If you are interested to engage RobustTechHouse on your web, mobile app development, ECommerce, Mobile-Commerce, Financial Technology (FinTech) projects in Singapore, you can contact us here.

Recommended Posts
Showing 43 comments
  • 토토사이트
    Reply

    Thanks a bunch for sharing this with all of us you really realize what you are talking about! Please also consult with my site. We may have a hyperlink trade agreement between us 토토사이트

  • 온라인카지노
    Reply

    Greetings! Very helpful advice within this post! It’s the little changes that will make the most important changes.

    Thanks for sharing! 온라인카지노

  • 파워볼사이트
    Reply

    I am not sure where you are getting your information, but good topic. I needs to spend some time learning more or understanding more. 파워볼사이트

  • 카지노사이트
    Reply

    This is very interesting, You’re a very skilled blogger.
    I’ve joined your feed and look forward to seeking more of your great post. Also, I have shared your web site in my social networks! 카지노사이트

  • casinosite777.info
    Reply

    Well I truly enjoyed reading it. This subject offered by you is very helpful and accurate.

  • baccaratsite.top
    Reply

    This is really helpful post and very informative there is no doubt about it.
    카지노사이트

  • sportstoto.zone
    Reply

    Such an amazing and helpful post. I really really love it.
    스포츠토토

  • baccaratsite.biz
    Reply

    Your article is very interesting. I think this article has a lot of information needed, looking forward to your new posts.
    온라인카지노

  • 메이저놀이터순위
    Reply

    While looking for articles on these topics, I came across this article on the site here. As I read your article, I felt like an expert in this field. I have several articles on these topics posted on my site. Could you please visit my homepage? 메이저놀이터순위

  • oncasinosite
    Reply

    I must say, as a lot as I enjoyed reading what you had to say, I couldnt help but lose interest after a while. 바카라사이트

  • casinositehomecom
    Reply

    stays on topic and states valid points. Thank you. 바카라사이트

  • toto365pro
    Reply

    Aw, this was a very nice post. Taking a few minutes and actual effort to make
    a very good article… but what can I say… I put things off a whole lot and don’t
    manage to get nearly anything done. 스포츠토토

  • sportstototop
    Reply

    Nice and very unique post…help for me…. Thank you very much… 토토사이트

  • keonhacai
    Reply

    Really no matter if someone doesn’t be aware of after that its up to other users that they will help, so here it takes place keonhacai.

  • 먹튀검증
    Reply

    The flutter kick is a kicking movement used in both swimming and calisthenics.

  • 메이저사이트
    Reply

    When I read your article on this topic, the first thought seems profound and difficult. There is also a bulletin board for discussion of articles and photos similar to this topic on my site, but I would like to visit once when I have time to discuss this topic. 메이저사이트

  • 메이저사이트추천
    Reply

    The assignment submission period was over and I was nervous, 메이저사이트추천 and I am very happy to see your post just in time and it was a great help. Thank you ! Leave your blog address below. Please visit me anytime.

  • baccarat
    Reply

    What a nice post! I’m so happy to read this. baccarat What you wrote was very helpful to me. Thank you. Actually, I run a site similar to you. If you have time, could you visit my site? Please leave your comments after reading what I wrote. If you do so, I will actively reflect your opinion. I think it will be a great help to run my site. Have a good day.

  • 샌즈카지노
    Reply

    When I read your article on this topic, my first thought seems to be profound and difficult. My site has a discussion board for articles and photos similar to this topic. If you leave a discussion thread on the topic, it will be reflected. 샌즈카지노

  • 메리트카지노
    Reply

    Looking at this article, I miss the time when I didn’t wear a mask. 메리트카지노 Hopefully this corona will end soon. My blog is a blog that mainly posts pictures of daily life before Corona and landscapes at that time. If you want to remember that time again, please visit us.

  • 토토사이트순위
    Reply

    Hello, I am one of the most impressed people in your article. 토토사이트순위 I’m very curious about how you write such a good article. Are you an expert on this subject? I think so. Thank you again for allowing me to read these posts, and have a nice day today. Thank you.

  • 메이저안전놀이터
    Reply

    Your ideas inspired me very much. 메이저안전놀이터 It’s amazing. I want to learn your writing skills. In fact, I also have a website. If you are okay, please visit once and leave your opinion. Thank you.

  • betmantotopro
    Reply

    I like the efforts you have put in this, regards for all the great content. 스포츠토토

  • safetotositepro
    Reply

    It’s appropriate time to make some plans for the future and it’s time to be happy. I have read this post and if I could I desire to suggest you some interesting things or advice. 토토

  • totosafeguidecom
    Reply

    We absolutely love your blog and find a lot of your post’s to be exactly what I’m looking for. Would you offer guest writers to write content for you personally? 토토사이트

  • 온라인카지노
    Reply

    I am contemplating this topic. I think you can solve my problems. My site is at “온라인카지노“. I hope you can help me.

  • think
    Reply

    good efforts. thanks for sharing. i really appreciate your efforts. so please sharing such an amazing information. production houses in Pakistan

  • baccarat
    Reply

    It’s the same topic , but I was quite surprised to see the opinions I didn’t think of. My blog also has articles on these topics, so I look forward to your visit.baccarat

  • 토토사이트추천
    Reply

    What a nice post! I’m so happy to read this. 토토사이트추천 What you wrote was very helpful to me. Thank you. Actually, I run a site similar to you. If you have time, could you visit my site? Please leave your comments after reading what I wrote. If you do so, I will actively reflect your opinion. I think it will be a great help to run my site. Have a good day.

  • 우리카지노
    Reply

    I’m writing on this topic these days, 우리카지노, but I have stopped writing because there is no reference material. Then I accidentally found your article. I can refer to a variety of materials, so I think the work I was preparing will work! Thank you for your efforts.

  • Reply

    먹튀검증업체 메이저추천 메이저사이트목록 – How To Win Using the Shimmy

  • 안전놀이터
    Reply

    This is the perfect post.안전놀이터 It helped me a lot. If you have time, I hope you come to my site and share your opinions. Have a nice day.

  • Samsung Galaxy M02s Price
    Reply

    Pretty section of content. I just stumbled upon your weblog and in accession capital to assert that I get actually enjoyed account your blog posts. Any way I will be subscribing to your augment and even I achievement you access consistently quickly.  Samsung Galaxy M02s Price

  • roulette
    Reply

    This is the post I was looking for roulette I am very happy to finally read about the Thank you very much. Your post was of great help to me. If you are interested in the column I wrote, please visit my site .

  • sania khatri
    Reply

    Pretty section of content. I just stumbled upon your weblog and in accession capital to assert that I get actually enjoyed account your blog posts. Any way I will be subscribing to your augment and even I achievement you access consistently quickly.  Oppo Find X2 Lite Price

  • sania khatri
    Reply

    Indeed, this made them think what different exercises are useful for those of us who wind up out and about or have restricted gear choices. satta

  • 파워볼사이트
    Reply

    As I am looking at your writing, 파워볼사이트 I regret being unable to do outdoor activities due to Corona 19, and I miss my old daily life. If you also miss the daily life of those days, would you please visit my site once? My site is a site where I post about photos and daily life when I was free.

  • 먹튀검증업체
    Reply

    How the 먹튀검증업체 Takes Your Money

  • sania khatri
    Reply

    Indeed, this made them think what different exercises are useful for those of us who wind up out and about or have restricted gear choices. matka

  • 메이저공원
    Reply

    Top 5 Most Ridiculous 메이저공원 Freakouts

  • 에볼루션카지노
    Reply

    Je ne suis pas vraiment un lecteur Internet pour être honnête mais vos blogs vraiment sympa, continue comme ça ! 에볼루션카지노 Je vais aller de l’avant et ajouter votre site à vos favoris pour revenir à l’avenir. advgamble.com

  • 토토사이트
    Reply

    The Burn Card in 토토사이트: What Do You Do About It?

  • sania khatri
    Reply

    very nice and great article, thanks for sharing with us! satta matka

Leave a Comment

Contact Us

We look forward to your messages. Please drop us a note for any enquiries and we'll get back to you, asap.

Not readable? Change text. captcha txt
Coding_in_progressUS_blockchain_04