středa 29. ledna 2014

čtvrtek 23. ledna 2014

Execution of MBR score.sas generated in SAS Enterprise Miner in SAS Enterprise Guide

Execution of non-data-step models like MBR and SVM is different from execution of data-step models like tree and regression. Hence it is not possible to call score.sas like:

data output;
   set input;
   %include "C:\score.sas";
run;


Instead you have to use something like:


/* Note: For k-NN score.sas is not enough by itself.
   k-NN also needs whole training set.*/


/* Define EMSCORE library where MBR will look for training
   and scoring data. It has to be named EMSCORE. */
libname EMSCORE base "C:\EMSCORE";

/* Define training data. It has to be named EM_TRAIN_MBR
   and reside in EMSCORE. */
data EMSCORE.EM_TRAIN_MBR;
    set learning;
run;

/* Define scoring data. Scoring data will be overwritten
   with MBR prediction. It has to be named em_score_output. */
data EMSCORE.scoring;
    set scoring;
run;
%let em_score_output=  EMSCORE.scoring;

/* Execute MBR */
data _null_;
   %include "C:\score.sas";
run;

pondělí 20. ledna 2014

Comparison of variable selection methods

Since I didn't know which variable selection method to use, I performed a trivial test on Sonar dataset. Sonar dataset has 60 attributes. But I arbitrarily decided to reduce the number of attributes to 10. Then I measured classification accuracy with ten fold cross-validation. And to get an idea how feature selection methods are dependent on the classifiers I tried three different models: naive Bayes, k-NN and classification tree:


Based on the comparison the best method to use is SVM attribute selection. However, this method requires parameter tunning. The next best variable selection method is Chi2. The disadvantage of this method is that it favors attributes with many levels. Hence the performance of Chi2 could be severely hindered on diverse set not like Sonar set. The last method from the top three is information gain ratio. The advantage of this method is that it can handle attributes with diverse number of levels not like Chi2.

neděle 19. ledna 2014

Oh people of SAS, you are amazing!

  1. Sometimes you have to use comma between parameters, for example in definition of macro:
         %macro append(columnName=, labelName=);
    But it other cases, like dropping,  you can't use comma between parameters:
          data table(drop=attr1 attr2 attr3);
  2. If you type DAAT instead of DATA, your program will run anyway with a polite note in your SAS log telling you that it has assumed you meant DATA and went ahead and executed based on that assumption, but if not, hey, feel free to let it know.

čtvrtek 16. ledna 2014

SAS experience

Jak se pozná, že jsem pracoval v SASu? Mám neuvěřitelně vyčesané vlasy do zádu.  Obávám se, že pokud mi i zítra řeknou, že budu pracovat se SASem, skončím, jako Homer Simpson, který se připravil o část kštice, kdykoliv se dozvěděl, že bude mít další dítě.

Práce

Co nemám rád na mém současném pracovišti? Koberec v kanceláři. Jak šoupu nohama po koberci, nabíjím se statickou elektřinou. A potom stačí dotknout se vodovodní baterie a zajiskří to.

Ale alespoň můžu říkat, že v práci přímo zářím.

úterý 14. ledna 2014

Nest Thermostat

Since we have self learning thermostats I predict that sooner or later we will have self learning electrical kettles. When you wake up, the water for the morning coffee will be preheated. So when you push the button you don't have to wait an eternity for the morning cup of coffee. And in the evening when it detects you are returning home it preheats water. It doesn't matter whether you use the hot water for a cup of tea or for dinner. It will be there ready on push of button. But in the mean time the kettle is not going to keep the water warm since it knows no one is home. Comfort and efficiency blended together. PS: did you notice how hard it is to estimate how much of water to pour into the kettle so you always pour a bit more of water than is necessary just to be sure you don't end up with too little water? If you have your favorite cup the intelligent kettle can learn it and preheat the exact amount of water. And it can get so far that you push the button and based on the sensors it fills any cup just the way you like - to the border or a centimeter below. Your choice.