pátek 28. prosince 2012

Which classifiers can deal with XOR

Machine learning scientists generally dislike XOR problem because not all algorithms can deal with it:
Classifier Precision [%]
k-nn 100
Naive-Bayes 50
Classification Tree 100
Random Forest 100
Multi Layer Perceptron 100
Neural Net 100
Logistic Regression 50
SVM 100
Vote (with NB) 50
Bagging (with NB) 100
AdaBoost (with NB) 50
Bayesian Boosting (with NB) 50
Stacking (with NB) 50

úterý 18. prosince 2012

Java in Eclipse

How to find unused functiones use Core downloads - right click on the class and select "Find Unreferenced Members".

To profile Java applications use JVM Monitor.See http://www.jvmmonitor.org/doc/index.html#Getting_started for instructions.

How to make things fast:
  • Pre-compute rather than re-calculate: any loops or repeated calls that contain calculations that have a relatively limited range of inputs, consider making a lookup (array or dictionary) that contains the result of that calculation for all values in the valid range of inputs. Then use a simple lookup inside the algorithm instead.
    Down-sides: if few of the pre-computed values are actually used this may make matters worse, also the lookup may take significant memory.
  • Don't use library methods: most libraries need to be written to operate correctly under a broad range of scenarios, and perform null checks on parameters, etc. By re-implementing a method you may be able to strip out a lot of logic that does not apply in the exact circumstance you are using it.
    Down-sides: writing additional code means more surface area for bugs.
  • Do use library methods: to contradict myself, language libraries get written by people that are a lot smarter than you or me; odds are they did it better and faster. Do not implement it yourself unless you can actually make it faster (i.e.: always measure!)
  • Cheat: in some cases although an exact calculation may exist for your problem, you may not need 'exact', sometimes an approximation may be 'good enough' and a lot faster in the deal. Ask yourself, does it really matter if the answer is out by 1%? 5%? even 10%?
    Down-sides: Well... the answer won't be exact.