RF-OEX
Usage description
Outlier detection is available on the Explorer part of basic weka. Once the data set is loaded in Preprocess screen, switch to Outlier Panel.
Outlier panel is used for searching most outlying instances and their Outlier Score. Make sure that parameter Number of Trees is hign enough (≈1000)
to ensure accurate result. You can deal with overfitting by setting minimal number of instances per node - Min. per Node parameter - or maximum depth of tree - Maximum Depth of Tree parameter.
Class parameter must be set appropriately according to data set class attribute.
We recommend to keep Count with mistaken class penalty and Count with ambiguous classification penalty checked to o consider similarity of
given instance with the rest of samples when calculating outlier factor. Parameter Bootstraping helps to make more varied trees.
Once you are ready with setup, click on Start button. After the computation is done, resulting outliers scores appears on the left part of the panel.
If Output summary information is checked, you can see the Summary Outlier Score section with instances sorted according to their outlier factor.
In Outlier Score section you can find more details about each instance and its outlier score.
Interpreation button becomes available now.
Click on it to open the Interpretation panel.

- α is used for output filtering - only interpretations with α-weight least as α parameter will be printed
- Number of Outliers determines how many outliers should be interpreted
- Support gives the support for minimal frequent pattern mining
- Number of Trees determines how many trees should be used for building of classification model.

On the screenshot below we can see result of outlier interptetation of Iris dataset.

On the first rows is overview of parameters settings.The second section describes outliers interpreation. Let's look on the interpretaion of most outlying instance number 71:
petalwidth=1.8, 0.88
These lines means that outlierness of instance number 71 is caused from 88% by value 1.8 of attribute petalwidth.
Now let's take a look on the third most outlying instance number 84:
petallength=5.1, 0.74
sepallength=6 && petallength=5.1, 0.26
Instance outlierness is caused from 74% by value of petallenght. There is also significant increase in outlierness if we combine attribute petalllenght
with attribute sepallength. This combination participates in outlierness with 26%.
On the picture below you see both instances together with other instances from class Iris-versicolor. You can see that attribute petalwidth
of instance 71 is really high compared to other instances.
For instance 84, value petallength is relatively high, althought there are other five instances with petallength >= 4.8. If we look at combination of
attributes petallength and sepallength, we see that althought there are several instances with sepallength≈6, the combination with high value
of petallength is quite unique.
Notice, that interpretations from RF-OEX corresponds with observations above.