Synopsis
A highly efficient implementation of the backward elimination scheme.
Description
This operator starts with the full set of attributes and, in each round, it removes each remaining attribute of the given set of examples. For each removed attribute, the performance is estimated using inner operators, e.g. a cross-validation. Only the attribute giving the least decrease of performance is finally removed from the selection. Then a new round is started with the modified selection. This implementation will avoid any additional memory consumption beside the memory used originally for storing the data and the memory which might be needed for applying the inner operators. A parameter specifies when the iteration will be aborted. There are three different behaviors possible:
- with decrease
runs as long as there is any increase in performance
- with decrease of more than
runs as long as the decrease is less than the specified threshold, either relative or absolute.
- with significant decrease
stops as soon as the decrease is significant to the specified level.
The parameter speculative_rounds defines how many rounds will be performed in a row, after a first time the stopping criterion was fulfilled. If the performance increases again during the speculative rounds, the elimination will be continued. Otherwise all additionally eliminated attributes will be restored, as if no speculative rounds would have been executed. This might help to avoid getting stuck in local optima.
The operator provides a value for logging the performance in each round using a Log.
Input
- example set: expects: ExampleSet
Output
- example set:
- attribute weights:
- performance:
Parameters
- maximal number of eliminations: The maximal number of backward eliminations. Hence the resulting number of attributes is maximal reduced by this number.
- speculative rounds: Defines the number of times, the stopping criterion might be consecutivly ignored before the elimination is actually stopped. A number higher than one might help not to stack in the local optima.
- stopping behavior: Defines on what criterias the elimination is stopped.
- use relative decrease: If checked, the relative performance decrease will be used as stopping criterion.
- maximal absolute decrease: If the absolut performance decrease to the last step exceeds this threshold, the selection will be stopped.
- maximal relative decrease: If the relative performance decrease to the last step exceeds this threshold, the selection will be stopped.
- alpha: The probability threshold which determines if differences are considered as significant.