Synopsis
Performs an exhaustive subgroup discovery.
Description
This operator discovers subgroups (or induces a rule set, respectively) by generating hypotheses exhaustively. Generation is done by stepwise refining the empty hypothesis (which contains no literals). The loop for this task hence iterates over the depth of the search space, i.e. the number of literals of the generated hypotheses. The maximum depth of the search can be specified. Furthermore the search space can be pruned by specifying a minimum coverage of the hypothesis or by using only a given amount of hypotheses which have the highest coverage. From the hypotheses, rules are derived according to the users preference. The operator allows the derivation of positive rules (Y+) and negative rules (Y-) separately or the combination by deriving both rules or only the one which is the most probable due to the examples covered by the hypothesis (hence: the actual prediction for that subset). All generated rules are evaluated on the example set by a user specified utility function and stored in the final rule set if they (1) exceed a minimum utility threshold or (2) are among the k best rules. The desired behavior can be specified as well.
Input
- training set: expects: ExampleSet
Output
- model:
- exampleSet:
Parameters
- mode: Discovery mode.
- utility function: Utility function.
- min utility: Minimum quality which has to be reached.
- k best rules: Report the k best rules.
- rule generation: Determines which rules are generated.
- max depth: Maximum depth of BFS.
- min coverage: Only consider rules which exceed the given coverage threshold.
- max cache: Bounds the number of rules which are evaluated (only the most supported rules are used).