FP-Growth


Synopsis

This learner efficiently calculates all frequent item sets from the given data.


Description

This operator calculates all frequent items sets from a data set by building a FPTree data structure on the transaction data base. This is a very compressed copy of the data which in many cases fits into main memory even for large data bases. From this FPTree all frequent item set are derived. A major advantage of FPGrowth compared to Apriori is that it uses only 2 data scans and is therefore often applicable even on large data sets.

Please note that the given data set is only allowed to contain binominal attributes, i.e. nominal attributes with only two different values. Simply use the provided preprocessing operators in order to transform your data set. The necessary operators are the discretization operators for changing the value types of numerical attributes to nominal and the operator Nominal2Binominal for transforming nominal attributes into binominal / binary ones.

The frequent item sets are mined for the positive entries in your data base, i.e. for those nominal values which are defined as positive in your data base. If you use an attribute description file (.aml) for the ExampleSource operator this corresponds to the second value which is defined via the classes attribute or inner value tags.

If your data does not specify the positive entries correctly, you may set them using the parameter positive_value. This only works if all your attributes contain this value!

This operator has two basic working modes: finding at least the specified number of item sets with highest support without taking the min_support into account (default) or finding all item sets with a support large than min_support.


Input


Output


Parameters


ExampleProcess