Synopsis
Discretize numerical attributes into a user defined number of bins.
Description
This operator discretizes all numeric attributes in the dataset into nominal attributes. This discretization is performed by simple binning, i.e. the specified number of equally sized bins is created and the numerical values are simply sorted into those bins. Skips all special attributes including the label.
Input
- example set input: expects: ExampleSetMetaData: #examples: = 0; #attributes: 0
, Example set matching at least one selected attribute.
Output
- example set output:
- original:
- preprocessing model:
Parameters
- return preprocessing model: Indicates if the preprocessing model should also be returned
- create view: Create View to apply preprocessing instead of changing the data
- attribute filter type: The condition specifies which attributes are selected or affected by this operator.
- attribute: The attribute which should be chosen.
- attributes: The attribute which should be chosen.
- regular expression: A regular expression for the names of the attributes which should be kept.
- use except expression: If enabled, an exception to the specified regular expression might be specified. Attributes of matching this will be filtered out, although matching the first expression.
- except regular expression: A regular expression for the names of the attributes which should be filtered out although matching the above regular expression.
- value type: The value type of the attributes.
- use value type exception: If enabled, an exception to the specified value type might be specified. Attributes of this type will be filtered out, although matching the first specified type.
- except value type: Except this value type.
- block type: The block type of the attributes.
- use block type exception: If enabled, an exception to the specified block type might be specified.
- except block type: Except this block type.
- numeric condition: Parameter string for the condition, e.g. '>= 5'
- invert selection: Indicates if only attributes should be accepted which would normally filtered.
- include special attributes: Indicate if this operator should also be applied on the special attributes. Otherwise they are always kept.
- number of bins: Defines the number of bins which should be used for each attribute.
- define boundaries: Define the boundraries for the bin calculation.
- min value: The minimum value for the binning range.
- max value: The maximum value for the binning range.
- range name type: Indicates if long range names including the limits should be used.
- automatic number of digits: Indicates if the number of digits should be automatically determined for the range names.
- number of digits: The minimum number of digits used for the interval names.