Synopsis
Learns a neural net from the input data.
Description
This operator learns a model by means of a feed-forward neural network trained by a backpropagation algorithm (multi-layer perceptron). The user can define the structure of the neural network with the parameter list "hidden_layers". Each list entry describes a new hidden layer. The key of each entry must correspond to the layer name. The value of each entry must be a number defining the size of the hidden layer. A size value of -1 indicates that the layer size should be calculated from the number of attributes of the input example set. In this case, the layer size will be set to (number of attributes + number of classes) / 2 + 1.
If the user does not specify any hidden layers, a default hidden layer with sigmoid type and size (number of attributes + number of classes) / 2 + 1 will be created and added to the net. If only a single layer without nodes is specified, the input nodes are directly connected to the output nodes and no hidden layer will be used.
The used activation function is the usual sigmoid function. Therefore, the values ranges of the attributes should be scaled to -1 and +1. This is also done by this operator if not specified otherwise by the corresponding parameter setting. The type of the output node is sigmoid if the learning data describes a classification task and linear for numerical regression tasks.
Input
- training set: expects: ExampleSet
Output
- model:
- exampleSet:
Parameters
- hidden layers: Describes the name and the size of all hidden layers.
- training cycles: The number of training cycles used for the neural network training.
- learning rate: The learning rate determines by how much we change the weights at each step. May not be 0.
- momentum: The momentum simply adds a fraction of the previous weight update to the current one (prevent local maxima and smoothes optimization directions).
- decay: Indicates if the learning rate should be decreased during learningh
- shuffle: Indicates if the input data should be shuffled before learning (increases memory usage but is recommended if data is sorted before)
- normalize: Indicates if the input data should be normalized between -1 and +1 before learning (increases runtime but is in most cases necessary)
- error epsilon: The optimization is stopped if the training error gets below this epsilon value.
- use local random seed: Indicates if a local random seed should be used.
- local random seed: Specifies the local random seed