Synopsis
A genetic algorithm for feature selection and feature generation (GGA).
Description
In contrast to the class GeneticAlgorithm, the GeneratingGeneticAlgorithm generates new attributes and thus can change the length of an individual. Therfore specialized mutation and crossover operators are being applied. Generators are chosen at random from a list of generators specified by boolean parameters. Since this operator does not contain algorithms to extract features from value series, it is restricted to example sets with only single attributes. For automatic feature extraction from values series the value series plugin for RapidMiner written by Ingo Mierswa should be used. It is available at <a href="http://rapid-i.com">http://rapid-i.com</a>
Input
- example set in: expects: ExampleSet
Output
- example set out:
- attribute weights out:
- performance out:
Parameters
- max number of new attributes: Max number of attributes to generate for an individual in one generation.
- limit max total number of attributes: Indicates if the total number of attributes in all generations should be limited.
- max total number of attributes: Max total number of attributes in all generations.
- use local random seed: Indicates if a local random seed should be used.
- local random seed: Specifies the local random seed
- show stop dialog: Determines if a dialog with a button should be displayed which stops the run: the best individual is returned.
- maximal fitness: The optimization will stop if the fitness reaches the defined maximum.
- population size: Number of individuals per generation.
- maximum number of generations: Number of generations after which to terminate the algorithm.
- use plus: Generate sums.
- use diff: Generate differences.
- use mult: Generate products.
- use div: Generate quotients.
- reciprocal value: Generate reciprocal values.
- use early stopping: Enables early stopping. If unchecked, always the maximum number of generations is performed.
- generations without improval: Stop criterion: Stop after n generations without improval of the performance.
- tournament size: The fraction of the current population which should be used as tournament members (only tournament selection).
- start temperature: The scaling temperature (only Boltzmann selection).
- dynamic selection pressure: If set to true the selection pressure is increased to maximum during the complete optimization run (only Boltzmann and tournament selection).
- keep best individual: If set to true, the best individual of each generations is guaranteed to be selected for the next generation (elitist selection).
- p initialize: Initial probability for an attribute to be switched on.
- p crossover: Probability for an individual to be selected for crossover.
- crossover type: Type of the crossover.
- p generate: Probability for an individual to be selected for generation.
- use heuristic mutation probability: If checked the probability for mutations will be chosen as 1/number of attributes.
- p mutation: Probability for mutation.