Synopsis
This operator fills gaps in the data based on the ID attribute of the data set.
Description
This operator fills gaps in the data based on the ID attribute of the data set. The ID attribute must either have the value type "integer" or one of the data value types.
The operator performs the following steps:
- * The data is sorted according to the ID attribute
- All occurring distances between consecutive ID values are calculated
- The greatest common divisor (GCD) of all distances is calculated
- All rows which would have an ID value which is a multiple of the GCD but are missing are added to the data set
Please note that all values of attributes beside the ID attribute will have a missing value which often must be replaced as a next step.
Input
- example set input: expects: ExampleSetMetaData: #examples: = 0; #attributes: 0
, expects: ExampleSet
Output
- example set output:
- original:
Parameters
- use gcd for step size: Indicates if the greatest common divisor should be calculated and used as the underlying distance between all data points.
- step size: The used step size for filling the gaps (only used if GCD calculation is not checked).
- start: If this parameter is defined gaps at the beginning (if they occur) before the first data point will also be filled.
- end: If this parameter is defined gaps at the end (if they occur) after the last data point will also be filled.