Synopsis
This operator reads an example set from an SQL database by incrementally caching it (recommended).
Description
This operator reads an ExampleSet from an SQL database. The data is load from a single table which is defined with the table name parameter. Please note that table and column names are often case sensitive. Databases may behave differently here.
The most convenient way of defining the necessary parameters is the configuration wizard. The most important parameters (database URL and user name) will be automatically determined by this wizard and it is also possible to define the special attributes like labels or ids.
In contrast to the DatabaseExampleSource operator, which loads the data into the main memory, this operator keeps the data in the database and performs the data reading in batches. This allows RapidMiner to access data sets of arbitrary sizes without any size restrictions.
Please note the following important restrictions and notes:
- only manifested tables (no views) are allowed as the base for this data caching operator,
- if no primary key and index is present, a new column named RM_INDEX is created and automatically used as primary key,
- if a primary key is already present in the specified table, a new table named RM_MAPPED_INDEX is created mapping a new index column RM_INDEX to the original primary key.
- users can provide the primary key column RM_INDEX themself which then has to be an integer valued index attribute, counting starts with 1 without any gaps or missing values for all rows
Input
Output
- output:
Parameters
- define connection: Indicates how the database connection should be specified.
- connection: A predefined database connection.
- database system: The used database system.
- database url: The URL connection string for the database, e.g. 'jdbc:mysql://foo.bar:portnr/database'
- username: The database username.
- password: The password for the database.
- jndi name: JNDI name for a data source.
- table name: A database table.
- recreate index: Indicates if a recreation of the index or index mapping table should be forced.
- label attribute: The (case sensitive) name of the label attribute
- id attribute: The (case sensitive) name of the id attribute
- weight attribute: The (case sensitive) name of the weight attribute