Select the component for which you want to create a self-organizing map from the model explorer on the
left side of the gateway, and click the Graphs
or Graph Templates tab.
The contents of the tab appear with the selected component’s name
displayed at the top of the tab.
Verify that the proper page is selected at the bottom of the tab. By
default, Page 1 is used. You can create multiple pages using the <New>
tab.
Create a self-organizing map by doing any of the following:
- Click the Graph button below the component title bar.
- Click the button next to the Graph button, and select Self-Organizing Map.
- Double-click the self-organizing map icon displayed on the tab next to the Graph and Table buttons. You can also drag and drop the icon on the grid.
If you created the graph using the Graph button, do the following: - Select Self-Organzing Map from the Graph Creation Wizard.
- Click Next.
Click a parameter to select parameters individually, or click Select to choose groups of parameters (Inputs,
Outputs, Local, or All).
For more information about the parameter modes, see About the Parameter Mode.
Click Next.
Enter the following information, as desired:
Option |
Description |
Maximum grid size |
SOM is an n by n grid of hexagons. You can specify the size of the self-organizing maps. If enough data points are available, Isight creates the self-organizing maps. If the data points are not available, Isight creates a SOM with as the actual grid size. The grid size plays an important role in determining how useful the resulting SOM is for data mining purposes. If the grid size is small, then global trends within the data set can be seen, but can omit some interesting relationships in a local region. Additionally, the computational time needed to create this self-organizing map will be small. If the grid size is large, then local trends may be seen within the map, but global trends may not be easy to discern. If the grid size is too big (for a particular data set), then SOM is susceptible to producing invalid maps. Invalid maps are easy to identify—no two adjacent cells have similar colors implying that the data has no trend which, in most cases, is not correct. |
Maximum iterations |
Isight constructs self-organizing maps in iterations. In each iteration, all points in the data set are used to train the map based on that point. If the number of iterations is small, the map may not be a true representation of the underlying trend in the data. If the number of iterations is large, computational time may be wasted without discerning any additional trend. The default value is 2 (i.e., Isight iterates overall the data points twice). |
Initial learning rate |
The learning rate controls the amount of deformation experience by each node in the mesh. For the self-organizing map to converge, Isight decreases the learning rate exponentially with time. The default is 0.8, which means that a node is moved by a maximum of 80% toward the training data point during the first iteration. Higher values for the learning rate result in arriving at the correct self-organizing map quickly, but can sometimes produce enormous deformations (invalid self-organizing maps). |
Initial neighborhood radius |
The neighborhood radius identifies the nodes in the neighborhood that are affected by a training data point. Similar to the learning rate, this radius is also reduced exponentially with time. The default is 0.8, which means that nodes that lie within 80% of the span of the self-organizing map are influenced by the training data point in the first iteration, but that range of influence decreases very quickly in subsequent iterations. Low values for neighborhood radius can produce faster convergence, which in some cases can be premature (invalid self-organizing maps). |
Initialization field |
To create self-organizing maps, Isight creates an initial map that is trained with the data set. The initial map can be generated with random data or with the linear correlations among the parameters in the training data set obtained through Principal Component Analysis (PCA). When the linear correlations are used to generate the map, a substantial aspect of the data trend is already captured before beginning the training iterations. The non-linear trends are captured during the training iterations. Using PCA, a good self-organizing map can be obtained with fewer iterations. However, for certain data sets the linear trends can be misleading and dominate the self-organizing map. In such cases, random initialization is preferred along with increasing the number of maximum iterations to ensure convergence. |
Random Seed |
When SOM is initialized with a random map, the seed value is used to generate the random data for the initial random map. The SOM can appear different for the same data set with different random initializations (controlled by the seed value), but the trends between parameters in the data set are preserved. |
Selecting one or more cells in the map selects all corresponding rows in the history; selecting a row in history marks the corresponding cell with a black outline.
Click Finish. Isight
adds the self-organizing map to the gateway. You can also find the self-organizing map on the Data Analysis tab in the Runtime Gateway.
|