Platinum Data Mining, Classification, and Prediction Using SLAMô
Please note: these functions are introduced within a conceptual 'workflow' for the purpose of introduction only. Within GeneLinkerô, you are free to apply any appropriate function to your data at any time.
1. Import Gene Expression Data
A training dataset (expression values with known classes) is required to train an artificial neural network (ANN) classifier. A test dataset can be imported to test a trained classifier. The two datasets must be studies of the same phenomenon (i.e. the variable type for both is the same, e.g. SRBC Tumors).
2. Import Variable Data
Import the classes (e.g. EWS, NB, BL, RMS) for the training dataset.
3. Discretize the Expression Data
Expression data is continuous. To apply the SLAMô data mining algorithm, the data must first be discretized.
4. Apply SLAMô Association Mining and Visualize the Results
SLAMô (Sub-Linear Association Mining) is a technology that finds hidden linear and non-linear correlations in discretized gene expression data. The SLAMô association viewer displays the results of running SLAMô and allows you to work with the results.
5. Create Gene List
As an aid to supervised learning, a gene list is created from the genes (features) identified as significant by SLAMô. If necessary, this gene list can be used to filter the test dataset to ensure it contains the same genes as the training dataset.
6. Create an ANN Classifier and View Training Results
Creating an ANN classifier is the process of exposing a committee of neural networks to data with known classes of a particular type. The training results can be displayed in a classification plot or an MSE plot.
7. Classify Data and Visualize the Classification Results
Classification is the process of using a trained classifier to predict the classes of the test dataset.