|
Platinum
For complete information on variables, see Variables Overview.
Variable (class) data for both Khan datasets needs to be imported. The first class data file is Khan_training_classes.csv and the second is Khan_test_classes.csv. Follow the procedure to import the first and then repeat it to import the second using the additional information in parentheses.
Import Variable Data
1. Click the Khan_training_data dataset (Khan_test_data for the second import) in the Experiments navigator. The item is highlighted.
2. Select Import from the File menu, and Variable from the sub menu. The Import Variable dialog is displayed.
The Dataset name is displayed at the top of the dialog and the number of samples in the dataset is listed under the name.
3. Click the Source File ... button. The Open dialog is displayed.
4. Click the file Khan_training_classes.csv (Khan_test_classes.csv for the second import). The item is highlighted.
5. Click Open.
The Source File name is displayed with the number of observations and classes in the file listed underneath.
The default Variable Name and Description are displayed.
6. The Preview allows you to view which sample belongs to which class and the total number of entries for each class. Click Preview. When you are finished examining the contents of the Preview, click Close to close it.
7. Type training classes into the Variable Name field overwriting what was there (test classes for the second import).
For the second import, skip to #12 below - no need to create the variable type again.
8. For the first import, click New Variable Type. The Create Variable Type dialog is displayed.
This variable type is used to group together all the observations and predictions of SRBC tumor types. For further discussion of variables and variables types, see Variables Overview. Once we have created the variable type tumor type, we will import variables of that type describing (first) the tumor type of the training data, and (second) the tumor type of the test data.
9. Type SRBC Tumors into the Name field, overwriting the default name.
10. Click OK. The Import Variables dialog is updated with the new variable type.
Note: the number of samples (listed under the Dataset name at the top of the dialog) equals the number of observations listed below the Source File. It is essential that these numbers match - that is, there is a class value for each and every sample.
11. Click Import.
The variable (class) data is imported and the Khan_training_data (Khan_test_data)
dataset in the Experiments navigator
is tagged with the variable information indicator icon .
For detailed information on variable import, see Importing Variables.