homeabout uscontact us

 

Tutorial 2: Step 2 Estimate Missing Data Values

 

The NCI60 studies rejected some data due to low signal or for quality control reasons. GeneLinker™ has functionality for eliminating genes that meet a specified threshold number of missing values, and for estimating missing values.

 

Estimate Missing Data Values

1. If the t_matrix dataset in the Experiments navigator is not already highlighted, click it.

2. Click the Estimate Missing Values toolbar icon , or select Estimate Missing Values from the Data menu, or right-click the item and select Estimate Missing Values from the shortcut menu. The Estimate Missing Values dialog is displayed.

3. Set dialog parameters.

Parameter

Setting

Remove Genes That Have Missing Values

30

Replacement Technique

Nearest Neighbors

Distance Metric

Euclidean

Number of Nearest Neighbors

3

 

4. Click OK. The Experiment Progress dialog is displayed.

The dialog is dynamically updated as the Estimate Missing Values operation is performed. Upon successful completion, a new Estimated: #mv < 30 | median complete dataset is added to the Experiments navigator under the original dataset. This new dataset has the complete dataset icon before its name. (An incomplete dataset has the incomplete dataset icon .)

Note: in addition to estimating missing values, GeneLinker™ provides facilities for normalizing and filtering data. Use of these functions is described in detail in the preprocessing section of the help. The dataset we are using was suitably normalized by the original authors.