Welcome to the first tutorial. This tutorial introduces you to clustering by walking you through a simple analysis of a real dataset. You will be shown how to normalize the data, cluster it, and then visualize the clustering results in different types of plots.
Skills You Will Learn:
How to import gene expression data from a file into the GeneLinkerô database.
How to use the table viewer.
How to normalize a dataset.
How to perform clustering experiments.
How to display plots.
How to generate a report and export an image.
This tutorial uses a dataset described in a 1998 paper (see URL http://www.pnas.org/cgi/content/abstract/95/1/334) by Xiling Wen, Stefanie Fuhrman, George S. Michaels, Daniel B. Carr, Susan Smith, Jeffrey L. Barker and Roland Somogyi, 'Large-scale temporal gene expression mapping of central nervous system development.' Proc. Nat. Acad. Sci. USA, Vol. 95, pp.334-339, January 1998. You may find it useful to have a copy of the paper on hand -- either on your screen, or printed out -- while working through this tutorial. In this tutorial this paper is referred to as 'Wen et al.', or simply 'Wen'.
The raw data represent RT-PCR product ratios (sample/control densities from gel images), averaged over three measurements. This expression study was designed to discover relationships between members of important gene families during different phases of rat cervical spinal cord development, assayed over nine time points before (E=embryonic) and after birth (P=postnatal). The selection covers a range of developmental markers and intercellular signaling genes, involving neurotransmitters and growth factors.
Wen et al. first clustered the genes 'from the combined 17 dimensional vectors of nine expression values (ranging between 0 to 1) and eight slopes (ranging between -1 and +1; slopes were calculated based on a reduced time interval of 1, not taking into account the variable time intervals). [They] included slopes to take into account offset but parallel patterns.' Computing this difference information (which they call 'slope') cannot be done entirely within GeneLinkerô. For the purpose of this tutorial, slopes are ignored, and the software is used only to investigate the expression levels.
This tutorial should take about an hour, depending on how long you spend investigating the data, and how fast your machine is. Note that if you must stop part way through the tutorial, simply exit the program by selecting Exit from the File menu. The data and experiments you have performed to that point are saved automatically by GeneLinkerô. The next time you start GeneLinkerô, you can continue on with the next step in the tutorial.