homeabout uscontact us

 

GeneLinkerô Tour - Clustering and PCA

 

Clustering / PCA and Visualization

 

Introduction to Clustering

Clustering is used to group biological samples or genes into separate clusters based on their statistical behavior. The main objective of clustering is to find similarities between experiments or genes (given their expression ratios across all genes or samples, respectively), and then group similar samples or genes together to assist in understanding relationships that might exist among them.

 

Clustering

{image}

 

Introduction to Principal Component Analysis

Component Analysis is an unsupervised or class-free approach to finding the most informative or explanatory features in data. In particular, Principal Component Analysis (PCA) substantially reduces the complexity of data in which a large number of variables (e.g. thousands) are interrelated, such as in large-scale gene expression data obtained across a variety of different samples or conditions. PCA accomplishes this by computing a new, much smaller set of uncorrelated variables which best represent the original data. PCA is a powerful, well-established technique for data reduction and visualization. 2D and 3D PCA plots often place objects with similar patterns near each other.

 

Principal Component Analysis (PCA)

{iamge}