Wilks’ dissimilarity for gene clustering: computational issues

Authors

  • F. Marta L. Di Lascio Free University of Bozen-Bolzano
  • Alberto Roverato Università degli Studi di Bologna

DOI:

https://doi.org/10.2427/8761

Abstract

Clustering methods are widely used in the analysis of gene expression data for their ability to uncover coordinated expression profiles. One important goal of clustering is to discover co–regulated genes because it has been postulated that co–regulation implies a similar function. In the context of agglomerative hierarchical clustering, we introduced a dissimilarity measure based on the Wilks’ Λ statistic that they called the Wilks’ dissimilarity and showed its usefulness in the identification of transcription modules. In this paper, we discuss the ability of the Wilks’ dissimilarity to identify clusters of co-expressed genes by providing an example where the most commonly used dissimilarity measures fail. Furthermore, we carry out a set of simulations aimed to investigate the use of a sparse canonical correlation technique in the estimation of the Wilks’ dissimilarity and provide guidelines for its use.

Downloads

Published

2022-07-07

Issue

Section

Statistical Methods