J4 ›› 2009, Vol. 36 ›› Issue (3): 502-534.

• 研究论文 • 上一篇    下一篇

一种新的基因表达数据聚类方法

王文俊;张军英   

  1. (西安电子科技大学 计算机学院,陕西 西安  710071)
  • 收稿日期:2008-02-28 修回日期:2008-04-02 出版日期:2009-06-20 发布日期:2009-07-04
  • 通讯作者: 王文俊
  • 基金资助:

    国家自然科学基金资助(60371044)

New method for clustering gene expression data

WANG Wen-jun;ZHANG Jun-ying   

  1. (School of Computer Science and Technology, Xidian Univ., Xi'an  710071, China)
  • Received:2008-02-28 Revised:2008-04-02 Online:2009-06-20 Published:2009-07-04
  • Contact: WANG Wen-jun

摘要:

提出了一种基于样本间关系的新聚类方法,从基因表达数据中通过pearson相关系数获得样本间的关系,并用网络的方法表示这种关系,通过该网络的空间结构特征来提取样本间的关系特征,并在这种关系特征空间中进行样本的聚类.该方法能更好地揭示不同类样本间的差异性,具有聚类空间维数低而无需降维的特点.分别采用本方法与现有的聚类方法对真实的基因表达数据进行了聚类分析,实验结果说明该方法能获得更高的聚类正确率,且对于分布混杂的数据的聚类效果也较好.

关键词: 聚类, 样本关系网络, 结构特征, 关系特征

Abstract:

A new clustering method based on the relationship between patterns is proposed. The relationship between patterns is obtained from gene expression data through the pearson correlation coefficient, which is denoted by a network, the relation feature between patterns is extracted by discovering the structure feature of the network, and clustering is performed in the relation feature space. The proposed method uncovers the dissimilarity between patterns belonging to different classes more effectively, and the dimensionality of the clustering space is so low than there is no need to reduce dimensions. The comparison of the method with the conventional ones shows that the method can obtain a much higher clustering efficiency than other methods and it can lead to a better efficiency even for those data with promiscuous distribution.

Key words: clustering, pattern relation network, structure feature, relation feature

中图分类号: 

  • TP391
Baidu
map