结合紧密度和分散度的近邻亲和相似度函数

doi:10.3969/j.issn.1001-2400.2014.03.018

Abstract

Abstract:

Traditional distance and similarity measurements did not take into account the influence of the individual sample on the whole sample set. To deal with this issue, a new similarity improvement strategy of k-nearest neighbor algorithm (KNN) is proposed in the paper. First, a new affinity distance function is introduced, which focuses on the separation and compactness between each individual sample and the whole sample set. Second, a new similarity function using this affinity distance function is proposed and taken as the similarity measure function in the KNN. Third, a theoretical analysis of and experiments on eighteen numerical UCI (University of California Irvine) datasets are made to compare the affinity similarity function proposed in this paper with classical distance or similarity functions through 5-fold partitioning cross-validations. Finally, classification results indicate that the proposed affinity similarity function is not only an effective similarity strategy for classification, but can reduce the classification time for large-scale data sets by combining efficient indexing algorithms.

Key words: machine learning, nearest neighbors, affinity similarity, separation, compactness

LI Juan;WANG Yuping. New nearest neighbor affinity similarity function based on separation and compactness between samples[J].J4, 2014, 41(3): 123-130.

References

［1］ Wu Xindong, Kumar V, Quinlan J R, et al. Top 10 Algorithms in Data Mining ［J］. Knowledge and Information Systems, 2008, 14(1): 1-37.
［2］ Hakan A. Improving the k-nearest Neighbour Rule: Using Geometrical Neighbourhoods and Manifold-based Metrics ［J］. Experts Systems, 2011, 28(4): 391-406.
［3］ Towell G, Shavlik J, Noordewier M. Refinement of Approximate Domain Theories by Knowledge-Based Neural Networks ［C］//Proceedings of 18th National Conference on Artificial Intelligence. Cambridge: MIT Press, 1990: 861-866.
［4］ Lin Zhiwei, Wang Hui, Sally M. A Multidimensional Sequence Approach to Measuring Tree Similarity ［J］. IEEE Transactions on Knowledge and Data Engineering, 2012, 24(2): 197-208.
［5］ Olson D L, Delen D. Advanced Data Mining Techniques ［M］. Berlin: Springer, 2008: 39-52.
［6］ Huan J, Wang W, Prins J, et al. Spin: Mining Maximal Frequent Subgraphs from Graph Databases ［C］//Proceedings of the 10th ACM SIGKDD International conference on Knowledge Discovery and Data Mining. New York: ACM, 2004: 581-586.
［7］陈凤, 杜兰, 保铮. 一种优化K 近邻准则及在雷达HRRP 目标识别中的应用［J］. 西安电子科技大学学报, 2007, 34(5): 681-686.
Chen Feng, Du Lan, Bao Zheng. Modified KNN Rule with Its Application in Radar HRRP Target Recognition ［J］. Journal of Xidian University, 2007, 34(5): 681-686.
［8］ Hui Wang. Neighborhood Counting Measure and Minimum Risk Metric ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(4): 766- 768.
［9］ Zeng Yong, Yang Yupu, Zhao Liang. Pseudo Nearest Neighbor Rule for Pattern Classification ［J］. Expert Systems with Applications, 2009, 36(2): 3587-3595.
［10］ Bhattacharyra G, Ghosh K, Chowdhury A S. An Affinity-based New Local Distance Function and Similarity Measure for kNN ［J］. Patter Recognition Letters, 2012, 33(3): 356-363.
［11］ Hu Qinghua, Zhu Pengfei, Yang Yongbin, et al. Large-margin Nearest Neighbor Classifiers Via Sample Weight Learning ［J］. Neurocomputing, 2011, 74(4): 656-660.
［12］ Gou Jianping, Zhang Yi, Du Lan, et al. A Local Mean-Based k-Nearest Centroid Neighbor Classifier ［J］. Computer Journal, 2012, 55(9): 1058-1071.
［13］ Gao Yunlong, Pan Jinyan, Ji Guoli, et al. A Novel Two-level Nearest Neighbor Classification Algorithm Using an Adaptive Distance Metric ［J］. Knowledge-based Systems, 2012(26): 103-110.
［14］ Mitra P, Murthy C A, Pal S K. Unsupervised Feature Selection Using Feature Similarity ［J］. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(3): 301-312.
［15］ Adam M. An Externalization of the k-d tree ［J］. Romanian Journal of Information Science and Technology, 2007, 10(4): 323-333.
［16］ Asuncion A, Newman D J. UCI Machine Learning Repository ［EB/OL］. ［2012-06-10］. http://archive.ics.uci.edu/ml/.

[1]	ZENG Yong,WU Zhengyuan,DONG Lihua,LIU Zhihong,MA Jianfeng,LI Zan. Research on malicious traffic identification technology in encrypted traffic [J]. Journal of Xidian University, 2021, 48(3): 170-187.
[2]	ZHANG Shudong,GAO Haichang,CAO Xiwen,KANG Shuai. Adaptive fast and targeted adversarial attack for speech recognition [J]. Journal of Xidian University, 2021, 48(1): 168-175.
[3]	WANG Junxiang,HUANG Lin,ZHANG Ying,NI Jiangqun,LIN Lang. Algorithm for the detection of a low complexity contrast enhanced image source [J]. Journal of Xidian University, 2021, 48(1): 96-106.
[4]	YAN Lin,LIU Kai,DUAN Meiyu. Lightweight deep neural network for point cloud classification [J]. Journal of Xidian University, 2020, 47(2): 46-53.
[5]	ZHAO Wei,CHOU Shengnan,YANG Shuo,LI Xiongfei. Anti-jamming algorithm for spread spectrum communication using blind source separation [J]. Journal of Xidian University, 2020, 47(1): 73-79.
[6]	LIU Yongli,GUO Chengyi,LIU Jing,WU Yan. Multi-view fuzzy clustering algorithm using FCS [J]. Journal of Xidian University, 2019, 46(4): 99-106.
[7]	WANG Chuanchuan,ZENG Yonghu,FU Weihong,WANG Liandong. Estimation algorithm for an underdetermined mixing matrix based on maximum density point searching [J]. Journal of Xidian University, 2019, 46(1): 106-111.
[8]	LIU Wenjuan;FENG Dazheng. Non-unitary joint diagonalization by complex Givens and complex Shear rotation [J]. Journal of Xidian University, 2016, 43(5): 6-11.
[9]	CHANG Lei;GU Huaxi;ZHANG Zhiyi;YU Xiaoshan;ZHAO Yan. Particle swarm optimization user-priority virtual network embedding algorithm [J]. J4, 2015, 42(1): 16-22.
[10]	FU Weihong;WANG Lu;MA Lifen. Improved laplace mixed model potential function algorithm for UBSS [J]. J4, 2014, 41(6): 1-5+88.
[11]	JI Jian;LI Xiao. Method for sparse component analysis in the shearlet domain [J]. J4, 2014, 41(1): 45-52+146.
[12]	WANG Shiqiang;ZHANG Dengfu;BI Duyan;XUE Deyou;YONG Xiaoju. Research on recognizing the radar signal using the bispectrum cascade feature [J]. J4, 2012, 39(2): 127-132+191.
[13]	WANG Shiqiang;ZHANG Dengfu;BI Duyan;YONG Xiaoju. Novel radar signal sorting method with low complexity [J]. J4, 2011, 38(4): 148-153.
[14]	YIN Hai-qing;LIU Hong-wei. New blind source separation algorithm based on L₁ sparse regularization and nonnegative matrix factorization [J]. J4, 2010, 37(5): 835-841.
[15]	CHEN Yu-feng;LI Zhi-wu. Computation of marking/transition separation instances for safe Petri nets using BDD [J]. J4, 2010, 37(1): 119-124+141.

New nearest neighbor affinity similarity function based on separation and compactness between samples

PDF (PC)

Like

Knowledge

Abstract

Cite this article

share this article

References

Related Articles 15

Metrics

Comments

Recommended 10