An Optimized k-means Algorithm Based on Information Entropy

Liu, Meiling; Zhang, Beixian; Li, Xi; Tang, Weidong; Zhang, GangQiang

doi:10.1093/comjnl/bxab078

Abstract

Clustering is a widely used technique in data mining applications and various pattern recognition applications, in which data objects are divided into groups. K-means algorithm is one of the most classical clustering algorithms. In this algorithm, the initial clustering centers are randomly selected, this results in unstable clustering results. To solve this problem, an optimized algorithm to select the initial centers is proposed. In the proposed algorithm, dispersion degree is defined, which is based on entropy. In the algorithm, all the objects are firstly grouped into a big cluster, and the object that has the maximum dispersion degree and the object that has the minimum dispersion degree are selected as the initial clustering centers from the initial big cluster. And then other objects in the biggest cluster are partitioned to the initial clusters to which the objects are nearest. The partition process will be repeated until the cluster number is equal to the specified value k. Finally, the partitioned k clusters and their cluster centers are applied to k-means algorithm as initial clusters and centers. Several experiments are conducted on real data sets to evaluate the proposed algorithm. The proposed algorithm is compared with traditional k-means algorithm and max-min distance clustering algorithm, and experimental results show that the improved k-means algorithm is stable in selecting initial clustering, because it can select unique initial clustering centers. The optimized algorithm’s effectiveness and feasibility are also verified by experiments, and the algorithm can reduce the times of iterations and has more stable clustering results and higher accuracy.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

You do not currently have access to this article.

Download all slides

Month:	Total Views:
June 2021	16
July 2021	1
August 2021	13
September 2021	20
October 2021	6
November 2021	10
December 2021	7
January 2022	1
February 2022	4
March 2022	2
April 2022	21
May 2022	8
June 2022	6
July 2022	12
August 2022	1
September 2022	4
October 2022	4
November 2022	4
December 2022	5
February 2023	8
March 2023	8
April 2023	4
May 2023	1
July 2023	4
August 2023	2
September 2023	6
October 2023	6
November 2023	5
January 2024	4
February 2024	2
March 2024	4
April 2024	2

An Optimized k-means Algorithm Based on Information Entropy

Abstract

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

An Optimized k-means Algorithm Based on Information Entropy

Abstract

Sign in

Personal account

Institutional access

Institutional account management

Get help with access

Institutional access

IP based access

Sign in through your institution

Sign in with a library card

Society Members

Sign in through society site

Sign in using a personal account

Personal account

Viewing your signed in accounts

Signed in but can't access content

Institutional account management

Purchase

Short-term Access

Rental

Citations

Views

Altmetric

Email alerts

Citing articles via

Latest

Most Read

Most Cited

This Feature Is Available To Subscribers Only