Please use this identifier to cite or link to this item:
http://ir.lib.seu.ac.lk/handle/123456789/6314
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Alibuhtto, M. C. | - |
dc.date.accessioned | 2022-12-01T06:22:12Z | - |
dc.date.available | 2022-12-01T06:22:12Z | - |
dc.date.issued | 2022-11-15 | - |
dc.identifier.citation | Proceedings of the 11th Annual Science Research Sessions, FAS, SEUSL, Sri Lanka 15th November 2022 Scientific Engagement for Sustainable Futuristic Innovations pp. 60. | en_US |
dc.identifier.isbn | 978-624-5736-60-7 | - |
dc.identifier.isbn | 978-624-5736-59-1 | - |
dc.identifier.uri | http://ir.lib.seu.ac.lk/handle/123456789/6314 | - |
dc.description.abstract | In the current digital era, data is generated enormously with fast growth from different sources, and managing such huge data is a big challenge. Clustering algorithm is able to find hidden patterns and extract useful information from huge datasets. Among the clustering techniques, k-means clustering algorithm is the most commonly used unsupervised classification technique to determine the optimal number of clusters (k). However, the choice of the optimal number of clusters (k) is a prominent problem in the process of the k-means clustering algorithm. In most cases, clustering huge data, k is pre-determined by researcher, and incorrectly chosen k leads to increase computational cost. In order to obtain the optimal number of clusters, a distance-based k-mean algorithm was proposed with a simulated dataset. In the k-means algorithm, two distance measures were considered namely Euclidean and Manhattan distances. The results based on simulated data reveal that the k-means algorithm with Euclidean distance yields the optimal number of clusters compared to Manhattan distance. Testing on real datasets shows consistent results as the simulated ones. | en_US |
dc.language.iso | en_US | en_US |
dc.publisher | Faculty of Applied Sciences, South Eastern University of Sri Lanka, Sammanthura | en_US |
dc.subject | Huge data | en_US |
dc.subject | Digital era | en_US |
dc.subject | Distance measure | en_US |
dc.subject | K-means algorithm | en_US |
dc.title | Determining the optimal number of clusters using distance based k-means algorithm | en_US |
dc.type | Article | en_US |
Appears in Collections: | 11th Annual Science Research Session - FAS |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Fas symposium paper-25.pdf | 472.59 kB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.