AN EFFICIENT HIERARCHICAL CLUSTERING ALGORITHM FOR PROTEIN SEQUENCING
No Thumbnail Available
Date
2009-02-22
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Government College of Technology, Coimbatore
Abstract
Clustering is the division of data into groups of similar objects. The main objective of this unsupervised leaming technique is to find a meaningful partition by using a distance or similarity function. This paper discusses about the incremental clustering algorithm-Leaders and Sub leaders- an extension of leader algorithm, suitable for protein sequences of bioinformatics is proposed for effective clustering and prototype selection for pattern classification .It is a simple and efficient technique to generate a hierarchical structure for finding the sub clusters within each cluster. The
experimental results of the proposed algorithm are compared with that of the Nearest Neighbour Classifier (NNC) methods. It is found to be computationally efficient when compared to NNC. Classification accuracy obtained using the representatives generated by Leader - Sub leader method
is found to be better than that of using the Leaders method and NNC method. Even if more number of prototypes is generated classification time is less when compared to NNC methods
Description
Keywords
Unsupervised Leaming, Nearest Neighbour Classifier, Classification Accuracy, Protein Sequences, Leaders method