ENHANCED SENTENCE-LEVEL TEXT CLUSTERING USING SEMANTIC SENTENCE SIMILARITY FROM DIFFERENT ASPECTS

No Thumbnail Available

Date

2014

Journal Title

Journal ISSN

Volume Title

Publisher

International Journal of Computer Science and Information Technologies

Abstract

Sentence clustering plays a significant role in many text processing activities. For instance, several authors have discussed that integrate sentence clustering into extractive multi document summarization useful to address issues of content overlap, leading to better coverage. Existing work proposed fuzzy clustering algorithm which is used for relational input data. This existing algorithm uses a graph representation of the data, and performs based on Expectation-Maximization framework. Proposed system improves the result of the clustering by introducing the novel sentence similarity technique. In our proposed system we are propose a new way to determine sentence similarities from different aspects. Probably based on information people can obtain from a sentence, which is objects the sentence describes, properties of these objects and behaviors of these objects. Four aspects, Objects-Specified Similarity, Objects-Property Similarity, Objects-Behavior Similarity and Overall Similarity are calculated to estimate the sentence similarities. First, for each sentence, all nouns in noun phrases are chosen as the objects specified in the sentence, all adjectives and adverbs in noun phrases as the objects properties and all verb phrases as the objects behaviors. Then, the four similarities are calculated based on a semantic vector method. We also conducted an experimental study with that could help us to efficiently clustering the sentence level text. Our study shows that this algorithm generates better quality clusters than traditional algorithms; in other words, it is benefits to increase the accuracy of the clustering result.

Description

Keywords

Sentence level clustering, Fuzzy relational clustering, Sentence Similarity, Objects based similarity

Citation

Endorsement

Review

Supplemented By

Referenced By