SEUIR Repository

Automated text summarization of Sinhala online articles

Show simple item record

dc.contributor.author Akmal Jahan, M. A. C.
dc.contributor.author Wijesekara, K. K. C.
dc.date.accessioned 2023-08-16T12:37:58Z
dc.date.available 2023-08-16T12:37:58Z
dc.date.issued 2023-06
dc.identifier.citation Journal of Science, Faculty of Applied Sciences, South Eastern University of Sri Lanka, Vol. 4, (No.1), June 2023, pp. 1-14. en_US
dc.identifier.issn 2738-2184
dc.identifier.uri https://www.seu.ac.lk/jsc/
dc.identifier.uri http://ir.lib.seu.ac.lk/handle/123456789/6773
dc.description.abstract Information retrieval is one of the major tasks in natural language processing applications. In digitalized world, there is a development of retrieval information from online platforms and there are abundant of information for a specific subject available in online. With the hustle and bustle, readers need to know whether the information is important according to their need within a very short time. Automated text summarization plays a key role in natural language processing applications. Many studies have been explored for summarizing different languages like English, Bengali, Hausa, Chinese, Hindi, etc. However, the local language like Sinhala is still in beginning stage. On the other hand, as a diverse country, there is a community and language diversity in Sri Lanka. Therefore, there are people who have less fluency in Sinhala as their mother-tongue is another local language like Tamil. Social media like Facebook provides platform for translation of content in a different language. However, other online platforms do not provide such translation process of the content. In such scenario, having a short summary of those articles would be an advantageous step for the readers who can easily understand the main idea of the content. Therefore, this work aims to generate an online platform that can provide a good summary for Sinhala language online articles. This research investigates extractive text summarization for Sinhala online articles using some state-of-the art algorithms in NLP applications to select a best suitable method. This work comparatively analyses the performance of TF-IDF (Term Frequency-Inverse Document Frequency) and Text-Rank algorithms for Sinhala language. Performance of the algorithms is evaluated with human generated summary from online sources using ROUGE (Recall Oriented Understudy of Gisting Evaluation) where high ROUGE score (Measure the rate of n-gram overlapping of original text and automated summary) values represent the more accurate automated summary of the article. From the results, the TF-IDF algorithm comparatively performs better for Sinhala online article summarization with medium content size. en_US
dc.language.iso en_US en_US
dc.publisher Faculty of Applied Sciences, South Eastern University of Sri Lanka, Sammanthurai. en_US
dc.subject Text Summarization en_US
dc.subject Text-Rank en_US
dc.subject TF-IDF en_US
dc.subject Sinhala Article en_US
dc.title Automated text summarization of Sinhala online articles en_US
dc.type Article en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search SEUIR


Advanced Search

Browse

My Account