SEUIR Repository

Authorship identification of instant messages

Show simple item record

dc.contributor.author Jahan, Akmal, M.A.C
dc.contributor.author Jiffriya, M.A.C
dc.contributor.author Nawfan, M.N
dc.date.accessioned 2015-09-04T05:53:38Z
dc.date.available 2015-09-04T05:53:38Z
dc.date.issued 2014
dc.identifier.citation Annual Science Research Session 2014
dc.identifier.uri http://ir.lib.seu.ac.lk/123456789/389
dc.description.abstract Authorship attribution is a process in which the author of the given text corpus can be automatically recognized using some techniques. In early days the approach to authorship detection was stylometric which is used to identify the particular author of the printed materials, online texts such as blogs, e-mails, tweets, posts etc. In past years e-mails took a big role in communication. In a vast distribution of social media people spend lot of time in online communication like chatting, which nowadays becomes an easiest and effective communication media among people.The social tool like Facebook, Skype, Google talk and the other instant messaging tools contribute greater role in the real time communication rather than the e-mails.In current era, cybercrimes and security threats become a big issue on the all internet related activities.Even though, instant messaging is highly used as fast and effective communication,it is more vulnerable to several attacks and this issue need to be addressed. So far, standard stylometric features have been used for the authorship detection. However, attempts to this approach are still in beginning.Therefore, this paper produces an alternative way for authorship attribution of instant messages. Here, we have used vector space model using unigram technique. Processed chat data set from individual users in which 2/3 of the datais treated as training set and the remaining set is usedfortesting. Similarity score between training and the testing set have been computed using the given algorithm. From the overall result, 75% of the training corpus shows the maximum similarity score with its testing pair. Moreover, the length of the chat corpus does a significant effect on the similarity score which determine the authorship attribution of the instant messages. en_US
dc.language.iso en_US en_US
dc.publisher Faculty of Applied science South Eastern University of Sri Lanka Oluvil # 32360 Sri Lanka en_US
dc.subject Authorship attribution en_US
dc.subject Unigram en_US
dc.subject Vector space model en_US
dc.title Authorship identification of instant messages en_US
dc.type Conference paper en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search SEUIR


Advanced Search

Browse

My Account