Please use this identifier to cite or link to this item: http://ir.lib.seu.ac.lk/handle/123456789/6267
Title: Optimization of plagiarism detection using vector space model on CUDA architecture
Authors: Jiffriya Mohamed, Abdul-Cader
Akmal Jahan, Mohamed Abdul Cader
Hasindu, Gamaarachchi
Roshan, G. Ragel
Keywords: Graphics Processing Units (GPU)
Computer Unified Device Architecture (CUDA)
Plagiarism Detection
Vector Space Model
Issue Date: 26-Sep-2022
Publisher: Inder Science Publishers
Citation: International Journal of Innovative Computing and Applications, Vol.13, No.4, 2022. pp.232 - 244 (pp. 1-20).
Abstract: Plagiarism is a rapidly rising issue among students that occurs during the submission of assignments, reports, and publications in universities and educational institutions because of the easy accessibility of abundant e-resources on the Internet. To mitigate plagiarism among students, many tools are available for natural language plagiarism detection. However, they become inefficient when dealing with a prolific number of documents with large content due to the time they consume. Therefore, we have proposed a way for software-based acceleration on text-based plagiarism detection using a suitable model on CPU/GPU. For the evaluation on the CPU, initially, a software-based serial vector space model was implemented on the CPU and tested with 1000 text-based documents particularly, students’ assignments, where it consumed 1641s for plagiarism detection. As the computation time of plagiarism detection is a bottleneck of performance while treating a prolific number of text-based sources with different sizes, we focus on accelerating and optimizing the model with the number of documents. Therefore, this research intends to implement and optimize the vector space model on the Graphics Processing Units (GPU) using Compute Unified Device Architecture (CUDA). In order to speed up, a parallel version of the model was developed on GPU using CUDA and tested with the same dataset which consumed only 36s and gained a 45x speedup compared to CPU, and when optimized further it took only 4s for the same dataset which was 389x faster than serial implementation.
URI: https://doi.org/10.1504/IJICA.2022.125675
http://ir.lib.seu.ac.lk/handle/123456789/6267
Appears in Collections:Research Articles

Files in This Item:
File Description SizeFormat 
Optimisation of plagiarism.pdf686.44 kBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.