Movie success and rating prediction using data mining algorithm

Pirunthavi, Sivakumar; Vithusia, Puvaneswaren Rajeswaren; Abishankar, Kamalanathan; Ekanayake, E. M. U. W. J. B.; Yanusha, Mehendran

Please use this identifier to cite or link to this item: http://ir.lib.seu.ac.lk/handle/123456789/5427

Full metadata record

DC Field	Value	Language
dc.contributor.author	Pirunthavi, Sivakumar	-
dc.contributor.author	Vithusia, Puvaneswaren Rajeswaren	-
dc.contributor.author	Abishankar, Kamalanathan	-
dc.contributor.author	Ekanayake, E. M. U. W. J. B.	-
dc.contributor.author	Yanusha, Mehendran	-
dc.date.accessioned	2021-04-01T09:29:05Z	-
dc.date.available	2021-04-01T09:29:05Z	-
dc.date.issued	2020-09-18	-
dc.identifier.citation	Journal of Information Systems & Information Technology Vol. 5 No. 2, 2020 pp. 72-80.	en_US
dc.identifier.issn	24780677	-
dc.identifier.uri	http://ir.lib.seu.ac.lk/handle/123456789/5427	-
dc.description.abstract	This project developed the models to predict the success and the ratings of a new movie before its release. Since the success of a movie is highly influenced by the actor, actress, director, music director and production company, those historical data were extracted from the Internet Movie Database (IMDb).The Box Office Mojo stores information about the cost of production of a movie and the total income of the movie. This information is helpful to determine whether the movie is successful or not in terms of revenue. A threshold was defined on revenue based on heuristics to categorize the movie into success or failure. Teasers’ and trailers’ comments were extracted from YouTube as those are very helpful to rate a movie. The keywords were extracted from the user reviews using a Natural Language Processing (NLP) technique and those reviews were categorized into positive or negative based on the sentimental analysis. A Random Forest Algorithm was trained using the features extracted from IMDb to predict the success of a movie. Further, the Naive Bayers model was trained using the user reviews extracted from YouTube to predict the rating of a movie. The models were tested on real datasets and the accuracy of those were evaluated respectively. Finally, two conclusions have been met that the rating of a new movie cannot be predicted in advance through the YouTube trailers’ and teasers’ comments and the success of a new movie can be predicted in advance by using the data or features collected from online. The performances of the models are decent enough compared to the existing models in the literature. The Success Prediction model can be used as an early assessment tool of movies since it has gained 70% overall accuracy and hence, useful for the people in the movie industry and the audience of the movies. YouTube allows to extract a limited number of user comments and hence, this factor could be negatively affected on the accuracy of the movie rating prediction. This abstract was presented at International Research Conference of Uva Wellassa University of Sri Lanka(IRCUWU2020).	en_US
dc.language.iso	en_US	en_US
dc.publisher	Faculty of Management and Commerce South Eastern University of Sri Lanka	en_US
dc.subject	Data Mining	en_US
dc.subject	Natural Language Processing	en_US
dc.subject	Sentimental Analysis	en_US
dc.subject	Naïve Bayers	en_US
dc.subject	Random Forest	en_US
dc.title	Movie success and rating prediction using data mining algorithm	en_US
dc.type	Article	en_US
Appears in Collections:	Vol.5 No.2 (2020)

Files in This Item:

File	Description	Size	Format
JISIT-5209.pdf		631.78 kB	Adobe PDF	View/Open

Show simple item record