Classification of Opinion on Movie Reviews by using Classifiers with 3-gram TF-IDF and SVD Features

Loading...
Thumbnail Image

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Extraction of features plays an effective role in sentiment analysis or opinion mining about an issue, customer reviews and products etc. in which these are fed to machine learning approaches to get the sentiments classified. Existing techniques widely used TF-IDF feature extraction from the unigram lexicons of the sentiment documents, some used the term frequency score of the unigram words as features. In this work, unigram, two word clusters (bigram) and three word clusters (trigram) are generated after filtering the collected sentiment data. Data used are the web movie reviews collection of the users. Data is collected manually from varies websites( www.imdb.com, bookmyshow.com, google user reviews ) about the conflict of Bollywood movie Padmaavat in which three different sentiments were found. Many people have positive moods about the issue and releasing of the movie Padmaavat and some of them are against of the movie, which are taken as negative reviews. A very little quantity was showing neutral moods, which show sentiments both in favor and against the movie. Hence three different categories of reviews are marked and fed to the proposed opinion mining system. All three unigram, bigram and trigram word lexicons are used further to get the TF-IDF of all the reviews. After that singular value decomposition (SVD) features are generated. Four different machine learning classifiers named as a K-Nearest Neighbor, Support Vector Machine, Naive-Bayes and Decision Tree are used for the classification step in which results are compared. Experimental results show more accuracy in classification when proposed feature extraction techniques are used as compared to existing method. Among the classifiers, decision trees give better accuracy in classification of sentiments than all other used classifiers. Decision tree gives 0.9272% accuracy in classification for positive sentiments, 0.8901% accuracy for negative sentiments and 0.9629% accuracy for neural sentiments.

Description

Master of Engineering- CSE

Citation

Endorsement

Review

Supplemented By

Referenced By