Classification of Opinion on Movie Reviews by using Classifiers with 3-gram TF-IDF and SVD Features

dc.contributor.authorShveta
dc.contributor.supervisorLoura, Ajay Kumar
dc.date.accessioned2018-08-07T11:37:41Z
dc.date.available2018-08-07T11:37:41Z
dc.date.issued2018-08-07
dc.descriptionMaster of Engineering- CSEen_US
dc.description.abstractExtraction of features plays an effective role in sentiment analysis or opinion mining about an issue, customer reviews and products etc. in which these are fed to machine learning approaches to get the sentiments classified. Existing techniques widely used TF-IDF feature extraction from the unigram lexicons of the sentiment documents, some used the term frequency score of the unigram words as features. In this work, unigram, two word clusters (bigram) and three word clusters (trigram) are generated after filtering the collected sentiment data. Data used are the web movie reviews collection of the users. Data is collected manually from varies websites( www.imdb.com, bookmyshow.com, google user reviews ) about the conflict of Bollywood movie Padmaavat in which three different sentiments were found. Many people have positive moods about the issue and releasing of the movie Padmaavat and some of them are against of the movie, which are taken as negative reviews. A very little quantity was showing neutral moods, which show sentiments both in favor and against the movie. Hence three different categories of reviews are marked and fed to the proposed opinion mining system. All three unigram, bigram and trigram word lexicons are used further to get the TF-IDF of all the reviews. After that singular value decomposition (SVD) features are generated. Four different machine learning classifiers named as a K-Nearest Neighbor, Support Vector Machine, Naive-Bayes and Decision Tree are used for the classification step in which results are compared. Experimental results show more accuracy in classification when proposed feature extraction techniques are used as compared to existing method. Among the classifiers, decision trees give better accuracy in classification of sentiments than all other used classifiers. Decision tree gives 0.9272% accuracy in classification for positive sentiments, 0.8901% accuracy for negative sentiments and 0.9629% accuracy for neural sentiments.en_US
dc.identifier.urihttp://hdl.handle.net/10266/5175
dc.language.isoenen_US
dc.subjectOpinionen_US
dc.subjectUni-gramen_US
dc.subjectbi-gram, tri-gramen_US
dc.subjectTF-IDFen_US
dc.subjectSVD, Classificationen_US
dc.subjectDecision Treeen_US
dc.subjectKNNen_US
dc.subjectNaive-Bayesen_US
dc.subjectSVMen_US
dc.titleClassification of Opinion on Movie Reviews by using Classifiers with 3-gram TF-IDF and SVD Featuresen_US
dc.typeThesisen_US

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis submission_PDF_Shveta_ME_CSE_801632046.pdf
Size:
3 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
2.03 KB
Format:
Item-specific license agreed upon to submission
Description: