Please use this identifier to cite or link to this item:
http://hdl.handle.net/10266/6072
Title: | Fake Content Detection System for Multimodal Signals over Social Media |
Authors: | Kaur, Sawinder |
Supervisor: | Kumar, Parteek Kumaraguru, Ponnurangam |
Keywords: | Fake;Social Media;deepfake;clickbait;content |
Issue Date: | 13-Jan-2021 |
Abstract: | Fake contents can be easily created and spread through the social media and web-based platforms, resulting into the widespread of real-world impact. Novel digital technologies make it increasingly difficult to distinguish between real and fake media. One of the most recent developments contributing to the problem is the emergence of fake contents in form of images, videos, articles, posts, clickbaits which are hyper-realistic to depict the things that never happened. Coupled with the reach and speed of social media, such fake contents can quickly reach millions of people and have negative impacts on the society. To develop fake content detection tools, characterizing of how fake data proliferates over social platforms and why it succeeds in deceiving readers are critical to develop such efficient tools for early detection. The research work presents the description about the general process of fake content detection using various multimodal signals (articles, URLs, posts, images, videos) over social media platforms. A literature review related to the field of fake content detection for textual and non-textual content is depicted in this research work. The current status of fake contents over social media in form of multimodal signals is classified in two categories (textual and non-textual). The periodical evolution seen in the field of fake content generation and research studies on the basis of publications has been analysed. Further, review protocol is followed and presented, selected sources of publications, retrieved research papers on the basis of inclusion-exclusion criteria. This research approach will help to make the findings available in a systematic way for assisting the researchers working in similar area to select the most appropriate techniques to identify fake content for textual and non-textual datasets. This thesis presents the multi-level architecture of fake news article detection system using three feature extraction techniques for efficiently identifying the category of news articles using machine and deep learning-based techniques. To train the framed architecture, various benchmark datasets of fake and real articles have been collected from Reuters, News Trends and Kaggle websites. Different machine learning based models and neural networks trained on the collected datasets have been compared with the proposed model on the basis of accuracy, recall, precision, F1-score, AUC-ROC performance metrics. The proposed multi-level voting model not only helps to improve the accuracy but also helps to increase the efficiency of the model. The experimental analysis shows that the proposed model outperforms the traditional machine learning algorithms for fake news article classification. Further, to process with both textual (clickbait headlines) and non-textual (headlines seen over posts in form of images) dataset, a novel two-phase hybrid Convolutional neural network- Long Short-Term Memory Biterm approach has been proposed for modelling short topic content. The hybrid model when implemented with pre-trained GloVe embedding yields the best results based on accuracy, recall, precision, and F1-score performance metrics. Eight types of clickbait such as Reasoning, Number, Reaction, Revealing, Shocking/Unbelievable, Hypothesis/Guess, Questionable, Forward referencing are classified in this work using the Biterm Topic Model. Also, a ground dataset of non-textual (image-based) data using multiple social media platforms has been created using human annotations. The textual information has been retrieved from the images with the help of pre-processed OCR tool. A comparative study is performed to show the effectiveness of the proposed model. Deepfake enthusiasts have been using neural networks to produce convincing face swaps. The impact of deepfakes has been alarming with politicians, senior corporates and world leaders being targeted by nefarious actors. To detect face-swapped deepfake video clips, a novel approach using the temporal sequential frames is proposed. The proposed approach uses the forged video to extract the frames at first level followed by which a deep depth-based Convolutional Long Short Term Memory model to identify the fake frames at second level. Also, the proposed model is evaluated on newly created ground truth dataset of forged videos using source and destination video frames of famous politicians. The newly ground dataset is created with the help of DeepFaceLab tool using source and destination videos from YouTube website. Experimental results demonstrate the effectiveness of the proposed model. This thesis also presents the developed Progressive Web Application of the proposed systems. Various features of the proposed schemes has been discussed that are embedded in the system. This web application promotes the exchange of information among its users. It serves the purpose to expedite the users to detect different types of fake or unverified content seen in the form of quotes, videos, articles, clickbaits over social media platforms. The goal of developing such an application is to outreach the social media users to give the power to analyse the reliable, trustworthy and real content for social development. Inappropriate use of unreliable words can also be detected through the developed application. This system can also help research teams, end-users, general public in analysing the fake contents over social media platforms. |
URI: | http://hdl.handle.net/10266/6072 |
Appears in Collections: | Doctoral Theses@CSED |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
updated thesis-11-1-21.pdf | Thesis Main File | 24.23 MB | Adobe PDF | ![]() View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.