Please use this identifier to cite or link to this item:
Title: An Efficient Approach for Transformation and Analysis of Streaming Data
Authors: Arora, Shruti
Supervisor: Rani, Rinkle
Saxena, Nitin
Keywords: Event Detection;Concept drift detection;Machine Learning;Streaming Data;Apache Spark
Issue Date: 5-Dec-2023
Abstract: The era of digitalization has ushered in an unprecedented deluge of data generated from various sources, leading to the emergence of data streams as a critical paradigm in data analysis. This PhD thesis delves into the intricate domain of transforming and analyzing data streams, addressing the challenges and opportunities presented by their high-velocity, dynamic, and often unbounded nature. The primary objective of this research is to develop efficient methodologies for effective transformation and analysis of data streams, catering to the unique characteristics and demands of real-time data processing. This study encompasses a comprehensive review of existing techniques, algorithms, and tools pertinent to data stream processing, while also identifying the gaps and limitations that need to be addressed. One of the major contributions of this thesis is the development of adaptive and scalable data stream transformation techniques. By integrating concepts from machine learning and statistical modeling, these techniques facilitate the automatic adaptation of transformation process according to the evolving nature of data streams. This adaptability not only enhances the accuracy of downstream analysis but also ensures the robustness of the transformation pipeline in handling dynamic data distributions. Furthermore, this research explores advanced methodologies for the analysis of transformed data streams. The development of techniques for real-time analysis, capable of handling high-velocity data streams while providing timely insights, is a key focus. These methods incorporate concepts from data mining, pattern recognition, and event detection to enable the extraction of valuable patterns, trends, and outliers from rapidly evolving data. In addition to the technical contributions, this thesis also explores the practical implementation and deployment aspects of the proposed techniques. A framework for building scalable and resilient data stream processing pipelines is proposed, considering factors such as computational efficiency, fault tolerance, and resource optimization. Moreover, the integration of visualization techniques aids in the effective communication of insights derived from the analyzed data streams. To validate the effectiveness of the proposed methodologies, a series of experimental evaluations are conducted using real-world data streams from IoT and social media domains. In addition to real world data streams, synthetic data streams are also generated using open source tool MOA for validation of proposed techniques. The results demonstrate the superiority of the developed techniques in terms of accuracy, efficiency, and adaptability in comparison to existing similar approaches. In conclusion, this PhD thesis contributes to the advancement in the field of data stream processing by addressing the challenges associated with transforming and analyzing highvelocity data streams. The developed methodologies not only enhance the quality of insights extracted from dynamic data but also lay the foundation for more informed decision-making in various application domains. This research opens avenues for future work in the areas of adaptive analytics, real-time visualization, and integration of emerging technologies into data stream processing pipelines.
Appears in Collections:Doctoral Theses@CSED

Files in This Item:
File Description SizeFormat 
Shruti Thesis.pdfPh.D. Thesis4.22 MBAdobe PDFView/Open    Request a copy

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.