Improved Data Structures for Dynamic and Massive Data

Loading...
Thumbnail Image

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

For many real world problems solved using various algorithmic techniques, traditional algorithms are theoretically optimal but as soon as we move on to real world data, based on the nature of data the algorithm efficiency degrades drastically. For massive data which is stored in external memory, algorithms designed for internal memory which are otherwise efficient will fail due to I/O bottleneck. Moreover in most real world scenarios data is dynamic i.e. changes with time. Novel and very different design techniques are needed to manage this dynamic and massive real world data. This thesis contributes to the vast research of such results. In this era of information explosion, the flood of data emerging on daily basis has made data handling and management a very crucial task. In this study, we are looking in to this data from two different perspectives: massive data and dynamic data. For dynamic data, we are considering the implementation of Retroactive data structures specific to some application and then using the proposed implementation to analyse the performance. Dynamic shortest path algorithms modify the existing shortest paths by considering the changes in the underlying graph configuration. Basic approach used in literature to solve this shortest path problem is to dynamize the Dijkstra algorithm: widely used algorithm for the static case. Different solutions already exist for the problem but we are using the concept of retroactive data structures for incorporating the dynamic changes into the solution. In this work algorithms for the different operations of retroactive priority queue are given. Also an algorithm has been proposed for the solution of dynamic shortest path problem as dynamic Dijkstra which uses the retroactive priority queue for its dynamization. The proposed approach is a suitable solution for the dynamic shortest path problem. For massive data, we are extending the novel approach of incremental suffix array construction using Lyndon factorization to the construction of extended suffix array. xv Extended suffix array is the suffix array along with the corresponding longest common prefix (LCP) array. Main motive behind the incremental and simultaneous construction of suffix array and LCP array is that both involve calculating the order information by considering the common prefixes of the suffixes. Using Lyndon factorization for suffix array construction reduces the merging overhead in terms of time as well as space because to extend the sorting from local suffixes of Lyndon factors to the global suffixes in merged Lyndon factors need no additional information. Also, the two sorted orders coincide thus making the merging of Lyndon factors a simple merging of two sorted lists of suffixes. Also, incremental LCP construction simultaneously saves a lot of computation and hence time. The proposed approach has quadratic run time and the disk working space requirement is O(n). Experiments also show the performance gain of our approach in terms of time over the existing method of incremental construction.

Description

Citation

Endorsement

Review

Supplemented By

Referenced By