Improved Data Structures for Dynamic and Massive Data
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
For many real world problems solved using various algorithmic techniques, traditional
algorithms are theoretically optimal but as soon as we move on to real world data, based on
the nature of data the algorithm efficiency degrades drastically. For massive data which is
stored in external memory, algorithms designed for internal memory which are otherwise
efficient will fail due to I/O bottleneck. Moreover in most real world scenarios data is
dynamic i.e. changes with time. Novel and very different design techniques are needed to
manage this dynamic and massive real world data. This thesis contributes to the vast research
of such results. In this era of information explosion, the flood of data emerging on daily basis
has made data handling and management a very crucial task. In this study, we are looking in
to this data from two different perspectives: massive data and dynamic data.
For dynamic data, we are considering the implementation of Retroactive data structures
specific to some application and then using the proposed implementation to analyse the
performance. Dynamic shortest path algorithms modify the existing shortest paths by
considering the changes in the underlying graph configuration. Basic approach used in
literature to solve this shortest path problem is to dynamize the Dijkstra algorithm: widely
used algorithm for the static case. Different solutions already exist for the problem but we are
using the concept of retroactive data structures for incorporating the dynamic changes into
the solution. In this work algorithms for the different operations of retroactive priority queue
are given. Also an algorithm has been proposed for the solution of dynamic shortest path
problem as dynamic Dijkstra which uses the retroactive priority queue for its dynamization.
The proposed approach is a suitable solution for the dynamic shortest path problem.
For massive data, we are extending the novel approach of incremental suffix array
construction using Lyndon factorization to the construction of extended suffix array.
xv
Extended suffix array is the suffix array along with the corresponding longest common prefix
(LCP) array. Main motive behind the incremental and simultaneous construction of suffix
array and LCP array is that both involve calculating the order information by considering the
common prefixes of the suffixes. Using Lyndon factorization for suffix array construction
reduces the merging overhead in terms of time as well as space because to extend the sorting
from local suffixes of Lyndon factors to the global suffixes in merged Lyndon factors need
no additional information. Also, the two sorted orders coincide thus making the merging of
Lyndon factors a simple merging of two sorted lists of suffixes. Also, incremental LCP
construction simultaneously saves a lot of computation and hence time. The proposed
approach has quadratic run time and the disk working space requirement is O(n).
Experiments also show the performance gain of our approach in terms of time over the
existing method of incremental construction.
