Agent Based Efficient Consistency Control and Replica Management in Data Grids

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Data Grids provide services and infrastructure for distributed data-intensive applications accessing massive geographically distributed datasets. An important technique to speed up the access to _les in data grids is replication, which create multiple copies of a file. The work presented in this dissertation is focused on the problem area of data grids, mainly on the challenges posed by data replication. The distinct problem areas addressed in this work are efficient management of replica and consistency management. Therefore, to achieve the set objectives of proposed work, an extensive literature review on replica management has been done. The state of the art techniques in the area of replica manageable in data grid have been explored. The comprehensive study of asynchronous replica management in data grids has been carried out to identify their inherent limitations. From the literature survey, it is apparent that the biggest challenge confronting data grid is related to replica creation, placement, selection and consistency maintenance. To address the replica management challenge, an agent based model, namely “Replica Management Model" (RMM), has been proposed and implemented for efficient management of distributed files across the data grid. The model considers various characteristics of data grid such as scalability, availability, and dynamic nature. The model is layered in nature. The proposed model divides the entire functionality across the different layers. The layered architecture bears negligible overhead in terms of communication cost. Evaluation of the proposed hierarchical model is done on two factors: (i) topology and (ii) availability. The performance analysis of the proposed model has been carried out in simulated environment. The comparative analysis of the hierarchical approach with centralized approach has shown that RMM model provide more scalability in the system. Moreover, increased availability of the files in the system increases the efficiency of the system. An agent based Replica Creation and Placement (RCP) strategy has been proposed and implemented. The RCP strategy aimed at achieving faster data access to files, efficient utilization of network resources and optimal number of replication. A popularity-driven dynamic RCP strategy for hierarchically structured RMM model balances access latency of the _le and storage space utilization. RCP dynamically adapt to the frequency and degree of replication on the basis of certain parameters such as average access frequency, available storage capacities, replication cost, placement cost etc. Simulation results has shown that the effectiveness of RCP strategy is dependent on various file access patterns, scheduling strategies and varied number of jobs. The optimal number of replica creation helps in enhancing the usefulness of the system. RCP utilizes the network resources in very efficient manner. Finally, the performance of proposed strategy with number of relevant strategies from the literature has done and demonstrates the effectiveness of the system. To enhance the utility of RMM model, a replica selection strategy, namely, Efficient Dynamic Replication using Agent (EDRA) has been proposed and implemented. A user submits a job to the grid, the selection of best replica which helps in execution of job is a crucial decision. EDRA strategy helps in finding the best replica from among the pool of replicas. EDRA has been applied at two levels i.e. region and sub-region for selection of best replica. At first level, the region level optimizer allocates jobs to a particular region, whereas at second level sub-region optimizer allocates jobs to particular node. Dynamic mapping has been employed with the help of access frequencies and availability of the files to maximize the hit ratio. The effectiveness of this approach is evaluated in Optorsim with varying jobs and different scenarios. Results observed from comparative analysis with existing replica selection strategies shows the efficiency of the EDRA strategy in term of execution time, storage utilization, computing utilization, network resource usage, hit ratio and transfer time. Lastly, the Replica Consistency and Conflict Resolution (RCCR) strategy is implemented to handle writeable replicas. A hybrid strategy is used, which take advantage of both pessimistic and optimistic approach. In RCCR strategy, consistencies of the replicas are handled at two levels i.e. local and global level. The local level consistency is attained by using pessimistic approach at region level. At global level optimistic approach is used for obtaining consistency and resolving conflicts among two different networks. Simulation results depicts that RCCR strategy handles write request in an efficient way compared with other existing approaches.

Description

Ph. D. Thesis

Citation

Endorsement

Review

Supplemented By

Referenced By