A Proactive Fault Tolerant Approach for Grid Environment

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Grid technology has emerged as a new way of large-scale distributed computing with high-performance orientation. Grid Computing tries to bring, under one definitional umbrella all the work being done in the high performance, cluster, peer-to-peer, and Internet computing arenas. Grid Computing solutions are constructed using a variety of technologies and open standards. Grid Computing, in turn, provides highly scalable, highly secure, and extremely high-performance mechanisms for discovering and negotiating access to remote computing resources in a seamless manner. This makes it possible for the sharing of computing resources, on an unprecedented scale, among an infinite number of geographically distributed groups. The increasing complexity of Grid services and systems demands correspondingly larger human effort for system configuration and performance management. The management issues, which are mainly done in a manual style today, become time-consuming, error-prone and even unmanageable for human administrators. The fault tolerance is the major area of concern in the self-management of the Grid. The terms self-management or self-healing have been borrowed from Autonomic Computing. Autonomic Computing helps to address complexity by using technology to manage technology In this work, a system for the coordinated, Autonomic management (self-healing) of multiple clusters in a Grid has been discussed. The clusters are integrated into the Grid making the Grid Fault tolerant. The existing fault tolerance mechanism in the Grid uses the reactive approaches. The focus of the thesis is to design a prototype for proactive fault tolerance in Grid environment.

Description

ME(SE)

Citation

Endorsement

Review

Supplemented By

Referenced By