Please use this identifier to cite or link to this item:
Title: Reliable Execution and Deployment of Workflows in Cloud Computing
Authors: Bharti, Monika
Supervisor: Bala, Anju
Keywords: Workflows;Cloud computing;Workflow Management System;Workflow Engines
Issue Date: 17-Jul-2012
Abstract: Cloud computing is a paradigm that provides demand service resources like software, hardware, platform, and infrastructure. Due to the advantages of cost-effectiveness, ondemand provision, easy for sharing, scalability, reliability, cloud computing has grown in popularity with research community for deploying scientific applications such as workflows. The underlying system in the workflows is Workflow Management System, which provides the end user with the required data and appropriate application program for their tasks. Workflow management is a fast evolving technology which is increasingly being exploited by businesses in a variety of industries. Its primary characteristic is the automation of processes involving combinations of human and machine-based activities, particularly those involving interaction with IT applications and tools. The main purpose of a workflow management system (WfMS) is to support the definition, execution, registration and control of business processes. Workflow scheduling is one of the key issues in the workflow management that maps and manages the execution of inter-dependent tasks on the distributed resources. It allocates suitable resources to workflow tasks such that the execution can be completed to satisfy objective functions imposed by users. Proper scheduling can have significant impact on the performance of the system. Fault tolerance is another major concern to guarantee availability and reliability of critical services as well as application execution. In order to minimize failure impact on the system and application execution, failures should be anticipated and proactively handled. So, there is need to implement reliable execution of workflows in cloud environment. Since in workflow, there is lot of data and compute nodes to process data, therefore execution time is required to minimize. Therefore, a tool is required to design and specify of task parallelism and task dependency ordering, coordinate execution of the tasks over large computer resources, facilitate workflow reuse over multiple datasets. The thesis discusses the various tools for generating workflow and these tools have been compared on the basis of operating system, databases, architecture and environment. Pegasus bridges the scientific domain and the execution environment by automatically mapping high-level workflow descriptions onto distributed resources. It automatically locates the necessary input data and computational resources necessary for workflow execution. iii Condor Scheduler manages individual workflow tasks and supervises their execution on local and remote resources. So, the application on workflow is designed and implemented with Pegasus. The designed workflow is analyzed, monitored and scheduled. Then the designed workflow can be deployed on Nimbus. The workflow type application is also designed using Oozie. The generated workflow is implemented in hadoop environment. Hadoop is an open-source JAVA based software platform developed by Apache Software Foundation. It lets one easy to write and run their applications on clusters to process huge amount of data. Then the designed workflows in oozie and Pegasus have been compared on the basis of mapper, enviorment, scheduling, and reliability with compatible cloud environments like Nimbus and Hadoop.
Appears in Collections:Masters Theses@CSED

Files in This Item:
File Description SizeFormat 
1761.pdf4.03 MBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.