Please use this identifier to cite or link to this item:
Title: An Intelligent Energy Aware Approach for Big Data Storage in Cloud Data Centers
Authors: Arora, Sumedha
Supervisor: Bala, Anju
Keywords: Energy, Power, Cloud, Big Data, Prediction
Issue Date: 3-Dec-2020
Abstract: The advancement in current technology has lead to the rapid rise in big data applications like E-commerce, scienti c computing, healthcare etc. These applications require enormous computing capabilities such as high end infrastructure, platforms and softwares. Cloud data centers provide these facilities based on pay as you go model, yet raise several challenges which include energy e ciency, scalability, privacy, and storage etc. Among these issues, energy e ciency has turned into an upcoming challenge for executing the big data applications in cloud environment. Energy has become a critical resource in modern computing systems, which presents challenges to the traditional storage systems. The energy consumed by the storage subsystem surpasses all other sub-components present in the server. The disks in high-end servers are responsible for the high power consumption. Hence, the prediction based energy-aware approach is required for an e cient data placement among the disks to power it down for the long duration. Prediction also helps in identifying which data objects need to be replicated. Therefore, an integration of data prediction with placement along with the disk scheduling provides an optimal solution to reduce energy and time consumption. To achieve the set objectives, an extensive literature survey of existing data prediction models and energy e cient storage techniques has been done. But the previous research does not cover all the aspects such as data prediction, data placement including replica management and disk scheduling for big data storage. Therefore, in this work, an intelligent energy aware approach is proposed to reduce storage energy consumption in the cloud environment. Firstly, the storage prediction model has been proposed that generates and customizes the SQL traces to nd the frequency of each query red on the real data streams obtained from the SCATS sensors of Dublin city. Based on the calculated frequency, the future frequency of each query has been predicted using ensemble approach. The predicted results have been tagged and classi ed as popular and unpopular data based on threshold frequency. The experimental results are validated in terms of accuracy, recall, precision, error rate, F-score. It yields 87.5% accuracy and successfully reduces the error rate to 11%. The highest measure of precision possible with the proposed model is 89% with 87% recall. The ROC value of 0.93 reveals the best capability of the proposed storage prediction model. Next, an intelligent energy aware approach has been proposed that optimally utilizes the prediction results to place the predicted popular data in hot disks using replication. Hot disks are the set of disks that remains active for most of the time. While, unpopular data is allocated to the cold disks that usually remain in standby state. When the user inputs the request, an intelligent disk scheduling technique has been applied for searching and selecting the most available disk that would execute the request. The replication allows the scheduler to select the disk that would execute the request with minimum energy and time. The disk selection is based on the maximum remaining time to move to idle state and minimum waiting time for the disk in active state. Likewise, the identi ed disk would save maximum disk spins. The standby disk would not be given any request until it can be satis ed by idle and active disk. The reduced energy and seek time has been measured in real world environment using multimeter and clampmeter which shows 9.7% decrease in the total execution time. Finally, an entire intelligent energy-aware approach has also been categorically validated in a cloud environment. The performance has been evaluated on OLTP (e.g., e-commerce) applications benchmarked with nancial and websearch input-output traces. Based on the performed experiments, the optimized replica, best disk ratio for each application is selected that consumes least energy and time to execute the request. The experimental results are compared with the reference benchmarks and existing literature. The proposed approach outperforms the existing approach with the 6.8% improvement in accuracy yielded by storage prediction model. Also, the 6% reduction in the energy consumption is seen along with the 18.26% improvement in the total execution time using intelligent energy aware approach.
Appears in Collections:Doctoral Theses@CSED

Files in This Item:
File Description SizeFormat 
Thesis.pdf10.03 MBAdobe PDFThumbnail

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.