An Intelligent Energy Aware Approach for Big Data Storage in Cloud Data Centers
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The advancement in current technology has lead to the rapid rise in big data applications
like E-commerce, scienti c computing, healthcare etc. These applications require enormous
computing capabilities such as high end infrastructure, platforms and softwares.
Cloud data centers provide these facilities based on pay as you go model, yet raise several
challenges which include energy e ciency, scalability, privacy, and storage etc. Among
these issues, energy e ciency has turned into an upcoming challenge for executing the
big data applications in cloud environment.
Energy has become a critical resource in modern computing systems, which presents
challenges to the traditional storage systems. The energy consumed by the storage subsystem
surpasses all other sub-components present in the server. The disks in high-end
servers are responsible for the high power consumption. Hence, the prediction based
energy-aware approach is required for an e cient data placement among the disks to
power it down for the long duration. Prediction also helps in identifying which data
objects need to be replicated. Therefore, an integration of data prediction with placement
along with the disk scheduling provides an optimal solution to reduce energy and time
consumption.
To achieve the set objectives, an extensive literature survey of existing data prediction
models and energy e cient storage techniques has been done. But the previous research
does not cover all the aspects such as data prediction, data placement including replica
management and disk scheduling for big data storage. Therefore, in this work, an intelligent
energy aware approach is proposed to reduce storage energy consumption in the
cloud environment.
Firstly, the storage prediction model has been proposed that generates and customizes
the SQL traces to nd the frequency of each query red on the real data streams obtained from the SCATS sensors of Dublin city. Based on the calculated frequency, the future
frequency of each query has been predicted using ensemble approach. The predicted
results have been tagged and classi ed as popular and unpopular data based on threshold
frequency. The experimental results are validated in terms of accuracy, recall, precision,
error rate, F-score. It yields 87.5% accuracy and successfully reduces the error rate to
11%. The highest measure of precision possible with the proposed model is 89% with
87% recall. The ROC value of 0.93 reveals the best capability of the proposed storage
prediction model.
Next, an intelligent energy aware approach has been proposed that optimally utilizes the
prediction results to place the predicted popular data in hot disks using replication. Hot
disks are the set of disks that remains active for most of the time. While, unpopular data
is allocated to the cold disks that usually remain in standby state. When the user inputs
the request, an intelligent disk scheduling technique has been applied for searching and
selecting the most available disk that would execute the request. The replication allows
the scheduler to select the disk that would execute the request with minimum energy
and time. The disk selection is based on the maximum remaining time to move to idle
state and minimum waiting time for the disk in active state. Likewise, the identi ed disk
would save maximum disk spins. The standby disk would not be given any request until
it can be satis ed by idle and active disk. The reduced energy and seek time has been
measured in real world environment using multimeter and clampmeter which shows 9.7%
decrease in the total execution time.
Finally, an entire intelligent energy-aware approach has also been categorically validated
in a cloud environment. The performance has been evaluated on OLTP (e.g., e-commerce)
applications benchmarked with nancial and websearch input-output traces. Based on
the performed experiments, the optimized replica, best disk ratio for each application is
selected that consumes least energy and time to execute the request. The experimental
results are compared with the reference benchmarks and existing literature. The proposed approach outperforms the existing approach with the 6.8% improvement in accuracy
yielded by storage prediction model. Also, the 6% reduction in the energy consumption
is seen along with the 18.26% improvement in the total execution time using intelligent
energy aware approach.
