-
Views
-
Cite
Cite
Ping Xie, Jianzhong Huang, Xiao Qin, Changsheng Xie, SmartRec: Fast Recovery from Single Failures in Heterogeneous RAID-Coded Storage Systems, The Computer Journal, Volume 61, Issue 6, June 2018, Pages 896–911, https://doi.org/10.1093/comjnl/bxx106
- Share Icon Share
Abstract
It is not uncommon for reconstruction I/Os to encounter workload fluctuation in heterogeneous RAID-coded storage systems. This paper proposes a heterogeneity-aware single-failure recovery scheme—SmartRec—to tolerate double and multiple disk failures in RAIDs. We start this study by formulating the data recovery problem of single-disk failures in form of an optimization function in the context of online and heterogeneous disk arrays. To take both static heterogeneity associated with disk configurations and dynamic heterogeneity affected by I/O loads into account, SmartRec periodically selects an appropriate reconstruction solution according to up-to-date disk utilization. The appropriate reconstruction solution indicates the amount of data being retrieved across surviving disks and is expected to achieve minimal recovery time, which is induced by both candidate reconstruction sequences and reconstruction I/O capability of surviving disks. We build a response-time model in SmartRec to measure the reconstruction I/O capability of surviving disks during a recovery process. To quantitatively compare the SmartRec scheme against three alternatives (i.e. ConRec, MinRec and BalRec), we build four analytical models and validate the correctness of the four models using empirical evaluations. We implement the four reconstruction schemes in a heterogeneous RAID, and carry out comparative online reconstruction tests by replaying real-world workloads under various configurations. The experimental results illustrate that our SmartRec scheme outperforms the three existing reconstruction schemes in terms of reconstruction time by up to 35.3% with an average of 25.8%.