TY - GEN
T1 - Value-based resource management in high-performance computing systems
AU - Machovec, Dylan
AU - Tunc, Cihan
AU - Kumbhare, Nirmal
AU - Khemka, Bhavesh
AU - Akoglu, Ali
AU - Hariri, Salim
AU - Siegel, Howard Jay
N1 - Publisher Copyright:
© 2016 ACM.
PY - 2016/6/1
Y1 - 2016/6/1
N2 - We introduce a new metric, Value of Service (VoS), which enables resource management techniques for high-performance computing (HPC) systems to take into consideration the value of completion time of a task and the value of energy used to compute that task at a given instant of time. These value functions have a soft-threshold, where the value function begins to decrease from its maximum value, and a hard-threshold, where the value function goes to zero. Each task has an associated importance factor to express the relative significance among tasks. We define the value of a task as the weighted sum of its value of performance and value of energy, multiplied by its importance factor. We also consider the variation in value for completing a task at different time; the value of energy reduction can change significantly between peak and non-peak periods. We define VoS for a given workload to be sum of the values for all tasks that are executed during a given period of time. Our system model is based on virtual machines (VMs), where each dynamically arriving task will be assigned to a VM with a resource configuration based on number of homogenous cores and amount of memory. Based on VoS, we design, evaluate, and compare different resource management heuristics. This comparison is done over various simulation scenarios and example experiments on an IBM blade server based system.
AB - We introduce a new metric, Value of Service (VoS), which enables resource management techniques for high-performance computing (HPC) systems to take into consideration the value of completion time of a task and the value of energy used to compute that task at a given instant of time. These value functions have a soft-threshold, where the value function begins to decrease from its maximum value, and a hard-threshold, where the value function goes to zero. Each task has an associated importance factor to express the relative significance among tasks. We define the value of a task as the weighted sum of its value of performance and value of energy, multiplied by its importance factor. We also consider the variation in value for completing a task at different time; the value of energy reduction can change significantly between peak and non-peak periods. We define VoS for a given workload to be sum of the values for all tasks that are executed during a given period of time. Our system model is based on virtual machines (VMs), where each dynamically arriving task will be assigned to a VM with a resource configuration based on number of homogenous cores and amount of memory. Based on VoS, we design, evaluate, and compare different resource management heuristics. This comparison is done over various simulation scenarios and example experiments on an IBM blade server based system.
KW - Energy-aware resource allocation
KW - Modeling
KW - Performance metrics
KW - Resource management
KW - Value of service
UR - http://www.scopus.com/inward/record.url?scp=84978891223&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84978891223&partnerID=8YFLogxK
U2 - 10.1145/2913712.2913716
DO - 10.1145/2913712.2913716
M3 - Conference contribution
AN - SCOPUS:84978891223
T3 - ScienceCloud 2016 - Proceedings of the ACM Workshop on Scientific Cloud Computing
SP - 19
EP - 26
BT - ScienceCloud 2016 - Proceedings of the ACM Workshop on Scientific Cloud Computing
PB - Association for Computing Machinery, Inc
T2 - 7th ACM Workshop on Scientific Cloud Computing, ScienceCloud 2016
Y2 - 1 June 2016
ER -