Value-based resource management in high-performance computing systems

Dylan Machovec, Cihan Tunc, Nirmal Kumbhare, Bhavesh Khemka, Ali Akoglu, Salim Hariri, Howard Jay Siegel

Research output: Chapter in Book/Report/Conference proceedingConference contribution

11 Scopus citations

Abstract

We introduce a new metric, Value of Service (VoS), which enables resource management techniques for high-performance computing (HPC) systems to take into consideration the value of completion time of a task and the value of energy used to compute that task at a given instant of time. These value functions have a soft-threshold, where the value function begins to decrease from its maximum value, and a hard-threshold, where the value function goes to zero. Each task has an associated importance factor to express the relative significance among tasks. We define the value of a task as the weighted sum of its value of performance and value of energy, multiplied by its importance factor. We also consider the variation in value for completing a task at different time; the value of energy reduction can change significantly between peak and non-peak periods. We define VoS for a given workload to be sum of the values for all tasks that are executed during a given period of time. Our system model is based on virtual machines (VMs), where each dynamically arriving task will be assigned to a VM with a resource configuration based on number of homogenous cores and amount of memory. Based on VoS, we design, evaluate, and compare different resource management heuristics. This comparison is done over various simulation scenarios and example experiments on an IBM blade server based system.

Original languageEnglish (US)
Title of host publicationScienceCloud 2016 - Proceedings of the ACM Workshop on Scientific Cloud Computing
PublisherAssociation for Computing Machinery, Inc
Pages19-26
Number of pages8
ISBN (Electronic)9781450343534
DOIs
StatePublished - Jun 1 2016
Event7th ACM Workshop on Scientific Cloud Computing, ScienceCloud 2016 - Kyoto, Japan
Duration: Jun 1 2016 → …

Publication series

NameScienceCloud 2016 - Proceedings of the ACM Workshop on Scientific Cloud Computing

Other

Other7th ACM Workshop on Scientific Cloud Computing, ScienceCloud 2016
Country/TerritoryJapan
CityKyoto
Period6/1/16 → …

Keywords

  • Energy-aware resource allocation
  • Modeling
  • Performance metrics
  • Resource management
  • Value of service

ASJC Scopus subject areas

  • Computer Science Applications
  • Software

Fingerprint

Dive into the research topics of 'Value-based resource management in high-performance computing systems'. Together they form a unique fingerprint.

Cite this