A study of network quality of service in many-core MPI applications

Lee Savoie, David K. Lowenthal, Bronis R. De Supinski, Kathryn Mohror

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

Network contention in existing high performance computing (HPC) systems increases job execution time and reduces machine throughput. This problem is expected to become worse in future systems as core counts increase and networks become larger and more complicated. In this paper, we investigate the use of network Quality of Service (QoS) to mitigate the effects of network contention. QoS allocates bandwidth to individual jobs, thus limiting the impact that one job can have on another through network contention. We consider coarse-grained QoS, in which each job runs at a different priority level, by running a number of micro-benchmarks and applications in different QoS configurations on real hardware with QoS capabilities. Our results indicate that while network contention reduces job performance by as much as 70%, coarse-grained QoS is unlikely to improve throughput on HPC systems and may increase job execution times by more than 100%. Based on our analysis, finer-grained QoS is more likely to improve performance and throughput.

Original languageEnglish (US)
Title of host publicationProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1313-1322
Number of pages10
ISBN (Print)9781538655559
DOIs
StatePublished - Aug 3 2018
Event32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018 - Vancouver, Canada
Duration: May 21 2018May 25 2018

Publication series

NameProceedings - 2018 IEEE 32nd International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018

Other

Other32nd IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2018
Country/TerritoryCanada
CityVancouver
Period5/21/185/25/18

Keywords

  • Contention
  • High performance computing
  • Many core
  • Network
  • Network contention
  • Performance
  • Quality of service
  • Service level

ASJC Scopus subject areas

  • Artificial Intelligence
  • Computer Networks and Communications
  • Hardware and Architecture
  • Information Systems and Management

Fingerprint

Dive into the research topics of 'A study of network quality of service in many-core MPI applications'. Together they form a unique fingerprint.

Cite this