TY - GEN
T1 - A Unified Portable and Programmable Framework for Task-Based Execution and Dynamic Resource Management on Heterogeneous Systems
AU - Gener, Serhan
AU - Hassan, Sahil
AU - Chang, Liangliang
AU - Chakrabarti, Chaitali
AU - Huang, Tsung Wei
AU - Ogras, Umit
AU - Akoglu, Ali
N1 - Publisher Copyright:
© 2025 Copyright held by the owner/author(s).
PY - 2025/5/5
Y1 - 2025/5/5
N2 - Heterogeneous computing systems are essential for addressing the diverse computational needs of modern applications. However, they present a fundamental trade-off between easy programmability and performance. This paper addresses this trade-off by enabling performance and energy efficiency optimization while facilitating easy programming without delving into hardware details. It introduces CEDR-Taskflow, a comprehensive framework that automatically parallelizes user applications and dynamically schedules its tasks to heterogeneous platforms, enabling efficient resource utilization and ease of programming. Emulation-based studies on the Xilinx ZCU102 and NVIDIA Jetson AGX Xavier SoC platforms demonstrate that this integrated framework improves application execution time by up to 1.47x compared to state-of-the-art, while maintaining hardware-agnostic application development. Furthermore, this integration approach enables features such as streaming-enabled execution and schedule caching that reduce the time spent on task scheduling by up to 29.6x and results in up to 6.1x lower execution time.
AB - Heterogeneous computing systems are essential for addressing the diverse computational needs of modern applications. However, they present a fundamental trade-off between easy programmability and performance. This paper addresses this trade-off by enabling performance and energy efficiency optimization while facilitating easy programming without delving into hardware details. It introduces CEDR-Taskflow, a comprehensive framework that automatically parallelizes user applications and dynamically schedules its tasks to heterogeneous platforms, enabling efficient resource utilization and ease of programming. Emulation-based studies on the Xilinx ZCU102 and NVIDIA Jetson AGX Xavier SoC platforms demonstrate that this integrated framework improves application execution time by up to 1.47x compared to state-of-the-art, while maintaining hardware-agnostic application development. Furthermore, this integration approach enables features such as streaming-enabled execution and schedule caching that reduce the time spent on task scheduling by up to 29.6x and results in up to 6.1x lower execution time.
KW - Auto parallelization
KW - dynamic scheduling
KW - heterogeneous runtime
UR - https://www.scopus.com/pages/publications/105007285746
UR - https://www.scopus.com/pages/publications/105007285746#tab=citedBy
U2 - 10.1145/3720555.3721988
DO - 10.1145/3720555.3721988
M3 - Conference contribution
AN - SCOPUS:105007285746
T3 - Proceedings of 2025 4th International Workshop on Extreme Heterogeneity Solutions, ExHET 2025
SP - 1
EP - 9
BT - Proceedings of 2025 4th International Workshop on Extreme Heterogeneity Solutions, ExHET 2025
PB - Association for Computing Machinery, Inc
T2 - 4th International Workshop on Extreme Heterogeneity Solutions, ExHET 2025
Y2 - 2 March 2025
ER -