This research line tackles designing and managing power-efficient high-performance computing servers and data centers running compute and memory-intensive next-generation workloads. These workloads include training and inference of AI analytics, Deep Learning, and also QoS-constrained applications such as video transcoding and next-generation genome sequencing.
We constantly monitor the heterogeneous hardware resources to ensure optimal performance per watt, and minimal performance interference between different applications collocated on the same server. To achieve the best efficiency of a specific workload, we explore machine-learning-based resource management and task mapping techniques.
| |