Microcontrollers (MCUs) are used in a wide range of applications, from wearable devices and sensor monitoring, to robotics and automotive. In particular, the design of low-power microcontrollers for wearables in the biomedical domain has received a lot of attention in recent decades. Recent proposals such as BiomedBench [1] have created a collection of biomedical applications and kernels with the aim of informing the design of new processing architectures for wearable devices.

In particular, the Embedded Systems Laboratory is developing X-HEEP, (eXtendable Heterogeneous Energy-Efficient Platform), which is an open-source, configurable, and extensible single-core RISC-V 32b MCU, sponsored by the EcoCloud sustainable computing center of EPFL. X-HEEP is based on third-party open-source IPs and in-house IPs developed at ESL jointly with other EPFL laboratories. X-HEEP provides a framework to run applications compiled for RISC-V on a simulator (Verilator, Questasim, or VCS), on a Xilinx FPGA, and can be implemented in silicon as well. The first ASIC based on X-HEEP is called HEEPocrates.

BiomedBench has recently been ported to X-HEEP. The open source nature of the platform, and the fact that it is being developed at ESL, creates an excellent chance to investigate which architectural features of low-power microcontrolers can increase the energy efficiency of wearables in the biomedical domain.

In this project we want to explore if the use of an in-order superscalar core, in place of the in-order single-scalar core currently used in X-HEEP, the RISC-V OpenHW Group CV32E20 [2], can improve energy efficiency during the processing phase of the applications in BiomedBench. The working hypothesis will be that, whereas out-of-order execution introduces too much power overhead in comparison with the improvements obtained in execution time, in-order superscalar execution, in particular with the RISC-V ISA, produces larger time improvements than power increases. Therefore, in-order superscalar execution is suitable for reducing energy consumption in RISC-V microprocessors for the biomedical wearable domain. As a second step, we will evaluate if other architectural extensions, e.g., specifically for fixed-point arithmetic, can improve the energy efficiency of the microcontroller.

During this project, the student will first develop a simulator for the RISC-V architecture that supports execution of the applications ported to X-HEEP (binary compatibility). Instead of developing a simulator from scratch, it will also be possible to use other existing open-source solutions, as long as they can be used for the second phase. The simulator will be used to generate dynamic execution traces from the applications in BiomedBench.

In a second phase, the student will modify the simulator to evaluate the impact on performance of in-order superscalar execution. The simulator will be easily modifiable, so that it will be possible to test which combinations of additional functional units will produce the best performance impact with the minimal cost in HW.

In a third phase, the student will evaluate, with help from the people participating in the X- HEEP project, the impact on power of the best candidate architectures found based on performance improvement. In this way, at the end of the project it will be possible to determine which architectural optimizations produce the best improvements in terms of total energy consumption, based on the maximum performance improvement with the minimum additional power.

The previous explorations do not require modifying the traces obtained from the execution in the simulator. Therefore, the compiled binary used for X-HEEP will be valid during these phases. However, other explorations, such as the introduction of specific instructions for fixed-point execution, may need modifying the execution trace. This will be done either using the original assembly code produced by the compiler, or dynamically modifying the trace during simulation.

The expected outcomes of this project are:

  • Development of a lightweight RISC-V simulator that can execute the binary (compiled) applications of BiomedBench for X-HEEP. No interrupts will be included in the simulations. The simulator will produce as output a complete memory dump and a summary of processor cycles required for execution of the benchmark.

    • The correctness of the simulation will be guaranteed at all times comparing the output of the application with the expected outputs from BiomedBench.

  • Generation of dynamic execution traces from the BiomedBench applications using the simulator. The student will be allowed to propose a different mechanism to obtain the traces, as long as it allows them to conduct the explorations in the following phases.

  • Modification of the simulator to account for superscalar execution of the traces. At this point, no extra functional units will be introduced; the simulation will only account for data dependencies between the instructions, the nature of the instructions (e.g., whether the first instruction is a branch or not), and the availability of resource classes. For example, at this stage: loads can proceed in parallel with additions; one addition and one multiplication can be executed simultaneously; two additions/subtractions cannot be executed in parallel.

    Optional/additional outcomes:

  • Exploration of the performance benefits of introducing different types of arithmetic operators, e.g., dual adders.

  • Exploration of the overhead in area and power of the proposed modifications of the control stage and additional functional units.

  • Exploration of other optimizations specific to the biomedical domain (e.g., for fixed- point arithmetic), as driven by the BiomedBench applications.

    Throughout the project, the student will learn:

  • Basic processor architecture concepts and the RISC-V ISA.

  • The main features of applications in the biomedical wearable domain.

  • Advanced processor architecture concepts such as superscalar, in-order and out-of- order execution.

  • How to work with git repositories in a team of contributors to the same project.

The project will be carried out at the ESL at EPFL, one of the world’s top-class universities, including EcoCloud’s technical support. ESL is an active group (24 Ph.D. students among 45 members) involved in many research lines. The student will be under the supervision of Prof. David Atienza (ESL) and Dr. Miguel Peón-Quirós (EcoCloud), with technical support from Stefano Albini (ESL).

Project objectives:

  1. Understanding the RISC-V architecture and development of a lightweight simulator that can execute the applications of BiomedBench compiled for X-HEEP and produce execution statistics.

  2. Modification of the simulator to evaluate in-order superscalar (parallel) execution of the applications in BiomedBench, using the original or additional numbers of functional units. The output of this evaluation will sustain or refute our initial hypothesis.

  3. (Optional) Evaluation of the overheads in terms of area/power of the proposed modifications.

  4. (Optional) Proposal of additional architectural improvements specific for the domain of biomedical wearables.

    Required knowledge and skills:

    • C++ and Python. General Linux use and scripting.

    • Good background in computer architecture and algorithms.

    • Some familiarity with any assembly language (RISC-V ISA will be used throught the project).

    • Good analytical skills.

    • Teamwork and git.

      Appreciated skills:

    • Scientific curiosity.

    • Good communication skills.

    • Advanced English (interaction during the project will be in English).

Type of work: 40% theory analysis, 60% design and simulation.

[1]. Dimitrios Samakovlis et al. “BiomedBench: A benchmark suite of TinyML biomedical applications for low- power wearables.” IEEE Design & Test, 2024. https://infoscience.epfl.ch/handle/20.500.14299/208450.7

[2]. Pasquale Davide Schiavone et al. “Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-Things applications”. In: Int. Symp. on Power and Timing Modeling, Optimization and Simulation (PATMOS). IEEE. 2017, pp. 1–8.