Research LineFull system simulation and design |
We propose a solution to the challenge of implementing transformer models on resource-constrained platforms due to their computational complexity and a large number of parameters. Our solution involves introducing tightly-coupled, small-scale systolic arrays (TiC-SATs) governed by dedicated ISA extensions to accelerate execution. We also employ software optimizations to maximize data reuse and lower miss rates across cache hierarchies. Our TiC-SAT framework is available as open-source.
Keywords
Systolic Array, Tightly-coupled Accelerators, TransformersTeam
Amirshahi Alireza | |
Ansaloni Giovanni | |
Atienza Alonso David | |
Klein Joshua Alexander Harrison |
Our project aims to address the computational challenge posed by the massive size and large number of parameters of typical transformer implementations in artificial intelligence (AI) scenarios. Transformers, originally developed for natural language processing (NLP) tasks, are now widely used for various applications such as question answering, sentiment analysis, image classification, clinical note analysis, and speech-to-text generation.
To accelerate the inference of transformer models, we propose a novel strategy called TiC-SAT (Tightly-Coupled Systolic Array Accelerators for Transformers). TiC-SATs are integrated into CPUs as custom functional units governed by dedicated instructions, avoiding the need for dedicated scratchpad memories and reducing resource consumption. Moreover, TiC-SATs leverage software optimizations that increase data locality, taking advantage of available resources in cache hierarchies without disrupting locality when transitioning from accelerated to non-accelerated computation segments.
To validate our strategy, we implement TiC-SAT as a parametric module in the gem5-X full system simulation environment and conduct comprehensive explorations across various SA sizes and benchmark applications. Our contributions include showcasing how SA accelerators can be integrated into computing systems, enabling full-system and application-wide explorations, and highlighting how tightly-coupled lightweight SAs, such as TiC-SATs, can aptly exploit software optimizations to improve data locality and performance. We also assess the benefits of small-scale, tightly-coupled SAs for accelerating inference in transformer models, considering different TiC-SAT sizes and benchmark applications.
Find us on github:
https://github.com/gem5-X/TiC-SAT
Related Publications
Accelerator-driven Data Arrangement to Minimize Transformers Run-time on Multi-core Architectures | ||||
Amirshahi, Alireza; Ansaloni, Giovanni; Atienza Alonso, David | ||||
2024-01-18 | Conference Paper | |||
TiC-SAT: Tightly-coupled Systolic Accelerator for Transformers | ||||
Amirshahi, Alireza; Klein, Joshua Alexander Harrison; Ansaloni, Giovanni; Atienza Alonso, David | ||||
2023-01-16 | Conference Paper |