Who’s Aman Arora
Reconfigurable computing, Domain-specific acceleration, Hardware for Machine Learning
Aman Arora is a Graduate Fellow and Ph.D. Candidate in the Department of Electrical and Computer Engineering at the University of Texas at Austin. His research vision is to minimize the gap between ASICs and FPGAs in terms of performance and efficiency, and to minimize the gap between CPUs/GPUs and FPGAs in terms of programmability. He imagines a future where FPGAs are first-class citizens in the world of computing and first-choice for accelerating new workloads. His PhD dissertation research focuses on the search for efficient reconfigurable fabrics for Deep Learning (DL) by proposing new DL-optimized blocks for FPGAs. His research has resulted in 11 paper publications in top conferences and journals in the field of reconfigurable computing and computer architecture and design. His work received a Best Paper Award at the IEEE FCCM conference in 2022, and he currently holds a fellowship from the UT Austin Graduate School. His research has been funded by the NSF. Aman has served as a secondary reviewer in top conferences like ACM FPGA (in 2021 and 2022). He is also the leader of the AI+FPGA committee at Open-Source FPGA (OSFPGA) Foundation, where he leads research efforts and organizes training webinars. He has 12 years of experience in the semiconductor industry in design, verification, testing and architecture roles. Most recently, he worked in the GPU Deep Learning architecture team at NVIDIA.
Aman’s past and current research focusses on architecting efficient reconfigurable acceleration substrates (or fabrics) for Deep Learning (DL). With Moore’s law slowing down, the requirements of resource-hungry applications like DL growing & changing rapidly, and climate change already knocking at our doors, this research theme has never been more relevant and important.
Aman has proposed changing the architecture of FPGAs to make them better DL accelerators. He proposed replacing a portion of the FPGA’s programmable logic area with new blocks called Tensor Slices, which are specialized for performing matrix operations like matrix-matrix multiplication and matrix-vector multiplication that are common in DL workloads. The FPGA industry has parallelly developed similar blocks like Intel AI Tensor Block and Achronix Machine Learning Processor.
In addition, Aman proposed adding compute capabilities to the on-chip memory blocks on FPGAs, so they can operate on data without having to move the data to compute units on the FPGA. He was the first to exploit the dual port nature of FPGA BRAMs to design these blocks instead of using technologies that significantly impact the circuitry of the RAM array and degrade its performance. He calls these new blocks CoMeFa RAMs. This work won the Best Paper Award at IEEE FCCM 2022.
Aman also led a team effort spanning three universities – UT Austin, University of Toronto, and University of New Brunswick – to develop an open-source DL benchmark suite called Koios. These benchmarks can be used to perform FPGA architecture and CAD research, and are integrated into VTR, which is the most popular open-source FPGA CAD flow.
Other research projects Aman has worked on or is working on include: (1) developing a parallel reconfigurable spatial acceleration fabric consisting of PIM (Processing-In-Memory) blocks connected using an FPGA-like interconnect, (2) implementing efficient accelerators for Weightless Neural Networks (WNNs) on FPGAs, (3) enabling support for open-source tools in FPGA research tools like COFFE, and (4) using Machine Learning (ML) to perform cross-prediction of power consumption on FPGAs and developing an open-source dataset that can be widely used for such prediction problems.
Aman hopes to start and direct a research lab at a university soon. His future research will straddle the entire stack of computer engineering: programmability at the top, architecture exploration in the middle, and hardware design at the bottom. The research thrusts he plans to focus on are next-gen reconfigurable fabrics, ML and FPGA related tooling, enabling the creation of an FPGA app store, and sustainable acceleration.