Chair’s New Year’s Greetings

Dear Members of ACM SIGDA,

After two years of the COVID-19 pandemic, the world is slowly returning to a new normal. In the Design Automation Conference (DAC) held in San Francisco last month, more than a thousand engineers, scholars, and students gathered in person for the first time in the last two and half years. They presented research ideas, exchanged industrial and societal information, and discussed collaboration opportunities. The only notable difference was probably that everyone was  wearing a mask. 

As  the world reopened from the pandemic, SIGDA elected its new executive committee (EC) in the summer of 2021. Like its predecessors, the new EC is responsible for all regular operations  of SIGDA, including conferences, publications and media, educational and technical activities, awards, and members’ benefits. Understandably, the COVID-19 pandemic has brought numerous unprecedented challenges that the current EC, and the whole SIGDA in general, are facing: disrupted international travels, unpredictable outbreaks of local epidemics, and lack of efficient and effective communications among our members, to name a few. Fortunately, the volunteers of SIGDA and the whole society at large have accumulated extensive experience in overcoming these challenges: the successful in-person DAC last month was just a perfect example. 

Building on these experiences, the new EC has been working tirelessly with our volunteers and the whole society to meet these challenges and prepare for the era after the pandemic. A new “Who’s Who” column of the SIGDA website (https://www.sigda.org/whos-who/) has been launched so that we’d still be able to learn about those active young researchers and scholars all over the world. A new version of ACM/SIGDA E-Newsletter is in the works, among many initiatives that are being planned. I am very proud of how our members, volunteers, and SIGDA leadership team have persevered through the challenging times and have also been delighted to witness the remarkable progress and achievements we have made in the past year. With this message we not only celebrate a successful 2021 with you, but also look forward to sharing some big goals and ideas soon! Our fellows will get in touch with you in the new year about our new plans and initiatives.

My warmest wishes to all the SIGDA members and their families for a healthy, restorative and productive 2022!

Yiran Chen

Chair of ACM SIGDA

Who’s Christophe Bobda

January 1st, 2022

Christophe Bobda

Professor

University of Florida

Email:

cbobda@ece.ufl.edu

Personal webpage

https://bobda.ece.ufl.edu/

Research interests

Reconfigurable Computing, FPGA, System on Chip Design, Embedded Imaging, Cybersecurity and Robotics

Short bio

Professor Bobda received the License in mathematics from the University of Yaounde, Cameroon, in 1992, the diploma of computer science and the Ph.D. degree in computer science from the University of Paderborn in Germany in 1999 and 2003 respectively. In June 2003 he joined the department of computer science at the University of Erlangen-Nuremberg in Germany as Post doc. Dr. Bobda received the best dissertation award 2003 from the University of Paderborn for his work on synthesis of reconfigurable systems using temporal partitioning and temporal placement. In 2005 Dr. Bobda was appointed assistant professor at the University of Kaiserslautern. There he set the chair for Self-Organizing Embedded Systems that he led until October 2007. From 2007 to 2010 Dr. Bobda was Professor at the University of Potsdam and leader of the working Group Computer Engineering. Upon moving to the US, Dr. Bobda was appointed Professor of computer engineering at the University of Arkansas where he founded the smart embedded systems lab (2010 – 2018). Since 2019, Dr. Boda has been with the University of Florida as Professor of Computer Engineering, leader of the lab smart systems and outreach director of the the Nelms Institute of Connected World.

Reasearch highlights

Professor Christophe Bobda’ research interests lie primarily in the design of smart embedded systems, with emphasis of run-time optimization. He investigates the design and run-time operation of high-performance and adaptive architectures with application in image processing, embedded optimization, security, and control. He recently introduced an event-based split-CNN architecture (ESCA) for running time-critical vision applications with comparatively less memory footprint while consuming low power. ESCA has a dedicated hardware architecture and scheduling of on-chip memory buffering using a split-CNN that reduces memory requirements by splitting the feature maps into small patches and independently executes them. This work received the best short paper of award at FCCM2021. His previous work of “DyNoC: A Dynamic Infrastructure for Communication in Dynamically Reconfigurable Devices” is nominated among the 23 most significant FPL papers of the last 25 years. His research enables a new paradigm to design adaptive architectures for various applications.

Who’s Dayane Alfenas Reis

January 1st, 2022

Dayane Alfenas Reis

Postdoctoral Researcher

Notre Dame University

Email:

Dayane.A.Reis.11@nd.edu

Personal webpage

https://sites.google.com/nd.edu/dreis

Research interests

VLSI design, Beyond-CMOS devices, In-memory computing architectures for data-centric applications, Hardware-software codesign, Secure computing

Short bio

Dr. Dayane Reis received her Ph.D. in Computer Science and Engineering from the University of Notre Dame in 2021, where she works as a Postdoctoral Researcher in the Hardware-Software Co-design Lab, under the direction of Dr. Xiaobo Sharon Hu and Dr. Michael Niemier. She also received the MSc. in Electrical Engineering from the Federal University of Minas Gerais, Brazil, in 2016, and the BSc. in Electronic Engineering from the Pontifical Catholic University of Minas Gerais, Brazil, in 2012. Dr. Reis’s research exploit the unique characteristics of beyond CMOS technologies for the design of fast, energy efficient and reliable in-memory computing kernels that can be used in a wide range of data-intensive application scenarios. She is the author of more than 20 articles in journals such as IEEE TVLSI, IEEE TCAD, IEEE Design, and Test, Nature Electronics, as well as renowned conferences including ISLPED, ASP-DAC, ICCAD and DATE. Dr. Reis was one of the two winners of the best paper award at the ACM/IEEE International Symposium on Electronics and Low Power Design in 2018 (ISLPED’18) for her paper “Computing in memory with FeFETs”, and a recipient of the Cadence Women in Technology (WIT) Scholarship 2018/2019, in recognition to her personal history and efforts toward the inclusion of women in STEM fields.

Reasearch highlights

Dr. Dayane Reis’s research investigates the impact of emerging technologies on the design of circuits and architectures for data-centric computing. Furthermore, her research also exploits non-von Neumann architectures – such as those based on the concept of in-memory computing (IMC) – to alleviate the impact of data transfers on a system’s overall performance and energy consumption. She designed the first IMC architecture based on Ferroelectric Field Effect Transistors (FeFETs) for general-purpose computing-in-memory. For this work, she won the Best Paper Award at the ISLPED. Furthermore, she designed a variety of hardware accelerators based on different IMC kernels (i.e., general-purpose computing-in-memory arrays, ternary content addressable memory arrays, etc.) for hardware-software codesign of meta-learning models and cryptography algorithms such as the Advanced Encrypted Standard (AES), and the Brakerski/Fan-Vercauteren scheme for homomorphic encryption. Dr. Reis also participated in the development of a uniform framework for benchmarking IMC architectures based on CMOS and emerging technologies. The framework allows researchers to assess the benefits of analog and digital IMC based on different devices for data-intensive tasks in the domain of machine learning and have wide applicability. Finally, together with collaborators at Purdue University, Dr. Reis proposed and evaluated polymorphic gates based on Black Phosphorus Field-Effect Transistors (BP-FETs) that operate with low voltage supply (up to 0.2V) and are resistant to power supply variations. Such hardware security primitives can be employed in logic obfuscation, having great utility for intellectual property protection.

Statement on the Tragedy of Davide Giri

The entire electronic design automation (EDA) community is in profound grief for the loss of Davide Giri, a graduate student at Columbia University, who fell victim to a horrific violence last Friday (December 3, 2021).  On behalf of the Association of Computing Machinery (ACM) Special Interest Group on Design Automation (SIGDA) and the Institute of Electrical and Electronics Engineers (IEEE) Council on Electronic Design Automation (CEDA), we would like to extend our most heartfelt condolences to the family and friends of Davide Giri.

Davide Giri, a Ph.D. Candidate of Computer Science, had been an active contributor to a number of important research projects on architectures and system-level design methodologies for heterogeneous System-on-Chip (SoC). He was also the author of multiple papers published in the top conferences and journals in our fields. The EDA community mourns the loss of such a bright young researcher who should have a very promising career.

Unfortunately, this tragedy is just one of many atrocious attacks that have recently happened to graduate students. We condemn the senseless violence and would like to urge the government, universities, and communities to take effective actions to protect the safety of our students and faculty members.

The ACM SIGDA and IEEE CEDA would like to offer help and support to anyone in our community who is impacted by such tragic events, physically or emotionally. We also encourage our members to reach out to the family, friends, and colleagues of Davide Giri, express our condolences, and help each other heal from such a big emotional loss. We hope that through our collective voice and power, we will lift up fellow members of our community during this trying time.

Regards,

Yiran Chen, Chair of ACM SIGDA

Yao-Wen Chang, President of IEEE CEDA

Who’s Pi-Cheng Hsiu

January 1st, 2022

Pi-Cheng Hsiu

Research Fellow

Academia Sinica

Email:

pchsiu@citi.sinica.edu.tw

Personal webpage

https://www.citi.sinica.edu.tw/~pchsiu

Research interests

Embedded Software and Intermittent Computing

Short bio

Dr. Pi-Cheng Hsiu received the Ph.D. degree in computer science and information engineering from National Taiwan University in 2009. He is currently a Research Fellow (Professor) and the Deputy Director of the Research Center for Information Technology Innovation (CITI), where he leads the Embedded and Mobile Computing Laboratory, and is also a Joint Research Fellow with the Institute of Information Science, Academia Sinica, Taiwan, a Jointly Appointed Professor with the Department of Computer Science and Engineering, National Chi Nan University, and a Jointly Appointed Professor with the College of Electrical Engineering and Computer Science, National Taiwan University. He was a Visiting Scholar with the Department of Computer Science, University of Illinois at Urbana-Champaign, in 2007 and with the Department of Electrical and Computer Engineering, University of Pittsburgh, in 2019. 

Dr. Hsiu constantly publishes papers at the premier venues in embedded systems, real-time systems, and design automation. His works were respectively nominated for the Best Paper Awards at IEEE/ACM CODES+ISSS 2019, 2020, and 2021, of which the last two received the Best Paper Awards in a row. He is a recipient of the 2019 Young Scholars’ Creativity Award of the Foundation for the Advancement of Outstanding Scholarship, the 2019 Exploration Research Award of the Pan Wen Yuan Foundation, and the 2015 Scientific Paper Award of the Y. Z. Hsu Science and Technology Memorial Foundation. He serves as an Associate Editor of the ACM Transactions on Cyber-Physical Systems, Track Co-Chairs of IEEE/ACM ISLPED and ACM SAC, and in the Technical Program Committees of major conferences in his field, including RTSS, RTAS, CODES+ISSS and DAC.

Reasearch highlights

Dr. Hsiu’s research goal is to realize Intermittent Artificial of Things (iAIoT), enabling battery-less IoT devices to intermittently execute deep neural networks (DNN) via ambient power. iAIoT is a novel research direction at the intersection of intermittent computing and deep learning, and once realized, would create innovative applications.

He has led a research team to release a suite of system runtime and libraries, facilitating AI and IoT application developers to easily build low cost, intermittent-aware inference systems. In particular, an intermittent operating system (TCAD’20), which was the first attempt to allow multitasking and task concurrency on intermittent systems, makes complicated intermittent applications increasingly possible. The HAWAII middleware (TCAD’20), which comprises an inference engine and API library, enables hardware accelerated intermittent DNN inference. In addition, the iNAS framework (TECS’21) was the first framework that introduces intermittent execution behavior into neural architecture search to automatically find intermittently-executable DNN models. HAWAII and iNAS received the Best Paper Awards, respectively, for two years in a row at IEEE/ACM CODES+ISSS 2020 and 2021. Such recognition indicates the innovativeness of his research and contributions to the community.

Who’s Yi-Chung Chen

December 1st, 2021

Yi-Chung Chen

Associate Professor

Tennessee State University

Email:

ychen@tnstate.edu

Personal webpage

https://yichungchen84.github.io/

Research interests

Application-specific system, 3-D integration, Heat simulation, NVM, Data pipeline, Deep learning

Short bio

Yi Chung Chen is currently an Assistant Professor in the Department of Electrical and Computer Engineering, Tennessee State University. He received the Ph.D. degree from Electrical and Computer Engineering, University of Pittsburgh, USA in 2014, and the M.S. degree in Electrical and Computer Engineering from New York University, New York in 2011. He has served as TPC member in many conference committees. He has served as committee member of regional and national STEM education committees for silicon and digital system education of underrepresented minorities. He has also served as organizing and technical program committee members of conferences.

Reasearch highlights

Prof. Chen’s has published interdisciplinary papers in the major research fields across computing systems and applications. His research contribution lies in vertical integration of EDA tools for design of application-specific computer system. He investigates data-driven intelligent systems for adaptive, resilient, and expandable operations demanded by critical missions. His research projects are supported by the US AF, the US ARMY, ASEE, and NSF. In addition, Prof. Chen is an acting committee member of semiconductor research and education working group for minority serving institutions. He is on a mission to prepare future semiconductor workforce of underrepresented groups in engineering by building education and research capacity with opensource EDA tools.

Who’s Hussam Amrouch

December 1st, 2021

Hussam Amrouch

Jun.-Professor

University of Stuttgart, Germany

Email:

amrouch@iti.uni-stuttgart.de

Personal webpage

https://www.iti.uni-stuttgart.de/en/institute/team/Amrouch/

Research interests

Beyond-CMOS, Beyond von-Neumann Architectures, Neuromorphic Computing, Semiconductor Physics, Machine Learing for CAD

Short bio

Hussam Amrouch is a Jun.-Professor heading the Chair of Semiconductor Test and Reliability (STAR) in the Computer Science, Electrical Engineering Faculty at the University of Stuttgart, Germany as well as he is a Research Group Leader at the Karlsruhe Institute of Technology (KIT), Germany. He earned in 06.2015 his Ph.D. degree in Computer Science (Dr.-Ing.) from KIT with the highest distinction (summa cum laude), which has an acceptance ratio of less than 10% at KIT. After which he had founded the “Dependable Hardware” research group at KIT, which he is still leading until now. In 07.2020, He was appointed at the University of Stuttgart, computer science department, as a Junior Professor leading the research efforts in the area of machine learning for CAD with a special focus on design for testing and reliability for advanced and emerging nanotechnologies.

Reasearch highlights

Prof. Amrouch has published more than 140 multidisciplinary publications (including 55 journals) in the major research areas across the computing stack (semiconductor physics, circuit design, computer-aided design, and computer architecture). His key research interests are focused on beyond-CMOS technologies, emerging memories, and beyond-von Neumann architectures with a special focus on In-Memory Computing, Neuromorphic Computing and AI applications. He received eight times a HiPEAC Paper Award. He also received three Best Paper Award Nominations for his work in reliability; two of them from the Design Automation Conference (DAC’17, DAC’16) and one from the Design, Automation and Test in Europe Conference (DATE’17). He has 10 tutorials and 24 invited talks (including 2 keynotes) in several top international conferences (e.g., DAC, DATE, etc.), universities and companies. He has also organized 9 special sessions in top CAD conferences. He currently serves as Review Editor at the Frontiers in Electronics and Associate Editor Integration, the VLSI Journal. He serves also as Technical Program Committee (TPC) member for many top international conferences in the computer science area like Design Automation Conference (DAC). He is also a reviewer in many top journals in different research fields starting from the system level (e.g., IEEE Transactions on Computers TC) to the circuit level (e.g., IEEE Transactions on Circuits and Systems TCAS-I) all the way down to semiconductor physics (e.g., IEEE Transactions on Electron Devices TED).

Who’s Bei Yu

December 1st, 2021

Bei Yu

Associate Professor

Chinese University of Hong Kong

Email:

byu@cse.cuhk.edu.hk

Personal webpage

http://www.cse.cuhk.edu.hk/~byu/index.html

Research interests

Physical Design, Mask Optimization, Design Space Exploration, Deep Learning

Short bio

Prof. Bei Yu is currently an Associate Professor in the Department of Computer Science and Engineering, The Chinese University of Hong Kong. He received the Ph.D degree from Electrical and Computer Engineering, University of Texas at Austin, USA in 2014, and the M.S. degree in Computer Science from Tsinghua University, China in 2010. He has served as TPC Chair of ACM/IEEE Workshop on Machine Learning for CAD, and in many journal editorial boards and conference committees. He is Editor of the IEEE TCCPS Newsletter.

Prof. Yu has published more than 200 research papers, mainly in top journals (including 39 IEEE TCAD) and top conferences (including 22 DAC and 29 ICCAD) in the VLSI CAD area. He published the most papers of DAC 2019 (7 papers) and ICCAD 2021 (9 papers) among all scholars around the world. He received seven Best Paper Awards from ASPDAC 2021, ICTAI 2019, Integration, the VLSI Journal in 2018, ISPD 2017, SPIE Advanced Lithography Conference 2016, ICCAD 2013, ASPDAC 2012 and five other Best Paper Award Nominations (DATE 2021, ASPDAC 2019, DAC 2014, ASPDAC 2013, and ICCAD 2011). He also received six awards in ICCAD/ISPD contest awards.

Reasearch highlights

As one of the pioneers, Prof. Yu’s research contribution lies in machine learning for EDA, which is to remarkably improve circuit design efficiency with the aid of machine learning techniques. He investigates generative adversarial network models GAN-OPC (DAC’18, TCAD’20) and DAMO (ICCAD’20, TCAD’21) to improve mask optimization quality and efficiency and even outperform state-of-the-art commercial tool. He is pioneer for new class of research about graph learning and point cloud embedding for EDA. For instance, he proposes graph neural network based learning on netlist-level, and investigates how graph learning model can help on testability, reliability, and manufacturability analysis (published on DAC’19, DAC’20, and TCAD’21). He is the first researcher exploiting deep point cloud embedding concept in VLSI physical design field to construct a routing tree (best paper award at ASPDAC’21). In addition, he proposes active learning (equipped with Gaussian process and neural process) as an advanced learning paradigm for design space exploration in EDA (published on TCAD’19 and TCAD’21).

Another distinctive contribution of Prof. Yu’s work is EDA for deep learning system. The resource consumption of the deep learning models is a major concern regarding the broad deployment on resource-constrained hardware. Prof. Yu investigates a unified approximation framework to compress and accelerate the deep learning models, where the low-rankness and structured sparsity are incorporated for model pruning. This work received the best student paper award from ICTAI’2019. He also proposes to optimize the HLS and TVM deployment strategies of DNN models on FPGA and GPU, cooperated with a set of advanced learning and optimization methodologies (published on 2x DATE’2021, 2x ICCAD’21 and ICCV’21). These methodologies facilitate DNN deployment on resource-constrained hardware with high efficiency and performance.

SRC-2020

News: SIGDA Student Research Competition (SRC) Gold Medalists won ACM SRC Grand Finals

  • Graduate: First Place

Jiaqi Gu, University of Texas at Austin

Research Advisors: David Z. Pan and Ray T. Chen

“Light in Artificial Intelligence: Efficient Neuromorphic Computing with Optical Neural Networks” (ICCAD 2020)

Deep neural networks have received an explosion of interest for their superior performance in various intelligent tasks and high impacts on our lives. The computing capacity is in an arms race with the rapidly escalating model size and data amount for intelligent information processing. Practical application scenarios, e.g., autonomous vehicles, data centers, and edge devices, have strict energy efficiency, latency, and bandwidth constraints, raising a surging need to develop more efficient computing solutions. However, as Moore’s law is winding down, it becomes increasingly challenging for conventional electrical processors to support such massively parallel and energy-hungry artificial intelligence (AI) workloads. .. [Read more]

  • Undergraduate: Second Place

Chuangtao Chen, Zhejiang University

Research Advisor: Cheng Zhuo

“Optimally Approximated Floating-Point Multiplier” (ICCAD 2020)

At the edge, IoT devices are designed to consume the minimum resource to achieve the desired accuracy. However, the conventional processors, such as CPU or GPU, can only conduct all the computations with predetermined but sometimes unnecessary precisions, inevitably degrading their energy efficiency. When running data-intensive applications, due to the large range of input operands, most conventional processors heavily rely on floating-point units (FPUs). Recently, approximate computing has become a promising alternative to improve energy efficiency for IoT devices on the edge, especially when running inaccuracy-tolerable applications. For various data-intensive tasks on edge devices, multiplication is a common but the most energy consuming one among different floating-point operations. As a common arithmetic component that has been studied for decades [1]–[3], the past focus on the FP multiplier is accuracy and performance… [Read more]


ACM Student Research Competition at ICCAD 2020 (SRC@ICCAD’20)

DEADLINE: September 28, 2020 (extended)
Online Submission: https://www.easychair.org/conferences/?conf=srciccad2020
 
Sponsored by Microsoft Research, the ACM Student Research Competition is an internationally recognized venue enabling undergraduate and graduate students who are ACM members to:

  • Experience the research world — for many undergraduates, this is a first!
  • Share research results and exchange ideas with other students, judges, and conference attendees
  • Rub shoulders with academic and industry luminaries
  • Understand the practical applications of their research
  • Perfect their communication skills
  • Receive prizes and gain recognition from ACM and the greater computing community.

The ACM Special Interest Group on Design Automation (ACM SIGDA) is organizing such an event in conjunction with the International Conference on Computer Aided Design (ICCAD). Authors of accepted submissions will get ICCAD registration fee support from SIGDA. The event consists of several rounds, as described at http://src.acm.org/ and http://www.acm.org/student-research-competition, where you can also find more details on student eligibility and timeline.



Details on abstract submission:
Research projects from all areas of design automation are encouraged. The author submitting the abstract must still be a student at the time the abstract is due. Each submission should be made on the EasyChair submission site. Please include the author’s name, affiliation, and email address; research advisor’s name; ACM student member number; category (undergraduate or graduate); research title; and an extended abstract (maximum 2 pages or 800 words) containing the following sections:

  • Problem and Motivation: This section should clearly state the problem being addressed and explain the reasons for seeking a solution to this problem.
  • Background and Related Work: This section should describe the specialized (but pertinent) background necessary to appreciate the work. Include references to the literature where appropriate, and briefly explain where your work departs from that done by others. Reference lists do not count towards the limit on the length of the abstract.
  • Approach and Uniqueness: This section should describe your approach in attacking the problem and should clearly state how your approach is novel.
  • Results and Contributions: This section should clearly show how the results of your work contribute to computer science and should explain the significance of those results. Include a separate paragraph (maximum of 100 words) for possible publication in the conference proceedings that serves as a succinct description of the project.
  • Single paper summaries (or just cut & paste versions of published papers) are inappropriate for the ACM SRC. Submissions should include at least one year worth of research contributions, but not subsuming an entire doctoral thesis load.

All accepted submissions will be invited to present their work to the community (and a jury) as part of the program for ICCAD 2020 (details on the presentations will follow after acceptance). Note that ICCAD will take place virtually (i.e., as an online event) from November 2 to November 4, 2020.

The ACM Student Research Competition allows both graduate and undergraduate students to discuss their research with student peers, as well as academic and industry researchers, in an informal setting, while enabling them to attend ICCAD and compete with other ACM SRC winners from other computing areas in the ACM Grand Finals.


Online Submission – EasyChair:
https://www.easychair.org/conferences/?conf=srciccad2020 
Important dates:

  • Abstract submission deadline: September 28, 2020 (extended)
  • Acceptance notification: October 12, 2020
  • Poster session: November 02, 2020
  • Award winners announced at ICCAD
  • Grand Finals winners honored at ACM Awards Banquet: June 2021 (Estimated)


Requirement:
Students submitting and presenting their work at SRC@ICCAD’20 are required to be members of both ACM and ACM SIGDA.

Organizers:

Robert Wille (Johannes Kepler University Linz, Austria), robert.wille@jku.at

Meng Li (Facebook, USA), meng.li@fb.com

ISPD 2021 TOC

ISPD ’21: Proceedings of the 2021 International Symposium on Physical Design


Full Citation in the ACM Digital Library

SESSION: Session 1: Opening Session and First Keynote

Session details: Session 1: Opening Session and First Keynote

  • Jens Lienig

Physical Design for 3D Chiplets and System Integration

  • Frank J.C. Lee

Heterogeneous three-dimensional (3-D) package-level integration plays an increasingly
important role in the design of higher functional density and lower power processors
for general computing, machine learning and mobile applications. In TSMC’s 3DFabricTM
platform, the back end packaging technology Chip-on-Wafer-on-Substrate (CoWoS®) with
the integration of High-Bandwidth Memory (HBM) has been successfully deployed in high
performance compute and machine learning applications to achieve high compute throughput,
while Integrated Fan-Out (InFO) packaging technology is widely used in mobile applications
thanks to its small footprint. System on Integrated Chips (SoIC⃨), leveraging advanced
front end Silicon process technology, offers an unprecedented bonding density for
vertical stacking.

Combining SoIC with CoWoS and InFO, the 3DFabric family of technologies provides a
versatile and flexible platform for system design innovations. A 3DFabric design starts
with system partitioning to decompose it into different functional components. In
contrast to a monolithic design approach, these functional components can potentially
be implemented in different technologies to optimize system performance, power, area,
and cost. Then these component chips are re-integrated with 3DFabric advanced packaging
technologies to form the system. There are new design challenges and opportunities
arising from 3DFabric. To unleash its full potential and accelerate the product development,
physical design solutions are developed. In this presentation, we will first review
these advanced packaging technologies trends and design challenges. Then, we will
present design solutions for 3-D chiplets and system integration.

SESSION: Session 2: Machine Learning for Physical Design (1/2)

Session details: Session 2: Machine Learning for Physical Design (1/2)

  • Jiang Hu

Reinforcement Learning for Electronic Design Automation: Successes and Opportunities

  • Matthew E. Taylor

Reinforcement learning is a machine learning technique that has been applied in many
domains, including robotics, game playing, and finance. This talk will briefly introduce
reinforcement learning with two use cases related to compiler optimization and chip
design. Interested participants will also have materials suggested to learn a more
at a technical or non-technical level about this exciting tool.

Reinforcement Learning for Placement Optimization

  • Anna Goldie
  • Azalia Mirhoseini

In the past decade, computer systems and chips have played a key role in the success
of artificial intelligence (AI). Our vision in Google Brain’s Machine Learning for
Systems team is to use AI to transform the way in which computer systems and chips
are designed. Many core problems in systems and hardware design are combinatorial
optimization or decision making tasks with state and action spaces that are orders
of magnitude larger than that of standard AI benchmarks in robotics and games. In
this talk, we will describe some of our latest learning based approaches to tackling
such large-scale optimization problems. We will discuss our work on a new domain-transferable
reinforcement learning (RL) method for optimizing chip placement [1], a long pole
in hardware design. Our approach is capable of learning from past experience and improving
over time, resulting in more optimized placements on unseen chip blocks as the RL
agent is exposed to a larger volume of data. Our objective is to minimize power, performance,
and area. We show that, in under six hours, our method can generate placements that
are superhuman or comparable on modern accelerator chips, whereas existing baselines
require human experts in the loop and can take several weeks.

The Law of Attraction: Affinity-Aware Placement Optimization using Graph Neural Networks

  • Yi-Chen Lu
  • Sai Pentapati
  • Sung Kyu Lim

Placement is one of the most crucial problems in modern Electronic Design Automation
(EDA) flows, where the solution quality is mainly dominated by on-chip interconnects.
To achieve target closures, designers often perform multiple placement iterations
to optimize key metrics such as wirelength and timing, which is highly time-consuming
and computationally inefficient. To overcome this issue, in this paper, we present
a graph learning-based framework named PL-GNN that provides placement guidance for
commercial placers by generating cell clusters based on logical affinity and manually
defined attributes of design instances. With the clustering information as a soft
placement constraint, commercial tools will strive to place design instances in a
common group together during global and detailed placements. Experimental results
on commercial multi-core CPU designs demonstrate that our framework improves the default
placement flow of Synopsys IC Compiler II (ICC2) by 3.9% in wirelength, 2.8% in power,
and 85.7% in performance.

SESSION: Session 3: Advances in Placement

Session details: Session 3: Advances in Placement

  • Joseph Shinnerl

Advancing Placement

  • Andrew B. Kahng

Placement is central to IC physical design: it determines spatial embedding, and hence
parasitics and performance. From coarse-to fine-grain, placement is conjointly optimized
with logic, performance, clock and power distribution, routability and manufacturability.
This paper gives some personal thoughts on futures for placement research in IC physical
design. Revisiting placement as optimization prompts a new look at placement requirements,
optimization quality, and scalability with resources. Placement must also evolve to
meet a growing need for co-optimizations and for co-operation with other design steps.
“New” challenges will naturally arise from scaling, both at the end of the 2D scaling
roadmap and in the context of future 2.5D/3D/4D integrations. And, the nexus of machine
learning and placement optimization will continue to be an area of intense focus for
research and practice. In general, placement research is likely to see more flow-scale
optimization contexts, open source, benchmarking of progress toward optimality, and
attention to translations into real-world practice.

A Fast Optimal Double Row Legalization Algorithm

  • Stefan Hougardy
  • Meike Neuwohner
  • Ulrike Schorr

In Placement Legalization, it is often assumed that (almost) all standard cells possess
the same height and can therefore be aligned in cell rows, which can then be treated
independently. However, this is no longer true for recent technologies, where a substantial
number of cells of double- or even arbitrary multiple-row height is to be expected.
Due to interdependencies between the cell placements within several rows, the legalization
task becomes considerably harder. In this paper, we show how to optimize quadratic
cell movement for pairs of adjacent rows comprising cells of single- as well as double-row
height with a fixed left-to-right ordering in time $\mathcalO (n\cdotłog(n))$, whereby
n denotes the number of cells involved. Opposed to prior works, we thereby do not
artificially bound the maximum cell movement and can guarantee to find an optimum
solution. Experimental results show an average percental decrease of over $26%$ in
the total quadratic movement when compared to a legalization approach that fixes cells
of more than single-row height after Global Placement.

Multiple-Layer Multiple-Patterning Aware Placement Refinement for Mixed-Cell-Height
Designs

  • Bo-Yang Chen
  • Chi-Chun Fang
  • Wai-Kei Mak
  • Ting-Chi Wang

Conventional lithography techniques are unable to achieve the resolution required
by advance technology nodes. Multiple patterning lithography (MPL) has been introduced
as a viable solution. Besides, new standard cell structure with multiple middle-of-line
(MOL) layers is adopted to improve intra-cell routability. A mixed-cell-height standard
cell library, consisting of cells of single-row and multiple-row heights, is also
used in designs for power, performance and area concerns. As a result, it becomes
increasingly difficult to get a feasible placement for a mixed-cell-height design
where multiple cell layers require MPL. In this paper, we present a methodology to
refine a given mixed-cell-height standard cell placement for satisfying MPL requirements
on multiple cell layers as much as possible, while minimizing the total cell displacement.
We introduce the concept of uncolored cell group (UCG) to facilitate the effective
removal of coloring conflicts. By eliminating UCGs without generating any new coloring
conflict around them, the number of UCGs is effectively reduced in the local and global
refinement stages of our methodology. We report promising experimental results to
demonstrate the efficacy of our methodology.

Snap-3D: A Constrained Placement-Driven Physical Design Methodology for Face-to-Face-Bonded
3D ICs

  • Pruek Vanna-Iampikul
  • Chengjia Shao
  • Yi-Chen Lu
  • Sai Pentapati
  • Sung Kyu Lim

3D integration technology is one of the leading options that can advance Moore’s Law
beyond conventional scaling. Due to the absence of commercial 3D placers and routers,
existing 3D physical design flows rely heavily on 2D commercial tools to handle 3D
IC physical synthesis. Specifically, these flows build 2D designs first and then convert
them into 3D designs. However, several works demonstrate that design qualities degrade
during this 2D-3D transformation. In this paper, we overcome this issue with our Snap-3D,
a constraint-driven placement approach to build commercial-quality 3D ICs. Our key
idea is based on the observation that if the standard cell height is contracted by
one half and partitioned into multiple tiers, any commercial 2D placer can place them
onto the row structure and naturally achieve high-quality 3D placement. This methodology
is shown to optimize power, performance, and area (PPA) metrics across different tiers
simultaneously and minimize the aforementioned design quality loss. Experimental results
on 7 industrial designs demonstrate that Snap-3D achieves up to 5.4% wirelength, 10.1%
power, and 92.3% total negative slack improvements compared with state-of-the-art
3D design flows.

SESSION: Session 4: Driving Research in Placement: a Retrospective

Session details: Session 4: Driving Research in Placement: a Retrospective

  • Igor Markov

Still Benchmarking After All These Years

  • Ismail S. Bustany
  • Jinwook Jung
  • Patrick H. Madden
  • Natarajan Viswanathan
  • Stephen Yang

Circuit benchmarks for VLSI physical design have been growing in size and complexity,
helping the industry tackle new problems and find new approaches. In this paper, we
take a look back at how benchmarking efforts have shaped the research community, consider
trade-offs that have been made, and speculate on what may come next.

SESSION: Session 6: Second Keynote

Session details: Session 6: Second Keynote

  • Ismail Bustany

Scalable System and Silicon Architectures to Handle the Workloads of the Post-Moore
Era

  • Ivo Bolsens

The end of Moore’s law has been proclaimed on many occasions and it’s probably safe
to say that we are now working in the post-Moore era. But no one is ready to slow
down just yet. We can view Gordon Moore’s observation on transistor densification
as just one aspect of a longer-term underlying technological trend – the Law of Accelerating
Returns articulated by Kurzweil. Arguably, companies became somewhat complacent in
the Moore era, happy to settle for the gains brought by each new process node. Although
we can expect scaling to continue, albeit at a slower pace, the end of Moore’s Law
delivers a stronger incentive to push other trends harder. Some exciting new technologies
are now emerging such as multi-chip 3D integration and the introduction of new technologies
such as storage-class memory and silicon photonics. Moreover, we are also entering
a golden age of computer architecture innovation. One of the key drivers is the pursuit
of domain-specific architectures as proclaimed by Turing award winners John Hennessy
and David Patterson. A good example is the Xilinx’s AI Engine, one of the important
features of the Versal? ACAP (adaptive compute acceleration platform). Today, the
explosion of AI workloads is one of the most powerful drivers shifting our attention
to find faster ways of moving data into, across, and out of accelerators. Features
such as massive parallel processing elements, the use of domain specific accelerators,
the dense interconnect between distributed on-chip memories and processing elements,
are examples of the ways chip makers are looking beyond scaling to achieve next-generation
performance gains.

SESSION: Session 7: Machine Learning for Physical Design (2/2)

Session details: Session 7: Machine Learning for Physical Design (2/2)

  • Siddhartha Nath

Learning Point Clouds in EDA

  • Wei Li
  • Guojin Chen
  • Haoyu Yang
  • Ran Chen
  • Bei Yu

The exploding of deep learning techniques have motivated the development in various
fields, including intelligent EDA algorithms from physical implementation to design
for manufacturability. Point cloud, defined as the set of data points in space, is
one of the most important data representations in deep learning since it directly
pre- serves the original geometric information without any discretization. However,
there are still some challenges that stifle the applications of point clouds in the
EDA field. In this paper, we first review previous works about deep learning in EDA
and point clouds in other fields. Then, we discuss some challenges of point clouds
in EDA raised by some intrinsic characteristics of point clouds. Finally, to stimulate
future research, we present several possible applications of point clouds in EDA and
demonstrate the feasibility by two case studies.

Building up End-to-end Mask Optimization Framework with Self-training

  • Bentian Jiang
  • Xiaopeng Zhang
  • Lixin Liu
  • Evangeline F.Y. Young

With the continuous shrinkage of device technology node, the tremendously increasing
demands for resolution enhancement technologies (RETs) have created severe concerns
over the balance between computational affordability and model accuracy. Having realized
the analogies between computational lithography tasks and deep learning-based computer
vision applications (e.g., medical image analysis), both industry and academia start
gradually migrating various RETs to deep learning-enabled platforms. In this paper,
we propose a unified self-training paradigm for building up an end-to-end mask optimization
framework from undisclosable layout patterns. Our proposed flow comprises (1) a learning-based
pattern generation stage to massively synthesize diverse and realistic layout patterns
following the distribution of the undisclosable target layouts, while keeping these
confidential layouts blind for any successive training stage, and (2) a complete self-training
stage for building up an end-to-end on-neural-network mask optimization framework
from scratch, which only requires the aforementioned generated patterns and a compact
lithography simulation model as the inputs. Quantitative results demonstrate that
our proposed flow achieves comparable state-of-the-art (SOTA) performance in terms
of both mask printability and mask correction time while reducing 66% of the turn
around time for flow construction.

Machine Learning Techniques in Analog Layout Automation

  • Tonmoy Dhar
  • Kishor Kunal
  • Yaguang Li
  • Yishuang Lin
  • Meghna Madhusudan
  • Jitesh Poojary
  • Arvind K. Sharma
  • Steven M. Burns
  • Ramesh Harjani
  • Jiang Hu
  • Parijat Mukherjee
  • Soner Yaldiz
  • Sachin S. Sapatnekar

The quality of layouts generated by automated analog design have traditionally not
been able to match those from human designers over a wide range of analog designs.
The ALIGN (Analog Layout, Intelligently Generated from Netlists) project [2, 3, 6]
aims to build an open-source analog layout engine [1] that overcomes these challenges,
using a variety of approaches. An important part of the toolbox is the use of machine
learning (ML) methods, combined with traditional methods, and this talk overviews
our efforts. The input to ALIGN is a SPICE-like netlist and a set of perfor- mance
specifications, and the output is a GDSII layout. ALIGN automatically recognizes hierarchies
in the input netlist. To detect variations of known blocks in the netlist, approximate
subgraph iso- morphism methods based on graph convolutional networks can be used [5].
Repeated structures in a netlist are typically constrained by layout requirements
related to symmetry or matching. In [7], we use a mix of graph methods and ML to detect
symmetric and array structures, including the use of neural network based approximate
matching through the use of the notion of graph edit distances. Once the circuit is
annotated, ALIGN generates the layout, going from the lowest level cells to higher
levels of the netlist hierarchy. Based on an abstraction of the process design rules,
ALIGN builds parameterized cell layouts for each structure, accounting for the need
for common centroid layouts where necessary [11]. These cells then undergo placement
and routing that honors the geomet- ric constraints (symmetry, common-centroid). The
chief parameter that changes during layout is the set of interconnect RC parasitics:
excessively large RCs could result in an inability to meet perfor- mance. These values
can be controlled by reducing the distance between blocks, or, in the case of R, by
using larger effective wire widths (using multiple parallel connections in FinFET
technologies where wire widths are quantized) to reduce the effective resistance.
ALIGN has developed several approaches based on ML for this purpose [4, 8, 9] that
rapidly predict whether a layout will meet the performance constraints that are imposed
at the circuit level, and these can be deployed together with conventional algorithmic
methods [10] to rapidly prune out infeasible layouts. This presentation overviews
our experience in the use of ML- based methods in conjunction with conventional algorithmic
ap- proaches for analog design. We will show (a) results from our efforts so far,
(b) appropriate methods for mixing ML methods with tra- ditional algorithmic techniques
for solving the larger problem of analog layout, (c) limitations of ML methods, and
(d) techniques for overcoming these limitations to deliver workable solutions for
analog layout automation.

SESSION: Session 8: Monolithic 3D and Packaging Session

Session details: Session 8: Monolithic 3D and Packaging Session

  • Bill Swartz

Advances in Carbon Nanotube Technologies: From Transistors to a RISC-V Microprocessor

  • Gage Hills

Carbon nanotube (CNT) field-effect transistors (CNFETs) promise to improve the energy
efficiency of very-large-scale integrated (VLSI) systems. However, multiple challenges
have prevented VLSI CNFET circuits from being realized, including inherent nano-scale
material defects, robust processing for yielding complementary CNFETs (i.e., CNT CMOS:
including both PMOS and NMOS CNFETs), and major CNT variations. In this talk, we summarize
techniques that we have recently developed to overcome these outstanding challenges,
enabling VLSI CNFET circuits to be experimentally realized today using standard VLSI
processing and design flows. Leveraging these techniques, we demonstrate the most
complex CNFET circuits and systems to-date, including a three-dimensional (3D) imaging
system comprising CNFETs fabricated directly on top of a silicon imager, CNT CMOS
analog and mixed-signal circuits, 1 kilobit CNFET static random-access memory (SRAM)
memory arrays, and a 16-bit RISC-V microprocessor built entirely out of CNFETs.

ML-Based Wire RC Prediction in Monolithic 3D ICs with an Application to Full-Chip
Optimization

  • Sai Surya Kiran Pentapati
  • Bon Woong Ku
  • Sung Kyu Lim

The state-of-the-art Monolithic 3D (M3D) IC design methodologies~\citem3d:Ku-tcad-Compact2D,
m3d:Panth-tcad-Shrunk2D use commercial electronic design automation tools built for
2D ICs to implement a pseudo-3D design and split it into two dies that are routed
independently to create an M3D design. Therefore, an accurate estimation of 3D wire
parasitics at the pseudo-3D stage is important to achieve a well optimized M3D design.
In this paper, we present a regression model based on boosted decision tree learning
to better predict the 3D wire parasitics (RCs) at the pseudo-3D stage. Our model is
trained using individual net features as well as the full-chip design metrics using
multiple instantiations of 8 different netlists and is tested on 3 unseen netlists.
Compared to the Compact-2D~\citem3d:Ku-tcad-Compact2D flow on its own as the reference
pseudo-3D, the addition of our predictive model achieves up to $2.9 \times$ and $1.7
\times$ smaller root mean square error in the resistance and capacitance predictions
respectively. On an unseen netlist design, we observe that our model provides 98.6%
and 94.6% RC prediction accuracy in 3D and up to $6.4 \times$ smaller total negative
slack of the design compared to the result of Compact-2D flow resulting in a more
timing-robust M3D IC. This model is not limited to Compact-2D, and can be extended
to other pseudo-3D flows.

Machine Learning-Enabled High-Frequency Low-Power Digital Design Implementation At
Advanced Process Nodes

  • Siddhartha Nath
  • Vishal Khandelwal

Relentless pursuit of high-frequency low-power designs at advanced nodes necessitate
achieving signoff-quality timing and power during digital implementation to minimize
any over-design. With growing design sizes (1–10M instances), full flow runtime is
an equally important metric and commercial implementation tools use graph-based timing
analysis (GBA) to gain runtime over path-based timing analysis (PBA), at the cost
of pessimism in timing. Last mile timing and power closure is then achieved through
expensive PBA-driven engineering change order (ECO) loops in signoff stage. In this
work, we explore “on-the-fly” machine learning (ML) models to predict PBA timing
based on GBA features, to drive digital implementation flow. Our ML model reduces
the GBA vs. PBA pessimism with minimal runtime overhead, resulting in improved area/power
without compromising on signoff timing closure. Experimental results obtained by integrating
our technique in a commercial digital implementation tool show improvement of up to
0.92% in area, 11.7% and 1.16% in power in leakage- and total power-centric designs,
respectively. Our method has a runtime overhead of $\sim$3% across a suite of 5–16nm
industrial designs.

A Fast Power Network Optimization Algorithm for Improving Dynamic IR-drop

  • Jai-Ming Lin
  • Yang-Tai Kung
  • Zheng-Yu Huang
  • I-Ru Chen

As the power consumption of an electronic equipment varies more severely, the device
voltages in a modern design may fluctuate violently as well. Consideration of dynamic
IR-drop becomes indispensable to current power network design. Since solving voltage
violations according to all power consumption files in all time slots is impractical
in reality, this paper applies a clustering based approach to find representative
power consumption files and shows that most IR-drop violations can be repaired if
we repair the power network according to these files. In order to further reduce runtime,
we also propose an efficient and effective power network optimization approach. Compared
to the intuitive approach which repairs a power network file by file, our approach
alternates between different power consumption files and always repairs the file which
has the worst IR-drop violation region that involves more power consumption files
in each iteration. Since many violations can be resolved at the same time, this method
is much faster than the iterative approach. The experimental results show that the
proposed algorithm can not only eliminate voltage violations efficiently but also
construct a power network with less routing resource.

SESSION: Session 9: Brains, Computers and EDA

Session details: Session 9: Brains, Computers and EDA

  • Patrick Groeneveld

A Lifetime of ICs, and Cross-field Exploration: ISPD 2021 Lifetime Achievement Award Bio

  • Louis K. Scheffer

The 2021 International Symposium on Physical Design lifetime achievement award goes
to Dr. Louis K. Scheffer for his outstand contributions to the field. This autobiography
in Lou’s own words provides a glimpse of what has happened through his career.

The Physical Design of Biological Systems – Insights from the Fly Brain

  • Louis K. Scheffer

Many different physical substrates can support complex computation. This is particularly
apparent when considering human made and biological systems that perform similar functions,
such as visually guided navigation. In common, however, is the need for good physical
design, as such designs are smaller, faster, lighter, and lower power, factors in
both the jungle and the marketplace. Although the physical design of man-made systems
is relatively well understood, the physical design of biological computation has remained
murky due to a lack of detailed information on their construction. The recent EM (electron
microscope) reconstruction of the central brain of the fruit fly now allows us to
start to examine these issues. Here we look at the physical design of the fly brain,
including such factors as fan-in and fanout, logic depth, division into physical compartments
and how this affects electrical response, pin to computation ratios (Rent’s rule),
and other physical characteristics of at least one biological computation substrate.
From this we speculate on how physical design algorithms might change if the target
implementation was a biological neural network.

Of Brains and Computers

  • Jan M. Rabaey

The human brain – which we consider to be the prototypal biological computer – in
its current incarnation is the result of more than a billion years of evolution. Its
main functions have always been to regulate the internal milieu and to help the organism/being
to survive and reproduce. With growing complexity, the brain has adapted a number
of design principles that serve to maximize its efficiency in performing a broad range
of tasks. The physical computer, on the other hand, had only 200 years or so to evolve,
and its perceived function was considerably different and far more constraint – that
is to solve a set of mathematical functions. This however is rapidly changing. One
may argue that the functions of brains and computers are converging. If so, the question
arises if the underlaying design principles will converge or cross-breed as well,
or will the different underlaying mechanisms (physics versus biology) lead to radically
different solutions.

EDA and Quantum Computing: The key role of Quantum Circuits

  • Leon Stok

Quantum computing (QC) is fast emerging as a potential disruptive technology that
can upend some businesses in the short-term and many enterprises in the long run.
Electronic Design Automation (EDA) is uniquely positioned to not only benefit from
quantum computing technologies but can also impact the pace of development of that
technology. Quantum circuits will play a key role in driving the synergy between quantum
and EDA. Much like standard cell libraries became the most important abstraction between
CMOS technology and most EDA tooling and spawned four decades of EDA innovation and
designer productivity, quantum circuits can unleash a similar streak of innovation
in quantum computing.

SESSION: Session 11: Third Keynote

Session details: Session 11: Third Keynote

  • Iris Hui-Ru Jiang

Physical Verification at Advanced Technology Nodes and the Road Ahead

  • Juan C. Rey

In spite of “doomsday” expectations, Moore’s Law is alive and well. Semiconductor
manufacturing and design companies, as well as the Electronic Design Automation (EDA)
industry have been pushing ahead to bring more functionality to satisfy more aggressive
space/power/performance requirements. Physical verification occupies a unique space
in the ecosystem as one of the key bridges between design and manufacturing. As such,
the traditional space of design rule checking (DRC) and layout versus schematic (LVS)
have expanded into electrical verification and yield enabling technologies such as
optical proximity correction, critical area analysis, multi-patterning decomposition
and automated filling. To achieve the expected accuracy and performance demanded by
the design and manufacturing community, it is necessary to consider the physical effects
of the manufacturing processes and electronic devices and to use the most advanced
software engineering technology and computational capabilities.

SESSION: Session 12: Physical Design at Advanced Technology Nodes

Session details: Session 12: Physical Design at Advanced Technology Nodes

  • Magna Mankalale

Hardware Security for and beyond CMOS Technology

  • Johann Knechtel

As with most aspects of electronic systems and integrated circuits, hardware security
has traditionally evolved around the dominant CMOS technology. However, with the rise
of various emerging technologies, whose main purpose is to overcome the fundamental
limitations for scaling and power consumption of CMOS technology, unique opportunities
arise to advance the notion of hardware security. In this paper, I first provide an
overview on hardware security in general. Next, I review selected emerging technologies,
namely (i) spintronics, (ii) memristors, (iii) carbon nanotubes and related transistors,
(iv) nanowires and related transistors, and (v) 3D and 2.5D integration. I then discuss
their application to advance hardware security and also outline related challenges.

Physical Design Challenges and Solutions for Emerging Heterogeneous 3D Integration
Technologies

  • Lingjun Zhu
  • Sung Kyu Lim

The emerging heterogeneous 3D integration technologies provide a promising solution
to improve the performance of electronic systems in the post-Moore era, but the lack
of design automation solutions and the challenges in physical design are hindering
the applications of these technologies. In this paper, we discuss multiple types and
levels of heterogeneous integration enabled by the high-density 3D technologies. We
investigate each physical implementation stage from technology setup to placement
and routing, identify the design challenges proposed by heterogeneous 3D integration.
This paper provides a comprehensive survey on the state-of-the-art physical design
methodologies to address these challenges.

A Scalable and Robust Hierarchical Floorplanning to Enable 24-hour Prototyping for
100k-LUT FPGAs

  • Ganesh Gore
  • Xifan Tang
  • Pierre-Emmanuel Gaillardon

Physical design for Field Programmable Gate Array (FPGA) is challenging and time-consuming,
primarily due to the use of a full-custom approach for aggressively optimize Performance,
Power and Area (P.P.A.) of the FPGA design. The growing number of FPGA applications
demands novel architectures and shorter development cycles. The use of an automated
toolchain is essential to reduce end-to-end development time. This paper presents
scalable and adaptive hierarchical floorplanning strategies to significantly reduce
the physical design runtime and enable millions-of-LUT FPGA layout implementations
using standard ASIC toolchains. This approach mainly exploits the regularity of the
design and performs necessary feedthrough creations for global and clock nets to eliminate
any requirement of global optimizations. To validate this approach, we implemented
full-chip layouts for modern FPGA fabric with logic capacity ranging from 40 to 100k
LUTs using a commercial 12nm technology. Our results show that the physical implementation
of a 128k-LUT FPGA fabric can be achieved within 24-hours, which has not been demonstrated
by any previous work. Compared to previous work, the runtime reduction of 8x is obtained
for implementing 2.5k LUTs FPGA device.

SESSION: Session 13: Contest and Results

Session details: Session 13: Contest and Results

  • Gracieli Posser

ISPD 2021 Wafer-Scale Physics Modeling Contest: A New Frontier for Partitioning, Placement and Routing

  • Patrick Groeneveld
  • Michael James
  • Vladimir Kibardin
  • Ilya Sharapov
  • Marvin Tom
  • Leo Wang

Solving 3-D partial differential equations in a Finite Element model is computationally
intensive and requires extremely high memory and communication bandwidth. This paper
describes a novel way where the Finite Element mesh points of varying resolution are
mapped on a large 2-D homogenous array of processors. Cerebras developed a novel supercomputer
that is powered by a 21.5cm by 21.5cm Wafer-Scale Engine (WSE) with 850,000 programmable
compute cores. With 2.6 trillion transistors in a 7nm process this is by far the largest
chip in the world. It is structured as a regular array of 800 by 1060 identical processing
elements, each with its own local fast SRAM memory and direct high bandwidth connection
to its neighboring cores. For the 2021 ISPD competition we propose a challenge to
optimize placement of computational physics problems to achieve the highest possible
performance on the Cerebras supercomputer. The objectives are to maximize performance
and accuracy by optimizing the mapping of the problem to cores in the system. This
involves partitioning and placement algorithms.