2025 International Symposium on Physical Design (ISPD) Table of Content

Full Citation in the ACM Digital Library

SESSION: Session 1: Opening Session and First Keynote

Towards Designing and Deploying Ising Machines

Sachin S. Sapatnekar

Today, NP-complete or NP-hard combinatorial problems are often solved on classical computers, using heuristics with no optimality guarantees or approximation algorithms with loose optimality bounds. Ising computation provides a new paradigm for solving these problems using networks of coupled oscillators. In contrast with traditional Ising machines that use supercooled chips, recent approaches have proposed the use of coupled CMOS ring oscillators, reducing the power dissipation of these systems by several orders of magnitude. This talk will overview the Ising model, discuss the challenges of building CMOS Ising machines, including issues related to layout and timing, and point to directions that are helping deploy these methods to solve ever-larger combinatorial problems.

SESSION: Session 2: Placement and DTCO

GOALPlace: Begin with the End in Mind

Anthony Agnesina
Rongjian Liang
Geraldo Pradipta
Anand Rajaram
Haoxing Ren

Co-optimizing placement with congestion is integral to achieving high-quality designs. This paper presents GOALPlace, a learning-based approach to improving placement congestion by controlling cell density. It efficiently learns from an EDA tool’s post-route optimized results and uses an empirical Bayes technique to adapt the target to a specific placer’s solutions, effectively beginning with the end in mind. Our method enhances correlation with the tool’s router and timing-opt engine, while solving placement globally without expensive incremental congestion estimation and mitigation methods. A statistical analysis with hierarchical netlist clustering establishes the importance of density and the potential for an adequate cell density target across placements. Our experiments show that our method, when integrated into an academic GPU-accelerated global placer, consistently produces macro and standard cell placements that match or exceed the quality of commercial tools. Our empirical Bayes methodology also shows a substantial quality improvement over leading academic mixed-size placers, achieving up to 10× fewer design rule check (DRC) violations, a 5% decrease in wirelength, and a 30% and 60% reduction in worst and total negative slack (WNS/TNS).

Invited: Scaling Standard Cell Layout Using Track Height Compression and Design Technology Co-optimization

Chung-Kuan Cheng
Byeonggon Kang
Bill Lin
Yucheng Wang

Moore’s law scaling is approaching physical limits, as indicated by the technology roadmap. Recent standard cell layout reductions rely on track height compression, which increases pin density and routing congestion. To address these challenges, design technology co-optimization (DTCO) was introduced. This paper explores how much track height can be compressed and how DTCO features can sustain layout scaling. To support this exploration, we developed an SMT-based cell synthesis tool that integrates gear ratio, M1 metal grid offset, local-interconnect source-drain (LISD) merging, adjustable gate cut lengths, and double-height architecture with pass-throughs, and various power delivery options.

In our exploration, we compress horizontal track numbers from four to two. Our synthesis tool enables flexible gear ratio options through a graph-based data structure that allocates vertical rout- ing resources at a smaller pitch than the contacted poly pitch. For double-height architecture, we allow pass-through options to alleviate routing congestion. The empirical results identify critical design strategies to meet scaling demands and overcome pin density challenges. Overall, the study demonstrates the design options that lead to potential scaling capability in the near future.

Invited: Physical Design Challenges for Design Technology Co-optimization

Taewhan Kim

Design technology co-optimization (DTCO) is the process of optimizing design and process technology together to enhance performance, power efficiency, chip utilization, and manufacturing cost/yield. Through DTCO, we are able to evaluate technologies, design rules, and cell architectures using block-level PPA (performance, power, area) analysis, which greatly helps semiconductor fabs reduce cost and shorten time-to-market in advanced process development with substantial architectural innovation.

The parameters that DTCO targets to evaluate include design rules (e.g., gate poly pitch, M1 pitch, side-to-top spacing rule, via-enclosure rule), cell architectures (e.g., single-row or multi-row-height, 2D or 1D M1, preference of metal direction), and technologies (e.g., Fin-FET, Nanosheet-FET, Complementary-FET), which are collectively called DTCO parameters.

LEGALM: Efficient Legalization for Mixed-Cell-Height Circuits with Linearized Augmented Lagrangian Method

Jing Mai
Chunyuan Zhao
Zuodong Zhang
Zhixiong Di
Yibo Lin
Runsheng Wang
Ru Huang

Advanced technologies increasingly adopt mixed-cell-height circuits due to their superior power efficiency, compact area usage, enhanced routability, and improved performance. However, the complex constraints of modern circuit design, including routing challenges and fence region constraints, increase the difficulty of mixed-cell-height legalization. In this paper, we introduce LEGALM, a state-of-the-art mixed-cell-height legalizer that can address routability and fence region constraints more efficiently. We propose an augmented Lagrangian formulation coupled with a block gradient descent method that offers a novel analytical perspective on the mixed-cell-height legalization problem. To further enhance efficiency, we develop a series of GPU-accelerated kernels and a triplefold partitioning technique with minor quality overhead. Experimental results on ICCAD-2017 and modified ISPD-2015 benchmarks show that our approach significantly outperforms current state-of-the-art legalization algorithms in both quality and efficiency.

SESSION: Session 3: Acceleration

Cypress: VLSI-Inspired PCB Placement with GPU Acceleration

Niansong Zhang
Anthony Agnesina
Noor Shbat
Yuval Leader
Zhiru Zhang
Haoxing Ren

The scale of printed circuit board (PCB) designs has increased significantly, with modern commercial designs featuring more than 10,000 components. However, the placement process heavily relies on manual efforts that take weeks to complete, highlighting the need for automated PCB placement methods. The challenges of PCB placement arise from its flexible design space and limited routing resources. Existing automated PCB placement tools have achieved limited success in quality and scalability. In contrast, very large-scale integration (VLSI) placement methods have proven to be scalable for designs with millions of cells and delivering high-quality results. Therefore, we propose Cypress, a scalable, GPU-accelerated PCB placement method inspired by VLSI. It incorporates tailored cost functions, constraint handling, and optimized techniques adapted for PCB layouts. In addition, there is an increasing demand for realistic and open-source benchmarks to (1) enable meaningful comparisons between tools and (2) establish performance baselines to track progress in PCB placement technology. To address this gap, we present a PCB benchmark suite synthesized from real commercial designs. We evaluate our method against state-of-the-art commercial and academic PCB placement tools with the benchmark suite. Our approach demonstrates a 1-5.9X higher routability on the proposed benchmarks. For fully routed designs, Cypress achieves 1-19.7X shorter routed track lengths. With GPU acceleration, Cypress delivers up to 492.3X speedup in run time. Finally, we demonstrate scalability to real commercial designs, a capability unmatched by existing tools.

GPU-Accelerated Inverse Lithography Towards High Quality Curvy Mask Generation

Haoyu Yang
Haoxing Ren

Inverse Lithography Technology (ILT) has emerged as a promising solution for photo mask design and optimization. Relying on multi-beam mask writers, ILT enables the creation of free-form curvilinear mask shapes that enhance printed wafer image quality and process window. However, a major challenge in implementing curvilinear ILT for large-scale production is mask rule checking, an area currently under development by foundries and EDA vendors. Although recent research has incorporated mask complexity into the optimization process, much of it focuses on reducing e-beam shots, which does not align with the goals of curvilinear ILT. In this paper, we introduce a GPU-accelerated ILT algorithm that improves not only contour quality and process window but also the precision of curvilinear mask shapes. Our experiments on open benchmarks demonstrate a significant advantage of our algorithm over leading academic ILT engines. Source code will be available at https://github.com/phdyang007/curvyILT.

Invited: Trailblazing the Future: Innovative Chip Design in the Era of Pervasive AI

Sudipto Kundu

Few engineering challenges are as complex and arduous as chip design, which typically requires multiple teams of experts and months of dedicated and diligent PPA (Performance, Power and Area) exploration work to achieve the desired goal. In the era of pervasive AI, Chip design automation tools are witnessing a seismic shift, emerging as a powerful intelligent AI agent that orchestrates various decision-making process at every aspect of chip design flow by leveraging multiple computes to explore different PPA strategies. This radical shift demands an AI ecosystem that connects data storage, compute resources, real time data analytics and efficient search space navigation technologies. In this talk, we will explore how physical design implementation tools are embracing AI agents to drive PPA exploration by using novel reinforcement learning techniques powered by deep insights of design and flow execution experience of prior run. The core of such system is built on continuous learning paradigm so that with each successive iteration of the same design, the system learns and adapts and delivers consistent PPA improvement.

SESSION: Session 4: Panel on Heterogenous Integration

Invited: Automatic Die-to-Die Routing with Shielding

Sheng-Yu Hsiao
Yu-Yueh Chang
Jeong-Tying Li

Die-to-die routing is a special layout problem in 3DIC advanced packaging, whose primary productivity bottleneck is routing. The requirements for high wiring density, high shielding rate, complex via stacking and design rules present challenges to create good routing results. In this paper, we describe our approaches to assign signal nets to various routing layers, insert VSS wires to shield signal nets as much as possible and achieve a high completion rate without design rule violations.

Invited: Streamlining and Automating Routing of Multi-Chiplet Technologies

Ksenia Roze

As designers transition to heterogeneously integrated multi-chiplet architectures, one of the largest bottlenecks they face is die-to-die (D2D) and die-to-substrate (D2S) signal routing. In some cases, routing can be up to three-quarters of the total design cycle. A next generation routing solution is required to reduce the routing bottleneck. In an effort to reduce this bottleneck, Cadence has been working on new strategies and solutions for automated routing of multi-chiplet technologies. This work led to the development of the industry’s first automated constraint-driven router that can meet the complex D2D and D2S routing requirements, including shielding and teardrop insertion.

This presentation will outline the routing strategies deployed to minimize signal delay, improve yield and ensure that the design meets all electrical and mechanical requirements. Advanced substrate routing automation involves breaking down the routing process into smaller, more controllable tasks such as pin escapes, breakout definitions, layer estimation, through-hole placement, detail routing and multi-layer congestion analysis. Electrical constraints like delay and minimum length must also considered while routing the signals. D2D routing involves developing a compact model with shielding to ensure maximum signal integrity and reliability. This often requires multiple channels for a given D2D region, and global planning of each channel space is necessary to optimize the routing design. Good routing ordering and escape definition achieve uniform routing across different channels.

This talk will present how, by strategically approaching each routing stage and defining topologies based on the complexity of the design, a significant amount of the routing workload can be automated. Our new advanced auto router enables full automated routing solution for multi-chiplet technologies resulting in significant productivity gains and brings new level of innovation to solving most complex D2D and D2S routing challenges.

SESSION: Session 5: Emerging Technologies

ML-QLS: Multilevel Quantum Layout Synthesis

Wan-Hsuan Lin
Jason Cong

Quantum Layout Synthesis (QLS) plays a crucial role in optimizing quantum circuit execution on physical quantum devices. As we enter the era where quantum computers have hundreds of qubits, optimal OLS tools face scalability issues, while heuristic methods suffer significant optimality gap due to the lack of global optimization. To address these challenges, we introduce a multilevel framework, which is an effective methodology for solving large-scale problems in VLSI design. In this paper, we present ML-QLS, the first multilevel quantum layout tool with a scalable refinement operation integrated with novel cost functions and clustering strategies. Our clustering provides valuable insights into generating a proper problem approximation for quantum circuits and devices. The experimental results demonstrate that ML-QLS can scale up to problems involving hundreds of qubits and achieve a remarkable 69% performance improvement over leading heuristic QLS tools for large circuits, which underscores the effectiveness of multilevel frameworks in quantum applications.

LiDAR: Automated Curvy Waveguide Detailed Routing for Large-Scale Photonic Integrated Circuits

Hongjian Zhou
Keren Zhu
Jiaqi Gu

As photonic integrated circuit (PIC) designs advance and grow in complexity, largely driven by innovations in photonic computing and interconnects, traditional manual physical design processes have become increasingly cumbersome. Available PIC layout automation tools are mostly schematic-driven, which has not alleviated the burden of manual waveguide planning and layout drawing for engineers. Previous research in automated PIC routing largely relies on off-the-shelf algorithms designed for electrical circuits, which only support high-level route planning to minimize waveguide crossings. It is not customized to handle unique photonics-specific routing constraints and metrics, such as curvy waveguides, bending, port alignment, and insertion loss. These approaches struggle with large-scale PICs and cannot produce real layout geometries without design-rule violations (DRVs). This highlights the pressing need for electronic-photonic design automation (EPDA) tools that can streamline the physical design of modern PICs. In this paper, for the first time, we propose an open-source automated PIC detailed routing tool, dubbed LiDAR, to generate DRV-free PIC layout for large-scale real-world PICs. LiDAR features a grid-based curvy-aware A* engine with adaptive crossing insertion, congestion-aware net ordering and objective, and crossing-waveguide optimization scheme, all tailored to the unique property of PIC. On large-scale real-world photonic computing cores and interconnects, LiDAR generates a DRV-free layout with 14% lower insertion loss and 6.25x speedup than prior methods, paving the way for future advancements in the EPDA toolchain. Our codes are open-sourced at https://github.com/ScopeX-ASU/LiDAR.

Invited: Physical Design for Systolic Array-Based Integrated Circuits

Jiang Hu

Systolic arrays have become a popular hardware architecture for machine learning computing, which is a key driver for the growth of the semiconductor industry. Unlike many other circuits, systolic arrays exhibit distinct 2D regularity, which holds significant potential for improving physical design quality. However, this regularity is largely overlooked in existing physical design methodologies.

Recent studies have demonstrated that leveraging this regularity can significantly enhance placement quality for FPGAs [1,3], cell placement [2], and mixed-size placement [4]. For instance, utilizing the regularity has resulted in over 20% wirelength reduction in FPGA placement compared to an industrial tool and a remarkable 53% wirelength reduction in mixed-size placement compared to a commercial tool.

Despite these advantages, exploiting the regularity is not as simple as duplicating the schematic. FPGA DSP architectures, typically column-based, often fail to align with the 2D regularity of systolic arrays. Additionally, in cell and mixed-size placement, the placement of IO and control logic can disrupt this regularity. This invited talk will highlight techniques to address these challenges. Beyond placement, the discussion will extend to other opportunities in physical design for systolic arrays, including routing, routability prediction, clock network synthesis, and lithographic hot-spot prediction.

SESSION: Session 6: Reliability

Photonic Side-Channel Analyzer: Enabling Security-Aware Physical Design Methodology

Meizhi Wang
Yi-Ru Chen
S. S. Teja Nibhanupudi
Elham Amini
Antonio Saavedra
Yinan Wang
Daniel Wasserman
Jean-Pierre Seifert
Jaydeep P. Kulkarni

Photon Emission (PE) from Integrated Circuits (IC) is an emerging non-invasive side channel that poses a serious security risk to modern System-on-Chips (SoCs). These emissions, generated during transistor switching, are determined by circuit operations and can be exploited in Side-Channel Analysis (SCA). Furthermore, physical design choices, such as standard cell placement and routing, affect how these emissions propagate and are detected. This makes it crucial to assess and mitigate such risks during the design phase. This paper presents a novel photonic side-channel analysis framework that integrates directly into the physical design flow. The framework enables designers to assess security vulnerabilities in digital ASIC designs by generating both time-resolved and accumulated PE maps at the standard-cell gate level. These PE maps can be applied to various side-channel analysis methods to identify vulnerable regions in the circuit.

We demonstrate the framework by applying it to a 40nm 128-bit Advanced Encryption Standard (AES) core, where we employ localized Correlation PE Attacks (CPEA) on simulated time-resolved PE maps. This approach pinpoints regions with high side-channel leakage. The results showcase the framework’s effectiveness in providing early detection and allow designers to enhance the overall security of the design against PE-related vulnerabilities. To validate our simulation framework, we compared the simulated accumulated PE maps with real-world measurements from a 40nm AES test chip. The close alignment between simulated and measured data confirms the accuracy of our simulator in predicting photon emission behavior across the chip.

Multi-Stage CSM Timing Waveform Propagation Accelerated by NLDM Assistance

Shih-Kai Lee
Pei-Yu Lee
Iris Hui-Ru Jiang

Static timing analysis (STA) is essential for timing closure. To address the complicated effects emerging at advanced technology nodes, the Current Source Model (CSM) has been developed to compute timing waveforms for timing propagation. Compared with Non-Linear Delay Model (NLDM), CSM provides superior accuracy but suffers from the efficiency and scalability issue. In this paper, we propose a multi-stage CSM timing propagation framework with three acceleration techniques with the assistance of NLDM. Our acceleration techniques are general and compatible with any CSM-based STA engine. Experimental results demonstrate the effectiveness of our acceleration techniques: Compared with CSM-based analysis, we achieve 4× speedups with only 0.4% accuracy loss.

SESSION: Session 7: Retrospective and Prospective of Physical Design

Invited: The Future of Functional ECO Automation and Logical Equivalence Checking for Advanced Digital Design Flows

Zhuo Li
David Stratman

Logical equivalence checking (LEC, or EC) is critical to design implementation and for decades has allowed cost-efficient RTL-level functional testing to be the dominant type of verification done on a project. Test once, then formally prove that the subsequent design stages later in the implementation flow are 100% logically equivalent. But over the last 10-15 years, SoCs have grown 100X in complexity, creating new challenges.

Concurrently, the use of functional ECOs to shortcut long design implementation cycles has skyrocketed. While automated approaches have greatly improved the ECO process quality and accelerated this overall trend, the setup can be challenging for inexperienced designers.

In order to significantly speed up and simplify EC and improve the entire functional ECO process, we require a new approach to both flows. This talk will highlight some of Cadence’s recent breakthrough research in this space, including the use of AI and ML to improve single-run results and multiply designer productivity while gathering insights and leveraging learnings across the duration of a project.

Invited: Toward an ML EDA Commons: Establishing Standards, Accessibility, and Reproducibility in ML-driven EDA Research

Vidya A. Chhabria
Jiang Hu
Andrew B. Kahng
Sachin S. Sapatnekar

Machine learning (ML) is transforming electronic design automation (EDA), offering innovative solutions for designing and optimizing integrated circuits (ICs). However, the field faces significant challenges in standardization, accessibility, and reproducibility, limiting the impact of ML-driven EDA (ML EDA) research. To address these barriers, this paper presents a vision for an ML EDA Commons, a collaborative open ecosystem designed to unify the community and drive progress through establishing standards, shared resources, and stakeholder-based governance. The ML EDA Commons focuses on three objectives: (1) Maturing existing EDA infrastructure to support ML EDA research; (2) Establishing standards for benchmarks, metrics, and data quality and formats for consistent evaluation via governance that includes key stakeholders; and (3) Improving accessibility and reproducibility by providing open datasets, tools, models, and workflows with cloud computing resources, to lower barriers to ML EDA research and promote robust research practices via artifact evaluations, canonical evaluators, and integration pipelines. Inspired by successes of ML and MLCommons, the ML EDA Commons aims to catalyze transparency and sustainability in ML EDA research.

Invited: Mapping Two Decades of Innovation: Lessons from 25 Years of ISPD Research

Gona Rahmaniani
Matthew Guthaus
Laleh Behjat

The design automation research community has driven the evolution of integrated circuits from a handful of transistors in the 1960s to billions today. The International Symposium on Physical Design (ISPD) has been instrumental in tackling challenges like scaling complexities, hardware security, and the exponential growth in transistor counts. This study conducts a comprehensive bibliometric analysis of ISPD publications using Natural Language Processing, machine learning, and network analysis. It explores research themes, collaboration dynamics, and global contributions through citation networks, co-authorship graphs, geographical and spatial mapping, and topic modeling. Key areas of focus include Physical Design Optimization, Power Efficiency, and Emerging Technologies, with prominent topics such as placement, routing, clock skew, lithography, machine learning, and hardware security. The analysis highlights the evolution of foundational techniques like placement and routing while identifying emerging trends such as AI-driven design automation. These insights provide a roadmap for sustaining innovation in physical design over the next 25 years.

SESSION: Session 8: Second Keynote

Invited: How Automotive Functional Safety is Disrupting Digital Implementation

Chuck J. Alpert
Vitali Karasenko
Connie O’Shea

The automotive industry is experiencing transformative disruption as the demand for vehicle electrification, connectivity, and autonomy drives manufacturers toward creating a ”datacenter on wheels.” As a result, the cost of silicon in vehicles is projected to rise significantly in the coming years, attracting many semiconductor companies to the market. However, unlike smartphones or data centers, safety is paramount in the automotive sector, prompting widespread adoption of the ISO 26262 functional safety standard. Meeting this standard introduces additional design time, rigorous processes, and increased silicon costs.

In design implementation, safety can be achieved through inserting safety mechanisms such as parity, triple-voting flops, and dual-core lockstep. However, the silicon cost of implementing safety can significantly increase chip area (e.g., from 30-80%), so design teams need advanced methodologies to achieve safety with minimum pain, but also minimum area and power. In particular, the Dual Core Lock Step is a popular safety mechanism since it provides excellent safety coverage for the logic to which it is applied. However, having numerous DCLS modules in a single design can become a floorplanning nightmare, leading to massive congestion, area bloat, and overall performance degradation. We propose a novel methodology for DCLS insertion during logic synthesis and digital implementation to address these issues.

SESSION: Session 9: AI for Chip Design

HeLO: A Heterogeneous Logic Optimization Framework by Hierarchical Clustering and Graph Learning

Yuan Pu
Fangzhou Liu
Zhuolun He
Keren Zhu
Rongliang Fu
Ziyi Wang
Tsung-Yi Ho
Bei Yu

Modern very large-scale integration (VLSI) designs usually consist of modules with various topological structures and functionalities. To better optimize such large and heterogeneous logic networks, it is essential to identify the structural and functional characteristics of its modules, and represent them with appropriate DAG types (such as AIG, MIG, XAG, etc.) for logic optimization. This paper proposes HeLO, a hetero-DAG logic optimization framework empowered by hierarchical clustering and graph learning. HeLO leverages a hierarchical clustering algorithm, which splits the original Boolean network into sub-circuits by considering both topological and functional characteristics. A novel graph neural network model is customized to generate the topological-functional embedding (used for distance calculation in hierarchical clustering) and predict the best-fit DAG type of each sub-circuit. Experimental results demonstrate that HeLO outperforms LSOracle, the SOTA heterogeneous logic optimization framework, in terms of node-depth product (for technology-independent logic optimization) and delay-area product (for technology mapping) by 8.7% and 6.9%, respectively.

GraphCAD: Leveraging Graph Neural Networks for Accuracy Prediction Handling Crosstalk-affected Delays

Fangzhou Liu
Guannan Guo
Yuyang Ye
Ziyi Wang
Wenjie Fu
Weihua Sheng
Bei Yu

As chip fabrication technology advances, the capacitive effects between wires have become increasingly pronounced, making crosstalk-induced incremental delay a serious issue. Traditional static timing analysis involves complex and iterative calculations through timing windows, requiring precise alignment of aggressor and victim nets, along with delay and slew estimations, which significantly increase runtime and licensing costs. In our work, we develop a Graph Neural Network framework to predict crosstalk-affected delays, focusing on the impacts of the coupling effect and overlapping nets. Moreover, we employ a curriculum learning strategy that gradually integrates aggressors with victims, improving model convergence through progressively complex scenarios. Experimental results show that our framework precisely predicts crosstalk-affected delays, matching commercial tools’ performance with a fivefold speedup.

Invited: AI-assisted Routing

Qijing Wang
Liang Xiao
Evangeline F.Y. Young

Routing is an important but complicated step in physical synthesis. Considering the potential of leveraging AI to seek higher efficiency and better quality in solving routing problems, we study in this work the methodology of AI-assisted routing in a systematic way. Decoupling the functionalities of different routing components will give a high flexibility in determining where and how AI can be used in an effective manner, while maintaining a high degree of interpretability. Two applications along this direction are presented, aiming at tackling the difficulties in routing with AI assistance. These provide examples of how to implement the methodology in practice, while revealing its effectiveness and potential.

SESSION: Session 10: LLM for Chip Design

DRC-Coder: Automated DRC Checker Code Generation Using LLM Autonomous Agent

Chen-Chia Chang
Chia-Tung Ho
Yaguang Li
Yiran Chen
Haoxing Ren

In the advanced technology nodes, the integrated design rule checker (DRC) is often utilized in place and route tools for fast optimization loops for power-performance-area. Implementing integrated DRC checkers to meet the standard of commercial DRC tools demands extensive human expertise to interpret foundry specifications, analyze layouts, and debug code iteratively. However, this labor-intensive process, requiring to be repeated by every update of technology nodes, prolongs the turnaround time of designing circuits. In this paper, we present DRC-Coder, a multi-agent framework with vision capabilities for automated DRC code generation. By incorporating vision language models and large language models (LLM), DRC-Coder can effectively process textual, visual, and layout information to perform rule interpretation and coding by two specialized LLMs. We also design an auto-evaluation function for LLMs to enable DRC code debugging. Experimental results show that targeting on a sub-3nm technology node for a state-of-the-art standard cell layout tool, DRC-Coder achieves perfect F1 score 1.000 in generating DRC codes for meeting the standard of a commercial DRC tool, highly outperforming standard prompting techniques (F1=0.631). DRC-Coder can generate code for each design rule within four minutes on average, which significantly accelerates technology advancement and reduces engineering costs.

LEGO-Size: LLM-Enhanced GPU-Optimized Signoff-Accurate Differentiable VLSI Gate Sizing in Advanced Nodes

Yi-Chen Lu
Kishor Kunal
Geraldo Pradipta
Rongjian Liang
Ravikishore Gandikota
Haoxing Ren

On-Chip Variation (OCV)-aware and Path-Based Analysis (PBA) accurate timing optimization achieved by gate sizing (including Vth-assignment) remains a pivotal step in modern signoff. However, in advanced nodes (e.g., 3nm), commercial tools often yield suboptimal results due to the intricate design demands and the vast choices of library cells that require substantial runtime and computational resources for exploration. To address these challenges, we introduce LEGO-Size, a generative framework that harnesses the power of Large Language Models (LLMs) and GPU-accelerated differentiable techniques for efficient gate sizing. LEGO-Size introduces three key innovations. First, it considers timing paths as sequences of tokenized library cells, casting gate sizing prediction as a language modeling task and solving it with self-supervised learning and supervised fine-tuning. Second, it employs a Graph Transformer (GT) with a linear-complexity attention mechanism for netlist encoding, enabling LLMs to make sizing decisions from a global perspective. Third, it integrates a differentiable Static Timing Analysis (STA) engine to refine LLM-predicted gate size probabilities by directly optimizing Total Negative Slack (TNS) through gradient descent. Experimental results on 5 unseen million-gate industrial designs in a commercial 3nm node show that LEGO-Size achieves up to 125x speed up with 37% TNS improvement over an industry-leading commercial signoff tool with minimal power and area overhead.

Invited: Artificial Netlist Generation for Enhanced Circuit Data Augmentation

Seokhyeong Kang

Optimizing power, performance, and area (PPA) at advanced nodes has become an increasingly challenging and complex task. To address these challenges, approaches such as machine learning (ML) and design-technology co-optimization (DTCO) have emerged as promising solutions. However, their effectiveness is limited by the lack of diverse training data and prolonged turnaround times (TAT). Artificial data has been widely used in various fields to address the limitations of real-world data. By augmenting datasets, artificial data improve the robustness of ML models against input perturbations, leading to improved performance. Similarly, in the physical design flow, artificial data has great potential for overcoming the scarcity of real-world circuit data [1], [2], [3]. Artificial circuits proposed in previous studies are typically designed for specific applications. By developing a method to generate artificial circuit which resemble real circuits, we can address data scarcity and TAT challenges in physical design. In this talk, we will discuss how leveraging artificial circuits to explore a wide range of circuit characteristics can enhance ML model performance for unseen real-world circuits and accelerate the PPA exploration flow.

SESSION: Session 11: Cell Design

Cell-Flex Metrics for Designing Optimal Standard Cell Layout with Enhanced Cell Layout Flexibility

Byeonggon Kang
Ying Yuan
Yucheng Wang
Bill Lin
Chung-Kuan Cheng

As physical pitch scaling slows, efforts to match its pace by reducing standard cell height and sacrificing horizontal routing tracks have introduced placement and routing challenges, making the design of high-quality standard cell layouts increasingly crucial. However, existing cell metrics only focus on pin accessibility and are insufficient to address issues in advanced nodes (e.g., Power Delivery Networks (PDN), increased routing blockages, etc.). We propose Cell Layout Flexibility(Cell-Flex) metrics, novel metrics that evaluate flexibility of standard cell layouts. flexibility reflects the versatility of cell layouts to placement and routing demands, which influences optimizing block design. By using Cell-Flex metrics as objectives in designing cell layout, we achieve a 13.2% reduction in block area without increasing total Design Rule Violations (DRVs). We develop a Machine Learning (ML) model using Kolmogorov-Arnold Networks (KAN) that utilizes the Cell-Flex metrics as features to make DRV prediction. By adding Cell-Flex features, we improve accuracy from 0.65 to 0.79 and F1 score from 0.52 to 0.78, demonstrating that our metrics are important for DRV prediction and serve as robust indicators of cell layout quality.

Scalable CFET Cell Library Synthesis with A DRC-Aware Lookup Table to Optimize Valid Pin Access

Ting-Wei Lee
Ting-Xin Lin
Yih-Lang Li

With the advent of CFET technology, which stacks P and N transistors together, the number of available tracks in a cell decreases. This poses a substantial challenge of hard-to-access pins during upper-level routing, which has been addressed in previous works by lengthening IO pins and increasing the spacing between adjacent IO pins. However, upper-level routing may generate DRC violations around IO pins in a cell, which compromises these efforts to improve pin accessibility. To overcome this challenge, we propose a scalable satisfiability modulo theories-based cell routing that establishes a DRC-aware scheme to enumerate potential DRC violations, enabling pin accessibility to be improved without producing DRC violations in upper-level routing. Our experimental results demonstrate that the proposed CFET cell generator is 100 times faster than previous work on average while delivering the same or better cell quality in terms of cell area. The scalability of the proposed method allows for the synthesis of large cells, including high-driving-strength cells and multi-bit flip flop (MBFF). Moreover, compared to previous work, the proposed method reduces DRC violations by an average of 99% in upper-level routing, and reduces both wire length and via usage effectively as well.

LVFGen: Efficient Liberty Variation Format (LVF) Generation Using Variational Analysis and Active Learning

Junzhuo Zhou
Haoxuan Xia
Wei Xing
Ting-Jung Lin
Li Huang
Lei He

As transistor dimensions shrink, process variations significantly impact circuit performance, signifying the need for accurate statistical circuit analysis. In digital circuit timing analysis, the Liberty Variation Format (LVF) has emerged as an industrial leading representation of timing distributions in cell libraries at 22 nm and below. However, LVF characterization relies on the Monte Carlo (MC) method, which requires excessive SPICE simulations of cells with process variations. Similar challenges also exist for uncertainty propagation and quantification in chip manufacturing and the broader scientific communities. To resolve this foundational challenge, this paper presents LVFGen, a novel method that reduces the simulation costs of MC while generate high-accuracy LVF library. LVFGen utilizes an active learning strategy based on variational analysis to identify process variation samples that impact timing distributions more significantly. Compared to the state-of-the-art Quasi-MC method, LVFGen demonstrates an overall 2.27× speedup in LVF library generation within an accuracy level of 5k-sample MC and a 4.06× speedup within a 100k-sample MC accuracy.

Abuttable Analog Cell Library and Automatic AMS Layout

Tianjia Zhou
Cheng Chang
Li Huang
Jingyun Gu
Zexin Ji
Xiangyang Liu
Hailang Liang
Zhanfei Chen
Ting-Jung Lin
Song Wang
Na Bai
Zhengping Li
Lei He

The state of the art analog circuit design applies mainly a full-custom layout methodology. This demands high expertise and heavy manual workload. Additionally, neither can the resulting layout be re-used easily across different designs or different PDKs. Learning from digital standard cells, existing work has proposed stem cells that are abuttable. But stem cells have a fixed area ratio of 2 over same-sized Pcells, limiting its wide application. In this paper we develop a new type of abuttable analog cells (called Acells) for transistors and passive elements. Acells are compatible with digital standard cells and can be abutted in all directions, enabling the use of automatic digital place and route (PnR) engines. We automate Acell generation and show that the average area ratio over same-sized Pcell is 1.49 for 65nm technology and 1.3 for 28nm technology, and is expected to decrease for more advanced technologies. We then use digital PnR to automatically layout a number of analog and mixed-signal (AMS) circuits mainly in 28nm, and show that compared to Pcell-based manual layout, Acell-based layout obtains similar performance and its circuit level layout area is about 2% higher for large scale AMS circuits in our experiments.

SESSION: Session 12: 3D IC Part I

Placement-Aware 3D Net-to-Pad Assignment for Array-Style Hybrid Bonding 3D ICs

Pruek Vanna-iampikul
Junsik Yoon
Chaeryung Park
Gary Yeap
Sung Kyu Lim

Hybrid bonding is emerging as a key technology for 3D integration, offering finer bonding pitches that address the high interconnect density requirements of modern VLSI applications. In advanced node technologies, where metal pitches are significantly smaller than bonding pitches, 3D net assignment becomes critical for achieving optimal design performance. Existing approaches primarily focus on either ensuring the legality of the assignment or optimizing the flexibility of 3D net locations for timing purposes in isolation. This limitation restricts the performance improvements of 3D designs over traditional 2D counterparts. To overcome these challenges, we introduce AnchorGrid, a novel 3D net assignment framework designed to concurrently assign 3D nets to legal locations while supporting their movement to enhance timing optimization. By modeling 3D nets as pairs of specialized ”anchor” cells, accompanied by relative placement constraints, precise movement and alignment are achieved during the pre-route optimization phase, before final placement onto grid-based locations. Experimental results on advanced node commercial designs demonstrate that AnchorGrid achieves up to a 24.35% improvement in power, performance, and area (PPA) metrics, while reducing design rule check (DRC) violations by 90%, outperforming state-of-the-art methods.

Invited: Physical Design for Advanced 3D ICs: Challenges and Solutions

Yuxuan Zhao
Lancheng Zou
Bei Yu

As technology scaling predicted by Moore’s law slows down, 3D integrated circuits (3D ICs) have emerged as a promising alternative to enhance performance while maintaining cost-effectiveness. With the advancement of fabrication and bonding technologies, wafer-level 3D integration enables fine-grain 3D interconnects that maximize the benefits in power, performance, and area (PPA). However, a multitude of challenges have obstructed traditional electronic design automation (EDA) methodologies for 3D IC implementations. This paper surveys the major challenges in the physical design of advanced 3D ICs. We provide a comprehensive review of existing solutions, analyzing their advantages and disadvantages in depth. Finally, we discuss open problems and research opportunities in the development of native 3D EDA tools.

Invited: Chiplet-Based Integration – Scale-Down and Scale-Out

Boris Vaisband

Motivation: The demand for increased computation and memory in applications such as large language models, has increased well beyond the reticle boundaries of a system-on-chip (SoC). Chiplet-based integration is a paradigm shift that shapes the way we design our future high-performance systems. The concept is to move away from large SoCs that are limited by communication, thermal design power, and reticle size, toward a robust plug-and-play approach, where small, hardened IP heterogeneous off-the-shelf chiplets are seamlessly integrated on a single platform.

Problem statement: Recent technological breakthroughs in advanced packaging platforms, have enabled the integration of hundreds to thousands of chiplets within a single platform. Nonetheless, building a functional and efficient ultra-large-scale high-performance computation system, requires overcoming important system-level design challenges. Specifically, short- and long-range communication, power delivery and thermal management, testing, synchronization, hardware security, and others.

Approach: In this talk, we will discuss the current state-of-the-art and challenges in chiplet integration as well as the scale-down and scale-out concepts. We will introduce the silicon interconnect fabric (Si-IF), an ultra-large wafer-scale heterogeneous integration platform, for applications such as high-performance computing. We will discuss paths to address the system-level challenges for designing and integrating a high-performance computation system on the Si-IF.

SESSION: Session 13: Lifetime Achievement Session

Invited: Innovation in Times of Technology Disruption

Bryan Preas

I hosted Jason Cong when he was an intern at the Xerox Palo Alto Research Center in 1987. Since then, we have been friends and collaborators. I have watched his extraordinary accomplishments with pride and pleasure over the years.

Disruptive technologies are a well-studied business school topic. New, or significantly improved, technologies present new problems and allow new approaches for older challenges. These disruptions are often accompanied by substantial technical innovation and creation of new business values. In their time, electric power, automobiles and television disrupted society. More recent examples include the internet and e-commerce.

Invited: Shaping the Future of Interconnected Physical Design

David Z. Pan

Physical design has been a cornerstone of electronic design automation (EDA) since the early days of chip and board development, with placement and routing at its core. By the late 1980s, the field was prematurely declared ”dead,” as many believed its challenges had been resolved. However, the advent of deep submicron scaling in the 1990s revitalized physical design research, establishing it as an indispensable part of the design process. Today, with technology advancing to the 1x nm regime and the rise of 3D heterogeneous integration (3DHI), physical design remains pivotal in achieving design closure across power, performance, area, cost, and turnaround time (PPACT). Over time, physical design has transformed from its classical ”place and route” framework into a more holistic and interconnected discipline, crosscutting into physical synthesis, design for manufacturing, 3DHI, analog and RF, emerging technologies, and AI/ML. The seminal contributions of Prof. Jason Cong have been instrumental in shaping the field of interconnected physical design. The seeds he planted have grown into thriving forests, with his academic descendants emerging as key leaders driving advancements in the field. This talk will explore the synergistic aspects of interconnected physical design and highlight Prof. Cong’s profound influence and legacy in shaping the future of the field.

Invited: Coping with Interconnects

Jason Cong

In this paper, I review the multi-decade research on overcoming the performance bottleneck of VLSI interconnects in deep sub-micrometer and nanometer technologies that started at UCLA in the early 1990s. Our research spans from interconnect topology and geometry optimization, to wire length reduction via scalable placement, to use of novel interconnect technologies such as 3D IC and RF-interconnects, to recent work on interconnect pipelining in chiplet designs, and the shift from interconnect to entanglement in quantum computing. The latter two efforts go beyond the typical physical design space and involve space-time co-optimization. This paper is dedicated to multiple generations of Ph.D. students, postdocs, and visiting researchers who contributed to build a strong physical design research program at UCLA.

SESSION: Session 14: Third Keynote

Invited: Automation and Optimization of Heterogeneous Multi-Die Systems

Henry Sheng

Advances in heterogeneous integration have enabled the creation of systems built from chips and interconnects using multiple silicon node technologies, package technologies, optical technologies, thermal mitigation, and more. The scale of integration has increased non-linearly with the pitch of die-to-die interconnect such as bump and hybrid bonds. Traditionally, many systems have been constructed as an ‘assembly’ of parts using manual layout techniques. Advanced package technologies have evolved to a point where they have achieved densities that challenge the viability of assembly-based manual methods as multi-die systems scale to 5X, 8X or even 40-60X reticle size at wafer scale. The scale complexity grows both with the densification of die-to-die connections as well as the increase of allowable system footprints. Furthermore, the integration of heterogeneous components in a single design mandates a workflow that fuses previously disconnected heterogeneous workflows and competencies together. These are secular shifts that are opening new classes of problems including migration from manual assembly to automated assembly, and then from automated assembly to optimization of system QoR (quality of results). This requires design automation tools to have a unified representation and treatment of heterogeneous systems and operate on optimization at a system scale across heterogeneous components for the key system QoR metrics such as SIPI, EMIR and Thermal.

SESSION: Session 15: 3D IC Part II

ML-Based Fine-Grained Modeling of DC Current Crowding in Power Delivery TSVs for Face-to-Face 3D ICs

Zheng Yang
Zhen Zhuang
Bei Yu
Tsung-Yi Ho
Martin D. F. Wong
Sung Kyu Lim

In contrast to uniform distribution in power wires, actual currents tend to exhibit complicated crowding phenomena at the connections between Through-silicon-via (TSV) and power wires. The current crowding effect degrades power integrity and increases the difficulty of 3D IC power delivery network (PDN) analysis. Therefore, a detailed analysis of current distribution and IR drops in power TSVs within 3D IC PDN is important. This paper will explore the complicated current behavior within TSVs and PDNs of the promising face-to-face 3D IC architecture. Since existing simulation methods are computationally intensive and time-consuming, we propose a graph attention network-based (GAT-based) framework, with novel aggregation methods in the GAT models and informative fine-grained graph generation methods, to achieve efficient analysis of current crowding and IR drops in face-to-face 3D IC TSVs. For current density and voltage predictions, the proposed framework attains R² scores of 0.9776 and 0.9952 compared to ground truth results, respectively. Our framework also demonstrates over 837× speedup than ANSYS Q3D Extractor. Furthermore, the proposed framework outperforms other machine learning-based (ML-based) methods, including the state-of-the-art method.

Invited: Modeling and Design Methodology for Backside Integration of Voltage Converters

Amaan Rahman
Hang Yang
Cong Hao
Sung Kyu Lim

As technology scales down, backside power delivery networks (BS-PDNs) are increasingly adopted to address frontside parasitics, power density, and IR-drop challenges. However, conventional BS-PDN architectures struggle with off-chip IR-drop issues caused by package-level parasitics. We introduce the first fully integrated backside voltage regulator (BS-IVR) to reduce off-chip circuitry overhead, and improve load regulation and on-chip power integrity. BS-IVR utilizes back-end-of-line compatible amorphous tungsten-doped indium oxide transistors for backside integration and our comprehensive backside PDK for design and verification. We developed an end-to-end EDA flow for BS-IVR on-chip integration and power integrity analysis. Optimizing the BS-IVR for on-chip integration achieves a compact IVR footprint of 0.0142 mm² with a power density of 3.02 W/mm² and efficiency of 60%. On-chip dynamic VDD IR-drop is further reduced by 46.06%.

Invited: Next-Generation Power Integrity Concepts and Applications for Physical Design

Emrah Acar

The rapid pace of innovation in electronics, driven by advancements in computing, machine learning, and artificial intelligence, has created an unprecedented demand for more efficient and powerful computing platforms. As integrated circuits (ICs) continue to scale and integrate into increasingly complex systems, they consume more power, leading to significant challenges in power integrity. These challenges are further exacerbated by the growing complexity of modern IC designs, necessitating more intelligent and actionable approaches to ensure robust power delivery networks.

This presentation introduces a novel methodology for addressing power integrity issues in next-generation IC designs. We propose a victim/aggressor interaction model as a foundational concept for IR drop analysis. This model enables the decomposition of IR drop in a victim instance into contributions from multiple aggressors and other components. By understanding these interactions, designers can implement corrective actions during the placement and routing stages, as well as enhance power connectivity at higher levels of the design hierarchy.

We will discuss the foundational advantages of RedHawk-SC SigmaDVD™ Technology, a cutting-edge solution for power integrity analysis and signoff. SigmaDVD™ is designed to address dynamic voltage drop (DVD) issues at advanced process nodes, providing comprehensive coverage and enabling early detection and prevention of voltage-drop-related problems. This technology is instrumental in achieving robust power integrity signoff, fixing IR violations, and ensuring timing closure with high confidence.

Key applications of SigmaDVD™ in physical design, IR/STA (Static Timing Analysis), and IR/ECO (Engineering Change Order) tools will be highlighted. The presentation will demonstrate how SigmaDVD™ is becoming the industry-leading method for avoiding DVD-induced voltage and timing problems, enabling shift-left prevention of voltage-drop issues, and delivering high-coverage power integrity signoff for advanced-node designs.

By leveraging these next-generation concepts and tools, designers can achieve more efficient, reliable, and high-performance ICs, paving the way for continued innovation in the electronics industry.

SESSION: Session 16: Contest Results and Closing Remarks

Invited: ISPD 2025 Performance-Driven Large Scale Global Routing Contest

Rongjian Liang
Anthony Agnesina
Wen-Hao Liu
Matt Liberty
Hsin-Tzu Chang
Haoxing Ren

Global routing is a critical aspect of VLSI design, significantly impacting timing, power consumption, and routability. The ISPD2024 contest focused on addressing the scalability challenges of global routing by leveraging GPU and machine learning techniques. Building on this foundation, the ISPD2025 contest introduces several important updates to better reflect real-world routing challenges. These updates include the provision of industry-standard input files for more precise modeling and integration with OpenROAD for accurate performance assessment. Collectively, these updates aim to bring the contest closer to practical routing scenarios, fostering the development of scalable and efficient solutions for large-scale chip designs.