MLCAD’20 TOC
MLCAD ’20: Proceedings of the 2020 ACM/IEEE Workshop on Machine Learning for CAD
SESSION: Keynote Talk I
Session details: Keynote Talk I
MLCAD Today and Tomorrow: Learning, Optimization and Scaling
The scaling imperative challenges us to always do better and faster, with less resources.
The semiconductor industry has looked to machine learning (ML) as a design-based lever
for scaling that will reduce design costs and design schedules while improving quality
of results. As a result, in recent years Machine Learning for CAD (MLCAD) has dominated
the conversation at leading conferences. Numerous ML-based enhancements and their
benefits have been highlighted by EDA vendors and their customers. With this as backdrop,
this talk will offer some thoughts on future directions for MLCAD.
First, MLCAD lies on a road to “self-driving IC design tools and flows” that make
design ideation and design space exploration both accurate and accessible. Eventually,
MLCAD (ML for CAD) will lead us to MLDA (ML-enabled Design Automation). But for this
to happen, researchers and practitioners will need to deliver (i) human-quality prediction,
evaluation and decision-making with no humans; (ii) design tools and flows that never
require iteration and never fail; (iii) modeling of design processes that continually
improves; and more.
Second, the trajectory of MLCAD will need to keep three concepts in foreground: Learning,
Optimization and Scaling. “Learning” seems obvious from “ML”, but it brings open questions
about data and models, ranging from statistics to standards, sharing and openness.
“Optimization” is the essence of CAD, and brings open questions about both synergies
and boundaries with learning. “Scaling” is how practical realization of Learning and
Optimization will satisfy the needs of design within an ever-tighter box of compute,
schedule and other resources. Finally, there is a meta-question of how the MLCAD community
will itself learn, optimize, and scale.
SESSION: Session 1: DNN for CAD
Session details: Session 1: DNN for CAD
An Adaptive Analytic FPGA Placement Framework based on Deep-Learning
In this work, a Convolutional Encoder-Decoder (CED) is utilized to significantly reduce
placement runtimes for large, high-utilization designs. The proposed CED uses features
available during the early stages of placement to predict the congestion present in
subsequent placement iterations including the final placement. This congestion information
is then used by the placer to improve decision making leading to reduced runtimes.
Experimental results show that reductions in placer runtime between 27% and 40% are
achievable with no significant deterioration in quality-of-result.
Design Rule Checking with a CNN Based Feature Extractor
Design rule checking (DRC) is getting increasingly complex in advanced nodes technologies.
It would be highly desirable to have a fast interactive DRC engine that could be used
during layout. In this work, we establish the proof of feasibility for such an engine.
The proposed model consists of a convolutional neural network (CNN) trained to detect
DRC violations. The model was trained with artificial data that was derived from a
set of 50 SRAM designs. The focus in this demonstration was metal 1 rules. Using this
solution, we can detect multiple DRC violations 32x faster than Boolean checkers with
an accuracy of up to 92%. The proposed solution can be easily expanded to a complete
rule set.
Using DNNs and Smart Sampling for Coverage Closure Acceleration
Coverage Directed Generation represents algorithms that are used to create tests or
test-templates for hitting coverage events. Standard approaches for solving the problem
use either user’s intuition or random sampling. Recent work has been using optimization
algorithms in order to hit, a single hard-to-hit event. In this work we extend the
optimization technique for many events and show that by using a deep neural network
one can accelerate the optimization significantly. The algorithms are presented on
the NorthStar simulator where we show substantial improvement over random based techniques
and a factor larger than 2 on other optimization-based techniques.
R2AD: Randomization and Reconstructor-based Adversarial Defense on Deep Neural Network
Machine learning (ML) has been widely adopted in a plethora of applications ranging
from simple time-series forecasting to computer security and autonomous systems. Despite
the robustness by the ML algorithms against random noise, it has been shown that inclusion
of specially crafted perturbations to the input data termed as adversarial samples
can lead to a significant degradation in the ML performance. Existing defenses to
mitigate or minimize the impact of adversarial samples including adversarial training
or randomization are confined to specific categories of adversaries, compute-intensive
and/or often lead to reduce performance even without adversaries. To overcome the
shortcomings of the existing works on adversarial defense, we propose a two-stage
adversarial defense technique (R2AD). To thwart the exploitation of the deep neural
network by the attacker, we first include a random nullification (RNF) layer. The
RNF nullifies/removes some of the features from the input randomly to reduce the impact
of adversarial noise and minimizes attacker’s feasibility to extract the model parameters.
However, the removal of input features through RNF leads to a reduction in the performance
of the ML. As an antidote, we equip the network with a Reconstructor. The Reconstructor
primarily contributes to reconstructing the input data by utilizing an autoencoder
network, but based on the distribution of the normal samples, thereby improving the
performance, and also being robust to the adversarial noise. We evaluated the performance
of proposed multi-stage R^2AD on the MNIST digits and Fashion-MNIST datasets against
multiple adversarial attacks including FGSM, JSMA, BIM, Deepfool, and CW attacks.
Our findings report improvements as high as 80% in the performance compared to the
existing defenses such as adversarial training and randomization-based defense.
DAVE: Deriving Automatically Verilog from English
Specifications for digital systems are provided in natural language, and engineers
undertake significant efforts to translate these into the programming languages understood
by compilers for digital systems. Automating this process allows designers to work
with the language in which they are most comfortable – the original natural language
– and focus instead on other downstream design challenges. We explore the use of state-of-the-art
machine learning (ML) to automatically derive Verilog snippets from English via fine-tuning
GPT-2, a natural language ML system. We describe our approach for producing a suitable
dataset of novice-level digital design tasks and provide a detailed exploration of
GPT-2, finding encouraging translation performance across our task sets (94.8% correct),
with the ability to handle both simple and abstract design tasks.
SESSION: Plenary I
Session details: Plenary I
Accelerating Chip Design with Machine Learning
As Moore’s law has provided an exponential increase in chip transistor density, the
unique features we can now include in large chips are no longer predominantly limited
by area constraints. Instead, new capabilities are increasingly limited by the engineering
effort associated with digital design, verification, and implementation. As applications
demand more performance and energy efficiency from specialization in the post-Moore’s-law
era, we expect required complexity and design effort to increase.
Historically, these challenges have been met through levels of abstraction and automation.
Over the last few decades, Electronic Design Automation (EDA) algorithms and methodologies
were developed for all aspects of chip design – design verification and simulation,
logic synthesis, place-and-route, and timing and physical signoff analysis. With each
increase in automation, total work per chip has increased, but more work has also
been offloaded from manual effort to software. Just as machine learning (ML) has transformed
software in many domains, we expect advancements in ML will also transform EDA software
and as a result, chip design workflows.
In this talk, we highlight work from our research group and the community applying
ML to various chip design prediction tasks [1]. We show how deep convolutional neural
networks [2] and graph-based neural networks [3] can be used in the areas of automatic
design space exploration, power analysis, VLSI physical design, and analog design.
We also present a future vision of an AI-assisted chip design workflow to automate
optimization tasks. In this future vision, GPU acceleration, neural-network predictors,
and deep reinforcement learning techniques combine to automate VLSI design and optimization.
SESSION: Keynote Talk II
Session details: Keynote Talk II
SoC Design Automation with ML – It’s Time for Research
The AI-hype started a few years ago, with advances in object recognition. Soon the
EDA research community made proposals on applying AI in EDA and all major players
announced new AI-based tools at DAC 2018. Unfortunately, few new AI-based EDA-tools
made it to productive use today. This talk analyses general challenges of AI in EDA,
outlines promising use cases, and motivates more AI research in EDA: More HI (=Human
Intelligence) is needed to make AI successful in EDA.
Motivation: For a long time, hardware design resides in an area between hell of complexity
and hell of physics. Continuously decreasing feature size enables to put more and
more transistors on a square millimeter silicon. This offers to make continuously
new applications at a reasonable form factor. However, the functionality of the application
must be designed first and the deep submicron effects must be considered properly.
EDA tools help to automate design, but face challenges keeping up with the continuously
increasing productivity demand. Therefore, the design teams have increased in size
to step up the design of the chips. So, any further innovation is welcome. The re-discovery
of AI in general and ML in particular created visions of learning from designers and
automatically creating automation from these learnings. To give an example, Google
describes in [1] how to accelerate Chip Placement from weeks to hours.
SESSION: Session 2: Design Methodology and Optimization
Session details: Session 2: Design Methodology and Optimization
Cost Optimization at Early Stages of Design Using Deep Reinforcement Learning
With the increase in the complexity of the modern system on Chips(SoCs) and the demand
for a lower time-to-market, automation becomes essential in hardware design. This
is particularly relevant in complex/time-consuming tasks, as the optimization of design
cost for a hardware component. Design cost, in fact, may depend on several objectives,
as for the hardware-software trade-off. Given the complexity of this task, the designer
often has no means to perform a fast and effective optimization in particular for
larger and complex designs. In this paper, we introduce Deep Reinforcement Learning(DRL)
for design cost optimization at the early stages of the design process. We first show
that DRL is a perfectly suitable solution for the problem at hand. Afterward, by means
of a Pointer Network, a neural network specifically applied for combinatorial problems,
we benchmark three DRL algorithms towards the selected problem. Results obtained in
different settings show the improvements achieved by DRL algorithms compared to conventional
optimization methods. Additionally, by using reward redistribution proposed in the
recently introduced RUDDER method, we obtain significant improvements in complex designs.
Here, the obtained optimization is on average 15.18% on the area as well as 8.25%
and 8.12% on the application size and execution time on a dataset of industrial hardware/software
interface design
F-LEMMA: Fast Learning-based Energy Management for Multi-/Many-core Processors
Over the last two decades, as microprocessors have evolved to achieve higher computational
performance, their power density also has increased at an accelerated rate. Improving
energy efficiency and reducing power consumption is therefore of critical importance
to modern computing systems. One effective technique to improve energy efficiency
is dynamic voltage and frequency scaling (DVFS). In this paper, we propose F-LEMMA:
a fast learning-based power management framework consisting of a global power allocator
in userspace, a reinforcement learning-based power management scheme at the architecture
level, and a swift controller at the digital circuit level. This hierarchical approach
leverages computation at the system and architecture levels, and the short response
times of the swift controllers, to achieve effective and rapid μs-level power management.
Our experimental results demonstrate that F-LEMMA can achieve significant energy savings
(35.2% on average) across a broad range of workload benchmarks. Compared With existing
state-of-the-art DVFS-based power management strategies that can only operate at millisecond
timescales, F-LEMMA is able to provide notable (up to 11%) Energy Delay Product improvements
when evaluated across benchmarks.
CALT: Classification with Adaptive Labeling Thresholds for Analog Circuit Sizing
A novel simulation-based framework that applies classification with adaptive labeling
thresholds (CALT) is developed that auto-generates the component sizes of an analog
integrated circuit. Classifiers are applied to predict whether the target specifications
are satisfied. To address the lack of data points with positive labels due to the
large dimensionality of the parameter space, the labeling threshold is adaptively
set to a certain percentile of the distribution of a given circuit performance metric
in the dataset. Random forest classifiers are executed for surrogate prediction modeling
that provide a ranking of the design parameters. For each iteration of the simulation
loop, optimization is utilized to determine new query points. CALT is applied to the
design of a low noise amplifier (LNA) in a 65 nm technology. Qualified design solutions
are generated for two sets of specifications with an average execution of 4 and 17
iterations of the optimization loop, which require an average of 1287 and 2190 simulation
samples, and an average execution time of 5.4 hours and 23.2 hours, respectively.
CALT is a specification-driven design framework to automate the sizing of the components
(transistors, capacitors, inductors, etc.) of an analog circuit. CALT generates interpretable
models and achieves high sample efficiency without requiring the use of prior circuit
models.
Decision Making in Synthesis cross Technologies using LSTMs and Transfer Learning
We propose a general approach that precisely estimates the Quality-of-Result (QoR),
such as delay and area, of unseen synthesis flows for specific designs. The main idea
is leveraging LSTM-based network to forecast the QoR, where the inputs are synthesis
flows represented in novel timed-flow modeling, and QoRs are ground truth. This approach
is demonstrated with 1.2 million data points collected using 14nm, 7nm regular-voltage
(RVT), and 7nm low-voltage (LVT) technologies with twelve IC designs. The accuracy
of predicting the QoRs (delay and area) evaluated using mean absolute prediction error
(MAPE). While collecting training data points in EDA can be extremely challenging,
we propose to elaborate transfer learning in our approach, which enables accurate
predictions cross different technologies and different IC designs. Our transfer learning
approach obtains estimation MAPE 3.7% over 960,000 test points collected on 7nm technologies,
with only 100 data points used for training the pre-trained LSTM network using 14nm
dataset.
Application of Quantum Machine Learning to VLSI Placement
Considerable advances in quantum computing with functioning noisy, near-term devices
have allowed for the application space to grow as a emerging field for problems with
large solution spaces. However, current quantum hardware is limited in scale and noisy
in generated data, necessitating hybrid quantum-classical solutions for viability
of results and convergence. A quantum backend generates data for classical algorithms
to optimize control parameters with, creating a hybrid quantum-classical computing
loop. VLSI placement problems have shown potential for utilization, where traditionally
heuristic solutions such as Kernighan-Lin (KL) are used. The Variational Quantum Eigensolver
(VQE) is used to formulate a recursive Balanced Min-Cut (BMC) algorithm, and we suggest
that quantum machine learning techniques can lower error rates and allow for faster
convergence to an optimal solution.
SESSION: Plenary II
Session details: Plenary II
From Tuning to Learning: Why the FPGA Physical Design Flow Offers a Compelling Case for ML?
MLCAD is particularly suited for the FPGA Physical design (PD) flow since each device
family generation innately provides a rich platform for device/design feature data
harvesting: (1) A vast amount of device architecture-specific interconnect/layout
fabric data and (2) significant amount of large design suite data from and from broad
set of application domains. These bode well for developing robust predictive ML models.
Furthermore, the long lifespan of these device families affords a favorable ROI. In
this talk, we will highlight some data harvesting and ML solutions we have developed
in Xilinx? Vivado PD flow and share some initial results. These include a strategy
recommendation framework for design closure, design classification for computational
resource allocation, device characteristics modeling, and routing congestion estimation.
Furthermore, we will outline potential MLCAD opportunities in trend identification,
algorithm parameter optimization, and reinforcement learning paradigms where we foresee
potential collaborations with the academic community.
Biography: Ismail Bustany is a Distinguished Engineer at Xilinx, where he works on
physical design, MLCAD, and sparse computation hardware acceleration. He has served
on the technical programming committees for the ISPD, the ISQED, and DAC. He was the
2019 ISPD general chair. He currently serves on the organizing committees for ICCAD
and SLIP. He organized the 2014 and 2015 ISPD detailed routing-driven placement contests
and co-organized the 2017 ICCAD detailed placement contest. His research interests
include physical design, computationally efficient optimization algorithms, MLCAD,
sparse matrix computations, and hardware acceleration. He earned his M.S. and Ph.D.
in electrical engineering from UC Berkeley.
SESSION: Keynote Talk III
Session details: Keynote Talk III
Data-driven CAD or Algorithm-Driven CAD: Competitors or Collaborators?
Motivation: Despite decades of R&D in algorithm-driven CAD, the design and implementation
of SoCs requires an ever-increasing number of resources in terms of designers, compute
servers and tool licenses. Design automation has not scaled with the complexity of
deep sub-micron fabrication process or the complexity of optimizing power, performance
and area (PPA) of modern SoCs. There seems to be a fundamental limit to algorithm-driven
CAD that prevents tools from scaling to meet the increasing complexity. As technology
scaling reaches its limits and the PPA gains from technology scaling get limited,
the need for design tools to close PPA gap through design will increase significantly,
making this problem worse.
Problem statement:
Approach:SoC design consists of taking a chip hardware spec and generating the fabrication
mask spec, involving two main tasks: (1) logic synthesis and (2) physical design.
While algorithm-driven CAD tools exist to automate both these tasks, they cannot meet
the PPA without a large number of manually guided design iterations that consume manpower,
compute and tool resources.
Approach: Data-driven CAD can capture the learning from manual PPA optimization, and
data-driven tools inherently scale with design complexity. We explore the open problems
in using Data-driven CAD, to complement the automation capabilities of algorithm-driven
CAD and meet the increasing PPA demands of modern SOCs in deep-submicron technologies.
SESSION: Session 3: ML for Reliability Improvement
Session details: Session 3: ML for Reliability Improvement
Data-Driven Fast Electrostatics and TDDB Aging Analysis
Computing the electric potential and electric field is a critical step for modeling
and analysis of VLSI chips such as TDDB (Time dependent dielectric breakdown) aging
analysis. Data-driven deep learning approach provides new perspectives for learning
the physics-law and representations of the physics dynamics from the data. In this
work, we propose a new data-driven learning based approach for fast 2D analysis of
electric potential and electric fields based on DNNs (deep neural networks). Our work
is based on the observation that the synthesized VLSI layout with multi interconnect
layers can be viewed as layered images. Image transformation techniques via CNN (convolutional
neural network) are adopted for the analysis. Once trained, the model is applicable
to any synthesized layout of the same technology. Training and testing are done on
a dataset built from a synthesized CPU chip. Results show that the proposed method
is around 138x faster than the conventional numerical methods based software COMSOL,
while keeping 99% of the accuracy on potential analysis, and 97% for TDDB aging analysis.
HAT-DRL: Hotspot-Aware Task Mapping for Lifetime Improvement of Multicore System using
Deep Reinforcement Learning
In this work, we propose a novel learning-based task to core mapping technique to
improve lifetime and reliability based on advanced deep reinforcement learning. The
new method, called HAT-DRL, is based on the observation that on-chip temperature sensors
may not capture the true hotspots of the chip, which can lead to sub-optimal control
decisions. In the new method, we first perform data-driven learning to model the hotspot
activation indicator with respect to the resource utilization of different workloads.
On top of this, we propose to employ a recently proposed, highly robust, sample-efficient
soft-actor-critic deep reinforcement learning algorithm, which can learn optimal maximum
entropy policies to improve the long-term reliability and minimize the performance
degradation from NBTI/HCI effects. Lifetime and reliability improvement is achieved
by assigning a reward function, which penalizes continuously stressing the same hotspots
and encourages even stressing of cores. The proposed algorithm is validated with an
Intel i7-8650U four-core CPU platform executing CPU benchmark workloads for various
hotspot activation profiles. Our experimental results show that HAT-DRL balances the
stress between all cores and hotspots, and achieves 50% and 160% longer lifetime compared
to non-hotspot-aware and Linux default scheduling respectively. The proposed method
can also reduce the average temperature by exploiting the true-hotspot information.
Can Wear-Aware Memory Allocation be Intelligent?
Many non-volatile memories (NVM) suffer from a severe reducedcell endurance and therefore
require wear-leveling. Heap memory,as one segment, which potentially is mapped to
a NVM, faces astrong application dependent characteristic regarding the amountof memory
accesses and allocations. A simple deterministic strategyfor wear leveling of the
heap may suffer when the available actionspace becomes too large. Therefore, we investigate
the employmentof a reinforcement learning agent as a substitute for such a strategyin
this paper. The agent’s objective is to learn a strategy, which isoptimal with respect
to the total memory wear out. We concludethis work with an evaluation, where we compare
the deterministicstrategy with the proposed agent. We report that our proposedagent
outperforms the simple deterministic strategy in several cases.However, we also report
further optimization potential in the agentdesign and deployment.
An Enhanced Machine Learning Model for Adaptive Monte Carlo Yield Analysis
This paper presents a novel methodology for generating machine learning models used
by an adaptive Monte Carlo analysis. The advantages of this methodology are that model
generation occurs at the beginning of the analysis with no retraining required, it
applies to both classification and regression models, and accuracy of the Monte Carlo
analysis is not impacted by the accuracy of the model. This paper discusses the details
of constructing and enhancing the machine learning model with emphasis on model training.
It will then show how the model enables a Monte Carlo analysis that monitors and adapts
to model mispredictions.
Towards NN-based Online Estimation of the Full-Chip Temperature and the Rate of Temperature
Change
We propose a novel technique to estimate at run-time both the dynamic thermal map
of the whole chip and the rate of temperature change. Knowledge of the current temperature
is crucial for thermal management. Additional knowledge of the rate of temperature
change allows for predictions of temperatures in the near future, and, therefore,
enables proactive management. However, neither is achievable with existing thermal
sensors due to their limited number. Our technique is based on a neural network (NN)
to predict the rate of temperature change based on performance counter readings and
the current estimate of the thermal map. The thermal map is then updated based on
the prediction. At design-time, we create training data for the NN by recording performance
counters and the dynamic thermal map during the execution of mixed workloads. The
thermal map is recorded with the help of an infrared (IR) camera. At run-time, our
technique requires only performance counter readings. Our technique predicts temperature
changes accurately. However, absolute temperature estimation suffers from instability.
SESSION: Plenary III
Session details: Plenary III
Design Challenges on Post Moore’s Law Era
IC companies nowadays are busy struggling between increasing challenges in deep submicron
process and, in the same time, more stringent time-to-market cycle to entertain the
more demanding consumers. As a result, engineers have to turn to more holistic optimizations
across software, architecture, micro-architecture, circuit design and physical implementations.
The increase in complexity also demands for high level of automation and help from
design tools. We shall look into some of the solutions that we are exploring to cope
with the situation.
Biography: Mr. Matthew Leung serves as the director of Huawei Hong Kong Research Center,
with a current focus in the development of hardware, software and algorithm for artificial
intelligence. Prior to that, he served as the director and a founding member of HiSilicon
(A subsidiary of Huawei) Hong Kong R&D Center. His expertise and experience lies in
the fields of VLSI design for advanced communication chipsets, microprocessors and
artificial intelligence. Mr. Leung received his BSc and MSc degrees of Electrical
Engineering in University of Michigan and Stanford University respectively.
SESSION: Keynote Talk IV
Session details: Keynote Talk IV
Machine Learning in EDA: Opportunities and Challenges
Electronic Design Automation software has delivered semiconductor design productivity
improvements for decades. The next leap in productivity will come from the addition
of machine learning techniques to the toolbox of computational software capabilities
employed by EDA developers. Recent research and development into machine learning
for EDA point to clear patterns for how it impacts EDA tools, flows, and design challenges.
This research has also illustrated some of the challenges that will come with production
deployment of machine learning techniques into EDA tools and flows. This talk will
detail patterns observed in ML for EDA development, as well as discussing challenges
with productization of ML for EDA developments and the opportunities that it presents
for researchers.
Biography: Elias Fallon is currently Engineering Group Director at Cadence Design
Systems, a leading Electronic Design Automation company. He has been involved in EDA
for more than 20 years from the founding of Neolinear, Inc, which was acquired by
Cadence in 2004. Elias was co-Primary Investigator on the MAGESTIC project, funded
by DARPA to investigate the application of Machine Learning to EDA for Package/PCB
and Analog IC. Elias also leads an innovation incubation team within the Custom IC
R&D group as well as other traditional EDA product teams. Beyond his work developing
electronic design automation tools, he has led software quality improvement initiatives
within Cadence, partnering with the Carnegie Mellon Software Engineering Institute.
Elias graduated from Carnegie Mellon University with an M.S. and B.S. in Electrical
and Computer Engineering. Elias, his wife and two children live north of Pittsburgh,
PA. https://www.linkedin.com/in/elias-fallon/
SESSION: Session 4: Intelligent Modeling
Session details: Session 4: Intelligent Modeling
Track-Assignment Detailed Routing Using Attention-based Policy Model With Supervision
Detailed routing is one of the most critical steps in analog circuit design. Complete
routing has become increasingly more challenging in advanced node analog circuits,
making advances in efficient automatic routers ever more necessary. In this work,
we propose a machine learning driven method for solving the track-assignment detailed
routing problem for advanced node analog circuits. Our approach adopts an attention-based
reinforcement learning (RL) policy model. Our main insight and advancement over this
RL model is the use of supervision as a way to leverage solutions generated by a conventional
genetic algorithm (GA). For this, our approach minimizes the Kullback-Leibler divergence
loss between the output from the RL policy model and a solution distribution obtained
from the genetic solver. The key advantage of this approach is that the router can
learn a policy in an offline setting with supervision, while improving the run-time
performance nearly 100× over the genetic solver. Moreover, the quality of the solutions
our approach produces matches well with those generated by GA. We show that especially
for complex problems, our supervised RL method provides good quality solution similar
to conventional attention-based RL without comprising run time performance. The ability
to learn from example designs and train the router to get similar solutions with orders
of magnitude run-time improvement can impact the design flow dramatically, potentially
enabling increased design exploration and routability-driven placement.
Compact Models for Initial MOSFET Sizing Based on Higher-order Artificial Neural Networks
Simple MOSFET models intended for hand analysis are inaccurate in deep sub-micrometer
process technologies and in the moderate inversion region of device operation. Accurate
models, such as the Berkeley BSIM6 model, are too complex for use in hand analysis
and are intended for circuit simulators. Artificial neural networks (ANNs) are efficient
at capturing both linear and non-linear multivariate relationships. In this work,
a straightforward modeling technique is presented using ANNs to replace the BSIM model
equations. Existing open-source libraries are used to quickly build models with error
rates generally below 3%. When combined with a novel approach, such as the gm/Id systematic
design method, the presented models are sufficiently accurate for use in the initial
sizing of analog circuit components without simulation.
An Efficient and Flexible Learning Framework for Dynamic Power and Thermal Co-Management
At the era of Artificial Intelligence and Internet of Things (AIoT), battery-powered
mobile devices are required to perform more sophisticated tasks featured with fast
varying workloads and constrained power supply, demanding more efficient run-time
power management. In this paper, we propose a deep reinforcement learning framework
for dynamic power and thermal co-management. We build several machine learning models
that incorporate the physical details for an ARM Cortex-A72, with on average 3% and
1% error for power and temperature predictions, respectively. We then build an efficient
deep reinforcement learning control incorporating the machine learning models and
facilitating the run-time dynamic voltage and frequency scaling (DVFS) strategy selection
based on the predicted power, workloads and temperature. We evaluate our proposed
framework, and compare the performance with existing management methods. The results
suggest that our proposed framework can achieve 6.8% performance improvement compared
with other alternatives.
Partial Sharing Neural Networks for Multi-Target Regression on Power and Performance
of Embedded Memories
Memories contribute significantly to the overall power, performance and area (PPA)
of modern integrated electronic systems. Owing to their regular structure, memories
are generated by memory compilers in modern industrial designs. Although such compilers
provide PPA-efficient and silicon-verified layouts, the large and growing number of
input parameters to the compilers themselves results in a new challenge of compiler
parameter selection given design requirements. The dimensionality of the search space
as well as the count of memories prohibit manual tuning in fast-paced design cycles.
To efficiently select optimal compiler parameters, we devise regression neural networks
as PPA models of memory compilers, based on which an optimal parameterization can
be selected. Highly accurate PPA estimates are a prerequisite to a reliable optimization.
While regression with multiple targets can easily be achieved by neural networks with
multiple output units, model accuracy depends highly on architecture and hyperparameters.
We study how neural network prediction error on multi-target regression problems can
be reduced, validating recent findings that partial parameter sharing is beneficial
to this class of problems. Our real-world application confirms the benefits of partial
sharing for multi-target regression, and asserts the applicability to the sigmoid
activation function. The accuracy of memory compiler PPA prediction is improved by
approximately ten percent on average, decreasing worst-case prediction errors by over
50 percent.
Explaining and Interpreting Machine Learning CAD Decisions: An IC Testing Case Study
We provide a methodology to explain and interpret machine learning decisions in Computer-Aided
Design (CAD) flows. We demonstrate the efficacy of the methodology to the VLSI testing
case. Such a tool will provide designers with insight into the “black box” machine
learning models/classifiers through human readable sentences based on normally understood
design rules or new design rules. The methodology builds on an intrinsically explainable,
rule-based ML framework, called Sentences in Feature Subsets (SiFS), to mine human
readable decision rules from empirical data sets. SiFS derives decision rules as compact
Boolean logic sentences involving subsets of features in the input data. The approach
is applied to test point insertion problem in circuits and compared to the ground
truth and traditional design rules.
SESSION: Plenary IV
Session details: Plenary IV
Machine-Learning Enabled Next-Generation Physical Design – An EDA Perspective
Physical design is an ensemble of NP-complete problems that P&R tools attempt to solve
in (pseudo) linear time. Advanced process nodes and complex signoff requirements bring
in new physical and timing constraints into the implementation flow, making it harder
for physical design algorithms to deliver industry-leading power, performance, area
(PPA), without giving up design turn-around-time. The relentless pursuit for low-power
high-performance designs is putting constant pressure to limit any over-design, creating
an acute need to have better models/predictions and advanced analytics to drive implementation
flows. Given the advancements in supervised and reinforcement learning, combined with
the availability of large-scale compute, Machine Learning (ML) has the potential to
become a disruptive paradigm change for EDA tools. In this talk, I would like to share
some of the challenges and opportunities for innovation in next-generation physical
design using ML.
Biography: Vishal leads the physical optimization team for the Digital Implementation
products at Synopsys. He has 15 years of R&D experience in building state-of-the-art
optimization engines and P&R flows targeting advanced-node low-power high-performance
designs. More recently, he has been looking at bringing machine-learning paradigms
into digital implementation tools to improve power, performance, area and productivity.
Vishal has a B.Tech. from Indian Institute of Technology, Kanpur and a Ph.D. from
University of Maryland, College Park. He has won a best paper award at ISPD, co-authored
several patents and over 20 IEEE/ACM publications.
SESSION: Panel
Session details: Panel
ML for CAD – Where is the Treasure Hiding?
Advances in ML have revolutionized its effectiveness for a variety of applications.
Indeed, in areas like image classification and NLP, ML (AI) has changed the rules
of the game and opened the door to incredible advances. Design processes seem to match
the ML paradigm perfectly. This mature area is highly automated, combines advanced
analytic techniques, and generates large volumes of data that are used during the
processes. With the promise of saving resources and improving quality, ML for CAD
has attracted a lot of attention in the industry and academia. This is well reflected
in conferences and journals; and the most advanced success stories and works-in-progress
are being presented at MLCAD-2020.
SESSION: Session 5: ML for Systems
Session details: Session 5: ML for Systems
Using Machine Learning Clustering To Find Large Coverage Holes
Identifying large and important coverage holes is a time-consuming process that requires
expertise in the design and its verification environment. This paper describes a novel
machine learning-based technique for finding large coverage holes when the coverage
events are individually defined. The technique is based on clustering the events according
to their names and mapping the clusters into cross-products. Our proposed technique
is being used in the verification of high-end servers. It has already improved the
quality of coverage analysis and helped identify several environment problems.
Exploring Logic Optimizations with Reinforcement Learning and Graph Convolutional
Network
Logic synthesis for combinational circuits is to find the minimum equivalent representation
for Boolean logic functions. A well-adopted logic synthesis paradigm represents the
Boolean logic with standardized logic networks, such as and-inverter graphs (AIG),
and performs logic minimization operations over the graph iteratively. Although the
research for different logic representation and operations is fruitful, the sequence
of using the operations are often determined by heuristics. We propose a Markov decision
process (MDP) formulation of the logic synthesis problem and a reinforcement learning
(RL) algorithm incorporating with graph convolutional network to explore the solution
search space. The experimental results show that the proposed method outperforms the
well-known logic synthesis heuristics with the same sequence length and action space.
AdaPool: Multi-Armed Bandits for Adaptive Virology Screening on Cyber-Physical Digital-Microfluidic
Biochips
Cyber-physical digital microfluidics is a versatile lab-on-chip technology that offers
key advantages in reconfigurability, manufacturability, and sensor integration. Critical
applications such as point-of-care testing (POCT) are expected to benefit the most
from this technology, thus motivating a great body of literature that addresses performance,
cost, and reliability using design-automation methodologies. Despite this effort,
today’s solutions are unable to support the most critical application in the modern
era; that is, cost-effective POCT for rapid virology screening. This application poses
new design challenges related to the testing capacity and adaptability to the infection
distribution within target populations. To support this application, we present a
reinforcement-learning method that enables a cyber-physical digital-microfluidic platform
to learn from its testing results. The proposed method, named AdaPool, uses multi-armed
bandits to infer the dynamics of viral infection and hence adapt the microfluidic
system to an effective testing strategy. Simulation results illustrate the effectiveness
of the proposed method at different infection conditions.
Automatic compiler optimization on embedded software through k-means clustering
Generating instead of implementing variable design platforms is becoming increasingly
popular in the development of System on Chips. This shift also poses the challenge
of rapid compiler optimization that adapts to each newly generated platform. In this
paper, we evaluate the impact of 104 compiler flags on memory usage and core execution
time against standard optimization levels. Each flag has a different influence on
these costs, which is difficult to predict. In this work, we apply cost estimation
methods to predict the impact of each flag on the generated core using unsupervised
Machine Learning, in the form of k-means clustering. The key strengths of the approach
are the low need for data, the adaptability to new cores, and the ease of use. This
helps the designer to understand the impact of flags on related applications, showing
which combination is optimizing the most. As a result, we can obtain 20,93% optimization
on the software size, 3,10% on the performance, and 1,75% on their trade-off beyond
the -O3 optimization.
Transfer Learning for Design-Space Exploration with High-Level Synthesis
High-level synthesis (HLS) raises the level of design abstraction, expedites the process
of hardware design, and enriches the set of final designs by automatically translating
a behavioral specification into a hardware implementation. To obtain different implementations,
HLS users can apply a variety of knobs, such as loop unrolling or function inlining,
to particular code regions of the specification. The applied knob configuration significantly
affects the synthesized design’s performance and cost, e.g., application latency and
area utilization. Hence, HLS users face the design-space exploration (DSE) problem,
i.e. determine which knob configurations result in Pareto-optimal implementations
in this multi-objective space. Whereas it can be costly in time and resources to run
HLS flows with an enormous number of knob configurations, machine learning approaches
can be employed to predict the performance and cost. Still, they require a sufficient
number of sample HLS runs. To enhance the training performance and reduce the sample
complexity, we propose a transfer learning approach that reuses the knowledge obtained
from previously explored design spaces in exploring a new target design space. We
develop a novel neural network model for mixed-sharing multi-domain transfer learning.
Experimental results demonstrate that the proposed model outperforms both single-domain
and hard-sharing models in predicting the performance and cost at early stages of
HLS-driven DSE.
Footprint Classification of Electric Components on Printed Circuit Boards
The market of Printed Circuit Boards (PCBs) is growing fast with the population of
the Internet of Things. Therefore, PCB manufacturers require an effective design methodology
to accelerate the PCB manufacturing processes. To design PCBs for new components,
footprints which contain component information are needed to mount components on a
PCB. However, current footprint design relies on experienced engineers and they may
not maintain rule guidelines, which makes it a time-consuming work in the design flow.
To achieve footprint design automation, analysis of footprint design rule is necessary
and footprint classification can help sorting out design rules for different type
of components. In this paper, we adopt both footprint and file name information to
classify footprints. Through the proposed methodology, we can classify the footprint
libraries with higher accuracy so as to achieve footprint design automation.