News

Remembering Arvind

From Friends, Colleagues, and Students of Arvind

It is with a heavy heart that we write to share the news that on June 17th 2024, we lost our beloved colleague, mentor, and friend Arvind. A...

Read More...

Call for SIGDA Newsletter Editor-in-Chief

ACM SIGDA announces the call for Editor-in-Chief for the SIGDA Newsletter, a monthly publication for news and event information in the design automation area. The Editor-in-Chief, along with the e...

Read More...

IEEE/ACM A. Richard Newton Technical Impact Award in Electronic Design Automation

The IEEE/ACM A. Richard Newton Technical Impact Award in Electronic Design Automation was conferred at DAC 2023 upon Moshe Vardi and Pierre Wolper for their research work “An Automata-Theoretic Ap...

Read More...

Highlights of CADAthlon Brazil 2023

The CADAthlon Brazil 2023 – 3rd Brazilian Programming Contest for Design Automation of Integrated Circuits (https://csbc.sbc.org.br/2023/cadathlon-brasil-en/) took place on August 8th in João Pess...

Read More...

Prof. Ron Rohrer receives ACM SIGDA Pioneering Achievement Award @ DAC 2023

https://www.dac.com/About/Conference-Archive/60th-DAC-2023/Awards-2023

For the introduction and evolution of simulation and analysis techniques that have supported the design and test of inte...
Read More...

Events

UDemo@DAC

UDDAC 2025: 35th ACM SIGDA / IEEE CEDA University Demonstration at Design Automation Conference

ACM SIGDA/IEEE CEDA University Demonstration (UD, previously University Booth) is an excellent op...

Read More...

LIVE

SIGDA Live is a series of webinars, launched monthly or bi-monthly, on topics (either technical or non-technical) of general interest to the SIGDA community. The talks in general fall on the last ...

Read More...

CADathlon@ICCAD

Sunday, Oct. 29, 2023  08:00 AM – 05:00 PM, In-Person

Gallery Ballroom, The Hyatt Regency San Francisco Downtown SoMa, San Francisco, CA, USA

Welcome to CADathlon@ICCAD

CADathlon...

Read More...

HACK@DAC

HACK@DAC is a hardware security challenge contest, co-located with the Design and Automation Conference (DAC), for finding and exploiting security-critical vulnerabilities in hardware and firmware...

Read More...

SDC@DAC

System Design Contest at DAC 2023

The DAC System Design Contest focuses on object detection and classification on an embedded GPU or FPGA system. Contestants will receive a training dataset pro...

Read More...

Awards

Pioneer

2025 ACM SIGDA Pioneering Achievement Award

Call for Nominations

Description: To honor a person for lifetime, outstanding contributions within the scope of electronic design automation, as evidenced by ideas pioneered in publications, industrial products, or other relevant contributions. The award is based on the impact of the contributions throughout the nominee’s lifetime.

Eligibility: Open to researchers in the field of electronic design automation who have had outstanding contributions in the field during their lifetime. Current members of the Board of the ACM SIGDA, or members of the Award Selection Committee are ineligible for the award. The awardee is usually invited to give a lecture at ICCAD.

Award Items: A plaque for the awardee, a citation, and $1000 honorarium. The honorarium will be funded by the SIGDA annual budget.

Nominee Solicitation: The call for nominees will be published by email to members of SIGDA, on the website of ACM SIGDA, and in the SIGDA newsletter. The nomination should be proposed by someone other than the nominee. The nomination materials should be emailed to sigda.acm@gmail.com (Subject: ACM SIGDA Pioneering Achievement Award). Nominations for the award should include:

  • A nomination letter that gives: a 100-word description of the nominee’s contribution and its impact; a 750-word detailed description of up to 10 of the nominee’s major products (papers, patents, software, etc.), the contributions embodied in those products, and their impact; a list of at most 10 citations to the major products discussed in the description.
  • Up to three letters of recommendation (not including the nominator or nominee).
  • Contact information of the nominator.

In addition to the evidence of impact, the nomination package will include biographical information (including education and employment), professional activities, publications, and recognition. Up to three endorsements attesting to the impact of the work may be included.

Award Committee:

Wanli Chang (Chair)

Alberto Sangiovanni-Vincentelli (UC Berkeley)

Giovanni De Micheli (EPFL)

John Hayes (University of Michigan at Ann Arbor)

Jiang Hu (TAMU)

All standard conflict of interest regulations as stated in ACM policy will be applied. Any awards committee members will recuse themselves from consideration of any candidates where a conflict of interest may exist.

Schedule: The submission deadline for the 2025 Award is 31 July 2025.

Selection/Basis for Judging: This award honors an individual who has made an outstanding technical contribution in the scope of electronic design automation throughout his or her lifetime. The award is based on the impact of the contributions as indicated above. Nominees from universities, industry, and government worldwide will be considered and encouraged. The award is not a best paper or initial original contribution award. Instead, it is intended for lifetime, outstanding contributions within the scope of electronic design automation, throughout the nominee’s lifetime.

Presentation: The award is planned to be presented annually at DAC as well as the SIGDA Annual Member Meeting and Dinner at ICCAD.

2024: John Darringer, IBM
2022: Ron Rohrer, SMU, CMU
For the introduction and evolution of simulation and analysis techniques that have supported the design and test of integrated circuits and systems for more than half a century.
2021: Prof. Rob Rutenbar, PITT
For his pioneering work and extraordinary leadership in analog design automation and general EDA education.
2020: Prof. Jacob A. Abraham, UT Austin
For pioneering and fundamental contributions to manufacturing testing and fault-tolerant operation of computing systems.
2019: Prof. Giovanni De Micheli, EPFL
For pioneering and fundamental contributions to synthesis and optimization of integrated circuits and networks-on-chip.
2018: Prof. Alberto Sangiovanni Vincentelli, UC Berkeley
For pioneering and fundamental contributions to design automation research and industry, in system-level design, embedded systems, logic synthesis, physical design and circuit simulation.
2017: Prof. Mary Jane Irwin, Pennsylvania State University
For contributions to VLSI architectures, electronic design automation and community membership.
2016: Prof. Chung Laung (Dave) Liu, National Tsing Hua University, Taiwan (emeritus)
For the fundamental and seminal contributions to physical design and embedded systems.

2014: Prof. John P. Hayes, University of Michigan

 
2013: Prof. Donald E. Thomas, Carnegie Mellon University
For his pioneering work in making the Verilog Hardware Description Language more accessible for the design automation community and allowing for faster and easier pathways to simulation, high-level synthesis, and co-design of hardware-software systems.
2012: Dr. Louise Trevillyan, IBM
Recognizing her almost-40-year career in EDA and her groundbreaking research contributions in logic and physical synthesis, design verification, high-level synthesis, processor performance analysis, and compiler technology.
2011: Prof. Robert K. Brayton, UC Berkeley
For outstanding contributions to the field of Computer Aided Design of integrated systems over the last several decades.
2010: Prof. Scott Kirkpatrick, The Hebrew University of Jerusalem
On Solving Hard Problems by Analogy
Automated electronic design is not the only field in which surprising analogies from other fields of science have been used to deal with the challenges of very large problem sizes, requiring optimization across multiple scales, with constraints which eliminate any elegant solutions. Similar opportunities arise, for example, in logistics, in scheduling, in portfolio optimization and other classic problems. The common ingredient in all of these is that the problems are fundamentally frustrated, in that conflicting objectives must be traded off at all scales. This, plus the irregular structure in such real world problems eliminates any easy routes to the best solutions. Of course, in engineering, the real objective is not a global optimum, but a solution that is “good enough” and can be obtained “soon enough” to be useful. The model in materials science that gave rise by analogy to simulated annealing is the spin glass, which recently surfaced again in computer science as a vehicle whose inherent complexity might answer the long-vexing question of whether P can be proved not equal to NP.
2009: Prof. Martin Davis, NYU
 For his fundamental contributions to algorithms for solving the Boolean Satisfiability problem, which heavily influenced modern tools for hardware and software verifciation, as well as logic circuit synthesis.
2008: Prof. Edward J. McCluskey, Stanford

 For his outstanding contributions to the areas of CAD, test and reliable computing during the past half of century.
2007: Dr. Gene M. Amdahl, Amdahl Corporation
Award citation: For his outstanding contributions to the computing industry on the occasion of the 40th anniversary of Amdahl’s Law.
Video of Dr. Amdahl’s dinner talk and a panel debate are available on the ACM digital library.
Read More...

ONFA

SIGDA Outstanding New Faculty Award

Call for Applications

The ACM SIGDA Outstanding New Faculty Award (ONFA) recognizes a junior faculty member early in her or his academic career who demonstrates outstanding potential as an educator and/or researcher in the field of electronic design automation. While prior research and/or teaching accomplishments are important, the selection committee will especially consider the impact that the candidate has had on her or his department and on the EDA field during the initial years of their academic appointment. The 2025 award will be presented at ICCAD 2025, consisting of a USD $1,000 cash prize to the faculty member, along with a plaque and a citation.

Eligibility: Outstanding new faculty who are developing academic careers in areas related to electronic design automation are encouraged to apply for this award. Note that this award is not intended for senior or highly experienced investigators who have already established independent research careers, even if they are new to academia. Candidates must have recently completed at least one full academic year and no more than four and a half full academic years in a tenure-track position. Applications will also be considered from people whose appointments are continuing (non-visiting) positions with substantial educational responsibilities regardless whether or not they are tenure track. Persons holding research-only positions are not eligible. Exceptions to the timing requirements will be made for persons who have interrupted their academic careers for substantive reasons, such as family or medical leave. The presence of such reasons must be attested by the sponsoring institution, but no explanation is needed.

Deadline for the 2025 Award: 31 May 2025

Application: Candidates applying for the award must submit the following to the selection committee:

  1. a 2-page statement summarizing the candidate’s teaching and research accomplishments since beginning their current academic position, as well as an indication of plans for further development over the next five years;
  2. a copy of a current curriculum vitae;
  3. a letter from either the candidate’s department chair or dean endorsing the application.

The nomination materials should be emailed by the deadline to sigda.acm@gmail.com (Subject: ACM SIGDA Outstanding New Faculty Award). Endorsement letters may be sent separately.

Award Committee:

Ron Duncan (Synopsys)
Tsung-Yi Ho (CUHK)
Ambar Sarkar (Nvidia)
Chengmo Yang (Delaware)
Dirk Ziegenbein (Bosch)

All standard conflict of interest regulations as stated in ACM policy will be applied. Any awards committee members will recuse themselves from consideration of any candidates where a conflict of interest may exist.
 

Past Awardees

2024Bonan YanPeking University
2023Tsung-Wei HuangUniversity of Utah
2022 Yingyan (Celine) LinRice University
2021Zheng ZhangUC Santa Barbara
2020Pierre-Emmanuel GaillardonUniversity of Utah
2019Jeyavijayan (JV) Rajendran Texas A&M University
2018Shimeng YuArizona State University
2017 Yier JinUniversity of Florida
2016 Swaroop GhoshUniversity of South Florida
2015 Muhammad ShafiqueKarlsruhe Institute of Technology
2014 Yiran ChenUniversity of Pittsburgh
2013 Shobha VasudevanUIUC
2012David AtienzaEPFL, Switzerland
2011 Farinaz KoushanfarRice University
2010Puneet GuptaUCLA
Deming ChenUIUC
2009Yu CaoArizona State University
2008Subhasish MitraStanford University
2007 Michael OrshanskyUniversity of Texas, Austin
2006David PanUniversity of Texas, Austin
2004Kaustav BanerjeeUniversity of California, Santa Barbara
Igor MarkovUniversity of Michigan, Ann Arbor
2003Dennis SylvesterUniversity of Michigan, Ann Arbor
2002Charlie Chung-Ping ChenUniv. of Wisconsin, Madison
2000Vijay NarayananPenn State University
Read More...

OPDA

2025 ACM Outstanding Ph.D. Dissertation Award in Electronic Design Automation

Call for Nominations

Design automation has gained widespread acceptance by the VLSI circuits and systems design community. Advancement in computer-aided design (CAD) methodologies, algorithms, and tools has become increasingly important to cope with the rapidly growing design complexity, higher performance and low-power requirements, and shorter time-to-market demands. To encourage innovative, ground-breaking research in the area of electronic design automation, the ACM’s Special Interest Group on Design Automation (SIGDA) has established an ACM award to be given each year to an outstanding Ph.D. dissertation that makes the most substantial contribution to the theory and/or application in the field of electronic design automation.

The award consists of a plaque and an honorarium of USD $1,000. The 2025 Award will be presented at ICCAD 2025 in November 2025. The award is selected by a committee of experts from academia and industry in the field and appointed by ACM in consultation with the SIGDA Chair.

Deadline for the 2025 Award: 30 April 2025

Eligibility and nomination requirements: For the 2025 Award, the nominated dissertation should date between 1 July 2023 and 31 December 2024. Each nomination package should consist of:

  • The PDF file of the Ph.D. dissertation in the English language;
  • A statement (up to two pages) from the nominee explaining the significance and major contributions of the work;
  • A nomination letter from the nominee’s advisor or department chair or dean of the school endorsing the application;
  • Optionally, up to three letters of recommendation from experts in the field.

The nomination materials should be emailed to sigda.acm@gmail.com (Subject: ACM Outstanding Ph.D. Dissertation Award in EDA). Recommendation letters may be sent separately.

Award Committee:

Ismail Bustany (AMD)

Mustafa Badaroglu (QualComm)

Jintong Hu (Pittsburg)

Sharad Malik (Princeton)

Mark Ren (Nvidia)

Aviral Shrivastava (ASU)

Linghao Song (Yale)

Peh Li Shiuan (NUS)

Natarajan Viswanathan (Cadence)

Robert Wille (TUM)

All standard conflict of interest regulations as stated in ACM policy will be applied. Any award committee members will recuse themselves from consideration of any candidates where a conflict of interest may exist.
 

Past Awardees

2024Lukas Burgholzer, for the dissertation “Design Automation Tools and Software for Quantum Computing”, Johannes Kepler University Linz. Advisors: Robert Wille and Jens Eisert.
2023Zhiyao Xie, for the dissertation “Intelligent Circuit Design and Implementation with Machine Learning”, Duke University, Advisors: Yiran Chen and Hai Li
2022Ganapati Bhat, for the dissertation “Design, Optimization, and Applications of Wearable IoT Devices”, Arizona State University, Advisor: Umit Y. Ogras
2021Ahmedullah Aziz, for the dissertation “Device-Circuit Co-Design Employing Phase Transition Materials for Low power Electronics”, Purdue University, Advisor: Sumeet Gupta.
2020Gengjie Chen, for the dissertation “VLSI Routing: Seeing Nano Tree in Giga Forest,” The Chinese University of Hong Kong. Advisor: Evangeline Young.
2019Tsung-Wei Huang, for the dissertation “Distributed Timing Analysis“, University of Illinois, Urbana-Champaign. Advisor: Martin D. F. Wong.
2018Xiaoqing Xu, for the dissertation “Standard Cell Optimization and Physical Design in Advanced Technology Nodes,” University of Texas at Austin. Advisor: David Z. Pan.
Pramod Subramanyan, for the dissertation “Deriving Abstractions to Address Hardware Platform Security Challenges,” Princeton University. Advisor: Sharad Malik.
2017Jeyavijayan Rajendran, for the dissertation “Trustworthy Integrated Circuit Design,” New York University. Advisor: Ramesh Karri.
2016Zheng Zhang, for the dissertation “Uncertainty Quantification for Integrated Circuits and Microelectromechanical Systems,” Massachusetts Institute of Technology. Advisor: Luca Daniel.
2015Wenchao Li, for the dissertation Specification Mining: New Formalisms, Algorithms and Applications,” University of California at Berkeley. Advisor: Sanjit Seshia.
2014Wangyang Zhang, for the dissertation IC Spatial Variation Modeling: Algorithms and Applicaitons,” Carnegie Mellon University. Advisors: Xin Li and Rob Rutenbar
2013Duo Ding, for the dissertation CAD for Nanolithography and Nanophotonics,” University of Texas at Austin. Advisor: David Z. Pan
Guojie Luo, for the dissertation “Placement and Design Planning for 3D integrated Circuits,” UCLA. Advisor: Jason Cong
2012Tan Yan, for the dissertation “Algorithmic Studies on PCB Routing,” defended with the University of Illinois at Urbana-Champaign.
2011Nishant Patil, for the dissertation “Design and Fabrication of Imperfection-Immune Carbon Nanotube Digital VLSI Circuits,” Stanford University.
2010Himanshu Jain, for the dissertation “Verification using Satisfiability Checking, Predicate Abstraction, and Craig Interpolation,” Carnegie Mellon University.
2009Kai-Hui Chang, for the dissertation “Functional Design Error Diagnosis, Correction and Layout Repair of Digital Circuits”, University of Michigan at Ann Arbor.
2008(No award is given this year)
2007(No award is given this year)
2006Haifeng Qian of University of Minnesota, Minneapolis, Department of Electrical and Computer Engineering, for the thesis entitled Stochastic and Hybrid Linear Equation Solvers and their Applications in VLSI Design Automation.
2005Shuvendu Lahiri of Carnegie Mellon University, Department of Electrical and Computer Engineering, for a thesis entitled “Unbounded System Verification using Decision Procedure and Predicate Abstraction
2004Chao Wang of University of Colorado at Boulder, Department of Electrical Engineering, for a thesis entitled “Abstraction Refinement for Large Scale Model Checking
2003Luca Daniel of University of California, Berkeley Department of Electrical Engineering and Computer Science for a thesis entitled “Simulation and modeling techniques for signal integrity and electromagnetic interference on high frequency electronic systems”
Lintao Zhang of Princeton University Department of Electrical Engineering for a thesis entitled “Searching for truth: techniques for satisfiability of Boolean formulas.
2002(No award is given this year)
2001Darko Kirovski from University of California, Los Angeles Department of Computer Science for a thesis entitled “Constraint Manipulation Techniques for Synthesis and Verification of Embedded Systems.” The runner-up who received an honorable mention in that years ceremony was Michael Beattie of Carnegie Mellon University Department of Electrical and Computer Engineering for a thesis entitled “Efficient Electromagnetic Modeling for Giga-scale IC Interconnect.” 
2000Robert Brent Jones of Stanford University Department of Electrical Engineering for a thesis entitled Applications of Symbolic Simulation To the Formal Verification of Microprocessors.”
Read More...

Newton

ACM/IEEE A. Richard Newton Technical Impact Award in Electronic Design Automation 2025

Call for Nominations

Description

To honor a person or persons for an outstanding technical contribution within the scope of electronic design automation, as evidenced by a paper published at least ten years before the presentation of the award (before July 2015).

Prize

USD 1500 to be shared amongst the authors and a plaque for each author.

Funding

Funded by the IEEE Council on Electronic Design Automation and ACM Special Interest Group on Design Automation.

Presentation

Presented annually at the Design Automation Conference.

Historical Background

A. Richard Newton, one of the foremost pioneers and leaders of the EDA field, passed away on 2 January 2007, of pancreatic cancer at the age of 55.

A. Richard Newton was professor and dean of the College of Engineering at the University of California, Berkeley. Newton was educated at the University of Melbourne and received his bachelor’s degree in 1973 and his master’s degree in 1975. In the early 1970s he began to work on SPICE, a simulation program initially developed by Larry Nagel and Donald Pederson to analyze and design complex electronic circuitry with speed and accuracy. In 1978, Newton earned his Ph.D. in electrical engineering and computer sciences from UC Berkeley.

For his research and entrepreneurial contributions to the electronic design automation industry, he was awarded the 2003 Phil Kaufman Award. In 2004, he was named a member of the National Academy of Engineering, and in 2006, of the American Academy of Arts and Sciences. He was a member of the Association for Computing Machinery and a fellow of the Institute of Electrical and Electronics Engineers.

Basis for Judging

The prime consideration will be the impact on technology, industry, and education, and on working designers and engineers in the field of EDA. Such impact might include a research result that inspires much innovative thinking, or that has been put into wide use in practice.

Eligibility

The paper must have passed through a peer-review process before publication, be an archived conference or journal publication available from or published by either ACM or IEEE, and be a seminal paper where an original idea was first described. Follow-up papers and extended descriptions of the work may be cited in the nomination, but the award is given for the initial original contribution.

Selection Committee

Chair: Wanli Chang

Vice-Chair: Deming Chen

Members to be announced

Nomination Deadline

21 March 2025

Nomination Package

Please send a one-page nomination letter explaining the impact of the nominated paper, evidence of the impact, biography of the nominator, at most three endorsements, and the nominated paper itself, all in one PDF file, to sigda.acm@gmail.com (Subject: 2025 A. Richard Newton Technical Impact Award in Electronic Design Automation).
 

Past Awardees

  • 2024: Mircea Stan and Wayne Burleson, “Bus-Invert Coding for Low-Power I/O”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, Vol. 3, No. 1, pp. 49-58, March 1995.
  • 2023: Moshe Vardi and Pierre Wolper for their research work “An Automata-Theoretic Approach to Automatic Program Verification”, published in the proceedings of the 1st Symposium on Logic in Computer Science, 1986.
  • 2022: Ricardo Telichevesky, Kenneth S. Kundert, and Jacob K. White, “Efficient Steady-State Analysis based on Matrix-Free Krylov-Subspace Methods”, In Proc. of the 32nd Design Automation Conference, 1995.
  • 2021: John A. Waicukauski, Eric Lindbloom, Barry K. Rosen, and Vijay S. Iyengar, “Transition Fault Simulation,” IEEE Design & Test of Computers, Vol. 4, no. 2, April 1987
  • 2020: Luca Benini and Giovanni De Micheli, “Networks on Chips: A New SoC Paradigm,” IEEE Computer, pp. 70-78, January 2002.
  • 2019: E. B. Eichelberger and T. W. Williams, “A Logic Design Structure for LSI Testability,” In Proc. of the 14th Design Automation Conference, 1977.
  • 2018: Hans Eisenmann and Frank M. Johannes, “Generic Global Placement and Floorplanning,” In Proc. of the 35th Design Automation Conference, 1998.
  • 2017: Matthew W. Moskewicz, Conor F. Madigan, Ying Zhao, Lintao Zhang, and Sharad Malik, “Chaff: Engineering an Efficient SAT Solver,” In Proc. of the 38st Design Automation Conference, 2001.
  • 2016: Chandu Visweswariah, Kaushik Ravindran, Kerim Kalafala, Steven G. Walker, Sambasivan Narayan, “First-Order Incremental Block-Based Statistical Timing Analysis,” In Proc. of the 41st Design Automation Conference, 2004.
  • 2015: Blaise Gassend, Dwaine Clarke, Marten van Dijk, and Srinivas Devadas, “Silicon Physical Random Functions,” In Proceedings of the 9th ACM Conference on Computer and Communications Security (CCS), 2002.
  • 2014: Subhasish Mitra and Kee Sup Kim, “X-compact: an efficient response compaction technique for test cost reduction,” IEEE International Test Conference, 2002.
  • 2013: Keith Nabors and Jacob White, “FastCap: A multipole accelerated 3-D capacitance extraction program,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Vol. 10, Issue 11 (1991): 1447-1459.
  • 2012: Altan Odabasioglu, Mustafa Celik, Larry Pileggi, “PRIMA: Passive Reduced-Order Interconnect Macromodeling Algorithm,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Aug., 1998.
  • 2011: Jason Cong, Eugene Ding, “FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Jan., 1994.
  • 2010: Randal Bryant, “Graph-based algorithms for Boolean function manipulation” IEEE Transactions on Computers, Aug., 1986.
  • 2009: Robert K. Brayton, Richard Rudell, Alberto Sangiovanni-Vincentelli, Albert R. Wang, “MIS: A Multiple-Level Logic Optimizations System,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, Nov., 1997.
Read More...

Service

Service Awards

SIGDA has restructured its service awards, and will be giving two annual service awards.

  • Distinguished Service Award: The SIGDA Distinguished Service Award is given to individuals who have dedicated many years of their career in extraordinary services to promoting, leading, or creating ACM/SIGDA programs or events.
  • Meritorious Service Award: The SIGDA Meritorious Service Award is given to individuals who have performed professional services above and beyond traditional service to promoting, leading, or creating ACM/SIGDA programs or events.

At any given year, the number of Distinguished Service Award will be up to 2, and the number of Meritorious Service Award will be up to 4.

Nominations should consist of:

  • Award type being nominated.
  • Name, address, phone number and email of person making the nomination.
  • Name, affiliation, address, email, and telephone number of the nominee for whom the award is recommended.
  • A statement (between 200 and 500 words long) explaining why the nominee deserves the award. Note that the award is given for service that goes above and beyond traditional services.
  • Up to 2 additional letters of support. Include the name, affiliation, email address, and telephone number of the letter writer(s). Supporters of multiple candidates are strongly encouraged to compare the candidates in their letters.

Note that the nominator and reference shall come from active SIGDA volunteers. Deadline of the nomination every year: March 15 (Except 2019, May 5).

Please send all your nomination materials as one pdf file to SIGDA-Award@acm.org before the deadline.

Distinguished Service Awards

2023Tulika Mitra, National University of Singapore
For her leadership in major SIGDA conferences such asgeneral chair for ICCAD and ESWEEK“.
Patrick Groeneveld, Stanford University
For his multi-year significant contribution to the EDA community, such as DAC finance chair among many other“.
2022Vijay Narayanan, The Pennsylvania State University
For Extraordinary Dedication and Leadership to SIGDA“.
Harry Foster, Siemens EDA
For Extraordinary Dedication and Persistence in Leading DAC during Pandemic“.
2021Deming Chen, University of Illinois Urbana-Champaign
For distinguished contributions to the design automation and reconfigurable computing communities“.
Evangeline F. Y. Young, Chinese University of Hong Kong
For outstanding leadership in promoting diversity in the ACM/SIGDA community“.
2020Sri Parameswaran, University of New South Wales
“For leadership and distinguished service to the EDA community“.
2019Naehyuck Chang, Korea Advanced Institute of Science and Technology
For many years of impactful service to ACM/SIGDA in various leadership positions“.
Sudeep Pasricha, Colorado State University
For a decade of outstanding service to ACM/SIGDA in various volunteer positions
2018Chuck Alpert, Cadence Design Systems
For significant contributions to DAC”.
Jörg Henkel, Karlsruhe Institute of Technology
For leading SIGDA efforts in Europe and DATE”.
Michael ‘Mac’ McNamara, Adapt-IP
For sustained contributions to the design automation community and DAC”.
Michelle Clancy, Cayenne Communication
For sustained contributions to the community, especially DAC”.
2016Steven Levitan
In recognition of a lifetime of devoted service to ACM SIGDA and the Electronic Design Automation community.
2015Tatsuo Ohtsuki, Waseda University
Hiroto Yasuura, Kyushu University
Hidetoshi Onodera, Kyoto University
For their distinguished contributions to the Asia and South Pacific Design automation Conference (ASPDAC) as well as their many years of dedicated service on the conference’s steering committee
Massoud Pedram, University of Southern California
For his many years of service as Editor in Chief of the ACM Transactions on Design Automation of Electronic Systems (TODAES)
2014Peter Marwedel, Technical University of Dortmund
For his muiltiple years of service starting and maintaining the DATE PhD Forum
2012Joe Zamreno, Iowa State University
Baris Taskin, Drexel University
2011Peter Feldman, IBM
Radu Marculescu, CMU
Qinru Qiu, Syracuse University
Martin Wong, UIUC
Qing Wu, Air Force Rome Labs
2010Alex K. Jones
For dedicated service to ACM/SIGDA and the Design Automation Conference as director of the University Booth
Matt Guthaus
For dedicated service as director of SIGDA CADathlon at ICCAD program and Editor-in-Chief of the SIGDA E-Newsletter
Diana Marculescu
For dedicated service as SIGDA Chair, and contributions to SIGDA, DAC and the EDA Profession
2009Nikil Dutt
For contributions to ACM’s Special Interest Group on Design Automation during the past fifteen years as a SIGDA officer, coordinator of the University Booth in its early years, and most recently, as Editor-in-Chief of the ACM Transactions on Design Automation of Electronic Systems
2008SungKyu Lim
For his contributions to the DAC University Booth.”
2007Richard Goering
For his contributions as EE Times Editorial Director for Design Automation for more than two decades
Gary Smith
For his contributions as Chief EDA Analyst at Gartner Dataquest for almost two decades.”
Daniel Gajski
Mary Jane Irwin
Donald E. Thomas
Chuck Shaw

For outstanding contributions to the creation of the SIGDA/DAC University Booth, on the occasion of its 20th edition.”
Soha Hassoun
Steven P. Levitan

For outstanding contributions to the creation of the SIGDA Ph.D. Forum at DAC on the occasion of its 10th edition.”
Richard Auletta
For over a decade of service to SIGDA as University Booth Coordinator, Secretary/Treasurer, and Executive Committee Member-at-Large.
2006Robert Walker
For dedicated service as SIGDA Chair (2001 – 2005), and over a decade of service to SIGDA, DAC and the EDA profession.”
2005Mary Jane Irwin
For dedicated service as Editor in Chief of ACM Journal, TODAES (1998 – 2004), and many years of service to SIGDA, DAC, and the EDA profession.”
2004James P. Cohoon
For exemplary service to SIGDA, to ACM, to DAC, and to the EDA profession as a whole
2003James Plusquellic
For exemplary service to ACM/SIGDA and the Design Automation Conference as director of the University Booth program
2002Steven P. Levitan
For over a decade of service to ACM/SIGDA and the EDA industry — as DAC University Booth Coordinator, Student Design Contest organizer, founder and promoter of SIGDA’s web server, and most recently, Chair of ACM/SIGDA from 1997 to 2001.
Cheng-Kok Koh
For exemplary service to ACM/SIGDA and the EDA industry — as Co-director of SIGDA’s CDROM Project, as SIGDA’s Travel Grants Coordinator, and as Editor of the SIGDA Newsletter.”
2001Robert Grafton
For contributions to the EDA profession through his many years as the Program Director of NSF’s Design, Tools, and Test Program of the computer, Information Sciences & Engineering Directorate. In this position, he provided supervision, mentorship, and guidance to several generation of EDA tool designers and builders funded by grants from the National Science Foundation.”
2000Massoud Pedram
For his contributions in developing the SIGDA Multimedia Series and organizing the Young Student Support Program
Soha Hassoun
For developing the SIGDA Ph.D. Forum
1999C.L. (Dave) Liu
For his work in founding our flagship journal ACM/TODAES

Meritorious Service Awards

2023Robert Wille, Technical University of Munich
For his leading positions in major ACM SIGDA conferences, including Executive Committee of DATE, ICCAD and Chair of the PhD Forum at DAC and DATE“.
Lei Jiang, Indiana University Bloomington
For his leadership and contribution to SIGDA student research forums (SRFs) at ASP-DAC“.
Hui-Ru Jiang, National Taiwan University
For her continuous contribution to SIGDA PhD Forum at DAC and many other events“.
Jeyavijayan (JV) Rajendran, Texas A&M University
For his leadership in co-founding and organizing Hack@DAC, the largest hardware security competition in the world“.
2022Jeff Goeders, Brigham Young University
For Chairing System Design Contest @ DAC for the Past 3 Years“.
Cheng Zhuo, Zhejiang University
For the Leading Efforts to the Success of SRC@ICCAD and SDC@DAC as Chairs for the past5 years, and the Sustained Contributions to the EDA Community in China“.
Tsung-Wei Huang, University of Utah
For Chairing CADathlon and CAD Contests at ICCAD for Three Years. These Activities Have Engaged Hundreds of Students into CAD Research“.
Yiyu Shi, University of Notre Dame
For Outstanding Services in Leading SIGDA Educational Efforts“.
2021Bei Yu, Chinese University of Hong Kong
For service as SIGDA Web Chair from 2016 to 2021, SIGDA Student Research Competition Chair in 2018 and 2019, and other SIGDA activities“.
2020Aida Todri-Sanial, LIRMM/University of Montpellier
“For service as Co-Editor-in-Chief of SIGDA e-Newsletter from 2016 to 2019 and other SIGDA activities“.
Yu Wang, Tsinghua University
For service as Co-Editor-in-Chief of SIGDA e-Newsletter from 2017 to 2019 and other SIGDA activities”.
2019Yinhe Han, Chinese Academy of Sciences
For outstanding effort in promoting EDA and SIGDA events in China
Jingtong Hu, University of Pittsburgh
For contribution to multiple SIGDA education and outreach activities
Xiaowei Xu, University of Notre Dame
For contribution to the 2018 System Design Contest at ACM/IEEE Design Automation Conference
2015Laleh Behjat, University of Calgary
For service as chair of the SIGDA PhD forum at DAC
Soonhoi Ha, Seoul National University
Jeonghee Shin, Apple
For their service as co-chairs of the University Booth at DAC
1998Jason Cong
Bryan Preas
Kathy Preas
Chong-Chian Koh
Cheng-Kok Koh

For contributions in producing SIGDA CD ROM’s – Archiving the knowledge of the Design Automation Community
1997Robert Walker
For his hard work as Secretary/Treasurer and University Booth Coordinator
1996Debbie Hall
For serving as ACM Program Director for SIGDA for the past 6 years

Following are some awards no longer being given:

Technical Leadership Awards

2013Jarrod Roy
Sudeep Pasricha
Sudarshan Banerjee
Srinivas Katkoori

for running CADathlon
2012Cheng Zhuo
Steve Burns
Amin Chirayu
Andrey Ayupov
Gustavo Wilke
Mustafa Ozdal
2011Raju Balasuramanian
Zhuo Li
Frank Liu
Natarajan Viswanathan
2010Cliff Sze
For contributions to the ISPD Physical Design contest, and promoting research in physical design.”
2008Hai Zhou
For contributions to the SIGDA E-Newsletter (2005-2008)
Jing Yang
For contributions to the SIGDA Ph.D. Forum at DAC (2004-2008)
2007Geert Janssen
For contributions to the SIGDA CADathlon
Tony Givargis
For contributions to the SIGDA Ph.D. Forum at DAC (2005-2007)
Gi-Joon Nam
For contributions to the Physical Design Contest at ISPD (2005-2007)
2006Kartik Mohanram
For contributions to the SIGDA CADathlon at ICCAD (2004-2005)
Ray Hoare
For contributions to the SIGDA University Booth at DAC (2004-2006)
Radu Marculescu
For contributions to the SIGDA Ph.D. Forum at DAC (2004-2006)
Frank Liu
For contributions to the SIGDA Ph.D. Forum at DAC (2005-2006)
2005Florian Krohm
For contributions to the SIGDA CADathlon at ICCAD
R. Iris Bahar
Igor Markov

For contributions to the SIGDA E-Newsletter
2004Robert Jones
For contributions to the SIGDA Ph.D. Forum at DAC
2003Diana Marculescu
For contributions to the SIGDA Ph.D. Forum at DAC
2002Geert Janssen
For contributions to the SIGDA CADathlon at ICCAD
Pai Chou
Abe Elfadel
Olivier Coudert
Soha Hassoun

For contributions to the SIGDA Ph.D. Forum at DAC

1996 SIGDA Lifetime Achievement Award

Paul Weil
“For contributions to SIGDA for 13 years, most recently being in charge of Workshops.”

1996 SIGDA Leadership Award

Steve Levitan
“For Newsletter and Electronic Publishing”

1995 SIGDA Outstanding Leadership Award

Charlotte Acken
“In recognition of her efforts in managing the high school scholarship program”

1994 SIGDA Outstanding Leadership Award

Pat Hefferan
“As Editor of the SIGDA Newsletter”

Read More...

Programs

ACM SIGDA Speaker Travel Grant Program

The SIGDA Speaker Series Travel Grant actively supports the travels of the speakers who are invited to give lectures or talks in local events, universities, and companies, so as to disseminate the values and impact of SIGDA. These speakers can be from either academia or company and are considered as good lectures that can help reach out to the audiences in the broad field of design automation. Once the application is approved, SIGDA will issue partial grants to cover the speaker’s travel expenses, including travel and subsistence costs.

This grant is to help on promoting the EDA community and activities all over the world. It will provide travel support averaging $1,000 (USD) for approximately 6 eligible speakers per year to defray their costs of giving lectures or talks in local events, universities, and companies. Priority will be given to the applicants from the local sections of SIGDA with the speakers presenting in the events supported by the local sections of SIGDA. In addition, local EDA communities or individuals, rather than local sections of SIGDA, are also encouraged to apply for this grant. For the application or additional information, please contact SIGDA by sending an email exclusively to the Technical Activity Chair (https://www.sigda.org/about-us/officers/).

Review Process

The review committee will be formed by the current Technical Activity Chair and Education Chair of SIGDA. The reviews will be reported and discussed in SIGDA’s executive committee meeting. After the discussion, the executing committee members will vote to grant or not grant the submitted applications.

Selection Criteria

The review takes the applicants/events and speakers in considerations.

  • Preference is given to the local sections of SIGDA for the speakers invited to the events, universities, and companies supported by the local sections of SIGDA. In addition, the applicants from local EDA communities or individuals are also considered.
  • The invited speaker should be a good lecture or researcher from either academia or industry, and has a good track record in the broad field of design automation.

Post Applications – Report and Reimbursement

  • For the speaker giving a talk in an ACM event, SIGDA can support the travel grant and process reimbursements to the speaker directly. At the end of the event, the speaker needs to complete the ACM reimbursement form and send it to SIGDA or ACM Representative along with copies of the receipts. The speakers will also need to abide by the reimbursement policies/standards found here: https://www.acm.org/special-interest-groups/volunteer-resources/conference-planning/conference-finances#speaker
  • For the speaker giving a talk in a non-ACM event, SIGDA will provide the lump sum payment to the legal and financial sponsoring organization, which would offer the fund as the travel grants and process reimbursements. Meanwhile, the sponsoring organization needs to indicate on the event’s promotional materials that travel grants are being supported by SIGDA. At the end of the event, the sponsoring organization needs to provide (1) a one-page final report to SIGDA reflecting the success of their goals against the funds provided and indicating how the funds were spent, (2) an invoice for the approved amount, and (3) tax form. Note that there is no specific format for the final report.

Application Form

Sponsor

Synopsys

Read More...

GLSVLSI 2016 TOC

SESSION: Keynote 1

Session details: Keynote 1

  • Coskun
    Ayse

Why Is It So Hard to Make Secure Chips?

  • Witteman
    Marc

Chip security has long been the domain of smart cards. These microcontrollers are
specifically designed to thwart many different attacks in order to deliver typical
security functions as payment cards, electronic passports, and access cards. With
the …

SESSION: Keynote 2

Session details: Keynote 2

  • Han
    Jie

Design and Implementation of Real-Time Multi-sensor Vision Systems

  • Leblebici
    Yusuf

Implementation of high performance multi-camera / multi-sensor imaging systems that
are required to produce real-time video output pose a large number of unique challenges
to conventional digital design based on general-purpose processors or GPUs. In …

SESSION: Keynote 3

Session details: Keynote 3

  • Margala
    Martin

Medical Device Security: The First 165 Years

  • Fu
    Kevin

Today, it would be difficult to find medical device technology that does not critically
depend on computer software. Network connectivity and wireless communication has transformed
the delivery of patient care. The technology often enables patients to …

SESSION: Keynote 4

Session details: Keynote 4

  • Behjat
    Laleh

VLSI Design Methods for Low Power Embedded Encryption

  • Verbauwhede
    Ingrid

Intelligent things, medical devices, vehicles and factories, all part of cyberphysical
systems, will only be secure if we can build devices that can perform the mathematically
demanding cryptographic operations in an efficient way. Unfortunately, many …

SESSION: Session 1: VLSI Circuits 1

Session details: Session 1: VLSI Circuits 1

  • Navabi
    Zain

High-Speed Polynomial Multiplier Architecture for Ring-LWE Based Public Key Cryptosystems

  • Du
    Chaohui

Many lattice-based cryptosystems are based on the security of the Ring learning with
errors (Ring-LWE) problem. The most critical and computationally intensive operation
of these Ring-LWE based cryptosystems is polynomial multiplication. In this paper,

Reduced Overhead Gate Level Logic Encryption

  • Juretus
    Kyle

Untrusted third-parties are found throughout the integrated circuit (IC) design flow
resulting in potential threats in IC reliability and security. Threats include IC
counterfeiting, intellectual property (IP) theft, IC overproduction, and the insertion

A Design of a Non-Volatile PMC-Based (Programmable Metallization Cell) Register File

  • Junsangsri
    Salin

This paper presents the design of a non-volatile register file using cells made of
a SRAM and a Programmable Metallization Cell (PMC). The proposed cell is a symmetric
8T2P (8-transistors, 2PMC) design; it utilizes three control lines to ensure the …

A Clockless Sequential PUF with Autonomous Majority Voting

  • Xu
    Xiaolin

Physical unclonable functions (PUFs) leverage minute silicon process variations to
produce device-tied secret keys. The energy and area costs of creating keys from PUFs
can far exceed the costs of the basic PUF circuits alone. Minimizing the end-to-end

SESSION: Session 2: VLSI and Test

Session details: Session 2: VLSI and Test

  • Qian
    Weikang

Area-Efficient Error-Resilient Discrete Fourier Transformation Design using Stochastic
Computing

  • Yuan
    Bo

Discrete Fourier Transformation (DFT)/Fast Fourier Transformation (FFT) are the widely
used techniques in numerous modern signal processing applications. In general, because
of their inherent multiplication-intensive characteristics, the hardware …

Concurrent Error Detection for Reliable SHA-3 Design

  • Luo
    Pei

Cryptographic systems are vulnerable to random errors and injected faults. Soft errors
can inadvertently happen in critical cryptographic modules and attackers can inject
faults into systems to retrieve the embedded secret. Different schemes have been …

Secure Model Checkers for Network-on-Chip (NoC) Architectures

  • Boraten
    Travis

As chip multiprocessors (CMPs) are becoming more susceptible to process variation,
crosstalk, and hard and soft errors, emerging threats from rogue employees in a compromised
foundry are creating new vulnerabilities that could undermine the integrity of …

Parameter-importance based Monte-Carlo Technique for Variation-aware Analog Yield
Optimization

  • kondamadugula
    Sita

The Monte-Carlo method is the method of choice for accurate yield estimation. Standard
Monte-Carlo methods suffer from a huge computational burden even though they are very
accurate. Recently a Monte-Carlo method was proposed for the parametric yield …

SESSION: Session 3: VLSI Design 1

Session details: Session 3: VLSI Design 1

  • Thapliyal
    Himanshu

Low Energy Sketching Engines on Many-Core Platform for Big Data Acceleration

  • Kulkarni
    Amey

Almost 90% of the data available today was created within the last couple of years,
thus Big Data set processing is of utmost importance. Many solutions have been investigated
to increase processing speed and memory capacity, however I/O bottleneck is …

Low-Power Manycore Accelerator for Personalized Biomedical Applications

  • Page
    Adam

Wearable personal health monitoring systems can offer a cost effective solution for
human healthcare. These systems must provide both highly accurate, secured and quick
processing and delivery of vast amount of data. In addition, wearable biomedical …

Hardware Security Threats and Potential Countermeasures in Emerging 3D ICs

  • Dofe
    Jaya

New hardware security threats are identified in emerging three-dimensional (3D) integrated
circuits (ICs) and potential countermeasures are introduced. Trigger and payload mechanisms
for future 3D hardware Trojans are predicted. Furthermore, a novel, …

Real-Time Analysis for Wormhole NoC: Revisited and Revised

  • Xiong
    Qin

The network delay upper-bound analysis problem is of fundamental importance to real-time
applications in Network-on-Chip (NoC). In the paper, we revisit a state-of-the-art
analysis model for real-time communication in wormhole NoC with priority-based …

SESSION: Session 4: CAD 1

Session details: Session 4: CAD 1

  • Adegbija
    Tosiron

A New Methodology for Noise Sensor Placement Based on Association Rule Mining

  • Hung
    Yu-Hsiang

Due to near-threshold computing nowadays, voltage emergency is threatening our design
margins very seriously. Noise sensors are inserted in order to prevent various integrity
issues from happening during runtime. In this work, we use a new technique …

MCFRoute 2.0: A Redundant Via Insertion Enhanced Concurrent Detailed Router

  • Jia
    Xiaotao

In modern VLSI design, manufacturing yield and chip performance are seriously affected
by via failure. Redundant via insertion is an effective technique recommended by foundries
to deal with the via failure. However, due to the extreme scaling of …

Modular Placement for Interposer based Multi-FPGA Systems

  • Mao
    Fubing

Novel device with multiple FPGAs on-chip based on interposer interconnection has emerged
to resolve the IOs limit and improve the inter-FPGA communication delay. However,
new challenges arise for the placement on such architecture. Firstly, existing …

A Parallel Random Walk Solver for the Capacitance Calculation Problem in Touchscreen
Design

  • Xu
    Zhezhao

In this paper, a random walk based solver is presented which calculates the capacitances
for verifying the touchscreen design. To suit the complicated conductor geometries
in touchscreen structures, we extend the floating random walk (FRW) method for …

POSTER SESSION: Poster Session 1

Session details: Poster Session 1

  • Moreshet
    Tali

Real-Time Hardware Stereo Matching Using Guided Image Filter

  • Yang
    Chen

Stereo matching is a key step in stereo vision systems that require high accurate
depth information and real-time processing of high definition image streams. This
work presents a high-accuracy hardware implementation for the stereo matching based
on …

Computing Complex Functions using Factorization in Unipolar Stochastic Logic

  • Liu
    Yin

This paper addresses computing complex functions using unipolar stochastic logic.
Stochastic computing requires simple logic gates and is inherently fault-tolerant.
Thus, these structures are well suited for nanoscale CMOS technologies. Implementations

DCC: Double Capacity Cache Architecture for Narrow-Width Values

  • Imani
    Mohsen

Modern caches are designed to hold 64-bits wide data, however a proportion of data
in the caches continues to be narrow width. In this paper, we propose a new cache
architecture which increases the effective cache capacity up to 2X for the systems
with …

Static Noise Margin based Yield Modelling of 6T SRAM for Area and Minimum Operating
Voltage Improvement using Recovery Techniques

  • Batra
    Nidhi

In advanced technology nodes, the process variations deteriorate SRAM performance
and greatly affect yield. It is necessary to formulate yield estimation models to
optimize SRAMs and effectively trade-off area, performance and robustness. We propose

Asynchronous High Speed Serial Links Analysis using Integrated Charge for Event Detection

  • Dalakoti
    Aditya

We present a metric for event detection, targeted for the analysis of CMOS asynchronous
serial data links. Our metric is used to analyze signaling strategies that allow for
coincident or nearly coincident detection of both data and event timing. The …

Design and Comparative Evaluation of a Hybrid Cache Memory at Architectural Level

  • Wei
    Wei

A hybrid memory cell usually consists of a Static Random Access Memory (SRAM) and
an embedded Dynamic Random Access Memory (eDRAM) cell; hybrid cells are particularly
suitable for cache design. A novel hybrid cache memory scheme (that has also non-…

A Sampling Clock Skew Correction Technique for Time-Interleaved SAR ADCs

  • Prashanth
    Daniel

A technique for sampling clock skew correction by adjusting the delay in the input
signal to each channel in a time-interleaved (TI) ADC is proposed. A proof-of-concept
TI ADC employing this technique was implemented in a 65 nm CMOS process. The four-…

Secure and Low-Overhead Circuit Obfuscation Technique with Multiplexers

  • Wang
    Xueyan

Circuit obfuscation techniques have been proposed to conceal circuit’s functionality
in order to thwart reverse engineering (RE) attacks to integrated circuits (IC). We
believe that a good obfuscation method should have low design complexity and low …

Task-Resource Co-Allocation for Hotspot Minimization in Heterogeneous Many-Core NoCs

  • Reza
    Md Farhadur

To fully exploit the massive parallelism of many cores, this work tackles the problem
of mapping large-scale applications onto heterogeneous on-chip networks (NoCs) to
minimize the peak workload for energy hotspot avoidance. A task-resource co-…

Guiding Power/Quality Exploration for Communication-Intense Stream Processing

  • Tabkhi
    Hamed

In this paper, we explore the power/quality trade-off for streaming applications with
a shift from the computation to the communication aspects of the design. The paper
proposes a systematic exploration methodology to formulate and traverse power/…

SESSION: Session 5: Low Power 1

Session details: Session 5: Low Power 1

  • Savidis
    Ioannis

Graphene-PLA (GPLA): a Compact and Ultra-Low Power Logic Array Architecture

  • Tenace
    Valerio

The key characteristics of the next generation of ICs for wearable applications include
high integration density, small area, low power consumption, high energy-efficiency,
reliability and enhanced mechanical properties like stretchability and …

A Metastability Immune Timing Error Masking Flip-Flop for Dynamic Variation Tolerance

  • Sannena
    Govinda

In this paper, two timing error masking flip-flops have been proposed, which are immune
to metastability. The proposed flip-flops exploit the concept of either delayed data
or pulse based approach to detect timing errors. The timing violations are …

Exploring Configurable Non-Volatile Memory-based Caches for Energy-Efficient Embedded
Systems

  • Adegbija
    Tosiron

Non-volatile memory (NVM) technologies have recently emerged as alternatives to traditional
SRAM-based cache memories, since NVMs offer advantages such as non-volatility, low
leakage power, fast read speed, and high density. However, NVMs also have …

Multiple Attempt Write Strategy for Low Energy STT-RAM

  • Park
    Jaeyoung

In this paper, we demonstrate an energy-reduction strategy that exploits the stochastic
switching characteristics of STT-RAM write operation and propose a multiple-attempt
write technique needed for it. In contrast to the traditional approach which uses

SESSION: Special Session 1: IoT Security: Issues, Innovations and Interplays

Session details: Special Session 1: IoT Security: Issues, Innovations and Interplays

  • Bhunia
    Swarup

Secret Sharing and Multi-user Authentication: From Visual Cryptography to RRAM Circuits

  • Arafin
    Md Tanvir

In this era of Internet of Things (IoT), connectivity exists everywhere, among everything
(including people) at all times. Therefore, security, trust, and privacy become crucial
to the design and implementation of IoT devices [12]. However, it is …

Defense Systems and IoT: Security Issues in an Era of Distributed Command and Control

  • Palmer
    Doug

Security Meets Nanoelectronics for Internet of Things Applications

  • Rose
    Garrett S.

The internet of things (IoT) is quickly emerging as the next major domain for embedded
computer systems. Although the term IoT could be defined in a variety of different
ways, IoT always encompasses typically ordinary devices (e.g., thermostats and …

Tracking Data Flow at Gate-Level through Structural Checking

  • Le
    Thao

The rapid growth of Internet-of-things and other electronic devices make a huge impact
on how and where data travel. The confidential data (e.g., personal data, financial
information) that travel through unreliable channels can be exposed to attackers.

SESSION: Session 6: Test 2

Session details: Session 6: Test 2

  • Yu
    Qiaoyan

Design of Error-Resilient Logic Gates with Reinforcement Using Implications

  • Han
    Xijing

Operating circuits in the sub-threshold region can save power, but at the cost of
higher susceptibility to noise. This paper analyzes various gate-level error-mitigation
designs appropriate for sub-threshold circuits. Previous works have proposed a …

Reducing Soft-error Vulnerability of Caches using Data Compression

  • Mittal
    Sparsh

With ongoing chip miniaturization and voltage scaling, particle strike-induced soft
errors present increasingly severe threat to the reliability of on-chip caches. In
this paper, we present a technique to reduce the vulnerability of caches to soft-…

Workload-Aware Worst Path Analysis of Processor-Scale NBTI Degradation

  • Bian
    Song

As technology further scales semiconductor devices, aging-induced device degradation
has become one of the major threats to device reliability. In addition, aging mechanisms
like the negative bias temperature instability (NBTI) is known to be sensitive …

Enhancing Fault Emulation of Transient Faults by Separating Combinational and Sequential
Fault Propagation

  • Nyberg
    Ralph

We present a fault emulation environment capable of injecting single and multiple
transient faults in sequential as well as combinational logic. It is used to perform
fault injection campaigns during design verification of security circuits such as

SESSION: Session 7: VLSI Circuits 2

Session details: Session 7: VLSI Circuits 2

  • Li
    Hai

A Novel On-Chip Impedance Calibration Method for LPDDR4 Interface between DRAM and
AP/SoC

  • Choi
    Yongsuk

In this paper, a novel on-chip impedance calibration methodology for a LPDDR4 (low
power double data rate) application is proposed. The background calibration operates
to compensate mismatches and variations of the output NMOS drivers from process and

A General Sign Bit Error Correction Scheme for Approximate Adders

  • Zhou
    Rui

Approximate computing is an emerging design technique for error-tolerant applications.
As adders are the key building blocks in many applications, approximate adders have
been widely studied recently. However, existing approximate adders may introduce …

RRAM Refresh Circuit: A Proposed Solution To Resolve The Soft-Error Failures For HfO2/Hf 1T1R RRAM Memory
Cell

  • Tosson
    Amr M.S.

RRAM-based memory is a promising emerging technology for both on-chip and stand-alone
non-volatile data storage in advanced technologies. In addition to its small dimensions,
the RRAM device has many technological advantages including its low-…

Exploratory Power Noise Models of Standard Cell 14, 10, and 7 nm FinFET ICs

  • Patel
    Ravi

The physical dimensions of standard cells constrain the dimensions of power networks,
affecting the on-chip power noise. An exploratory modeling methodology is presented
for estimating power noise in advanced technology nodes. The models are evaluated

8T1R: A Novel Low-power High-speed RRAM-based Non-volatile SRAM Design

  • Abdelwahed
    Amr M.S. Tosson

With continuous and aggressive technology scaling, suppressing the stand-by power
is among the top priorities for SRAM design. Switching off the less-frequently accessed
blocks is an efficient way to reduce the stand-by power, provided that the …

SESSION: Session 8: Emerging 1

Session details: Session 8: Emerging 1

  • Yuan
    Bo

Polynomial Arithmetic Using Sequential Stochastic Logic

  • Saraf
    Naman

We present the design of stochastic computing systems based on sequential logic to
implement arbitrary polynomial functions. Stochastic computing is an emerging alternative
computing paradigm that performs arithmetic operations on real-valued data …

Ultra-Robust Null Convention Logic Circuit with Emerging Domain Wall Devices

  • Bai
    Yu

Despite many attractive advantages, Null Convention Logic (NCL) remains to be a niche
largely due to its high imple- mentation costs. Using emerging spintronic devices,
this paper proposes a Domain-Wall-Motion-based NCL circuit design methodology that

Inter-Tier Crosstalk Noise On Power Delivery Networks For 3-D ICs With Inductively-Coupled
Interconnects

  • Papistas
    Ioannis A.

Inductive links have been proposed as an inter-tier interconnect solution for three-dimensional
(3-D) integrated systems. Combined with signal multiplexing, inductive links achieve
high communication bandwidth comparable to that of through silicon vias. …

Delay Estimates for Graphene Nanoribbons: A Novel Measure of Fidelity and Experiments with Global Routing Trees

  • Das
    Subrata

With extreme miniaturization of traditional CMOS devices in deep sub-micron design
levels, the delay of a circuit, as well as power dissipation and area are dominated
by interconnections between logic blocks. In an attempt to search for alternative

SESSION: Session 9: CAD 2

Session details: Session 9: CAD 2

  • Velev
    Miroslav

VarDroid: Online Variability Emulation in Android/Linux Platforms

  • Mercati
    Pietro

Variability is the real big challenge for integrated circuits. Today, simulators help
to estimate the effect of variability, but fail to capture real workload dynamics
and user interactions, which are fundamental to mobile devices. This paper presents

Neural Network-based Prediction Algorithms for In-Door Multi-Source Energy Harvesting
System for Non-Volatile Processors

  • Liu
    Ning

Due to size, longevity, safety, and recharging concerns, energy harvesting is becoming
a better choice for many wearable embedded systems than batteries. However, harvested
energy is intrinsically unstable. In order to overcome this drawback, non-…

A Unified Model of Power Sources for the Simulation of Electrical Energy Systems

  • Vinco
    Sara

Models of power sources are essential elements in the simulation of systems that generate,
store and manage energy. In spite of the huge difference in power scale, they perform
a common function: converting a primary environmental quantity into power. …

Hardware-Accelerated Software Library Drivers Generation for IP-Centric SoC Designs

  • Jassi
    Munish

In recent years, the semiconductor industry has been witnessing an increasing reuse
of hardware IPs for System-on-Chip (SoC) designs and embedded computing systems on
FPGA platforms with hard-core processors. The IP-reuse comes with an increasing …

Extracting Designs of Secure IPs Using FPGA CAD Tools

  • Mirian
    Vincent

In today’s competitive market, a company’s success is strongly dependent on delivering
sophisticated and state-of-the-art IPs prior to their competitors. To take a short
cut, a company may resort to reverse engineering or pirating their competitor’s IP.

SESSION: Special Session 3: Emerging Technology Devices and Security

Session details: Special Session 3: Emerging Technology Devices and Security

  • Rajendran
    JV

Security Primitive Design with Nanoscale Devices: A Case Study with Resistive RAM

  • Karam
    Robert

Inherent stochastic physical mechanisms in emerging nonvolatile memories (NVMs), such
as resistive random-access-memory (RRAM), have recently been explored for hardware
security applications. Unlike the conventional silicon Physical Unclonable Functions

Enhancing Hardware Security with Emerging Transistor Technologies

  • Bi
    Yu

We consider how the I-V characteristics of emerging transistors (particularly those
sponsored by STARnet) might be employed to enhance hardware security. An emphasis
of this work is to move beyond hardware implementations of physically unclonable …

The Applications of NVM Technology in Hardware Security

  • Yang
    Chaofei

The emerging nonvolatile memory (NVM) technologies have demonstrated great potentials
in revolutionizing modern memory hierarchy because of their many promising properties:
nanosecond read/write time, small cell area, non-volatility, and easy CMOS …

Survey of Emerging Technology Based Physical Unclonable Funtions

  • Bautista Adames
    Ilia A.

Authentication of electronic devices has become critical. Hardware authentication
is one way to enhance security of a chip. Along with software, it makes it harder
for an intruder to access any computer, smart-phone, or other devices without …

SESSION: Session 10: VLSI Design 2

Session details: Session 10: VLSI Design 2

  • Meyer
    Brett

Trellis-search based Dynamic Multi-Path Connection Allocation for TDM-NoCs

  • Chen
    Yong

This paper proposes a centralized approach for connection allocation for TDM-based
NoCs by making use of dedicated hardware unit called NoCManager that employs trellis-based
search algorithm enabling dynamic parallel multi-path, multi-slot allocation. …

Prolonging Lifetime of Non-volatile Last Level Caches with Cluster Mapping

  • Soltani
    Morteza

Recently, work has been done on using nonvolatile cells, such as Spin Transfer Torque
RAM (STT-RAM) or Magnetic RAM (M-RAM), to construct last level caches (LLC). These
structures mitigate the leakage power and density problem found in traditional SRAM

A Low-Power Network-on-Chip Architecture for Tile-based Chip Multi-Processors

  • Psarras
    Anastasios

Technology scaling of tiled-based CMPs reduces the physical size of each tile and
increases the number of tiles per die. This trend directly impacts the on-chip interconnect;
even though the tile population increases, the inter-tile link distances scale …

Dynamic Real-Time Scheduler for Large-Scale MPSoCs

  • Ruaro
    Marcelo

Large-scale MPSoCs requires a scalable and dynamic real-time (RT) task scheduler,
able to handle non-deterministic computational behaviors. Current proposals for MPSoCs
have limitations, as lack of scalability, complex static steps, validation with …

SESSION: Special Session 4: Emerging Frontiers in Hardware Security

Session details: Special Session 4: Emerging Frontiers in Hardware Security

  • Joshi
    Ajay

Leveraging 3D Technologies for Hardware Security: Opportunities and Challenges

  • Gu
    Peng

3D die stacking and 2.5D interposer design are promising technologies to improve integration
density, performance and cost. Current approaches face serious issues in dealing with
emerging security challenges such as side channel attacks, hardware …

POSTER SESSION: Poster Session 2

Session details: Poster Session 2

  • Tabkhi
    Hamed

FCM: Towards Fine-Grained GPU Power Management for Closed Source Mobile Games

  • Song
    Jiachen

Contemporary mobile platforms employ embedded graphic processing units (GPUs) for
graphics-intensive games, and dynamic voltage and frequency scaling (DVFS) policies
are used to save energy without sacrificing quality. However, current GPU DVFS policies

Quality of Service-Aware, Scalable Cache Tuning Algorithm in Consumer-based Embedded
Devices

  • Alsafrjalani
    Mohamad Hammam

To meet energy and quality of service (QoS) constraints in consumer-based embedded
devices (CEDs), configurable caches can be tuned to a best configuration that consumes
the least amount of energy while adhering to QoS expectations. However, due to …

Temperature-aware Dynamic Voltage Scaling for Near-Threshold Computing

  • Kiamehr
    Saman

Power/energy reduction is of uttermost importance for applications with stringent
power/energy budget such as ultra-low power and energy-harvested systems. Aggressive
voltage scaling and in particular Near-Threshold Computing (NTC) is a promising …

Leakage Power Minimization in Deep Sub-Micron Technology by Exploiting Positive Slacks
of Dependent Paths

  • Chakraborty
    Tuhin Subhra

Leakage power minimization is one of the key aspects of modern multi-million low power
system-on-chip (SoC) design. In post timing-closure phase, leakage-in-place-optimization
(LIPO) is generally adopted to reduce leakage power by swapping high-leaky …

An Enhanced Analytical Electrical Masking Model for Multiple Event Transients

  • Watkins
    Adam

Due to the reducing transistor feature size, the susceptibility of modern circuits
to radiation induced errors has increased. This, as a result, has increased the likelihood
of multiple transients affecting a circuit. An important aspect when modeling …

Capturing True Workload Dependency of BTI-induced Degradation in CPU Components

  • Stamoulis
    Dimitrios

Atomistic-based approaches accurately model Bias Temperature Instability phenomena,
but they suffer from prolonged execution times, preventing their seamless integration
in system-level analysis flows. In this paper we present a comprehensive flow that

Performance Constraint-Aware Task Mapping to Optimize Lifetime Reliability of Manycore
Systems

  • Rathore
    Vijeta

Negative bias temperature instability (NBTI) has emerged as a critical challenge to
lifetime reliability of computing systems. Traditionally, temperature-aware methodologies
are used to mitigate the impact of NBTI on aging and degradation of computing …

ASIC Implementation of An All-digital Self-adaptive PVTA Variation-aware Clock Generation
System

  • Pérez-Puigdemont
    Jordi

An all-digital self-adaptive clock generation system capable of autonomously adapt
the clock frequency to compensate the effects of static spatially heterogeneous (SSHet)
PVTA variations is presented. The design uses time-to-digital converters (TDCs) as

Ultra-Low Energy Reconfigurable Spintronic Threshold Logic Gate

  • Fan
    Deliang

This paper introduces a novel design of reconfigurable Spintronic Threshold Logic
Gate (STLG), which employs spintronic weight devices to perform current mode weighted
summation of binary inputs, whereas, the low voltage spintronic threshold device …

Red-Shield: Shielding Read Disturbance for STT-RAM Based Register Files on GPUs

  • Zhang
    Hang

To address the high energy consumption issue of SRAM on GPUs, emerging Spin-Transfer
Torque (STT-RAM) memory technology has been intensively studied to build GPU register
files for better energy-efficiency, thanks to its benefits of low leakage power, …

Modeling and Study of Two-BDT-Nanostructure based Sequential Logic Circuits

  • Marthi
    Poorna

In this paper, study of different digital logic circuits developed using two-BDT ballistic
nanostructure is presented. New D flip-flop (DFF) based on the same nanostructure
is also proposed. The logic structure comprises two ballistic deflection …

SESSION: Session 11: Emerging 2

Session details: Session 11: Emerging 2

  • Dai
    Jianwen

Exploring Main Memory Design Based on Racetrack Memory Technology

  • Hu
    Qingda

Emerging non-volatile memories (NVMs), which include PC-RAM and STT-RAM, have been
proposed to replace DRAM, mainly because they have better scalability and lower standby
power. However, previous research has demonstrated that these NVMs cannot …

An Offline Frequent Value Encoding for Energy-Efficient MLC/TLC Non-volatile Memories

  • Alsuwaiyan
    Ali

This paper describes a low overhead, offline frequent value encoding (FVE) solution
to reduce the write energy in multi-level/triple-level cell (MLC/TLC) non-volatile
memories (NVMs). The proposed solution, which does not require any runtime software

Low-Power Multi-Port Memory Architecture based on Spin Orbit Torque Magnetic Devices

  • Bishnoi
    Rajendra

Multi-port memories are widely used as shared memory, such as register files, in a
microprocessor system, and its number of ports and capacities are significantly increasing
with every product generation. However, with technology advancements, multi-…

Optimizing the Operating Voltage of Tunnel FET-Based SRAM Arrays Equipped with Read/Write
Assist Circuitry

  • Afzali-Kusha
    Hassan

This paper deals with obtaining the minimum operating voltage of memory arrays based
on TFET SRAM cells. First, we compare the I-V characteristics of two TFETs and one
FDSOI using SPICE simulations based on 20nm technology models. The results reveal

SESSION: Session 12: Low Power 2

Session details: Session 12: Low Power 2

  • Kim
    Kyung Ki

Approximate Differential Encoding for Energy-Efficient Serial Communication

  • Jahier Pagliari
    Daniele

Embedded computing systems include several off-chip serial links, that are typically
used to interface processing elements with peripherals, such as sensors, actuators
and I/O controllers. Because of the long physical lines of these connections, they

Fast Thermal Simulation using SystemC-AMS

  • Chen
    Yukai

Out of the many options available for thermal simulation of digital electronic systems,
those based on solving an RC equivalent circuit of the thermal network are the most
popular choice in the EDA community, as they provide a reasonable tradeoff …

Learning-Based Near-Optimal Area-Power Trade-offs in Hardware Design for Neural Signal
Acquisition

  • Aprile
    Cosimo

Wireless implantable devices capable of monitoring the electrical activity of the
brain are becoming an important tool for understanding and potentially treating mental
diseases such as epilepsy and depression. While such devices exist, it is still …

Load Balanced On-Chip Power Delivery for Average Current Demand

  • Pathak
    Divya

A dynamic power management system for homogeneous chip multi-processors (CMP) is proposed.
Each core of the CMP includes on chip DC-DC switching buck converters that are interconnected
through a switch network. The peak current rating of the buck …

DAC 2018 TOC

Ensemble learning for effective run-time hardware-based malware detection: a comprehensive analysis and classification

  • Sayadi
    Hossein

Malware detection at the hardware level has emerged recently as a promising solution
to improve the security of computing systems. Hardware-based malware detectors take
advantage of Machine Learning (ML) classifiers to detect pattern of malicious …

Deepsecure: scalable provably-secure deep learning

  • Rouhani
    Bita Darvish

This paper presents DeepSecure, the an scalable and provably secure Deep Learning
(DL) framework that is built upon automated design, efficient logic synthesis, and
optimization methodologies. DeepSecure targets scenarios in which neither of the …

DWE: decrypting learning with errors with errors

  • Bian
    Song

The Learning with Errors (LWE) problem is a novel foundation of a variety of cryptographic
applications, including quantumly-secure public-key encryption, digital signature,
and fully homomorphic encryption. In this work, we propose an approximate …

Reverse engineering convolutional neural networks through side-channel information
leaks

  • Hua
    Weizhe

A convolutional neural network (CNN) model represents a crucial piece of intellectual
property in many applications. Revealing its structure or weights would leak confidential
information. In this paper we present novel reverse-engineering attacks on …

OFTL: ordering-aware FTL for maximizing performance of the journaling file system

  • Park
    Daekyu

Journaling of ext4 file system employs two FLUSH commands to make their data durable,
even though the FLUSH is more expensive than the ordinary write operations. In this
paper, to halve the number of FLUSH commands, we propose an efficient FTL, called

LAWN: boosting the performance of NVMM file system through reducing write amplification

  • Wang
    Chundong

Byte-addressable non-volatile memories can be used with DRAM to build a hybrid memory
system of volatile/non-volatile main memory (NVMM). NVMM file systems demand consistency
techniques such as logging and copy-on-write to guarantee data consistency in …

FastGC: accelerate garbage collection via an efficient copyback-based data migration in SSDs

  • Wu
    Fei

Copyback is an advanced command contributing to accelerating data migration in garbage
collection (GC). Unfortunately, detecting copyback feasibility (whether copyback can
be carried out with assurable reliability) against data corruption in the …

Dynamic management of key states for reinforcement learning-assisted garbage collection
to reduce long tail latency in SSD

  • Kang
    Wonkyung

Garbage collection (GC) is one of main causes of the long-tail latency problem in
storage systems. Long-tail latency due to GC is more than 100 times greater than the
average latency at the 99th percentile. Therefore, due to such a long tail latency, …

WB-trees: a meshed tree representation for FinFET analog layout designs

  • Lu
    Yu-Sheng

The emerging design requirements with the FinFET technology, along with traditional
geometrical constraints, make the FinFET-based analog placement even more challenging.
Previous works can handle only partial FinFET-induced design constraints because …

Analog placement with current flow and symmetry constraints using PCP-SP

  • Patyal
    Abhishek

Modern analog placement techniques require consideration of current path and symmetry
constraints. The symmetry pairs can be efficiently packed using the symmetry island
configurations, but not all these configurations result in minimum gate …

Multi-objective bayesian optimization for analog/RF circuit synthesis

  • Lyu
    Wenlong

In this paper, a novel multi-objective Bayesian optimization method is proposed for
the sizing of analog/RF circuits. The proposed approach follows the framework of Bayesian
optimization to balance the exploitation and exploration. Gaussian processes (…

Calibrating process variation at system level with in-situ low-precision transfer
learning for analog neural network processors

  • Jia
    Kaige

Process Variation (PV) may cause accuracy loss of the analog neural network (ANN)
processors, and make it hard to be scaled down, as well as feasibility degrading.
This paper first analyses the impact of PV on the performance of ANN chips. Then proposes

DPS: dynamic precision scaling for stochastic computing-based deep neural networks

  • Sim
    Hyeonuk

Stochastic computing (SC) is a promising technique with advantages such as low-cost,
low-power, and error-resilience. However so far SC-based CNN (convolutional neural
network) accelerators have been kept to relatively small CNNs only, primarily due
to …

Dyhard-DNN: even more DNN acceleration with dynamic hardware reconfiguration

  • Putic
    Mateja

Deep Neural Networks (DNNs) have demonstrated their utility across a wide range of
input data types, usable across diverse computing substrates, from edge devices to
datacenters. This broad utility has resulted in myriad hardware accelerator …

Exploring the programmability for deep learning processors: from architecture to tensorization

  • Chen
    Chixiao

This paper presents an instruction and Fabric Programmable Neuron Array (iFPNA) architecture, its 28nm CMOS chip prototype, and a compiler for
the acceleration of a variety of deep learning neural networks (DNNs) including convolutional
neural networks (…

LCP: a layer clusters paralleling mapping method for accelerating inception and residual
networks on FPGA

  • Lin
    Xinhan

Deep convolutional neural networks (DCNNs) have been widely used in various AI applications.
Inception and Residual are two promising structures adopted in many important modern
DCNN models, including AlphaGo Zero’s model. These structures allow …

Ares: a framework for quantifying the resilience of deep neural networks

  • Reagen
    Brandon

As the use of deep neural networks continues to grow, so does the fraction of compute
cycles devoted to their execution. This has led the CAD and architecture communities
to devote considerable attention to building DNN hardware. Despite these efforts,

DeepN-JPEG: a deep neural network favorable JPEG-based image compression framework

  • Liu
    Zihao

As one of most fascinating machine learning techniques, deep neural network (DNN)
has demonstrated excellent performance in various intelligent tasks such as image
classification. DNN achieves such performance, to a large extent, by performing expensive

Thundervolt: enabling aggressive voltage underscaling and timing error resilience for energy efficient
deep learning accelerators

  • Zhang
    Jeff

Hardware accelerators are being increasingly deployed to boost the performance and
energy efficiency of deep neural network (DNN) inference. In this paper we propose
Thundervolt, a new framework that enables aggressive voltage underscaling of high-…

Loom: exploiting weight and activation precisions to accelerate convolutional neural networks

  • Sharify
    Sayeh

Loom (LM), a hardware inference accelerator for Convolutional Neural Networks (CNNs) is presented.
In LM every bit of data precision that can be saved translates to proportional performance
gains. For both weights and activations LM exploits profile-…

Parallelizing SRAM arrays with customized bit-cell for binary neural networks

  • Liu
    Rui

Recent advances in deep neural networks (DNNs) have shown Binary Neural Networks (BNNs)
are able to provide a reasonable accuracy on various image datasets with a significant
reduction in computation and memory cost. In this paper, we explore two BNNs: …

An ultra-low energy internally analog, externally digital vector-matrix multiplier
based on NOR flash memory technology

  • Mahmoodi
    M. Reza

Vector-matrix multiplication (VMM) is a core operation in many signal and data processing
algorithms. Previous work showed that analog multipliers based on nonvolatile memories
have superior energy efficiency as compared to digital counterparts at low-…

Coding approach for low-power 3D interconnects

  • Bamberg
    Lennart

Through-silicon vias (TSVs) in 3D ICs show a significant power consumption, which
can be reduced using coding techniques. This work presents an approach which reduces
the TSV power consumption by a signal-aware bit assignment which includes inversions

A novel 3D DRAM memory cube architecture for space applications

  • Agnesina
    Anthony

The first mainstream products in 3D IC design are memory devices where multiple memory
tiers are horizontally integrated to offer manifold improvements compared with their
2D counterparts. Unfortunately, none of these existing 3D memory cubes are ready …

A general graph based pessimism reduction framework for design optimization of timing
closure

  • Peng
    Fulin

In this paper, we develop a general pessimism reduction framework for design optimization
of timing closure. Although the modified graph based timing analysis (mGBA) slack
model can be readily formulated into a quadratic programming problem with …

Virtualsync: timing optimization by synchronizing logic waves with sequential and combinational
components as delay units

  • Zhang
    Grace Li

In digital circuit designs, sequential components such as flip-flops are used to synchronize
signal propagations. Logic computations are aligned at and thus isolated by flip-flop
stages. Although this fully synchronous style can reduce design efforts …

Noise-aware DVFS transition sequence optimization for battery-powered IoT devices

  • Luo
    Shaoheng

Low power system-on-chips (SoCs) are now at the heart of Internet-of-Things (IoT)
devices, which are well known for their bursty workloads and limited energy storage
— usually in the form of tiny batteries. To ensure battery lifetime, DVFS has become

Accurate processor-level wirelength distribution model for technology pathfinding
using a modernized interpretation of rent’s rule

  • Prasad
    Divya

Faithful system-level modeling is vital to design and technology pathfinding, and
requires accurate representation of interconnects. In this study, Rent’s rule is modernized
to cater to advanced technology and design, and applied to derive a priori …

Semi-automatic safety analysis and optimization

  • Munk
    Peter

The complexity of safety-critical E/E-systems within the automotive domain are continuously
increasing. At the same time, functional safety standards such as the ISO 26262 prescribe
analysis methods like the Fault Tree Analysis (FTA) and Failure Mode …

Reasoning about safety of learning-enabled components in autonomous cyber-physical
systems

  • Tuncali
    Cumhur Erkan

We present a simulation-based approach for generating barrier certificate functions
for safety verification of cyber-physical systems (CPS) that contain neural network-based
controllers. A linear programming solver is utilized to find a candidate …

Runtime monitoring for safety of intelligent vehicles

  • Watanabe
    Kosuke

Advanced driver-assistance systems (ADAS), autonomous driving, and connectivity have
enabled a range of new features, but also made automotive design more complex than
ever. Formal verification can be applied to establish functional correctness, but
its …

Revisiting context-based authentication in IoT

  • Miettinen
    Markus

The emergence of IoT poses new challenges towards solutions for authenticating numerous
very heterogeneous IoT devices to their respective trust domains. Using passwords
or pre-defined keys have drawbacks that limit their use in IoT scenarios. Recent …

MAXelerator: FPGA accelerator for privacy preserving multiply-accumulate (MAC) on cloud servers

  • Hussain
    Siam U.

This paper presents MAXelerator, the first hardware accelerator for privacy-preserving
machine learning (ML) on cloud servers. Cloud-based ML is being increasingly employed
in various data sensitive scenarios. While it enhances both efficiency and …

Hypernel: a hardware-assisted framework for kernel protection without nested paging

  • Kwon
    Donghyun

Large OS kernels always suffer from attacks due to their numerous inherent vulnerabilities.
To protect the kernel, hypervisors have been employed by many security solutions.
However, relying on a hypervisor has a detrimental impact on the system …

Reducing the overhead of authenticated memory encryption using delta encoding and
ECC memory

  • Yitbarek
    Salessawi Ferede

Data stored in an off-chip memory, such as DRAM or non-volatile main memory, can potentially
be extracted or tampered by an attacker with physical access to a device. Protecting
such attacks requires storing message authentication codes and counters – …

Reducing time and effort in IC implementation: a roadmap of challenges and solutions

  • Kahng
    Andrew B.

To reduce time and effort in IC implementation, fundamental challenges must be solved.
First, the need for (expensive) humans must be removed wherever possible. Humans are
skilled at predicting downstream flow failures, evaluating key early decisions …

Efficient reinforcement learning for automating human decision-making in SoC design

  • Sadasivam
    Shankar

The exponential growth in PVT corners due to Moore’s law scaling, and the increasing
demand for consumer applications and longer battery life in mobile devices, has ushered
in significant cost and power-related challenges for designing and productizing …

Compensated-DNN: energy efficient low-precision deep neural networks by compensating quantization errors

  • Jain
    Shubham

Deep Neural Networks (DNNs) represent the state-of-the-art in many Artificial Intelligence
(AI) tasks involving images, videos, text, and natural language. Their ubiquitous
adoption is limited by the high computation and storage requirements of DNNs, …

Thermal-aware optimizations of reRAM-based neuromorphic computing systems

  • Beigi
    Majed Valad

ReRAM-based systems are attractive implementation alternatives for neuromorphic computing
because of their high speed and low design cost. In this work, we investigate the
impact of temperature on the ReRAM-based neuromorphic architectures and show how …

Compiler-guided instruction-level clock scheduling for timing speculative processors

  • Fan
    Yuanbo

Despite the significant promise that circuit-level timing speculation has for enabling
operation in marginal conditions, overheads associated with recovery prove to be a
serious drawback. We show that fine-grained clock adjustment guided by the compiler

SRAM based opportunistic energy efficiency improvement in dual-supply near-threshold
processors

  • Gu
    Yunfei

Energy-efficient microprocessors are essential for a wide range of applications. While
near-threshold computing is a promising technique to improve energy efficiency, optimal
supply demands from logic core and on-chip memory are conflicting. In this …

Enhancing workload-dependent voltage scaling for energy-efficient ultra-low-power
embedded systems

  • Mohan
    Veni

Ultra-low-power (ULP) chipsets are in higher demand than ever due to the proliferation
of ULP embedded systems to support growing applications like the Internet of Things
(IoT), wearables and sensor networks. Since ULP systems are also cost constrained,

Efficient and reliable power delivery in voltage-stacked manycore system with hybrid
charge-recycling regulators

  • Zou
    An

Voltage stacking (VS) fundamentally improves power delivery efficiency (PDE) by series-stacking
multiple voltage domains to eliminate explicit step-down voltage conversion and reduce
energy loss along the power delivery path. However, it suffers from …

Exact algorithms for delay-bounded steiner arborescences

  • Held
    Stephan

Rectilinear Steiner arborescences under linear delay constraints play an important
role for buffering. We present exact algorithms for either minimizing the total length
subject to delay constraints, or minimizing the total length plus the (weighted) …

Efficient multi-layer obstacle-avoiding region-to-region rectilinear steiner tree
construction

  • Wang
    Run-Yi

As Engineering Change Order (ECO) has attracted substantial attention in modern VLSI
design, the open net problem, which aims at constructing a shortest obstacle-avoiding
path to reconnect the net shapes in an open net, becomes more critical in the ECO

Obstacle-avoiding open-net connector with precise shortest distance estimation

  • Fang
    Guan-Qi

At the end of digital integrated circuit (IC) design flow, some nets may still be
left open due to engineering change order (ECO). Resolving these opens could be quite
challenging for some huge nets such as power ground nets because of a large number
of …

COSAT: congestion, obstacle, and slew aware tree construction for multiple power domain design

  • Lu
    Chien-Pang

Slew fixing, which ensures correct signal propagation, is essential during timing
closure of IC design flow. Conventionally, gate sizing, Vt swapping, or buffer insertion
is adopted to locally fix the slew violation on a single gate. Nevertheless, when

A machine learning framework to identify detailed routing short violations from a
placed netlist

  • Tabrizi
    Aysa Fakheri

Detecting and preventing routing violations has become a critical issue in physical
design, especially in the early stages. Lack of correlation between global and detailed
routing congestion estimations and the long runtime required to frequently …

DSA-friendly detailed routing considering double patterning and DSA template assignments

  • Yu
    Hai-Juan

As integrated circuit technology nodes continue to shrink, dense via distribution
becomes a severe challenge, requiring multiple masks to avoid spacing violations in
via layers. Meanwhile, the directed self-assembly (DSA) technique shows a great promise

Developing synthesis flows without human knowledge

  • Yu
    Cunxi

Design flows are the explicit combinations of design transformations, primarily involved
in synthesis, placement and routing processes, to accomplish the design of Integrated
Circuits (ICs) and System-on-Chip (SoC). Mostly, the flows are developed based …

Efficient computation of ECO patch functions

  • Dao
    Ai Quoc

Engineering Change Orders (ECO) modify a synthesized netlist after its specification
has changed. ECO is divided into two major tasks: finding target signals whose functions
should be updated and synthesizing the patch that produces the desired change. …

Canonical computation without canonical representation

  • Mishchenko
    Alan

A representation of a Boolean function is canonical if, given a variable order, only one instance of the representation is possible for
the function. A
computation is canonical if the result depends only on the Boolean function and a variable order, and …

SAT based exact synthesis using DAG topology families

  • Haaswijk
    Winston

SAT based exact synthesis is a powerful technique, with applications in logic optimization,
technology mapping, and synthesis for emerging technologies. However, its runtime
behavior can be unpredictable and slow. In this paper, we propose to add a new …

Efficient batch statistical error estimation for iterative multi-level approximate
logic synthesis

  • Su
    Sanbao

Approximate computing is an emerging energy-efficient paradigm for error-resilient
applications. Approximate logic synthesis (ALS) is an important field of it. To improve
the existing ALS flows, one key issue is to derive a more accurate and efficient …

BLASYS: approximate logic synthesis using boolean matrix factorization

  • Hashemi
    Soheil

Approximate computing is an emerging paradigm where design accuracy can be traded
off for benefits in design metrics such as design area, power consumption or circuit
complexity. In this work, we present a novel paradigm to synthesize approximate …

Optimized I/O determinism for emerging NVM-based NVMe SSD in an enterprise system

  • Kim
    Seonbong

Non-volatile memory express (NVMe) over peripheral component interconnect express
(PCIe) has been adopted in the storage system to provide low latency and high throughput.
NVMe allows a host system to reduce latency because it offers a high parallel …

Improving runtime performance of deduplication system with host-managed SMR storage
drives

  • Wu
    Chun-Feng

Due to the cost consideration for data storage, high-areal-density shingled-magnetic-recording
(SMR) drives and data deduplication techniques are getting popular in many data storage
services for the improvement of profit per storage unit. However, …

Wear leveling for crossbar resistive memory

  • Wen
    Wen

Resistive Memory (ReRAM) is an emerging non-volatile memory technology that has many
advantages over conventional DRAM. ReRAM crossbar has the smallest 4F2 planar cell size and thus is widely adopted for constructing dense memory with large
capacity. …

RADAR: a 3D-reRAM based DNA alignment accelerator architecture

  • Huangfu
    Wenqin

Next Generation Sequencing (NGS) technology has become an indispensable tool for studying
genomics, resulting in an exponentially growth of biological data. Booming data volume
demands significant computational resources and creates challenges for ‘…

Mamba: closing the performance gap in productive hardware development frameworks

  • Jiang
    Shunning

Modern high-level languages bring compelling productivity benefits to hardware design
and verification. For example, hardware generation and simulation frameworks (HGSFs)
use a single “host” language for parameterization, static elaboration, test bench

ACED: a hardware library for generating DSP systems

  • Wang
    Angie

Designers translate DSP algorithms into application-specific hardware via primitives
composed in various ways for different architectural realizations. Despite sharing
underlying algorithms and hardware constructs, designs are often difficult to reuse,

PARM: <u>p</u>ower supply noise <u>a</u>ware <u>r</u>esource <u>m</u>anagement for NoC based
multicore systems in the dark silicon era

  • Raparti
    Venkata Yaswanth

Reliability is a major concern in chip multi-processors (CMPs) due to shrinking technology
and low operating voltages. Today’s processors designed at sub-10nm technology nodes
have high device densities and fast switching frequencies that cause …

Aging-constrained performance optimization for multi cores

  • Khdr
    Heba

Circuit aging has become a dire design concern and hence it is considered a primary
design constraint. Current practice to cope with this problem is to apply (too) conservative
means.

In contrast, we introduce a far less restrictive approach by …

A measurement system for capacitive PUF-based security enclosures

  • Obermaier
    Johannes

Battery-backed security enclosures that are permanently monitored for penetration
and tampering are common solutions for providing physical integrity to multi-chip
embedded systems. This paper presents a well-tailored measurement system for a

It’s hammer time: how to attack (rowhammer-based) DRAM-PUFs

  • Zeitouni
    Shaza

Physically Unclonable Functions (PUFs) are still considered promising technology as
building blocks in cryptographic protocols. While most PUFs require dedicated circuitry,
recent research leverages DRAM hardware for PUFs due to its intrinsic properties …

CamPUF: physically unclonable function based on CMOS image sensor fixed pattern noise

  • Kim
    Younghyun

Physically unclonable functions (PUFs) have proved to be an effective measure for
secure device authentication and key generation. We propose a novel PUF design, named
CamPUF, based on commercial off-the-shelf CMOS image sensors, which are ubiquitously

Tamper-resistant pin-constrained digital microfluidic biochips

  • Tang
    Jack

Digital microfluidic biochips (DMFBs)—an emerging technology that implements bioassays
through manipulation of discrete fluid droplets—are vulnerable to actuation tampering
attacks, where a malicious adversary modifies control signals for the …

Approximation-aware coordinated power/performance management for heterogeneous multi-cores

  • Kanduri
    Anil

Run-time resource management of heterogeneous multi-core systems is challenging due
to i) dynamic workloads, that often result in ii) conflicting knob actuation decisions,
which potentially iii) compromise on performance for thermal safety. We present a

QoS-aware stochastic power management for many-cores

  • Pathania
    Anuj

A many-core processor can execute hundreds of multi-threaded tasks in parallel on
its 100s – 1000s of processing cores. When deployed in a Quality of Service (QoS)-based
system, the many-core must execute a task at a target QoS. The amount of processing

Employing classification-based algorithms for general-purpose approximate computing

  • Oliveira
    Geraldo F.

Approximate computing has recently reemerged as a design solution for additional performance
and energy improvements at the cost of output quality. In this paper, we propose using
a tree-based classification algorithm as an approximation tool for …

Using imprecise computing for improved non-preemptive real-time scheduling

  • Huang
    Lin

Conventional hard real-time scheduling is often overly pessimistic due to the worst
case execution time estimation. The pessimism can be mitigated by exploiting imprecise
computing in applications where occasional small errors are acceptable. This …

A modular digital VLSI flow for high-productivity SoC design

  • Khailany
    Brucek

A high-productivity digital VLSI flow for designing complex SoCs is presented. The
flow includes high-level synthesis tools, an object-oriented library of synthesizable
SystemC and C++ components, and a modular VLSI physical design approach based on …

Basejump STL: systemverilog needs a standard template library for hardware design

  • Taylor
    Michael Bedford

We propose a Standard Template Library (STL) for synthesizeable SystemVerilog that
sharply reduces the time required to design digital circuits. We overview the principles
that underly the design of the open-source BaseJump STL, including light-weight …

TRIG: hardware accelerator for inference-based applications and experimental demonstration
using carbon nanotube FETs

  • Hills
    Gage

The energy efficiency demands of future abundant-data applications, e.g., those which
use inference-based techniques to classify large amounts of data, exceed the capabilities
of digital systems today. Field-effect transistors (FETs) built using …

OPERON: optical-electrical power-efficient route synthesis for on-chip signals

  • Liu
    Derong

As VLSI technology scales to deep sub-micron, optical interconnect becomes an attractive
alternative for on-chip communication. The traditional optical routing works mainly
optimize the path loss, and few works explicitly exploit the optical-electrical …

Soft-FET: phase transition material assisted soft switching field effect transistor for supply
voltage droop mitigation

  • Teja
    Subrahmanya

Phase Transition Material (PTM) assisted novel soft switching transistor architecture
named “Soft-FET” is proposed for supply voltage droop mitigation. By utilizing the
abrupt phase transition mechanism in PTMs, the proposed Soft-FET achieves soft …

Ultralow power acoustic feature-scoring using gaussian I-V transistors

  • Trivedi
    Amit Ranjan

This paper discusses energy-efficient acoustic feature-scoring using transistors with
Gaussian-shaped Ids-Vgs. Acoustic feature-scoring is a critical step in speech recognition tasks such as
speaker recognition. Suited to the transistor, we discuss a …

Test cost reduction for X-value elimination by scan slice correlation analysis

  • Chae
    Hyunsu

X-values in test output responses corrupt an output response compaction and can cause
a fault coverage loss. X-Masking and X-Canceling MISR methods have been suggested to eliminate X-values, however, there are control data volume and test time overhead …

Cross-layer fault-space pruning for hardware-assisted fault injection

  • Dietrich
    Christian

With shrinking structure sizes, soft-error mitigation has become a major challenge
in the design and certification of safety-critical embedded systems. Their robustness
is quantified by extensive fault-injection campaigns, which on hardware level can

A machine learning based hard fault recuperation model for approximate hardware accelerators

  • Taher
    Farah Naz

Continuous pursuit of higher performance and energy efficiency has led to heterogeneous
SoC that contains multiple dedicated hardware accelerators. These accelerators exploit
the inherent parallelism of tasks and are often tolerant to inaccuracies in …

SOTERIA: exploiting process variations to enhance hardware security with photonic NoC architectures

  • Chittamuru
    Sai Vineel Reddy

Photonic networks-on-chip (PNoCs) enable high bandwidth on-chip data transfers by
using photonic waveguides capable of dense-wave-length-division-multiplexing (DWDM)
for signal traversal and microring resonators (MRs) for signal modulation. A Hardware

LEAD: learning-enabled energy-aware dynamic voltage/frequency scaling in NoCs

  • Clark
    Mark

Network on Chips (NoCs) are the interconnect fabric of choice for multicore processors
due to their superiority over traditional buses and crossbars in terms of scalability.
While NoC’s offer several advantages, they still suffer from high static and …

Subutai: distributed synchronization primitives in NoC interfaces for legacy parallel-applications

  • Cataldo
    Rodrigo

Parallel applications are essential for efficiently using the computational power
of a Multiprocessor System-on-Chip (MPSoC). Unfortunately, these applications do not
scale effortlessly with the number of cores because of synchronization operations
that …

Packet pump: overcoming network bottleneck in on-chip interconnects for GPGPUs

  • Cheng
    Xianwei

In order to fully exploit GPGPU’s parallel processing power, on-chip interconnects
need to provide bandwidth efficient data communication. GPGPUs exhibit a many-to-few-to-many
traffic pattern which makes the memory controller connected routers the …

STASH: security architecture for smart hybrid memories

  • Swami
    Shivam

Whereas emerging non-volatile memories (NVMs) are low power, dense, scalable alternatives
to DRAM, the high latency and low endurance of these NVMs limit the feasibility of
NVM-only memory systems. Smart hybrid memories (SHMs) that integrate NVM, DRAM, …

ACME: advanced counter mode encryption for secure non-volatile memories

  • Swami
    Shivam

Modern computing systems that integrate emerging non-volatile memories (NVMs) are
vulnerable to classical security threats to data confidentiality (e.g., stolen DIMM
and bus snooping attacks) as well as new security threats to system availability (e.g.,

CASTLE: compression architecture for secure low latency, low energy, high endurance NVMs

  • Palangappa
    Poovaiah M.

CASTLE is a Compression-based main memory Architecture realizing a read-decrypt-free
(i.e., write-only) Secure solution for low laTency, Low Energy, high endurance non-volatile
memories (NVMs). CASTLE integrates pattern-based data compression and …

A collaborative defense against wear out attacks in non-volatile processors

  • Cronin
    Patrick

While the Internet of Things (IoT) keeps advancing, its full adoption is continually
blocked by power delivery problems. One promising solution is Non-Volatile (NV) processors,
which harvest energy for themselves and employ a NV memory hierarchy. This …

Protecting the supply chain for automotives and IoTs

  • Ray
    Sandip

Modern automotive systems and IoT devices are designed through a highly complex, globalized,
and potentially untrustworthy supply chain. Each player in this supply chain may (1)
introduce sensitive information and data (collectively termed “assets”) …

Reconciling remote attestation and safety-critical operation on simple IoT devices

  • Carpent
    Xavier

Remote attestation (RA) is a means of malware detection, typically realized as an
interaction between a trusted verifier and a potentially compromised remote device
(prover). RA is especially relevant for low-end embedded devices that are incapable
of …

Formal security verification of concurrent firmware in SoCs using instruction-level
abstraction for hardware

  • Huang
    Bo-Yuan

Formal security verification of firmware interacting with hardware in modern Systems-on-Chip
(SoCs) is a critical research problem. This faces the following challenges: (1) design
complexity and heterogeneity, (2) semantics gaps between software and …

Application level hardware tracing for scaling post-silicon debug

  • Pal
    Debjit

We present a method for selecting trace messages for post-silicon validation of Systems-on-a-Chips
(SoCs) with diverse usage scenarios. We model specifications of interacting flows
in typical applications. Our method optimizes trace buffer utilization …

Specification-driven automated conformance checking for virtual prototype and post-silicon
designs

  • Gu
    Haifeng

Due to the increasing complexity of System-on-Chip (SoC) design, how to ensure that
silicon implementations conform to their high-level specifications is becoming a major
challenge. To address this problem, we propose a novel specification-driven …

Formal micro-architectural analysis of on-chip ring networks

  • van Wesel
    Perry

In the realm of Multi-Processors System-on-Chip (MPSoC’s), the Network-on-Chip (NoC)
connecting all system components plays a crucial role in the overall correctness and
performance of the system. Recent papers have proposed several ring based NoC …

HFMV: hybridizing formal methods and machine learning for verification of analog and mixed-signal
circuits

  • Hu
    Hanbin

With increasing design complexity and robustness requirement, analog and mixed-signal
(AMS) verification manifests itself as a key bottleneck. While formal methods and
machine learning have been proposed for AMS verification, these two techniques suffer

Cost-aware patch generation for multi-target function rectification of engineering
change orders

  • Zhang
    He-Teng

The increasing system complexity makes engineering change order (ECO) mostly inevitable
and a common practice in integrated circuit design. Despite extensive research being
made, prior methods are not effectively applicable to instances where …

Modelling multicore contention on the AURIXTM TC27x

  • Díaz
    Enrique

Multicores are becoming ubiquitous in automotive. Yet, the expected benefits on integration
are challenged by multicore contention concerns on timing V&V. Worst-case execution
time (WCET) estimates are required as early as possible in the software …

Cache side-channel attacks and time-predictability in high-performance critical real-time
systems

  • Trilla
    David

Embedded computers control an increasing number of systems directly interacting with
humans, while also manage more and more personal or sensitive information. As a result,
both safety and security are becoming ubiquitous requirements in embedded …

Cross-layer dependency analysis with timing dependence graphs

  • Möstl
    Mischa

We present Non-interference Analysis as a model-based method to automatically reveal,
track and analyze end-to-end timing dependencies as part of a cross-layer dependency
analysis in complex systems. Based on revealed timing dependencies of functional …

Brook auto: high-level certification-friendly programming for GPU-powered automotive systems

  • Trompouki
    Matina Maria

Modern automotive systems require increased performance to implement Advanced Driving
Assistance Systems (ADAS). GPU-powered platforms are promising candidates for such
computational tasks, however current low-level programming models challenge the …

Dynamic vehicle software with AUTOCONT

  • Jakobs
    Christine

Future automotive software needs to deal with an increasing level of dynamicity, reasoned
by the wish for connected driving, software updates, and dynamic feature activation.
Such functionalities cannot be properly realized with today’s classic AUTOSAR …

Automated interpretation and reduction of in-vehicle network traces at a large scale

  • Mrowca
    Artur

In modern vehicles, high communication complexity requires cost-effective integration
tests such as data-driven system verification with in-vehicle network traces. With
the growing amount of traces, distributable Big Data solutions for analyses become

Atomlayer: a universal reRAM-based CNN accelerator with atomic layer computation

  • Qiao
    Ximing

Although ReRAM-based convolutional neural network (CNN) accelerators have been widely
studied, state-of-the-art solutions suffer from either incapability of training (e.g.,
ISSAC [1]) or inefficiency of inference (e.g., PipeLayer [2]) due to the …

Towards accurate and high-speed spiking neuromorphic systems with data quantization-aware
deep networks

  • Liu
    Fuqiang

Deep Neural Networks (DNNs) have gained immense success in cognitive applications
and greatly pushed today’s artificial intelligence forward. The biggest challenge
in executing DNNs is their extremely data-extensive computations. The computing …

CMP-PIM: an energy-efficient comparator-based processing-in-memory neural network accelerator

  • Angizi
    Shaahin

In this paper, an energy-efficient and high-speed comparator-based processing-in-memory
accelerator (CMP-PIM) is proposed to efficiently execute a novel hardware-oriented
comparator-based deep neural network called CMPNET. Inspired by local binary …

SNrram: an efficient sparse neural network computation architecture based on resistive random-access
memory

  • Wang
    Peiqi

The sparsity in the deep neural networks can be leveraged by methods such as pruning
and compression to help the efficient deployment of large-scale deep neural networks
onto hardware platforms, such as GPU or FPGA, for better performance and power …

Long live TIME: improving lifetime for training-in-memory engines by structured gradient sparsification

  • Cai
    Yi

Deeper and larger Neural Networks (NNs) have made breakthroughs in many fields. While
conventional CMOS-based computing platforms are hard to achieve higher energy efficiency.
RRAM-based systems provide a promising solution to build efficient Training-…

Hierarchical hyperdimensional computing for energy efficient classification

  • Imani
    Mohsen

Brain-inspired Hyperdimensional (HD) computing emulates cognition tasks by computing
with hypervectors rather than traditional numerical values. In HD, an encoder maps
inputs to high dimensional vectors (hypervectors) and combines them to generate a

Dadu-P: a scalable accelerator for robot motion planning in a dynamic environment

  • Lian
    Shiqi

As a critical operation in robotics, motion planning consumes lots of time and energy,
especially in a dynamic environment. Through approaches based on general-purpose processors,
it is hard to get a valid planning in real time. We present an …

Data prediction for response flows in packet processing cache

  • Yamaki
    Hayato

We propose a technique to reduce compulsory misses of packet processing cache (PPC),
which largely affects both throughput and energy of core routers. Rather than prefetching
data, our technique called response prediction cache (RPC) speculatively …

PULP-HD: accelerating brain-inspired high-dimensional computing on a parallel ultra-low power
platform

  • Montagna
    Fabio

Computing with high-dimensional (HD) vectors, also referred to as hypervectors, is a brain-inspired alternative to computing with scalars. Key properties of HD
computing include a well-defined set of arithmetic operations on hypervectors, generality,

Active forwarding: eliminate IOMMU address translation for accelerator-rich architectures

  • Fu
    Hsueh-Chun

Accelerator-rich architectures employ IOMMUs to support unified virtual address, but
researches show that they fail to meet the performance and energy requirements of
accelerators. Instead of optimizing the speed/energy of IOMMU address translation,

SARA: self-aware resource allocation for heterogeneous MPSoCs

  • Song
    Yang

In modern heterogeneous MPSoCs, the management of shared memory resources is crucial
in delivering end-to-end QoS. Previous frameworks have either focused on singular
QoS targets or the allocation of partitionable resources among CPU applications at

PEP: <u>p</u>roactive checkpointing for <u>e</u>fficient <u>p</u>reemption on GPUs

  • Li
    Chen

The demand for multitasking GPUs increases whenever the GPU may be shared by multiple
applications, either spatially or temporally. This requires that GPUs can be preempted
and switch context to a new application while already executing one. Unlike CPUs,…

FMMU: a hardware-accelerated flash map management unit for scalable performance of flash-based
SSDs

  • Woo
    Yeong-Jae

Address translation is increasingly a performance bottleneck in flash-based SSDs (solid
state drives). We propose a hardware-accelerated flash map management unit called
FMMU to speed up the address translation. The FMMU operates in a non-blocking …

Minimizing write amplification to enhance lifetime of large-page flash-memory storage
devices

  • Wang
    Wei-Lin

Due to the decreasing endurance of flash chips, the lifetime of flash drives has become
a critical issue. To resolve this issue, various techniques such as wear-leveling
and error correction code have been proposed to reduce the bit error rates of flash

Proactive channel adjustment to improve polar code capability for flash storage devices

  • Hsu
    Kun-Cheng

With the low encoding/decoding complexity and the high error correction capability,
polar code with the support of list-decoding and cyclic redundancy check can outperform
LDPC code in the area of data communication. Thus, it also draws a lot of …

Achieving defect-free multilevel 3D flash memories with one-shot program design

  • Ho
    Chien-Chung

To store the desired data on MLC and TLC flash memories, the conventional programming
strategies need to divide a fixed range of threshold voltage (Vt) window into several parts. The narrowly partitioned Vt window in turn limits the design of …

Power-based side-channel instruction-level disassembler

  • Park
    Jungmin

Modern embedded computing devices are vulnerable against malware and software piracy
due to insufficient security scrutiny and the complications of continuous patching.
To detect malicious activity as well as protecting the integrity of executable …

Side-channel security of superscalar CPUs: evaluating the impact of micro-architectural features

  • Barenghi
    Alessandro

Side-channel attacks are performed on increasingly complex targets, starting to threaten
superscalar CPUs supporting a complete operating system. The difficulty of both assessing
the vulnerability of a device to them, and validating the effectiveness of …

Electro-magnetic analysis of GPU-based AES implementation

  • Gao
    Yiwen

In this work, for the first time, we investigate Electro-Magnetic (EM) attacks on
GPU-based AES implementation. In detail, we first sample EM traces using a delicate
trigger; then, we build a heuristic leakage model and a novel leakage model to exploit

GPU obfuscation: attack and defense strategies

  • Chakraborty
    Abhishek

Conventional attacks against existing logic obfuscation techniques rely on the presence
of an activated hardware for analysis. In reality, obtaining such activated chips may not always be practical,
especially if the on-chip test structures are …

Measurement-based cache representativeness on multipath programs

  • Milutinovic
    Suzana

Autonomous vehicles in embedded real-time systems increase critical-software size
and complexity whose performance needs are covered with high-performance hardware
features like caches, which however hampers obtaining WCET estimates that hold valid
for …

Resource-aware partitioned scheduling for heterogeneous multicore real-time systems

  • Han
    Jian-Jun

Heterogeneous multicore processors have become popular computing engines for modern
embedded real-time systems recently. However, there is rather limited research on
the scheduling of real-time tasks running on heterogeneous multicore systems with

Response-time analysis of DAG tasks supporting heterogeneous computing

  • Serrano
    Maria A.

Hardware platforms are evolving towards parallel and heterogeneous architectures to
overcome the increasing necessity of more performance in the real-time domain. Parallel
programming models are fundamental to exploit the performance capabilities of …

Duet: an OLED & GPU co-management scheme for dynamic resolution adaptation

  • Lin
    Han-Yi

The increasingly high display resolution of mobile devices imposes a further burden
on energy consumption. Existing schemes manage either OLED or GPU power to save energy.
This paper presents the design, algorithm, and implementation of a co-managing …

RAMP: resource-aware mapping for CGRAs

  • Dave
    Shail

Coarse-grained reconfigurable array (CGRA) is a promising solution that can accelerate
even non-parallel loops. Acceleration achieved through CGRAs critically depends on
the goodness of mapping (of loop operations onto the PEs of CGRA), and in …

An architecture-agnostic integer linear programming approach to CGRA mapping

  • Chin
    S. Alexander

Coarse-grained reconfigurable architectures (CGRAs) have gained traction as a potential
solution to implement accelerators for compute-intensive kernels, particularly in
domains requiring hardware programmability. Architecture and CAD for CGRAs are …

Dnestmap: mapping deeply-nested loops on ultra-low power CGRAs

  • Karunaratne
    Manupa

Coarse-Grained Reconfigurable Arrays (CGRAs) provide high performance, energy-efficient
execution of the innermost loops of an application. Most real-world applications,
however, comprise of deeply-nested loops with complex and often irregular control

Locality aware memory assignment and tiling

  • Rogers
    Samuel

With the trend toward specialization, an efficient memory-path design is vital to
capitalize customization in data-path. A monolithic memory hierarchy is often highly
inefficient for irregular applications, traditionally targeted for CPUs. New …

GAN-OPC: mask optimization with lithography-guided generative adversarial nets

  • Yang
    Haoyu

Mask optimization has been a critical problem in the VLSI design flow due to the mismatch
between the lithography system and the continuously shrinking feature sizes. Optical
proximity correction (OPC) is one of the prevailing resolution enhancement …

An efficient Bayesian yield estimation method for high dimensional and high sigma
SRAM circuits

  • Zhai
    Jinyuan

With increasing dimension of variation space and computational intensive circuit simulation,
accurate and fast yield estimation of realistic SRAM chip remains a significant and
complicated challenge. In this paper, du Experiment results show that the …

RAIN: a tool for reliability assessment of interconnect networks—physics to software

  • Abbasinasab
    Ali

In this paper, we study the main interconnect aging processes: electromigration, thermomigration
and stress migration and propose comprehensive yet compact models for transient and
steady states based on hydrostatic stress evolution. Our model can be …

A fast and robust failure analysis of memory circuits using adaptive importance sampling
method

  • Shi
    Xiao

Performance failure has become a growing concern for the robustness and reliability
of memory circuits. It is challenging to accurately estimate the extremely small failure
probability when failed samples are distributed in multiple disjoint failure …

SpWA: an efficient sparse winograd convolutional neural networks accelerator on FPGAs

  • Lu
    Liqiang

FPGAs have been an efficient accelerator for CNN inference due to its high performance,
flexibility, and energy-efficiency. To improve the performance of CNNs on FPGAs, fast
algorithms and sparse methods emerge as the most attractive alternatives, which …

Efficient winograd-based convolution kernel implementation on edge devices

  • Xygkis
    Athanasios

The implementation of Convolutional Neural Networks on edge Internet of Things (IoT)
devices is a significant programming challenge, due to the limited computational resources
and the real-time requirements of modern applications. This work focuses on …

An efficient kernel transformation architecture for binary- and ternary-weight neural
network inference

  • Zheng
    Shixuan

While deep convolutional neural networks (CNNs) have emerged as the driving force
of a wide range of domains, their computationally and memory intensive natures hinder
the further deployment in mobile and embedded applications. Recently, CNNs with low-…

Content addressable memory based binarized neural network accelerator using time-domain
signal processing

  • Choi
    Woong

Binarized neural network (BNN) is one of the most promising solution for low-cost
convolutional neural network acceleration. Since BNN is based on binarized bit-level
operations, there exist great opportunities to reduce power-hungry data transfers
and …

A security vulnerability analysis of SoCFPGA architectures

  • Chaudhuri
    Sumanta

SoCFPGAs or FPGAs integrated on the same die with chip multi processors have made
it to the market in the past years. In this article we analyse various security loopholes,
existing precautions and countermeasures in these architectures. We consider …

Raise your game for split manufacturing: restoring the true functionality through BEOL

  • Patnaik
    Satwik

Split manufacturing (SM) seeks to protect against piracy of intellectual property
(IP) in chip designs. Here we propose a scheme to manipulate both placement and routing
in an intertwined manner, thereby increasing the resilience of SM layouts. Key …

Analysis of security of split manufacturing using machine learning

  • Zhang
    Boyu

This work is the first to analyze the security of split manufacturing using machine
learning, based on data collected from layouts provided by industry, with 8 routing
metal layers, and significant variation in wire size and routing congestion across

Inducing local timing fault through EM injection

  • Ghodrati
    Marjan

Electromagnetic fault injection (EMFI) is an efficient class of physical attacks that
can compromise the immunity of secure cryptographic algorithms. Despite successful
EMFI attacks, the effects of electromagnetic injection (EM) on a processor are not

IAfinder: identifying potential implicit assumptions to facilitate validation in medical cyber-physical
system

  • Fu
    Zhicheng

According to the U.S. Food and Drug Administration (FDA) medical device recall database,
medical device recalls are at an all-time high. One of the major causes of the recalls
is due to implicit assumptions of which either the medical device operating …

An efficient timestamp-based monitoring approach to test timing constraints of cyber-physical
systems

  • Mehrabian
    Mohammadreza

Formal specifications on temporal behavior of Cyber-Physical Systems (CPS) is essential
for verification of performance and safety. Existing solutions for verifying the satisfaction
of temporal constraints on a CPS are compute and resource intensive …

Runtime adjustment of IoT system-on-chips for minimum energy operation

  • Golanbari
    Mohammad Saber

Energy-constrained Systems-on-Chips (SoC) are becoming major components of many emerging
applications, especially in the Internet of Things (IoT) domain. Although the best
energy efficiency is achieved when the SoC operates in the near-threshold region,

Edge-cloud collaborative processing for intelligent internet of things: a case study on smart surveillance

  • Mudassar
    Burhan A.

Limited processing power and memory prevent realization of state of the art algorithms
on the edge level. Offloading computations to the cloud comes with tradeoffs as compression
techniques employed to conserve transmission bandwidth and energy …

Bandwidth-efficient deep learning

  • Han
    Song

Deep learning algorithms are achieving increasingly higher prediction accuracy on
many machine learning tasks. However, applying brute-force programming to data demands
a huge amount of machine power to perform training and inference, and a huge amount

Co-design of deep neural nets and neural net accelerators for embedded vision applications

  • Kwon
    Kiseok

Deep Learning is arguably the most rapidly evolving research area in recent years.
As a result it is not surprising that the design of state-of-the-art deep neural net
models proceeds without much consideration of the latest hardware targets, and the

Generalized augmented lagrangian and its applications to VLSI global placement

  • Zhu
    Ziran

Global placement dominates the circuit placement process in its solution quality and
efficiency. With increasing design complexity and various design constraints, it is
desirable to develop an efficient, high-quality global placement algorithm for …

Routability-driven and fence-aware legalization for mixed-cell-height circuits

  • Li
    Haocheng

Placement is one of the most critical stages in the physical synthesis flow. Circuits
with increasing numbers of cells of multi-row height have brought challenges to traditional
placers on efficiency and effectiveness. Furthermore, constraints on fence …

PlanarONoC: concurrent placement and routing considering crossing minimization for optical networks-on-chip

  • Chuang
    Yu-Kai

Optical networks-on-chips (ONoCs) have become a promising solution for the on-chip
communication of multi-and many-core systems to provide superior communication bandwidths,
efficiency in power consumption, and latency performance compared to electronic …

Similarity-aware spectral sparsification by edge filtering

  • Feng
    Zhuo

In recent years, spectral graph sparsification techniques that can compute ultra-sparse
graph proxies have been extensively studied for accelerating various numerical and
graph-related applications. Prior nearly-linear-time spectral sparsification …

S2FA: an accelerator automation framework for heterogeneous computing in datacenters

  • Yu
    Cody Hao

Big data analytics using the JVM-based MapReduce framework has become a popular approach
to address the explosive growth of data sizes. Adopting FPGAs in datacenters as accelerators
to improve performance and energy efficiency also attracts increasing …

Automated accelerator generation and optimization with composable, parallel and pipeline
architecture

  • Cong
    Jason

CPU-FPGA heterogeneous architectures feature flexible acceleration of many workloads
to advance computational capabilities and energy efficiency in today’s datacenters.
This advantage, however, is often overshadowed by the poor programmability of FPGAs.

TAO: techniques for algorithm-level obfuscation during high-level synthesis

  • Pilato
    Christian

Intellectual Property (IP) theft costs semiconductor design companies billions of
dollars every year. Unauthorized IP copies start from reverse engineering the given
chip. Existing techniques to protect against IP theft aim to hide the IC’s …

Extracting data parallelism in non-stencil kernel computing by optimally coloring
folded memory conflict graph

  • Escobedo
    Juan

Irregular memory access pattern in non-stencil kernel computing renders the well-known
hyperplane- [1], lattice- [2], or tessellation-based [3] HLS techniques ineffective.
We develop an elegant yet effective technique that synthesizes memory-optimal …

SMApproxlib: library of FPGA-based approximate multipliers

  • Ullah
    Salim

The main focus of the existing approximate arithmetic circuits has been on ASIC-based
designs. However, due to the architectural differences between ASICs and FPGAs, comparable
performance gains cannot be achieved for FPGA-based systems by using the …

Sign-magnitude SC: getting 10X accuracy for free in stochastic computing for deep neural networks

  • Zhakatayev
    Aidyn

Stochastic computing (SC) is a promising computing paradigm for applications with
low precision requirement, stringent cost and power restriction. One known problem
with SC, however, is the low accuracy especially with multiplication. In this paper
we …

Area-optimized low-latency approximate multipliers for FPGA-based hardware accelerators

  • Ullah
    Salim

The architectural differences between ASICs and FPGAs limit the effective performance
gains achievable by the application of ASIC-based approximation principles for FPGA-based
reconfigurable computing systems. This paper presents a novel approximate …

Approximate on-the-fly coarse-grained reconfigurable acceleration for general-purpose
applications

  • Brandalero
    Marcelo

Approximate functional unit designs have the potential to reduce power consumption
significantly compared to their precise counterparts; however, few works have investigated
composing them to build generic accelerators. In this work, we do a design-…

LEMAX: learning-based energy consumption minimization in approximate computing with quality
guarantee

  • Akhlaghi
    Vahideh

Approximate computing aims to trade accuracy for energy efficiency. Various approximate
methods have been proposed in the literature that demonstrate the effectiveness of
relaxing accuracy requirements in a specific unit. This provides a basis for …

PIMA-logic: a novel processing-in-memory architecture for highly flexible and energy-efficient
logic computation

  • Angizi
    Shaahin

In this paper, we propose PIMA-Logic, as a novel <u>P</u>rocessing-<u>i</u>n-<u>M</u>emory
<u>A</u>rchitecture for highly flexible and efficient <u>Logic</u> computation. Insteadof
integrating complex logic units in cost-sensitive memory, PIMA-Logic …

Columba S: a scalable co-layout design automation tool for microfluidic large-scale integration

  • Tseng
    Tsun-Ming

Microfluidic large-scale integration (mLSI) is a promising platform for high-throughput
biological applications. Design automation for mLSI has made much progress in recent
years. Columba and its succeeding work Columba 2.0 proposed a mathematical …

Design-for-testability for continuous-flow microfluidic biochips

  • Liu
    Chunfeng

Flow-based microfluidic biochips are gaining traction in the microfluidics community
since they enable efficient and low-cost biochemical experiments. These highly integrated
lab-on-a-chip systems, however, suffer from manufacturing defects, which cause …

Design and architectural co-optimization of monolithic 3D liquid state machine-based
neuromorphic processor

  • Ku
    Bon Woong

A liquid state machine (LSM) is a powerful recurrent spiking neural network shown
to be effective in various learning tasks including speech recognition. In this work,
we investigate design and architectural co-optimization to further improve the area-…

Enabling a new era of brain-inspired computing: energy-efficient spiking neural network with ring topology

  • Bai
    Kangjun

The reservoir computing, an emerging computing paradigm, has proven its benefit to
multifarious applications. In this work, we successfully designed and fabricated an
analog delayed feedback reservoir (DFR) chip. Measurement results demonstrate its
rich …

A neuromorphic design using chaotic mott memristor with relaxation oscillation

  • Yan
    Bonan

The recent proposed nanoscale Mott memristor features negative differential resistance
and chaotic dynamics. This work proposes a novel neuromorphic computing system that
utilizes Mott memristors to simplify peripheral circuitry. According to the …

DrAcc: a DRAM based accelerator for accurate CNN inference

  • Deng
    Quan

Modern Convolutional Neural Networks (CNNs) are computation and memory intensive.
Thus it is crucial to develop hardware accelerators to achieve high performance as
well as power/energy-efficiency on resource limited embedded systems. DRAM-based CNN

On-chip deep neural network storage with multi-level eNVM

  • Donato
    Marco

One of the biggest performance bottlenecks of today’s neural network (NN) accelerators
is off-chip memory accesses [11]. In this paper, we propose a method to use multi-level,
embedded nonvolatile memory (eNVM) to eliminate all off-chip weight accesses. …

Closed yet open DRAM: achieving low latency and high performance in DRAM memory systems

  • Subramanian
    Lavanya

DRAM memory access is a critical performance bottleneck. To access one cache block,
an entire row needs to be sensed and amplified, data restored into the bitcells and
the bitlines precharged, incurring high latency. Isolating the bitlines and sense

VRL-DRAM: improving DRAM performance via variable refresh latency

  • Das
    Anup

A DRAM chip requires periodic refresh operations to prevent data loss due to charge
leakage in DRAM cells. Refresh operations incur significant performance overhead as
a DRAM bank/rank becomes unavailable to service access requests while being …

Enabling union page cache to boost file access performance of NVRAM-based storage
device

  • Chen
    Shuo-Han

Due to the fast access performance, byte-addressability, and non-volatility of non-volatile
random access memory (NVRAM), NVRAM has emerged as a popular candidate for the design
of memory/storage systems on mobile computing systems. For example, the …

FLOSS: FLOw sensitive scheduling on mobile platforms

  • Zhang
    Haibo

Today’s mobile platforms have grown in sophistication to run a wide variety of frame-based
applications. To deliver better QoS and energy efficiency, these applications utilize
multi-flow execution, which exploits hardware-level parallelism across …

Context-aware dataflow adaptation technique for low-power multi-core embedded systems

  • Jung
    Hyeonseok

Today’s embedded systems operate under increasingly dynamic conditions. First, computational
workloads can be either fluctuating or adjustable. Moreover, as many devices are battery-powered,
it is common to have runtime power management technique, which …

Architecture decomposition in system synthesis of heterogeneous many-core systems

  • Richthammer
    Valentina

Determining feasible application mappings for Design Space Exploration (DSE) and run-time
embedding is a challenge for modern many-core systems. The underlying NP-complete
system-synthesis problem faces tremendously complex problem instances due to the …

NNsim: fast performance estimation based on sampled simulation of GPGPU kernels for neural
networks

  • Kang
    Jintaek

Existent GPU simulators are too slow to use for neural networks implemented in GPUs.
For fast performance estimation, we propose a novel hybrid method of analytical performance
modeling and sampled simulation of GPUs. By taking full advantage of …

STAFF: online learning with stabilized adaptive forgetting factor and feature selection algorithm

  • Gupta
    Ujjwal

Dynamic resource management techniques rely on power consumption and performance models
to optimize the operating frequency and utilization of processing elements, such as
CPU and GPU. Despite the importance of these decisions, many existing approaches …

Extensive evaluation of programming models and ISAs impact on multicore soft error
reliability

  • Rosa
    Felipe da

To take advantage of the performance enhancements provided by multicore processors,
new instruction set architectures (ISAs) and parallel programming libraries have been
investigated across multiple industrial segments. This paper investigates the …

Optimized selection of wireless network topologies and components via efficient pruning
of feasible paths

  • Kirov
    Dmitrii

We address the design space exploration of wireless networks to jointly select topology
and component sizing. We formulate the exploration problem as an optimized mapping
problem, where network elements are associated with components from pre-defined …

SLIP 2018 TOC

Resource and data optimization for hardware implementation of deep neural networks
targeting FPGA-based edge devices

  • Liu
    Xinheng

Recently, as machine learning algorithms have become more practical, there has been
much effort to implement them on edge devices that can be used in our daily lives.
However, unlike server-scale devices, edge devices are relatively small and thus have

A study of optimal cost-skew tradeoff and remaining suboptimality in interconnect
tree constructions

  • Han
    Kwangsoo

Cost and skew are among the most fundamental objectives for interconnect tree synthesis.
The cost-skew tradeoff is particularly important in buffered clock tree construction,
where clock subnets are an important “sweet spot” for balancing on-chip …

A design framework for processing-in-memory accelerator

  • Gao
    Di

With increasing performance mismatch between processor and memory, “memory wall” has
become the bottleneck of the entire computing system. In order to bridge the gap,
processing-in-memory (PIM) has been revisited as a viable option to overcome the …

Fast and precise routability analysis with conditional design rules

  • Kang
    Ilgweon

As pin accessibility encounters more challenges due to the less number of tracks,
higher pin density, and more complex design rules, routability has become one bottleneck
of sub-10nm designs. Thus, we need a new design methodology for fast turnaround in …

Adaptive sensitivity analysis with nonlinear power load modeling

  • Hsu
    Po-Ya

Voltage fluctuation in power networks is a critical issue for VLSI designs. The analysis
and optimization of the voltage drops rely on accurate sensitivity calculation. Due
to the high complexity of large-scale circuits, in practice active devices are …

Exploiting PDN noise to thwart correlation power analysis attacks in 3D ICs

  • Dofe
    Jaya

Three-dimensional (3D) integration is envisioned as a natural defense to thwart side-channel
analysis (SCA) attacks. However, there lack extensive studies on the unique feature
of 3D power distribution network (PDN) noise and its impact on the …

ISPD 2017 TOC

SESSION: Welcome and Keynote Address

Technology Options for Beyond-CMOS

  • Young
    Ian

CMOS integrated circuit technology for computation is at an inflexion point. Although
this is the technology which has enabled the semiconductor industry to make vast progress
over the past 30-plus years, it is expected to see challenges going beyond …

SESSION: Machine Learning in EDA

The Quest for The Ultimate Learning Machine

  • Dubey
    Pradeep

Traditionally, there has been a division of labor between computers and humans where
all forms of number crunching and bit manipulations are left to computers; whereas,
intelligent decision-making is left to us humans. We are now at the cusp of a major

Deep Learning in the Enhanced Cloud

  • Chung
    Eric

Deep Learning has emerged as a singularly critical technology for enabling human-like
intelligence in online services such as Azure, Office 365, Bing, Cortana, Skype, and
other high-valued scenarios at Microsoft. While Deep Neural Networks (DNNs) have …

Bilinear Lithography Hotspot Detection

  • Zhang
    Hang

Advanced semiconductor process technologies are producing various circuit layout patterns,
and it is essential to detect and eliminate problematic ones, which are called lithography
hotspots. These hotspots are formed due to light diffraction and …

Routability Optimization for Industrial Designs at Sub-14nm Process Nodes Using Machine
Learning

  • Chan
    Wei-Ting J.

Design rule check (DRC) violations after detailed routing prevent a design from being
taped out. To solve this problem, state-of-the-art commercial EDA tools global-route
the design to produce a global-route congestion map; this map is used by the …

SESSION: Monday Afternoon Keynote

Pushing the boundaries of Moore’s Law to transition from FPGA to All Programmable
Platform

  • Bolsens
    Ivo

Since their inception, FPGAs have changed significantly in their capacity and architecture.
The devices we use today are called upon to solve problems in mixed-signal, high-speed
communications, signal processing and compute acceleration that early …

POSTER SESSION: Invited Poster Presentation

How Game Engines Can Inspire EDA Tools Development: A use case for an open-source physical design library

  • Fontana
    Tiago

Similarly to game engines, physical design tools must handle huge amounts of data.
Although the game industry has been employing modern software development concepts
such as data-oriented design, most physical design tools still relies on object-…

Rsyn: An Extensible Physical Synthesis Framework

  • Flach
    Guilherme

Due to the advanced stage of development on EDA science, it has been increasingly
difficult to implement realistic software infrastructures in academia so that new
problems and solutions are tested in a meaningful and consistent way. In this paper
we …

SESSION: Nontraditional Physical Design Challenges

Research Challenges in Security-Aware Physical Design

  • Karri
    Ramesh

The presentation will discuss security techniques such as IC camouflaging and logic
encryption.

Challenges and Opportunities: From Near-memory Computing to In-memory Computing

  • Khoram
    Soroosh

The confluence of the recent advances in technology and the ever-growing demand for
large-scale data analytics created a renewed interest in a decades-old concept, processing-in-memory
(PIM). PIM, in general, may cover a very wide spectrum of compute …

Physical Design Considerations of One-level RRAM-based Routing Multiplexers

  • Tang
    Xifan

Resistive Random Access Memory(RRAM) technology opens the opportunity for granting both high-performance and low-power
features to routing multiplexers. In this paper, we study the physical design considerations
related to RRAM-based routing …

Hierarchical and Analytical Placement Techniques for High-Performance Analog Circuits

  • Xu
    Biying

High-performance analog integrated circuits usually require minimizing critical parasitic
loading, which can be modeled by the critical net wire length in the layout stage.
In order to reduce post-layout circuit performance degradation, critical net …

SESSION: Tuesday Keynote Address

Physical Design Challenges and Innovations to Meet Power, Speed, and Area Scaling
Trend

  • Lu
    Lee-Chung

In the advanced process technologies of 7nm and beyond, the semiconductor industry
faces several new challenges: (1) aggressive chip area scaling with economically feasible
process technology development, (2) sufficient performance enhancement of …

SESSION: Clock and Timing

Modern Challenges in Constructing Clocks

  • Alpert
    Charles J.

Clock Tree Construction based on Arrival Time Constraints

  • Ewetz
    Rickard

There are striking differences between constructing clock trees based on dynamic implied
skew constraints and based on static arrival time constraints. Dynamic implied skew
constraints allow the full timing margins to be utilized, but the constraints …

A Fast Incremental Cycle Ratio Algorithm

  • Wu
    Gang

In this paper, we propose an algorithm to quickly find the maximum cycle ratio (MCR)
on an incrementally changing directed cyclic graph. Compared with traditional MCR
algorithms which have to recalculate everything from scratch at each incremental …

iTimerM: Compact and Accurate Timing Macro Modeling for Efficient Hierarchical Timing Analysis

  • Lee
    Pei-Yu

As designs continue to grow in size and complexity, EDA paradigm shifts from flat
to hierarchical timing analysis. In this paper, we propose compact and accurate timing
macro modeling, which is the key to achieve efficient and accurate hierarchical …

SESSION: Routability Considerations

DSAR: DSA aware Routing with Simultaneous DSA Guiding Pattern and Double Patterning Assignment

  • Ou
    Jiaojiao

Directed self-assembly (DSA) is a promising solution for fabrication of contacts and
vias for advanced technology nodes. In this paper, we study a DSA aware detailed routing
problem, where DSA guiding pattern assignment and guiding pattern double …

Automatic Cell Layout in the 7nm Era

  • Cremer
    Pascal

Multi patterning technology used in 7nm technology and beyond imposes more and more
complex design rules on the layout of cells. The often non local nature of these new
design rules is a great challenge not only for human designers but also for existing

Improving Detailed Routability and Pin Access with 3D Monolithic Standard Cells

  • Shi
    Daohang

We study the impact of using 3D monolithic (3DM) standard cells on improving detailed
routability and pin access. We propose a design flow which transforms standard rows
of single-tier “2D” cells into rows of standard 3DM cells folded into two tiers. …

SESSION: Commemoration for Professor Satoshi Goto

The Spirit of in-house CAD Achieved by the Legend of Master “Prof. Goto” and his Apprentices

  • Nakamura
    Yuichi

In this paper, a legend story to develop CAD algorithms and CAD/EDA tools for NEC’s
in-house use is described. About 30 years ago, since there are few commercial CAD
tools, ICT vendors had to develop their own CAD tools to enhance the performance of

Generalized Force Directed Relaxation with Optimal Regions and Its Applications to
Circuit Placement

  • Chang
    Yao Wen

This paper introduces popular algorithmic paradigms for circuit placement, presents
Goto’s classical placement framework based on the generalized force directed relaxation
(GFDR) method with an optimal region (OR) formulation and its impacts on modern …

100x Evolution of Video Codec Chips

  • Zhou
    Jinjia

In the past two decades, there has been tremendous progress in video compression technologies.
Meanwhile, the use of these technologies, along with the ever-increasing demand for
emerging ultra-high-definition applications greatly challenges the design …

Physical Layout after Half a Century: From Back-Board Ordering to Multi-Dimensional Placement and Beyond

  • Kang
    Ilgweon

Innovations and advancements on physical design (PD) in the past half century significantly
contribute to the progresses of modern VLSI designs. While “Moore’s Law” and “Dennard
Scaling” have become slowing down recently, physical design society …

Past, Present and Future of the Research

  • Goto
    Satoshi

SESSION: Optimization and Placement

Interesting Problems in Physical Synthesis

  • Ho
    Pei-Hsin

It is a misperception that the Chinese have the same word for crisis as opportunity.
Despite that, a technical crisis does present opportunities for researchers and practitioners
to solve interesting problems. In this talk we point out two crises: …

Pin Accessibility-Driven Detailed Placement Refinement

  • Ding
    Yixiao

The significantly increased number of routing design rules at sub-20nm nodes has made
pin access one of the most critical challenges in detailed routing. Resolving pin
access issues in detailed routing stage may be too late due to the fixed pin …

A Fast, Robust Network Flow-based Standard-Cell Legalization Method for Minimizing
Maximum Movement

  • Karimpour Darav
    Nima

The standard-cell placement legalization problem has become critical due to increasing
design rule complexity and design utilization at 16nm and lower technology nodes.
An ideal legalization approach should preserve the quality of the input placement
in …

SESSION: FPGA CAD and Contest

CAD Opportunities with Hyper-Pipelining

  • Iyer
    Mahesh A.

Hyper-pipelining is a design technique that results in significant performance and
throughput improvements in latency-insensitive designs. Modern FPGA architectures
like Intel’s Stratix®10 feature a revolutionary register-rich HyperFlex? core fabric

An Effective Timing-Driven Detailed Placement Algorithm for FPGAs

  • Dhar
    Shounak

In this paper, we propose a new timing-driven detailed placement technique for FPGAs
based on optimizing critical paths. Our approach extends well beyond the previously
known critical path optimization approaches and explores a significantly larger …

Clock-Aware FPGA Placement Contest

  • Yang
    Stephen

Modern FPGA device contains complex clocking architecture on top of FPGA logic fabric.
To best utilize FPGA clocking architecture, both FPGA designers and EDA tool developers
need to understand the clocking architecture and design best methodology/…

SLIP 2016 TOC

A Comparative Analysis of Front-End and Back-End Compatible Silicon Photonic On-Chip
Interconnects

  • Thakkar
    Ishan G.

Photonic devices fabricated with back-end compatible silicon photonic (BCSP) materials
can provide independence from the complex CMOS front-end compatible silicon photonic
(FCSP) process, to significantly enhance photonic network-on-chip (PNoC) …

Latch Clustering for Minimizing Detection-to-Boosting Latency Toward Low-Power Resilient
Circuits

  • Hsu
    Chih-Cheng

Dynamic voltage scaling (DVS) has become one of the most effective approaches to achieve
ultra-low-power SoC. To eliminate timing errors arising from DVS, several error-resilient
circuit design techniques were proposed to detect and/or correct timing …

Connectivity Effects on Energy and Area for Neuromorphic System with High Speed Asynchronous
Pulse Mode Links

  • Segal
    Carrie

Hardware neuromorphic systems are challenged to achieve biologically realistic levels
of interconnectivity. When building a physical implementation of a neural net, the
properties of the media immediately impose limits on the number of interconnects and

Buffered Interconnects in 3D IC Layout Design

  • Ahmed
    Mohammad A.

A very important challenge in designing through-silicon via (TSV)-based 3D ICs is
to accurately estimate, through all stages of the physical design, the interconnect
delay which is strongly dependent on the layout of 3D IC. The earlier in the design

Topologically-Geometric Routing

  • Bazylevych
    Roman

The paper introduces foundations of the “Flexible Routing Method” that belongs to
the topologically-geometric type. It develops the idea to divide the routing problem
on two separate successive stages: topological and geometrical. At the first stage
it …

Revisiting 3DIC Benefit with Multiple Tiers

  • Chan
    Wei-Ting Jonas

3DICs with multiple tiers are expected to achieve large benefits (e.g., in terms of
power, area) as compared to conventional planar designs. However, few if any previous
works study upper bounds on power and area benefits from 3DIC integration with …

Spin-Hall Assisted STT-RAM Design and Discussion

  • Eken
    Enes

In recent years, Spin-Transfer Torque Random Access Memory (STT-RAM) has attracted
significant attentions from both industry and academia due to its attractive attributes
such as small cell area and non-volatility. However, long switching time and large

A Demand-Aware Predictive Dynamic Bandwidth Allocation Mechanism for Wireless Network-on-Chip

  • Mansoor
    Naseef

Long distance data communication over multi-hop wireline paths in conventional Networks-on-Chips
(NoCs) cause high energy consumption and degradation in bandwidth. Wireless interconnects
in the millimeter-wave band have emerged as an energy-efficient …

ISPD 2018 TOC

SESSION: Keynote Address

Session details: Keynote Address

  • Chu
    Chris

Challenges and Opportunities in Automotive, Industrial, and IoT Physical Design

  • Hill
    Anthony M.

Taping out modern, complex SOCs presents a myriad of challenges in physical design.
Doing so for demanding markets such as automotive, industrial, and IoT multiplies
that complexity. In this talk we will take a broad look across the physical design

SESSION: Finding the Golden Tree in the Forest!

Session details: Finding the Golden Tree in the Forest!

  • Yeap
    Gary

Wot the L: Analysis of Real versus Random Placed Nets, and Implications for Steiner Tree Heuristics

  • Kahng
    Andrew B.

The NP-hard Rectilinear Steiner Minimum Tree (RSMT) problem has been studied in the
VLSI physical design literature for well over three decades. Fast estimators of RSMT
cost (which reflects routed wirelength) are a required ingredient of modern physical

Prim-Dijkstra Revisited: Achieving Superior Timing-driven Routing Trees

  • Alpert
    Charles J.

The Prim-Dijkstra (PD ) construction [1] was first presented over 20 years ago as a way to efficiently
trade off between shortest-path and minimum-wirelength routing trees. This approach
has stood the test of time, having been integrated into leading …

Construction of All Rectilinear Steiner Minimum Trees on the Hanan Grid

  • Lin
    Sheng-En David

Given a set of pins, a Rectilinear Steiner Minimum Tree (RSMT) connects the pins using
only rectilinear edges with the minimum wirelength. RSMT construction is heavily used
at various design steps such as floorplanning, placement, routing, and …

SESSION: FPGA Special Session

Session details: FPGA Special Session

  • Das
    Sabya

Challenges in Large FPGA-based Logic Emulation Systems

  • Hung
    William N.N.

Functional verification is an important aspect of electronic design automation. Traditionally,
simulation at the register transfer-level has been the mainstream functional verification
approach. Formal verification and various static analysis checkers …

Flexibility: FPGAs and CAD in Deep Learning Acceleration

  • Chiu
    Gordon R.

Deep learning inference has become the key workload to accelerate in our AI-powered
world. FPGAs are an ideal platform for the acceleration of deep learning inference
by combining low-latency performance, power-efficiency, and flexibility. This paper

Exploration and Tradeoffs of different Kernels in FPGA Deep Learning Applications

  • Delaye
    Elliott

In the field of deep learning, efficient computational hardware has come to the forefront
of the large scale implementation and deployment of many applications. In the process
of designing hardware, various characteristics of hardware platforms have …

Architecture Exploration of Standard-Cell and FPGA-Overlay CGRAs Using the Open-Source
CGRA-ME Framework

  • Chin
    S. Alexander

We describe an open-source software framework,CGRA-ME, for the modeling and exploration
of coarse-grained reconfigurable architectures (CGRAs). CGRAs are programmable hardware
devices having large ALU-like logic blocks, and datapath bus-style inter-…

SESSION: Design Flow and Power Grid Optimization

Session details: Design Flow and Power Grid Optimization

  • Iyer
    Mahesh

Concurrent High Performance Processor Design: From Logic to PD in Parallel

  • Stok
    Leon

The design of a high-performance processor in an advanced technology node is a highly
concurrent process. While most SoCs are designed with (fairly) stable IP, several
trends are driving the design of the micro-architecture, the logic and the physical

Towards a VLSI Design Flow Based on Logic Computation and Signal Distribution

  • Reis
    André

This paper discusses directions for a VLSI design flow based on a novel paradigm of
local logic computation and global signal distribution. In the last years there has
been an increasing effort to perform a better integration between logic synthesis
and …

Power Grid Reduction by Sparse Convex Optimization

  • Ye
    Wei

With the dramatic increase in the complexity of modern integrated circuits (ICs),
direct analysis and verification of IC power distribution networks (PDNs) have become
extremely computationally expensive. Various power grid reduction methods are …

SESSION: Statistical and Machine Learning-Based CAD

Session details: Statistical and Machine Learning-Based CAD

  • Kissiov
    Ivan

Machine Learning Applications in Physical Design: Recent Results and Directions

  • Kahng
    Andrew B.

In the late-CMOS era, semiconductor and electronics companies face severe product
schedule and other competitive pressures. In this context, electronic design automation
(EDA) must deliver “design-based equivalent scaling” to help continue essential …

Machine Learning for Feature-Based Analytics

  • Wang
    Li-C.

Applying machine learning in Electronic Design Automation (EDA) has received growing
interests in recent years. One approach to analyze data in EDA applications can be
called feature-based analytics. In this context, the paper explains the inadequacy
of …

Data Efficient Lithography Modeling with Residual Neural Networks and Transfer Learning

  • Lin
    Yibo

Lithography simulation is one of the key steps in physical verification, enabled by
the substantial optical and resist models. A resist model bridges the aerial image
simulation to printed patterns. While the effectiveness of learning-based solutions

SESSION: Three Shades of Placement!

Session details: Three Shades of Placement!

  • Shinnerl
    Joseph

Compact-2D: A Physical Design Methodology to Build Commercial-Quality Face-to-Face-Bonded 3D ICs

  • Ku
    Bon Woong

The recent advancement of wafer bonding technology offers fine-grained and silicon-space
overhead-free 3D interconnections in face-to-face (F2F) bonded 3D ICs. In this paper,
we propose a full-chip RTL-to-GDSII physical design solution to build high-…

Analog Placement Constraint Extraction and Exploration with the Application to Layout
Retargeting

  • Xu
    Biying

In analog/mixed-signal (AMS) integrated circuits (ICs), most of the layout design
efforts are still handled manually, which is time-consuming and error-prone. Given
the previous high-quality manual layouts containing valuable design expertise of …

Pin Assignment Optimization for Multi-2.5D FPGA-based Systems

  • Kuo
    Wan-Sin

Advanced 2.5D FPGAs with larger logic capacity and higher pin counts compared to conventional
FPGAs are commercially available. Some multi-FPGA systems have already utilized 2.5D
FPGAs. Commercial 2.5D FPGA consists of multiple dies connected through an …

SESSION: Commemoration for Professor Te Chiang Hu

Session details: Commemoration for Professor Te Chiang Hu

  • Kahng
    Andrew B.

Influence of Professor T. C. Hu’s Works on Fundamental Approaches in Layout

  • Kahng
    Andrew B.

Professor T. C. Hu has made numerous pioneering and fundamental contributions in combinatorial
algorithms, mathematical programming and operations research. His seminal 1985 IEEE
book VLSI Circuit Layout: Theory and Design, coedited with Prof. E. S. Kuh,…

Tree Structures and Algorithms for Physical Design

  • Cheng
    Chung-Kuan

Tree structures and algorithms provide a fundamental and powerful data abstraction
and methods for computer science and operations research. In particular, they enable
significant advancement of IC physical design techniques and design optimization.
For …

Pioneer Research on Mathematical Models and Methods for Physical Design

  • Chu
    Chris

In the inaugural International Symposium on Physical Design (ISPD) at 1997, Prof.
Te Chiang Hu has delivered the keynote address “Physical Design: Mathematical Models
and Methods” [1]. Without any question, Prof. Hu has made a lot of foundational and

Theory and Algorithms of Physical Design

  • Cheng
    Chung-Kuan

SESSION: Interconnect Optimization and Detailed Routing Contest Results

Session details: Interconnect Optimization and Detailed Routing Contest Results

  • Yan
    Jackey

Interconnect Optimization Considering Multiple Critical Paths

  • Hu
    Jiang

Interconnect optimization, including buffer insertion and Steiner tree construction,
continues to be a pillar technology that largely determines overall chip performance.
Buffer insertion algorithms in published literature are mostly focused on …

Interconnect Physical Optimization

  • Janac
    K. Charles

The SoC Interconnect is one of the most important IPs in modern chips as it is the
logical and physical instantiation of an SoC architecture and carries virtually all
the SoC data. Interconnect IPs have to carry non-coherent, cache coherent, subsystem

ISPD 2018 Initial Detailed Routing Contest and Benchmarks

  • Mantik
    Stefanus

In advanced technology nodes, detailed routing becomes the most complicated and runtime
consuming stage. To spur detailed routing research, ISPD 2018 initial detailed routing
contest is hosted and it is the first ISPD contest on detailed routing …

SESSION: How to Make Your Foundry Happier?

Session details: How to Make Your Foundry Happier?

  • Hu
    Jiang

The Pressing Need for Electromigration-Aware Physical Design

  • Lienig
    Jens

Electromigration (EM) is becoming a progressively intractable design challenge due
to increased interconnect current densities. It has changed from something designers
“should” think about to something they “must” think about, i.e., it is now a definite

On Coloring and Colorability Analysis of Integrated Circuits with Triple and Quadruple
Patterning Techniques

  • Lvov
    Alexey

The continued delay of higher resolution alternatives for lithography, such as EUV,
is forcing the continued adoption of multi-patterning solutions in new technology
nodes, which include triple and quadruple patterning using several lithography-etch

Standard CAD Tool-Based Method for Simulation of Laser-Induced Faults in Large-Scale
Circuits

  • Viera
    Raphael A.C.

Designing secure integrated systems requires methods and tools dedicated to simulating
that early design stages’ the effects of laser-induced transient faults maliciously
injected by attackers. Existing methods for simulation of laser-induced transient

GLSVLSI 2019 TOC

SESSION: Keynote & Invited Talks

Thoughts on Edge Intelligence

  • Wolf
    Marilyn

Machine learning methods have exploded in the past half-dozen years. Machine learning
is being applied to a huge range of problems across the spectrum of applications.
Initial results relied on server-oriented computations. But many applications will

Automatic Implementation of Secure Silicon

  • Leef
    Serge

Throughout the past decade, cybersecurity threats have evolved from attacks focused
high in the software stack to progressively lower levels of computational hierarchy.
With the explosion of popularity and growing deployment of internet connected …

Processing Data Where It Makes Sense in Modern Computing Systems: Enabling In-Memory Computation

  • Mutlu
    Onur

Today’s systems are overwhelmingly designed to move data to computation. This design
choice goes directly against at least three key trends in systems that cause performance,
scalability and energy bottlenecks: 1) data access from memory is already a …

Innovations in IoT for a Safe, Secure, and Sustainable Future

  • Bhunia
    Swarup

Internet of things (IoT) promises to usher in the fourth industrial revolution through
an exponential growth of smart connected devices deployed in myriad application domains.
It gives rise to new relationships between man and smart connected machines …

SESSION: Tech Session 1: Design and Integration of Hardware Security Primitives

LPN-based Device Authentication Using Resistive Memory

  • Arafin
    Md Tanvir

Recent progress in the design and implementation of resistive memory components such
as RRAMs and PCMs has introduced opportunities for developing novel hardware security
solutions using unique physical properties of these devices. In this work, we …

Leveraging On-Chip Voltage Regulators Against Fault Injection Attacks

  • Vosoughi
    Ali

The security implications of utilizing an on-chip voltage regulator as a countermeasure
against fault injection attacks are investigated in this paper. The effect of the
size of the capacitors and number of phases of the voltage regulator on the …

On the Theoretical Analysis of Memristor based True Random Number Generator

  • Uddin
    Mesbah

Emerging nano-devices like memristors display stochastic switching behavior which
poses a big uncertainty in their implementation as the next-generation CMOS alternative.
However, this stochasticity provides an opportunity to design circuits for …

Control-Lock: Securing Processor Cores Against Software-Controlled Hardware Trojans

  • Šišejković
    Dominik

Malicious circuit modifications known as hardware Trojans represent a rising threat
to the integrated circuit supply chain. As many Trojans are activated based on a specific
sequence of circuit states, we have recognized the ease of utilizing an …

Lightweight Authenticated Encryption for Network-on-Chip Communications

  • Harttung
    Julian

In recent years, Network-on-Chip (NoC) has gained increasing popularity as a promising
solution for the challenging interconnection problem in multi-processor systems-on-chip
(MPSoCs). However, the interest of adversaries to compromise such systems grew …

SESSION: Tech Session 2: VLSI Circuits and Power Aware Design

Design of a Low-power and Small-area Approximate Multiplier using First the Approximate
and then the Accurate Compression Method

  • Yang
    Tongxin

Recently emerging applications, such as convolution neural networks (CNNs), which
process thousands of convolutional computations, require a large amount of power.
Multiplication is the key arithmetic in these applications and an approximate multiplier

GraphiDe: A Graph Processing Accelerator leveraging In-DRAM-Computing

  • Angizi
    Shaahin

In this paper, we propose GraphiDe, a novel DRAM-based processing-in-memory (PIM)
accelerator for graph processing. It transforms current DRAM architecture to massively
parallel computational units exploiting the high internal bandwidth of the modern

An Efficient Time-based Stochastic Computing Circuitry Employing Neuron-MOS

  • Erlina
    Tati

A compact and low energy circuitry of time-based stochastic computing (TBSC) have
been designed. In the TBSC theory, stochastic numbers (SNs) are represented by duty-cycle
of periodic signals. Additionally, multiplication and addition operations of the …

Monolithic 8×8 SiPM with 4-bit Current-Mode Flash ADC with Tunable Dynamic Range

  • Vinayaka
    Vikas

A monolithic photon-counting receiver consisting of an integrated silicon-photomultiplier
and a current-mode analog-to-digital converter (ADC) was designed, simulated and fabricated
in the AMS 0.35 μm SiGe BiCMOS process. The silicon photomultiplier (…

SESSION: Tech Session 3: : VLSI for Machine Learning and Artificial Intelligence

A Systolic SNN Inference Accelerator and its Co-optimized Software Framework

  • Guo
    Shasha

Although Deep Neural Network (DNN) architectures have made some breakthroughs in computer
vision tasks, they are not close to biological brain neurons. Spiking Neural Network
(SNN) is highly expected to bridge the gap between artificial computing …

Dynamic Beam Width Tuning for Energy-Efficient Recurrent Neural Networks

  • Jahier Pagliari
    Daniele

Recurrent Neural Networks (RNNs) are state-of-the-art models for many machine learning
tasks, such as language modeling and machine translation. Executing the inference
phase of a RNN directly in edge nodes, rather than in the cloud, would provide …

Efficient Softmax Hardware Architecture for Deep Neural Networks

  • Du
    Gaoming

Deep neural network (DNN) has become a pivotal machine learning and object recognition
technology in the big data era. The softmax layer is one of the key component layers
for completing multi-classification tasks. However, the softmax layer contains …

HSIM-DNN: Hardware Simulator for Computation-, Storage- and Power-Efficient Deep Neural Networks

  • Sun
    Mengshu

Deep learning that utilizes large-scale deep neural networks (DNNs) is effective in
automatic high-level feature extraction but also computation and memory intensive.
Constructing DNNs using block-circulant matrices can simultaneously achieve hardware

SESSION: Tech Session 4: Next Generation Interconnect: Architecture to Physical Design

An Area-Efficient Iterative Single-Precision Floating-Point Multiplier Architecture
for FPGA

  • Kim
    Sunwoong

Approximate multipliers have been widely used in critical applications, such as machine
learning and multimedia, which are tolerant to approximation errors. This paper proposes
a novel single-precision floating-point (SPFP) multiplication algorithm and …

An Automatic Transistor-Level Tool for GRM FPGA Interconnect Circuits Optimization

  • Li
    Zhengjie

Due to its dominance in FPGA area and delay, the interconnect circuit is traditionally
designed and optimized in full customized fashion, which can be extremely time consuming.
In this paper, we propose an automated transistor-level sizing optimization …

Low Voltage Clock Tree Synthesis with Local Gate Clusters

  • Sitik
    Can

In this paper, a novel local clock gate cluster-aware low voltage clock tree synthesis
methodology is introduced. In low voltage/swing clocking, timing closure is a challenging
problem due to tight skew and slew constraints. The clock gating makes this …

SESSION: Tech Session 5: Designing robust VLSI circuits. From approximate computing to hardware
security

TOIC: Timing Obfuscated Integrated Circuits

  • Alam
    Mahabubul

To counter the threats of reverse engineering (RE) and Trojan in-sertion, researchers
have considered gate-level obfuscation in inte-grated circuits (IC) as a viable solution.
However, several techniques are present in the literature to crack the …

Design for Eliminating Operation Specific Power Signatures from Digital Logic

  • Majumder
    Md Badruddoja

Conventional digital logic operations have distinguishable power signatures. Side
channel power analysis combined with classification algorithm can reveal unknown logic
operations. Revealing the underlying operations is the main task in reverse …

Non-Uniform Temperature Distribution in Interconnects and Its Impact on Electromigration

  • Abbasinasab
    Ali

We investigate the effect of electrically induced thermal load on interconnect reliability
and aging. We propose new models for uniform and non-uniform temperature evolution
and its steady state distribution in interconnects considering Joule heating …

Fault Classification and Coverage of Analog Circuits using DC Operating Point and
Frequency Response Analysis

  • Sanyal
    Sayandeep

Detection of faults in a mixed-signal SOC at the pre-silicon stage is a challenge,
especially when it has substantial analog components. Given the time taken for simulating
analog circuits, designing tests to detect faults in them is not a …

Crash Skipping: A Minimal-Cost Framework for Efficient Error Recovery in Approximate Computing Environments

  • Verdeja Herms
    Yan

We present a lightweight technique to minimize error recovery costs in approximate
computing environments. We take advantage of the key observation that if an application
crashes in a “non-critical” region of its execution, then skipping the crash and …

SESSION: Tech Session 6: Emerging Computing & Post-CMOS Technologies

Voltage-Controlled Magnetoelectric Memory Bit-cell Design With Assisted Body-bias
in FD-SOI

  • Cai
    Hao

Voltage-controlled magnetic anisotropy (VCMA)-magnetic tunnel junction (MTJ) is incorporated
into FD-SOI CMOS technology. The design space of 1 transistor-1 MTJ (1T-1M) bit-cell
is explored through varied VCMA pulse duration/amplitude and scaling down …

Low Cost Hybrid Spin-CMOS Compressor for Stochastic Neural Networks

  • Li
    Bingzhe

With expansion of neural network (NN) applications lowering their hardware implementation
cost becomes an urgent task especially in back-end applications where the power-supply
is limited. Stochastic computing (SC) is a promising solution to realize low-…

Functionally Complete Boolean Logic and Adder Design Based on 2T2R RRAMs for Post-CMOS
In-Memory Computing

  • Yang
    Zongxian

In-memory computing (IMC) paradigm has attracted extensive attention for future electronics
to overcome the bottleneck and memory wall problem in the von Neumann systems. Nonvolatile
logic based on resistive random-access memory (RRAM) is a promising …

Jump Search: A Fast Technique for the Synthesis of Approximate Circuits

  • Witschen
    Linus

State-of-the-art frameworks for generating approximate circuits automatically explore
the search space in an iterative process – often greedily. Synthesis and verification
processes are invoked in each iteration to evaluate the found solutions and to …

SESSION: Tech Session 7: Physical Design and Obfuscation

SAT-Based Placement Adjustment of FinFETs inside Unroutable Standard Cells Targeting
Feasible DRC-Clean Routing

  • Sorokin
    Anton

In this paper, we present an algorithm of transistor placement that takes unroutable
standard cells and makes them routable by moving transistors in local windows. It
converts the task of placement of gridded FinFETs into a Boolean problem and employs

A Scalable and Process Variation Aware NVM-FPGA Placement Algorithm

  • Yang
    Chengmo

As non-volatile memory (NVM) based FPGAs gain increasing popularity, FPGA synthesis
tools start to tune the synthesis flow to match NVM characteristics. State-of-the-art
NVM FPGA placement algorithms tried to reduce the high reconfiguration cost induced

Functional Obfuscation of Hardware Accelerators through Selective Partial Design Extraction
onto an Embedded FPGA

  • Hu
    Bo

The protection of Intellectual Property (IP) has emerged as one of the most serious
areas of concern in the semiconductor industry. To address this issue, we present
a method and architecture to map selective portions of a design, given as a behavioral

HydraRoute: A Novel Approach to Circuit Routing

  • Khasawneh
    Mohammad

Routing for dense circuits is a major challenge for VLSI physical design. Most routing
approaches rely at least partially on a “rip-up and reroute” scheme, where solution
quality and run times can be impacted profoundly by the order in which nets are …

SESSION: Tech Session 8: Quantum Circuits and Emerging Technologies

Balanced Factorization and Rewriting Algorithms for Synthesizing Single Flux Quantum
Logic Circuits

  • Pasandi
    Ghasem

Single Flux Quantum (SFQ) logic with switching energy of 100zJ1 and switching delay
of 1ps is a promising post-CMOS candidate. Logic synthesis of these magnetic-pulse-based
circuits is a very important step in their design flow with a big impact on the …

A Majority Logic Synthesis Framework for Adiabatic Quantum-Flux-Parametron Superconducting
Circuits

  • Cai
    Ruizhe

Adiabatic Quantum-Flux-Parametron (AQFP) logic is an adiabatic superconductor logic
that has been proposed as alternative to CMOS logic with extremely high energy efficiency.
In AQFP technology, majority-based gates have the same area as two-input AND/…

A Processing-In-Memory Implementation of SHA-3 Using a Voltage-Gated Spin Hall-Effect
Driven MTJ-based Crossbar

  • Yang
    Chengmo

Processing-In-Memory (PIM), which implements logic operations within memory cells,
opens up a new direction on organizing data and computation. Leveraging resistive
or magnetic characteristics of nonvolatile memory (NVM) devices, platforms such as
PLiM …

Exploring Processing In-Memory for Different Technologies

  • Gupta
    Saransh

The recent emergence of IoT has led to a substantial increase in the amount of data
processed. Today, a large number of applications are data intensive, involving massive
data transfers between processing core and memory. These transfers act as a …

SESSION: Tech Session 9: Towards Fast, Efficient, and Robust Memory

BLADE: A BitLine Accelerator for Devices on the Edge

  • Simon
    William Andrew

The increasing ubiquity of edge devices in the consumer market, along with their ever
more computationally expensive workloads, necessitate corresponding increases in computing
power to support such workloads. In-memory computing is attractive in edge …

Enhancing the Lifetime of Non-Volatile Caches by Exploiting Module-Wise Write Restriction

  • Agarwal
    Sukarn

The emerging Non-Volatile Memory (NVM) technologies offer a good combination of high
density and near-zero leakage power, becoming the strongest candidate in the memory
hierarchy including caches. However, the weak write endurance of these memories …

Mitigating the Performance and Quality of Parallelized Compressive Sensing Reconstruction
Using Image Stitching

  • Namazi
    Mahmoud

Orthogonal Matching Pursuit is an iterative greedy algorithm used to find a sparse
approximation for high-dimensional signals. The algorithm is most popularly used in
Compressive Sensing, which allows for the reconstruction of sparse signals at rates

Towards Optimizing Refresh Energy in embedded-DRAM Caches using Private Blocks

  • Manohar
    Sheel Sindhu

In recent years, the increased working set size of applications craves for more memory
demand in terms of large size Last Level Caches (LLC). To fulfill this, embedded DRAM
(eDRAM) caches have been considered as one of the best alternatives over …

SESSION: Tech Session 10: MSE

Extending Student Labs with SMT Circuit Implementation

  • Brunvand
    Erik

Computer Science and Computer Engineering classes related to digital circuits, embedded
systems, Human Computer Interaction (HCI), and a wide variety of “maker” subjects,
would often like to include physical computing projects. Extending these physical

Teaching the Next Generation of Cryptographic Hardware Design to the Next Generation
of Engineers

  • Aysu
    Aydin

Evolving threats against cryptographic systems and the increasing diversity of computing
platforms enforce teaching cryptographic engineering to a wider audience. This paper
describes the development of a new graduate course on hardware security taught …

A Web-based Remote FPGA Laboratory for Computer Organization Course

  • Wan
    Han

Learning in digital systems could be enhanced by applying a learn-by-doing mechanism.
In this paper the implementation of a web-based remote FPGA laboratory for Computer
Organization course is proposed. The projects created for this course are designed

System-on-a-Chip Design as a Platform for Teaching Design and Design Flow Integration

  • Covey
    Jacob

The design of microelectronic systems requires integration and cooperation across
multiple disciplines, but most curriculum is taught in unconnected pieces. This makes
the creation of manageable projects that reflect the design experience very …

SESSION: Poster Sessions I, II

UPIM: Unipolar Switching Logic for High Density Processing-in-Memory Applications

  • Sim
    Joonseop

Internet of Things (IoT) has built a network with billions of connected devices which
generate massive volumes of data. Processing large data on existing systems requires
significant costs for data movements between processors and memory due to limited

Fence-Region-Aware Mixed-Height Standard Cell Legalization

  • Do
    SangGi

We propose a fence-region-aware mixed-height standard cell legalization that can optimize
the placement of standard cells that have more than a two row height in various shapes
of the fence region. The algorithm consists of pre-legalization and mixed-…

A Case for Heterogeneous Network-on-Chip Based H.264 Video Decoders

  • Ghorbani Moghaddam
    Milad

The design of a heterogeneous network-on-chip (NoC) based H.264 video decoder is proposed.
A thorough investigation using a system simulator developed as the combination of
a cycle accurate NoC simulator together with complete implementations of all the …

A 16b Clockless Digital-to-Analog Converter with Ultra-Low-Cost Poly Resistors Supporting
Wide-Temperature Range from -40°C to 85°C

  • Wang
    Xuedi

High-precision digital-to-analog converter (DAC) is a critical component in process
control, data acquisition, and testing instruments. In order to achieve high resolution
and a wide-temperature range, conventional designs have been adopting high-cost …

A Skyrmion Racetrack Memory based Computing In-memory Architecture for Binary Neural
Convolutional Network

  • Pan
    Yu

A Skyrmion Racetrack Memory (SRM) based Computing In-Memory Architecture (SRM-CIM)
was proposed in this paper. Both data and computing operation can be achieved in SRM-CIM.
SRM-CIM is used to support convolutional computing in Binary Convolutional …

TASecure: Temperature-Aware Secure Deletion Scheme for Solid State Drives

  • Li
    Bingzhe

With the increasing concerns of security, the secure deletion for SSDs becomes very
costly due to its out-of-place update (i.e., an update is performed in a new location
leaving the old data un-touched). Some previous studies used a combined erase-based

An Asymmetric Dual Output On-Chip DC-DC Converter for Dynamic Workloads

  • Liu
    Xingye

We propose a novel two-stage hybrid on-chip DC-DC converter targeting low power applications
with multiple supply voltage domains and dynamic workloads. The converter has a nominal
input voltage of 1.2V and generates two asymmetrically regulated output …

CNNWire: Boosting Convolutional Neural Network with Winograd on ReRAM based Accelerators

  • Lin
    Jilan

Resistive random access memory (ReRAM) demonstrates the great potential of in-memory
processing for neural network (NN) acceleration. However, since the convolutional
neural network (CNN) is widely known as compute-bound, current ReRAM-based …

Feed-Forward XOR PUFs: Reliability and Attack-Resistance Analysis

  • Avvaru
    S. V. Sandeep

Physical unclonable functions (PUFs) can be used to generate unique signatures of
integrated circuit (IC) chips. XOR arbiter PUFs (XOR PUFs), that typically contain
multiple standard arbiter PUFs as their components, are more secure than standard

Exploring Design Trade-offs in Fault-Tolerant Behavioral Hardware Accelerators

  • Zhu
    Zhiqi

High-Level Synthesis (HLS) allows the automatic generation of hardware accelerators
with unique design metrics. This work leverages this unique feature and presents a
method to increase the search space of fault-tolerant hardware accelerators. The …

Automatic Extraction of Requirements from State-based Hardware Designs for Runtime
Verification

  • Seo
    Minjun

Runtime monitoring and verification enables a system to monitor itself and ensure
system requirements are met even in the presence of dynamic environments. For hardware,
state-based models are widely used, but verifying the correctness between the state-…

MirrorCache: An Energy-Efficient Relaxed Retention L1 STTRAM Cache

  • Kuan
    Kyle

Spin-Transfer Torque RAM (STTRAM) is a promising alternative to SRAMs in on-chip caches,
due to several advantages, including non-volatility, low leakage, high integration
density, and CMOS compatibility. However, STTRAMs’ wide adoption in resource-…

Design and Evaluation of DNU-Tolerant Registers for Resilient Architectural State
Storage

  • Alghareb
    Faris S.

In this work, we aim to maintain the correct execution of instructions in the pipeline
stages. To achieve that, the integrity for the data computed in registers during execution
should be maintained via protecting the susceptible registers. Thus, we …

Automated Analysis of Virtual Prototypes at Electronic System Level

  • Goli
    Mehran

The exponential increase in functionality of System-on-Chips (SoCs) and reduced Time-to-Market
(TTM) requirements have significantly altered the typical design and verification
flow. Virtual Prototyping (VP) at the Electronic System Level (ESL) using …

Dynamic Physically Unclonable Functions

  • Xiong
    Wenjie

Physical variations in the manufacturing processes of electronic devices have been
widely leveraged to design Physically Unclonable Functions (PUFs), which can be used
for authentication and key storage. Existing PUFs are static, as their PUF responses

RDTA: An Efficient Routability-Driven Track Assignment Algorithm

  • Liu
    Genggeng

This paper presents a routability-driven track assignment algorithm (RDTA) to efficiently
estimate routability. Routability has become a very challenging issue in modern IC
design and it can be effectively estimated by routing congestion. Track …

EraseMe: A Defense Mechanism against Information Leakage exploiting GPU Memory

  • Fang
    Hongyu

Graphics Processing Units (GPU) play a major role in speeding up computational tasks
of the users, especially in applications such as high volume text and image processing.
Recent works have demonstrated the security problems associated with GPU that do …

A Statistical Current and Delay Model Based on Log-Skew-Normal Distribution for Low
Voltage Region

  • Cao
    Peng

The increasing performance variation and non-Gaussian distribution pose remarkable
challenges to timing analysis for circuits operating in low voltage region. Accurate
modeling of the statistical characteristics is urgently required with process …

Enabling Approximate Storage through Lossy Media Data Compression

  • Worek
    Brian

As compute capabilities continue to scale, memory capacity and bandwidth continue
to lag behind. Data compression is an effective approach to improving memory capacity
and bandwidth; but prior works have focused primarily on lossless compression and

Thermal Fingerprinting of FPGA Designs through High-Level Synthesis

  • Chen
    Jianqi

This work investigates if temperature can be used to fingerprint FPGA designs and
presents a method to generate a large number of functionally equivalent FPGA designs
such that each design has a unique distinguishable thermal signature. The main …

Deep RNN-Oriented Paradigm Shift through BOCANet: Broken Obfuscated Circuit Attack

  • Tehranipoor
    Fatemeh

Logic encryption obfuscation has been used for thwarting counterfeiting, overproduction,
and reverse engineering but vulnerable to attacks. However, it was recently shown
that satisfiability – checking (SAT) can potentially compromise hardware …

STAT: Mean and Variance Characterization for Robust Inference of DNNs on Memristor-based
Platforms

  • Zhang
    Baogang

An emerging solution to accelerate the inference phase of deep neural networks (DNNs)
is to utilize memristor crossbar arrays (MCAs) to perform highly efficient matrix-vector
multiplication in the analog domain. An adverse challenge is that memristor …

LSM: Novel Low-Complexity Unified Systolic Multiplier over Binary Extension Field

  • Xie
    Jiafeng

Unified (hybrid field-size) systolic multiplier over GF(2m) (binary extension field)
has attracted significant attentions from research communities recently as it can
be used in reconfigurable cryptographic processors. In this paper, we present a novel

Binarized Depthwise Separable Neural Network for Object Tracking in FPGA

  • Yang
    Li

Object tracking has achieved great advances in the past few years and has been widely
applied in vision-based application. Nowadays, deep convolutional neural network has
taken an important role in object tracking tasks. However, its enormous model size

An Analytical-based Hybrid Algorithm for FPGA Placement

  • Hu
    Chengyu

As the capacity of FPGA increases, FPGA placers that adopt Simulated Annealing (SA)
algorithm take more and more runtime. To solve this problem, this paper presents HCAS,
a Hybrid algorithm Combining Analytical method and SA. There are three …

Approximate Memory with Approximate DCT

  • Ma
    Shenghou

Approximate Computing is an emerging computing paradigm where one exploits inherent
error resilience of certain applications (e.g., digital signal processing, multimedia
and artificial intelligence) and trades off absolute computation precisions for …

AQuRate: MRAM-based Stochastic Oscillator for Adaptive Quantization Rate Sampling of Sparse
Signals

  • Salehi
    Soheil

Recently, the promising aspects of compressive sensing have inspired new circuit-level
approaches for their efficient realization within the literature. However, most of
these recent advances involving novel sampling techniques have been proposed …

Clockless Spin-based Look-Up Tables with Wide Read Margin

  • Salehi
    Soheil

In this paper, we develop a 6-input fracturable non-volatile Clockless LUT (C-LUT)
using spin Hall effect (SHE)-based Magnetic Tunnel Junctions (MTJs) and provide a
detailed comparison between the SHE-MTJ-based C-LUT and Spin Transfer Torque (STT)-MTJ-…

A Hybrid Framework for Functional Verification using Reinforcement Learning and Deep
Learning

  • Singh
    Karunveer

In this paper, we propose a novel hybrid verification framework (HVF) which uses Reinforcement
Learning (RL) and Deep Neural Networks (DNNs) to accelerate the verification of complex
systems. More precisely, our HVF incorporates RL to generate all …

SESSION: Special Session 1: In-Memory Processing for Future Electronics

Digital and Analog-Mixed-Signal In-Memory Processing in CMOS SRAM

  • Jaiswal
    Akhilesh

Ferroelectric FET Based In-Memory Computing for Few-Shot Learning

  • Laguna
    Ann Franchesca

As CMOS technology advances, the performance gap between the CPU and main memory has
not improved. Furthermore, the hardware deployed for Internet of Things (IoT) applications
need to process ever growing volumes of data, which can further exacerbate …

True In-memory Computing with the CRAM: From Technology to Applications

  • Zabihi
    Masoud

An Overview of In-memory Processing with Emerging Non-volatile Memory for Data-intensive
Applications

  • Li
    Bing

The conventional von Neumann architecture has been revealed as a major performance
and energy bottleneck for rising data-intensive applications. The decade-old idea
of leveraging in-memory processing to eliminate substantial data movements has returned

SESSION: Special Session 2: Approximate Computing Systems Design: Energy Efficiency and Security
Implications

Security Threats in Approximate Computing Systems

  • Yellu
    Pruthvy

Approximate computing systems improve energy efficiency and computation speed at the
cost of reduced accuracy on system outputs. Existing efforts mainly explore the feasible
approximation mechanisms and their implementation methods. There is limited …

Characterizing Approximate Adders and Multipliers Optimized under Different Design
Constraints

  • Jiang
    Honglan

Taking advantage of the error resilience in many applications as well as the perceptual
limitations of humans, numerous approximate arithmetic circuits have been proposed
that trade off accuracy for higher speed or lower power in emerging applications …

Approximate Communication Strategies for Energy-Efficient and High Performance NoC: Opportunities and Challenges

  • Reza
    Md Farhadur

With the advancement and miniaturization of transistor technology, hundreds of cores
can be integrated on a single chip. Network-on-Chips (NoCs) are the de facto on-chip
communication fabrics for multi/many core systems because of their benefits over …

Information Hiding behind Approximate Computation

  • Wang
    Ye

There are many interesting advances in approximate computing recently targeting the
energy efficiency in system design and execution. The basic idea is to trade computation
accuracy for power and energy during all phases of the computation, from data to …

MLPrivacyGuard: Defeating Confidence Information based Model Inversion Attacks on Machine Learning
Systems

  • Alves
    Tiago A. O.

As services based on Machine Learning (ML) applications find increasing use, there
is a growing risk of attack against such systems. Recently, adversarial machine learning
has received a lot of attention, where an adversary is able to craft an input or …

SESSION: Special Session 3: Recent Advances in Near and In-Memory Computing Circuit ?

XNOR-SRAM: In-Bitcell Computing SRAM Macro based on Resistive Computing Mechanism

  • Jiang
    Zhewei

We present an in-memory computing SRAM macro for binary neural networks. The memory
macro computes XNOR-and-accumulate for binary/ternary deep convolutional neural networks
on the bitline without row-by-row data access. It achieves 33X better energy and …

Efficient Process-in-Memory Architecture Design for Unsupervised GAN-based Deep Learning
using ReRAM

  • Chen
    Fan

The ending of Moore’s Law makes domain-specific architecture as the future of computing.
The most representative is the emergence of various deep learning accelerators. Among
the proposed solutions, resistive random access memory (ReRAM) based process-…

DigitalPIM: Digital-based Processing In-Memory for Big Data Acceleration

  • Imani
    Mohsen

In this work, we design, DigitalPIM, a Digital-based Processing In-Memory platform
capable of accelerating fundamental big data algorithms in real time with orders of
magnitude more energy efficient operation. Unlike the existing near-data processing

In-memory Processing based on Time-domain Circuit

  • Kong
    Yuyao

Deep Neural Networks (DNN) have emerged as a dominant algorithm for machine learning
(ML). High performance and extreme energy efficiency are critical for deployments
of DNN, especially in mobile platforms such as autonomous vehicles, cameras, and other

SESSION: Special Session 4: Opportunities and Challenges for Emerging Monolithic 3D Integrated
Circuits

An Overview of Thermal Challenges and Opportunities for Monolithic 3D ICs

  • Shukla
    Prachi

Monolithic 3D (Mono3D) is a three-dimensional integration technology that can overcome
some of the fundamental limitations faced by traditional, two-dimensional scaling.
This paper analyzes the unique thermal characteristics of Mono3D ICs by simulating

Logic Monolithic 3D ICs: PPA Benefits and EDA Tools Necessary

  • Pentapati
    Sai Surya Kiran

Monolithic 3D (M3D) ICs provide a way to achieve high performance and low power designs
within the same technology node, thereby bypassing the need for transistor scaling.
M3D ICs have multiple 2D tiers sequentially fabricated on top of each other and …

Investigation and Trade-offs in 3DIC Partitioning Methodologies: N/A

  • Sketopoulos
    Nikolaos

In this work, we compare alternative 3DIC partitioning methodologies, in terms of
slack, number of inter-tier vias, Tier Area Ratio (TAR) and HPWL design parameters.
The popular 3DIC postplacement, bin-based Fidducia-Mattheyses (FM) partitioning flow
is …

Test and Design-for-Testability Solutions for Monolithic 3D Integrated Circuits

  • Koneru
    Abhishek

M3D integration can result in reduced area and higher performance when compared to
3D die stacking. Due to the benefits of M3D integration, there is growing interest
in industry towards the adoption of this technology. However, test challenges for
M3D …

N3XT Monolithic 3D Energy-Efficient Computing Systems

  • Aly
    Mohamed M. Sabry

The world’s appetite for analyzing massive amounts of structured and unstructured
data has grown dramatically. The computational demands of these abundant-data applications
far exceed the capabilities of today’s computing systems. The N3XT (Nano-…

SESSION: Special Session 5: Robust IC Authentication and Protected Intellectual Property: A
Special Session on Hardware Security

How to Generate Robust Keys from Noisy DRAMs?

  • Karimian
    Nima

Security primitives based on Dynamic Random Access Memory (DRAM) can provide cost-efficient
and practical security solutions, especially for resource-constrained devices, such
as hardware used in the Internet of Things (IoT), as DRAMs are an intrinsic …

Threats on Logic Locking: A Decade Later

  • Zamiri Azar
    Kimia

To reduce the cost of ICs and to meet the market’s demand, a considerable portion
of manufacturing supply chain, including silicon fabrication, packaging and testing
may be pushed offshore. Utilizing a global IC manufacturing supply chain, and inclusion

On Custom LUT-based Obfuscation

  • Kolhe
    Gaurav

Logic obfuscation yields hardware security against various threats, such as Intellectual
Property (IP) piracy and reverse engineering. Evolving Boolean satisfiability (SAT)
attacks have challenged the hardware security assurance rendered by various …

Securing Analog Mixed-Signal Integrated Circuits Through Shared Dependencies

  • Juretus
    Kyle

The transition to a horizontal integrated circuit (IC) design flow has raised concerns
regarding the security and protection of IC intellectual property (IP). Obfuscation
of an IC has been explored as a potential methodology to protect IP in both the …

SESSION: Special Session 6: Neuromorphic Computing and Deep Neural Network

Design Methodology for Embedded Approximate Artificial Neural Networks

  • Balaji
    Adarsha

Artificial neural networks (ANNs) have demonstrated significant promise while implementing
recognition and classification applications. The implementation of pre-trained ANNs
on embedded systems requires representation of data and design parameters in …

Exploration of Segmented Bus As Scalable Global Interconnect for Neuromorphic Computing

  • Balaji
    Adarsha

Spiking Neural Networks (SNNs) are efficient computation models for spatio-temporal
pattern recognition on resource and power constrained platforms. Dedicated SNN hardware,
also called neuromorphic hardware, can further reduce the energy consumption of …

ADMM-based Weight Pruning for Real-Time Deep Learning Acceleration on Mobile Devices

  • Li
    Hongjia

Deep learning solutions are being increasingly deployed in mobile applications, at
least for the inference phase. Due to the large model size and computational requirements,
model compression for deep neural networks (DNNs) becomes necessary, especially …

On the use of Deep Autoencoders for Efficient Embedded Reinforcement Learning

  • Prakash
    Bharat

In autonomous embedded systems, it is often vital to reduce the amount of actions
taken in the real world and energy required to learn a policy. Training reinforcement
learning agents from high dimensional image representations can be very expensive
and …

SESSION: Panelist Position Papers

Tuning Track-based NVM Caches for Low-Power IoT Devices

  • Aghaei Khouzani
    Hoda

Track-based non-volatile memories, such as Domain Wall Memory (DWM) and Skyrmion,
are promising candidates to be used as CPU caches due to their ultra-high density
and low-static power. However, the access latency and energy of these devices are
highly …

Dynamic Computation Migration at the Edge: Is There an Optimal Choice?

  • Shahhosseini
    Sina

In the era of Fog computing where one can decide to compute certain time-critical
tasks at the edge of the network, designers often encounter a question whether the
sensor layer provides the optimal response time for a service, or the Fog layer, or

Solving Energy and Cybersecurity Constraints in IoT Devices Using Energy Recovery
Computing

  • Thapliyal
    Himanshu

With the growth of Internet-of-Things (IoT), the potential threat vectors for malicious
cyber and hardware attacks are rapidly expanding. As the IoT paradigm emerges, there
are challenging requirements to design energy-efficient and secure systems. To …

Right-Provisioned IoT Edge Computing: An Overview

  • Adegbija
    Tosiron

Edge computing on the Internet of Things (IoT) is an increasingly popular paradigm
in which computation is moved closer to the data source (i.e., edge devices). Edge
computing mitigates the overheads of cloud-based computing arising from increased

Secure Computing Systems Design Through Formal Micro-Contracts

  • Kinsy
    Michel A.

Two enduring concepts in computer system design are abstraction levels and layered
composition. The design generally takes a layered approach where each layer implements
a different abstraction of the system. The layers communicate through interfaces …

SBCCI 2019 TOC

PHICC: an error correction code for memory devices

  • Philippe Magalhães
  • Otávio Alcântara
  • Jarbas Silveira

With the evolution of technology in the microelectronics field, integrated circuits (ICs) have been developed with decreasing dimensions. Despite the advances provided by the scale reduction, the occurrence of Multiple Cell Upsets (MCUs) caused by interferences such as ionizing radiation, has become increasingly common. Error Correction Codes (ECCs) are capable of augmenting fault tolerance of computer systems, however, there must be balance between error correction effectiveness and silicon implementation costs. The purpose of this article is to present the Parity Hamming Interleaved Correction Code (PHICC), which consists of a code capable of correcting multiple transient errors in memory cells, with low implementation cost. The validation of the PHICC was performed through a comparative analysis of correction effectiveness, implementation costs, reliability and Mean Time to Failure (MTTF) with others ECCs. The results show that PHICC can maintain the reliability system for longer time, which makes it a strong candidate for use in critical applications.

Lightweight security mechanisms for MPSoCs

  • Anderson Camargo Sant’Ana
  • Henrique Martins Medina
  • Kevin Boucinha Fiorentin
  • Fernando Gehm Moraes

Computational systems tend to adopt parallel architectures, by using multiprocessor systems-on-chip (MPSoCs). MPSoCs are vulnerable to software and hardware attacks, as infected applications and Hardware Trojans respectively. These attacks may have the purpose to gain access to sensitive data, interrupt a given application or even damage the system physically. The literature presents countermeasures using dedicated routing algorithms, cryptography, firewalls and secure zones. These approaches present a significant hardware cost (firewalls, cryptography) or are too restrictive regarding the use of MPSoC resources (secure zones). The goal of this paper is to present lightweight security mechanisms for MPSoCs, using four techniques: spatial isolation of applications; dedicated network to send sensitive data; traffic blocking filter; lightweight cryptography. These mechanisms protect the MPSoC against the most common software attacks, as Denial of Service (DoS) and spoofing (man-in-the-middle), and ensures confidentiality and integrity to applications. Results present low area and latency overhead, as well as the effectiveness of using the mechanisms to block malicious traffic.

Exploiting approximate computing for low-cost fault tolerant architectures

  • Gennaro S. Rodrigues
  • Juan Fonseca
  • Fabio Benevenuti
  • Fernanda Kastensmidt
  • Alberto Bosio

This work investigates how the approximate computing paradigm can be exploited to provide low-cost fault tolerant architectures. In particular, we focus on the implementation of Approximate Triple Modular Redundancy (ATMR) designs using the precision reduction technique. The proposed method is applied to two benchmarks and a multitude of ATMR designs with different degrees of approximation. The benchmarks are implemented on a Xilinx Zynq-7000 APSoC FPGA through high-level synthesis and evaluated concerning area usage and the inaccuracy caused by approximation. Fault injection experiments are performed by flipping bits of the FPGA configuration bitstream. Results show that the proposed approximation method can decrease the DSP usage of the hardware implementation up to 80% and the number of sensitive configuration bits up to 75% while maintaining an accuracy of more than 99.96%.

Fine-grain temperature monitoring for many-core systems

  • Alzemiro Lucas da Silva
  • André Luís del Mestre Martins
  • Fernando Gehm Moraes

The power density may limit the amount of energy a many-core system can consume. A many-core at its maximum performance may lead to safe temperature violations and, consequently, result in reliability issues. Dynamic Thermal Management (DTM) techniques have been proposed to guarantee that many-core systems run at good performance without compromising reliability. DTM techniques rely on accurate temperature information and estimation, which is a computationally complex problem. However, related works usually abstract the temperature monitoring complexity, assuming available temperature sensors. An issue related to temperature sensors is their granularity, frequently measuring the temperature of a large system area instead of a processing element (PE) area. Therefore, the first goal of this work is to propose a fine-grain (PE level) temperature monitoring for many-core systems. The second one is to present a dedicated hardware accelerator to estimate the system temperature. Results show that software performance can be a limiting factor when applying an accurate model to provide temperature estimation for system management. On the other side, the hardware accelerator connected to the many-core enables the fine-grain temperature estimation at runtime without sacrificing system performance.

An adaptive discrete particle swarm optimization for mapping real-time applications onto network-on-a-chip based MPSoCs

  • Jessé Barreto de Barros
  • Renato Coral Sampaio
  • Carlos Humberto Llanos

This paper presents a modified version of the well-known Particle Swarm Optimization (PSO) algorithm as an alternative for the single-objective Genetic Algorithm (GA) that is currently the state-of-the-art method to map real-time applications tasks onto Multiple Processors System-on-a-Chip (MPSoC) using preemptive capable wormhole-based Network-on-a-Chip (NoC) as their communication architecture. A statistical study based on an experimental setup has been performed to compare the GA-based task mapper and the proposed method by using a real-time application as a benchmark, as well as a group of randomly generated ones. Preliminary results have shown that our method is capable of achieving quicker convergence than the GA-based method, and it even produces better results when the application utilization is smaller than the available processing capacity, i.e., a fully schedulable mapping solution exists.

Exploring Tabu search based algorithms for mapping and placement in NoC-based reconfigurable systems

  • Guilherme A. Silva Novaes
  • Luiz Carlos Moreira
  • Wang Jiang Chau

Nowadays, the development of systems based on Networks-on-Chip (NoCs) brings big challenges to the designers due to problems of scalability, such as efficient Mapping and Placement, which are NP-hard problems. Several solutions have been proposed to solve this type of problem that is a variation of Quadratic Assignment Problems (QAP), being Tabu Search (TS) algorithms the ones showing most promising results. In NoC-based dynamically reconfigurable systems (NoC-DRSs), both mapping and placement problems present several layers of complexity due the reconfigurable scenarios. A previous work has adopted TS algorithm variations, but the best solution is not achieved with the wished high frequency. This paper introduces the original Forced Inversion (FI) Heuristic over Tabu Search algorithms for 2D-Mesh FPGA NoC-DRSs, in order to avoid local minima. Results with a series of benchmarks are presented and the performances of different approaches are quantitatively and qualitatively compared.

Performance evaluation of HEVC RCL applications mapped onto NoC-based embedded platforms

  • Wagner Penny
  • Daniel Palomino
  • Marcelo Porto
  • Bruno Zatt
  • Leandro Indrusiak

Today, several applications running into embedded systems have to fulfill soft or hard timing constraints. Video applications, like the modern High Efficiency Video Coding (HEVC), e.g., most often have soft real-time constraints. However, in specific scenarios, such as in robotic surgeries, the coupling of satellites and so on, harder timing constraints arise, becoming a huge challenge. Although the implementation of such applications in Networks-on-Chip (NoCs) being an alternative to reduce their algorithmic complexity and meet real-time constraints, a performance evaluation of the mapped NoC and the schedulability analysis for a given application are mandatory. In this work we make a performance evaluation of HEVC Residual Coding Loop (RCL) mapped onto a NoC-based embedded platform, considering the encoding of a single 1920×1080 pixels frame. A set of analysis exploring the combination of different NoC sizes and task mapping strategies were performed, showing for the typical and upper-bound workload cases scenarios when the application is schedulable and meets the real-time constraints.

An FPGA-based evaluation platform for energy harvesting embedded systems

  • Roberto Paulo Dias Alcantara Filho
  • Otavio Alcantara de Lima Junior
  • Corneli Gomes Furtado Junior

Extreme low-power embedded systems are essential in Smart Cities and the Internet of Things, once these systems are responsible for acquiring, processing, and transmitting valuable environmental data. Some of these systems should run for a very long time without any human intervention, even for batteries replacement. Energy harvesting technologies allow embedded systems to be powered up from the environment by converting surrounding energy sources into electrical energy. However, energy-harvesting embedded systems (EHES) heavily depends on the nature of the energy sources, which are mostly uncontrollable and unpredictable. To improve the evaluation of energy management techniques in EHES, we propose the emulation of I-V curves of low-power energy harvesting transducers. An FPGA-based platform controls the energy source emulation combined with an integrated logic analyzer, which allows real-time data gathering from the EHES in multiple evaluation scenarios. The experiments show that the platform replicates solar energy scenarios with only 0.56% mean error.

A comparison of two embedded systems to detect electrical disturbances using decision tree algorithm

  • Reneilson Santos
  • Edward David Moreno
  • Carlos Estombelo-Montesco

The Electrical Power Quality (EPQ) is a relevant subject in the academic area because of its importance on real-world problems. The anomalies on an electrical network can cause strong losses in equipment and data. In this context, much effort has been made by many types of research approaches to get solutions for this kind of problem, seeking for better accuracy on the classification of the anomalies, or building a system to detect them. This paper, therefore, aims to compare two systems built to classify electrical disturbances even in noised environments. For this purpose, it was used a microprocessor system (Raspberry Pi3) and a micro-controller system (NodeMCU Amica), analyzing their time to classify the input signal. The microprocessor achieves better results (45.50ms against 267.10ms), with an accuracy of 97.96% in an ideal environment and 76.79% in a noisy environment (20dB of SNR) for both systems.

FPGA hardware linear regression implementation using fixed-point arithmetic

  • Willian de Assis Pedrobon Ferreira
  • Ian Grout
  • Alexandre César Rodrigues da Silva

In this paper, a hardware design based on the field programmable gate array (FPGA) to implement a linear regression algorithm is presented. The arithmetic operations were optimized by applying a fixed-point number representation for all hardware based computations. A floating-point number training data point was initially created and stored in a personal computer (PC) which was then converted to fixed-point representation and transmitted to the FPGA via a serial communication link. With the proposed VHDL design description synthesized and implemented within the FPGA, the custom hardware architecture performs the linear regression algorithm based on matrix algebra considering a fixed size training data point set. To validate the hardware fixed-point arithmetic operations, the same algorithm was implemented in the Python language and the results of the two computation approaches were compared. The power consumption of the proposed embedded FPGA system was estimated to be 136.82 mW.

New insight for next generation SRAM: tunnel FET versus FinFET for different topologies

  • Adriana Arevalo
  • Romain Liautard
  • Daniel Romero
  • Lionel Trojman
  • Luis-Miguel Procel

The purpose of this work is to point out the main differences between a Static Random-Access Memory (SRAM) cells implemented by using Tunnel FET (TFET) and FinFET technologies. We have compared the behavior of SRAM cells implemented in both technologies cells for a supply voltage range from 0.4V to 1.2V. Furthermore, for our study, we have chosen different SRAM cell topologies, such as 6T, 8T, 9T and 10T. Therefore, we have simulated all of these topologies for both technologies and extracted the Static Noise Margins (SNM) for the reading and writing processes. In addition, we have determined the power consumption in order to find the best trade-off between stability and power. By analyzing these results, we have determined the best topology for each technology. Finally, we have compared these best topologies for each technology in order to perform a study of advantages and shortcomings. Our results show more advantages using TFET technology instead of FinFET one.

DNAr-logic: a constructive DNA logic circuit design library in R language for molecular computing

  • Renan A. Marks
  • Daniel K. S. Vieira
  • Marcos V. Guterres
  • Poliana A. C. Oliveira
  • Omar P. Vilela Neto

This paper describes the DNAr-Logic: an implementation of a software package in R language that provides ease of use and scalability of the design process of digital logic circuits in molecular computing, more specifically, DNA computing. These devices may be used in-vitro, in-vivo, or even replace the CMOS technology in some applications. Using a technique known as DNA strand displacement reaction (DSD) in conjunction with chemical reaction networks (CRN’s), DNA strands can be used as “wet” hardware to construct molecular logic circuits analogous to electronic digital projects. The circuits designed using the DNAr-Logic can be created in a constructive manner and simulated without requiring knowledge of chemistry or DSD mechanism. The package implements all the main logic gates. We describe the design of a majority gate (also available in the package) and a full-adder circuit that only uses this port. We describe the results and simulation of our design.

Finding optimal qubit permutations for IBM’s quantum computer architectures

  • Alexandre A. A. de Almeida
  • Gerhard W. Dueck
  • Alexandre C. R. da Silva

IBM offers quantum processors for Clifford+T circuits. The only restriction is that not all CNOT gates are implemented and must be substituted with alternate sequences of gates. Each CNOT has its own mapping with a respective cost. However, by permuting the qubits, the number of CNOT that need mappings can be reduced. The problem is to find a good permutation without an exhaustive search. In this paper we propose a solution for this problem. The permutation problem is formulated as an Integer Linear Programming (ILP) problem. Solving the ILP problem, the lowest cost permutation for the CNOT mappings is guaranteed. To test and validated the proposed formulation, quantum architectures with 5 and 16 qubits were used. The ILP formulation along with mapping techniques found circuits with up to 64% fewer gates than other approaches.

Hardware implementation of a shape recognition algorithm based on invariant moments

  • Clement Raffaitin
  • Juan-Sebastian Romero
  • Juan-Sebastian Romero
  • Luis-Miguel Procel

The present work shows the description of a simple fast shape detection algorithm and its implementation in hardware in a FPGA system. The detection algorithm is based on the concepts of Hu’s moments which are invariant to similarity transformations. The recognition algorithm is implemented by using a non-local means filter. The algorithm is implemented on a FPGA system by using a hardware description language. We present the different design stages of the algorithm implementation which is based on the finite state machine technique. This algorithm is able to recognize a target shape over a test image. Furthermore, this work, describes the advantages of the implementation in hardware, such as speed and parallelism in signal processing. Finally, we show some results of the implementation of this algorithm.

A custom processor for an FPGA-based platform for automatic license plate recognition

  • Guilherme A. M. Sborz
  • Guilherme A. Pohl
  • Felipe Viel
  • Cesar A. Zeferino

Automatic License Plate Recognition (ALPR) systems are used to identify a vehicle from an image that contains its plate. These systems have applications in a wide range of areas, such as toll payment, border control, and traffic surveillance. ALPR systems demand high computational power, especially for real-time applications. In this context, this paper describes the development of a custom processor designed to accelerate part of the processing of an FPGA-based ALPR system. This processor reduces the latency for computing the most expensive function of the ALPR system in almost 23 times, thus reducing the time necessary for detection of a vehicle plate.

Hardware design of DC/CFL intra-prediction decoder for the AV1 codec

  • Jones Goebel
  • Bruno Zatt
  • Luciano Agostini
  • Marcelo Porto

This paper presents a dedicated hardware design for the DC and Chroma from Luma (CFL) intra-prediction modes of AV1 decoder. The hardware was designed to reach real-time when processing UHD 4K videos. The AV1 codec is an open-source and royalties-free video coding, which was developed by the AOMedia group, this group is composed of multiple companies like Google, Netflix, AMD, ARM, Intel, Nvidia, Microsoft, Mozilla and others. The proposed solution can support all 19 block sizes allowed in AV1 encoder, being able to process UHD 4K videos at 60 frames per second. The DC/CFL modules were synthesized to the TSMC 40 nm cells library targeting the frequency of 132.1 MHz. Synthesis results show the proposed hardware used 89.39 Kgates and a power dissipation of 7.96mW.

Approximate interpolation filters for the fractional motion estimation in HEVC encoders and their VLSI design

  • Rafael da Silva
  • Ícaro Siqueira
  • Mateus Grellert

Motion Estimation (ME) is one of the most complex HEVC steps, consuming more than 60% of the average encoding time, most of which is spent on its fractional part (Fractional Motion Estimation – FME), in which sub-pixel samples are interpolated and searched over to find motion vectors with higher precision. This paper presents hardware designs for the sub-pixel interpolation unit of the FME step. The designs employ approximate computing techniques by reducing the number of taps in each filter to reduce memory access and hardware cost. The approximate filters were implemented in the HEVC reference software to assess their impact on coding performance. A complete interpolation architecture was implemented in VHDL and synthesized with different filter precision and input sizes in order to assess the effect of these parameters on hardware area and performance. The approximate designs reduce the number of adders/subtractors by up to 67.65% and memory bandwidth by up to 75% with a tolerable loss in coding performance (less than 1% using the Bjontegaard Delta bitrate metric). When synthesized to an FPGA device, 52.9% less logic elements are required with a modest increase in frequency.

An SVM-based hardware accelerator for onboard classification of hyperspectral images

  • Lucas A. Martins
  • Guilherme A. M. Sborz
  • Felipe Viel
  • Cesar A. Zeferino

Hyperspectral images (HSIs) have been used in civil and military scenarios for ground recognition, urban development management, rare minerals identification, and diverse other purposes. However, HSIs have a significant volume of information and require high computational power, especially for real-time processing in embedded applications, as in onboard computers in satellites. These issues have driven the development of hardware-based solutions able to provide the processing power necessary to meet such requirements. In this paper, we present a hardware accelerator to enhance the performance of one of the most computational expensive stages of HSI processing: the classification. We have employed the Entropy Multiple Correlation Ratio procedure to select the spectral bands to be used in the training process. For the classification step, we have applied a Support Vector Machine classifier with a Hamming Distance decision approach. The proposed custom processor was implemented in FPGA and compared with high-level implementations. The results obtained demonstrate that the processor has a silicon cost lower than similar solutions and can perform a real-time pixel classification in 0.1 ms and achieves a state-of-the-art accuracy of 99.7%.

A sub-1mA highly linear inductorless wideband LNA with low IP3 sensitivity to variability for IoT applications

  • Arthur Liraneto Torres Costa
  • Hamilton Klimach
  • Sergio Bampi

This paper proposes a wideband 0.4-2 GHz cascode common-gate LNA that can be used as a building block for a noise canceling topology (which entails its noise to be canceled at the output node). The design strategy is to set the operating point by analyzing the third order coefficient (α3) of the output current and the output voltage, which is designed using a load composed by a diode-connected PMOS transistor and a resistor in parallel. This operating point allows a reasonable VGS spread, maintaining a high IIP3 which implies a low IIP3 sensitivity to process variability. The design strategy also achieves a current consumption under 1 mA and, depending on the technology node VDD (CMOS 130 nm in this case), it can consume under 1 mW of power. This makes the wideband LNA suitable for IoT applications. Monte Carlo simulations have been carried out to demonstrate the operating region sensitivity to variability and achieves a result of worst case IIP3μ = +0.2 dBm with σ = 0.8 dBm (@2GHz) up to a nominal 2.75 dBm @900 MHz, S11 < -23 dB, NF < 5.5 dB (canceled by virtue of its topology), a voltage gain of 11.6-14.6 dB (S21 = 6.4-9.4 dB with a buffer to 50 Ω), and consuming just 1.19 mW from a 1.2 V supply.

Comparison between direct and indirect learnings for the digital pre-distortion of concurrent dual-band power amplifiers

  • Luis Schuartz
  • Artur T. Hara
  • André A. Mariano
  • Bernardo Leite
  • Eduardo G. Lima

Current radio-communication systems demand high linearity and high efficiency. The digital baseband pre-distorter (DPD) is a cost-effective solution to guarantee the required linearity without compromising the efficiency. In the design of a DPD for a single band power amplifier (PA), the position of the inverse system is exchanged during the identification procedure to avoid the necessity of a PA model within a cumbersome closed-loop process. However, in a practical environment where only an approximation to the inverse is achieved, the linearization capability is affected by shifting the post-inverse placed after the PA to a pre-inverse located before the PA. In DPD intended for concurrent dual-band PAs, an additional advantage of such approach is that the post-inverse identifications for each band are completely independent of each other. This work performs a comparative analysis between two learning architectures applied to the linearization of two concurrent dual-band PAs stimulated by 2.4 GHz Wi-Fi and 3.5 GHz LTE signals. For the first PA, an exact PA model is known and the replacement of a post-inverse to a pre-inverse produces only negligible degradation in linearity. For the second PA, only an approximate PA model is available and the accuracy of such PA model produces a major impact on the linearization capability than the shifting of the inverse.

Interactive evolutionary approach to reduce the optimization cycle time of a low noise amplifier

  • Rodrigo A. L. Moreto
  • Douglas Rocha
  • Carlos E. Thomaz
  • André Mariano
  • Salvador P. Gimenez

Nowadays, wireless communications at frequencies of gigahertz have an increasing demand due to the ever-increasing number of electronic devices that uses this type of communication. They are implemented by Radio Frequency (RF) circuits. However, the design of RF circuits is difficult, time-consuming and based on designer knowledge and experience. This work proposes an interactive evolutionary approach using the genetic algorithm, which is implemented in the in-house iMTGSPICE optimization tool, to perform the optimization process of a robust (corner and Monte Carlo analyses) Ultra Low-Power Low Noise Amplifier (LNA) dedicated to Wireless Sensor Networks (WSN), which is implemented in a 130 nm Bulk CMOS technology. We performed two experimental studies to optimize the LNA. The first one used the interactive approach of iMTGSPICE, which was monitored and assisted by a beginner designer during the optimization process. The second one used the conventional approach of iMTGSPICE (non-interactive), which was not assisted by a designer during the optimization process. The obtained results demonstrated that the interactive approach of iMTGSPICE performed the optimization process of the robust LNA around 94% faster (in approximately 20 minutes only) than the non-interactive evolutionary approach (in approximately 6 hours).

An innovative strategy to reduce die area of robust OTA by using iMTGSPICE and diamond layout style for MOSFETs

  • José Roberto Banin Júnior
  • Rodrigo Alves de Lima Moreto
  • Gabriel Augusto da Silva
  • Carlos Eduardo Thomaz
  • Salvador Pinillos Gimenez

This paper describes a pioneering design and optimization methodology that provides a remarkable die area reduction of robust analog Complementary Metal-Oxide-Semiconductor (CMOS) Integrated Circuits (ICs) by using a computational tool based on artificial intelligence (iMTGSPICE) and the Diamond layout style for MOSFETs. The validation of this innovative optimization strategy for analog CMOS ICs was made for an Operational Transconductance Amplifiers (OTA) by using 180 nm CMOS ICs technology. The main finding of this work reports a remarkable reduction of the total die area of a robust OTA around 30%, regarding the use of Diamond MOSFETs with alfa angles of 45° when compared to the one implemented with standard rectangular MOSFETs.

NMLSim 2.0: a robust CAD and simulation tool for in-plane nanomagnetic logic based on the LLG equation

  • Lucas A. Lascasas Freitas
  • Omar P. Vilela Neto
  • João G. Nizer Rahmeier
  • Luiz G. C. Melo

Nanomagnetic Logic (NML) is a new technology based on the magnetization of nanometric magnets. Logic operations are performed via dipolar coupling through ferromagnetic and antiferromagnetic interactions. The low energy dissipation and the possibility of higher integration density in circuits are significant advantages over CMOS technology. Even so, there is a great need for simulation and CAD tools for the proper study of large NML circuits. This paper presents a high-efficiency tool that uses the Landau-Lifshitz-Gilbert equation to evolve the magnetization of the particles over time in amonodomain approach. The new version of NMLSim comes with flexibility in its code, allowing expansion of the tool with ease and consistency. The results of simulated structures show the reliability of the simulator when compared with the current state of the art Object-Oriented Micromagnetic Framework (OOMMF). It also presents an improvement of up to 716 times in execution time and up to 41 times in memory usage.

Ropper: a placement and routing framework for field-coupled nanotechnologies

  • Ruan Evangelista Formigoni
  • Ricardo S. Ferreira
  • José Augusto M. Nacif

Field-Coupled Nanocomputing technologies are the subject of extensive research to overcome current CMOS limitations. These technologies include nanomagnetic and quantum structures, each with its design and synchronization challenges. In this scenario clocking schemes are used to ensure circuit synchronization and avoid signal disruptions at the cost of some area overhead. Unfortunatelly, a nanocomputing technology is limited to a small subset of clocking schemes due to its number of clocking phases and signal propagation system, thus, leading to complex design challenges when tackling the placement and routing problem resulting in technology dependant solutions. Our work consists on presenting a novel framework developed by our team that solves these design challenges when using distinct schemes, therefore, avoiding the need to design pre-defined routing algorithms for each one. The framework offers a technology independent solution and provides interfaces for the implementation of efficient and scalable placement strategies, moreover, it has full integration with reference state-of-the-art optimization and synthesis tools.

Toward nanometric scale integration: an automatic routing approach for NML circuits

  • Pedro Arthur R. L. Silva
  • Omar Paranaíba V. Neto
  • José Augusto M. Nacif

In recent years, many technologies have been studied to replace or complement CMOS. Some of these emerging technologies are known as Field Coupled Nanocomputing. However, these new technologies introduce the need for developing tools to perform circuit mapping, placement, and routing. Nanomagnetic Logic Circuit (NML) is one of these emergent technologies. It relies on the magnetization of nanomagnets to perform operations through majority logic. In this work, we propose an approach to map a gate-level circuit to an NML layout automatically. We use the Breadth First Search to perform the placement and the A* algorithm to transverse the circuit and build the routes for each node. To evaluate the effectiveness of our approach, we use a series of ISCAS’85 benchmarks. Our results show an area reduction varying from 20% to 60%.

Energy efficient fJ/spike LTS e-Neuron using 55-nm node

  • Pietro M. Ferreira
  • Nathan De Carvalho
  • Geoffroy Klisnick
  • Aziz Benlarbi-Delai

While CMOS technology is currently reaching its limits in power consumption and circuit density, a challenger is emerging from the analogy between biology and silicon. Hardware-based neural networks may drive a new generation of bio-inspired computers by the urge of a hardware solution for real-time applications. This paper redesigns a previous proposed electronic neuron (e-Neuron) in a higher firing rate to reduce the silicon area and highlight a better energy efficiency trade-off. Besides, an innovative schematic is proposed to state an e-Neuron library based on Izhikevichs model of neural firing patterns. Both e-Neuron circuits are designed using 55 nm technology node. Physical design of transistors in weak inversion are discussed to a minimal leakage. Neural firing pattern behaviors are validated by post-layout simulations, demonstrating the spike frequency adaptation and the rebound spikes due to post-inhibitory effect in LTS e-Neuron. Presented results suggest that the time to rebound spikes is dependent of the excitation current amplitude. Both e-Neurons have presented a fF/spike energy efficiency and a smaller silicon area in comparison to Izhikevichs library propositions in the literature.

CMOS analog four-quadrant multiplier free of voltage reference generators

  • Antonio José Sobrinho de Sousa
  • Fabian de Andrade
  • Hildeloi dos Santos
  • Gabriele Gonçalves
  • Maicon Deivid Pereira
  • Edson Santana
  • Ana Isabela Cunha

This work presents a CMOS four quadrant analog multiplier architecture for application as the synapse element in analog cellular neural networks. The circuit has voltage-mode inputs and a current-mode output and includes a signal application method that avoids voltage or current reference generators. Simulations have been accomplished for a CMOS 130 nm technology, featuring ±50 mV input voltage range, 60 μW static power and -25 dB maximum THD. The active area is 346 μm2.

Amplifier-based MOS analog neural network implementation and weights optimization

  • Tiago Oliveira Weber
  • Diogo da Silva Labres
  • Fabián Leonardo Cabrera

Neural networks are achieving state-of-the-art performance in many applications, from speech recognition to computer vision. A neuron in a multi-layer network needs to multiply each input by its weight, sum the results and perform an activation function. This paper presents a variation of the implementation of an amplifier-based MOS analog neuron capable of performing these tasks and also the optimization of the synaptic weights using in-loop circuit simulations. MOS transistors operating in the triode region are used as variable resistors to convert the input and weight voltage to a proportional input current. To test the analog neuron in full networks, an automatic generator is developed to produce a netlist based on the number of neurons on each layer, inputs and weights. Simulation results using a CMOS 180 nm technology demonstrate the neuron proper transfer function and its functionality while trained in test datasets.

Reduction of neural network circuits by constant and nearly constant signal propagation

  • Augusto Andre Souza Berndt
  • Alan Mishchenko
  • Paulo Francisco Butzen
  • Andre Inacio Reis

This work focuses on optimizing circuits representing neural networks (NNs) in the form of and-inverter graphs (AIGs). The optimization is done by analyzing the training set of the neural network to find constant bit values at the primary inputs. The constant values are then propagated through the AIG, which results in removing unnecessary nodes. Furthermore, a trade-off between neural network accuracy and its reduction due to constant propagation is investigated by replacing with constants those inputs that are likely to be zero or one. The experimental results show a significant reduction in circuit size with negligible loss in accuracy.

A FPGA parameterizable multi-layer architecture for CNNs

  • Guilherme Korol
  • Fernando Gehm Moraes

Advances in hardware platforms boosted the use of Convolutional Neural Networks (CNNs) to solve problems in several fields such as Computer Vision and Natural Language Processing. With the improvements of algorithms involved in learning and inferencing for CNNs, dedicated hardware architectures have been proposed with the goal to speed up the CNNs’ performance. However, the CNNs’ requirements in bandwidth and processing power challenge designers to create architectures fitted for ASICs and FPGAs. Embedded applications targeting IoT (including sensors and actuators), health devices, smartphones, and any other battery-powered device may benefit from CNNs. For that, the CNN design must follow a different path, where the cost function is a small area footprint and reduced power consumption. This paper is a step towards this goal, by proposing an architecture for the main modules of modern CNNs. The proposal uses as case-study the Alexnet CNN, targeting Xilinx FPGA devices. Compared to the literature, results show a reduction up to 9 times in the amount of required DSP modules.

Design of a low power 10-bit 12MS/s asynchronous SAR ADC in 65nm CMOS

  • Arthur Lombardi Campos
  • João Navarro
  • Maximiliam Luppe

During the last decades we have witnessed the performance improvement and the aggressive growth of the complexity of integrated circuits (ICs). The progressive size reduction of transistors in recent technological nodes has allowed IC designers to perform analog tasks in the digital domain, increasing the demand for analog-to-digital converters (ADCs). This work presents the design and implementation of a low power successive approximation register analog-to-digital converter (SAR ADC) in a 65nm CMOS technology, suitable for low power frontend of wireless receivers with a flexible sampling rate up to 12 MS/s. At maximum sampling rate, the post-layout simulated circuit achieved an equivalent number of bits (ENOB) of 9.65 and a power consumption of 151.4μW, leading to a Figure of Merit of 15.8fJ/Conversion-step, inside an area of 0.074mm2.

A new algorithm for an incremental sigma-delta converter reconstruction filter

  • Li Huang
  • Caroline Lelandais-Perrault
  • Anthony Kolar
  • Philippe Benabes

Image sensors dedicated for the applications of the Earth observation require medium-speed and high-resolution analog-to-digital converters (ADCs). For that purpose, an incremental sigma-delta analog-to-digital converter (IΣΔ ADC) has been designed. Post-layout simulations highlighted a degradation in resolution caused by the circuit imperfections. Therefore, a digital correction has been investigated. This paper proposes a new reconstruction filter which takes into account not only the bit values of the modulator output sequence but also the occurrence of certain patterns. This technique has been applied to an incremental sigma-delta analog-to-digital converter in order to correct the conversion errors. Performing with 400 clock periods for each conversion cycle, the theoretical resolution is 15.4 bits. Post-layout simulation shows that a 13.5-bit resolution is obtained by using the classical optimal filter whereas a 14.8-bit resolution is obtained with our reconstruction filter.

Behavioral modeling of a control module for an energy-investing piezoelectric harvester

  • Tales Luiz Bortolin
  • André Luiz Aita
  • João Baptista dos Santos Martins

This work analyzes a piezoelectric energy harvesting system that uses a single inductor and the concept of energy investment. The harvester behavior, with special focus on its control logic module and state machine, is fully described and modeled in Verilog-A. The needed sensors and control variables were also identified and modeled. Simulation results have shown the correct behavioral modeling of the piezoelectric energy harvester system and proposed control, highlighting the harvesting mechanism based on the concept of energy-investment and the effect of the energy invested on the characteristics of the battery charging profile. The speed of the behavioral simulations when compared to electrical ones and the obtained model accuracy, have shown a reliable and prospective higher-level design approach.

An IR-UWB pulse generator using PAM modulation with adaptive PSD in 130nm CMOS process

  • Luiz Carlos Moreira
  • José Fontebasso Neto
  • Walter Silva Oliveira
  • Thiago Ferauche
  • Guilherme Heck
  • Ney Laert Vilar Calazans
  • Fernando Gehm Moraes

This paper proposes an adaptive pulse generator using Pulse Amplitude Modulation (PAM). The circuit was implemented with eight Pulse Generator Units (PGUs) to produce up to eight monocycles per pulse. The number of monocycles per pulse is inversely proportional to the Power Spectrum Density (PSD) bandwidth in the Impulse Radio Ultra-Wide Band (IR-UWB). The complete circuit contains two pulse generator blocks, each one composed by eight PGUs to build a rectangular waveform at the output. The PGU has been implemented with Edge Combiners High (ECH) and Edge Combiners Low (ECL) to encode the information. Each Edge Combiner has a high impedance circuit that is selected by digital control signals. The circuit has been simulated, showing an output pulse amplitude of ≈70mV for the high logic level and an amplitude of ≈35mV for the low logic level, both at 100 MHz Pulse Repetition Frequency (PRF). This produces a mean pulse duration of ≈270ps, a mean central frequency of ≈3.7GHz and a power consumption less than 0,22μW. The pulse generator block occupies an area of 0.54mm2.

CADathlon@ICCAD

Sunday, Oct. 29, 2023  08:00 AM – 05:00 PM, In-Person

Gallery Ballroom, The Hyatt Regency San Francisco Downtown SoMa, San Francisco, CA, USA

Welcome to CADathlon@ICCAD

CADathlon 2023 will be held as an in-person event.

The CADathlon is a challenging, all-day, programming competition focusing on practical problems at the forefront of Computer-Aided Design, and Electronic Design Automation in particular. The contest emphasizes the knowledge of algorithmic techniques for CADapplications, problem-solving and programming skills, as well as teamwork.

As the “Olympic games of EDA,” the contest brings together the best and the brightest of the next generation of CAD professionals. It gives academia and the industry a unique perspective on challenging problems and rising stars, and it also helps attract top graduate students to the EDA field.

The contest is open to two-person teams of undergraduate/graduate students specializing in CAD and currently full-time enrolled in a Ph.D. granting institution in any country. Students are selected based on their academic backgrounds and their relevant EDA programming experiences. Partial or full travel grants are provided to qualifying students. CADathlon competition may consist of six problems in the following areas:

  • Circuit Design & Analysis
  • Physical Design & Design for Manufacturability
  • Logic & High-Level Synthesis
  • System Design & Analysis
  • Functional Verification & Testing
  • Future technologies (Bio-EDA, Security, AI, etc.)

More specific information about the problems and relevant research papers will be released on the Internet one week prior to the competition. The writers and judges that construct and review the problems are experts in EDA from both academia and industry. At the contest, students will be given the problem statements and example test data, but they will not have the judges’ test data. Solutions will be judged on correctness and efficiency. Where appropriate, partial credit might be given.

The team that earns the highest score is declared the winner. In addition to handsome trophies, the first place and the second place teams receive cash award, and the contest winners will be announced at the ICCAD conference.

  • Cash Prize
    • First place award: 1500 per person
    • Second place award: 750 per person
  • Participation Request (please submit via Google Form)
  • Important dates
    • October 10th, 2023: Participation request form due
    • October 14th, 2023: Participation acceptance announcement
    • October 22th, 2023: Release of topic/hardware details
    • October 29th, 2023: Contest date
  • Problems and References:
    2022’s / 2019’s / 2018’s /  2017’s / 2016’s / 2015’s / 2014’s / 2013’s / 2012’s / archive
  • Organization Committee
    • Chair: Andy, Yu-Guang Chen, National Central University, Taiwan
    • Co-chair: Jeff (Jun) Zhang, Arizona State University, USA
    • Co-chair: Zahra Ghodsi, Purdue University, USA
    • Co-chair: Pei-Yu (Billy) Lee, Synopsys, Taiwan
  • Contact: andyygchen.nuc at gmail dot com

Global Education Partner:

cadence-logo

SRC-2018

ACM Student Research Competition at ICCAD 2018 (SRC@ICCAD’18)

http://www.cse.cuhk.edu.hk/~byu/img/img-sigda/logo-src.jpg


DEADLINE: September 02, 2018 
Online Submission: https://www.easychair.org/conferences/?conf=srciccad18
 
Sponsored by Microsoft Research, the ACM Student Research Competition is an internationally recognized venue enabling undergraduate and graduate students who are ACM members to:

  • Experience the research world — for many undergraduates this is a first!
  • Share research results and exchange ideas with other students, judges, and conference attendees
  • Rub shoulders with academic and industry luminaries
  • Understand the practical applications of their research
  • Perfect their communication skills
  • Receive prizes and gain recognition from ACM and the greater computing community.

The ACM Special Interest Group on Design Automation (ACM SIGDA) is organizing such an event in conjunction with the International Conference on Computer Aided Design (ICCAD). Authors of accepted submissions will get travel grants up to $500 from ACM/Microsoft and ICCAD registration fee support from SIGDA. The event consists of several rounds, as described at http://src.acm.org/ and http://www.acm.org/student-research-competition, where you can also find more details on student eligibility and timeline.
 
The first-place winner in the graduate category at SRC@ICCAD’17, Meng Li (University of Texas at Austin), also won the First Place in the 2018 ACM SRC Grand Finals!
 
The first-place winner in the undergraduate category at SRC@ICCAD’16, Jennifer Vaccaro (Olin College of Engineering), also won the Second Place in the 2017 ACM SRC Grand Finals: http://www.acm.org/media-center/2017/june/src-2017-grand-finals.
 
Details on abstract submission:
Research projects from all areas of design automation are encouraged. The author submitting the abstract must still be a student at the time the abstract is due. Each submission should be made on the EasyChair submission site. Please include the author’s name, affiliation, postal address, and email address; research advisor’s name; ACM student member number; category (undergraduate or graduate); research title; and an extended abstract (maximum 2 pages or 800 words) containing the following sections:

  • Problem and Motivation: This section should clearly state the problem being addressed and explain the reasons for seeking a solution to this problem.
  • Background and Related Work: This section should describe the specialized (but pertinent) background necessary to appreciate the work. Include references to the literature where appropriate, and briefly explain where your work departs from that done by others. Reference lists do not count towards the limit on the length of the abstract.
  • Approach and Uniqueness: This section should describe your approach in attacking the problem and should clearly state how your approach is novel.
  • Results and Contributions: This section should clearly show how the results of your work contribute to computer science and should explain the significance of those results. Include a separate paragraph (maximum of 100 words) for possible publication in the conference proceedings that serves as a succinct description of the project.
  • Single paper summaries (or just cut & paste versions of published papers) are inappropriate for the ACM SRC. Submissions should include at least one year worth of research contributions, but not subsuming an entire doctoral thesis load.

Note that this event is different than other ACM/SIGDA sponsored or supported events at DAC or ICCAD: YSSP brings together seniors and 1st year graduate students at DAC, UBooth features demos from research groups, DASS allows graduate students to get up to speed on lectures on design automation, while the PhD Forum showcases post-proposal PhD research at DAC and the CADathlon allows graduate students to compete in a programming contest at ICCAD.
The ACM Student Research Competition allows both graduate and undergraduate students to discuss their research with student peers, as well as academic and industry researchers, in an informal setting, while enabling them to attend DAC and compete with other ACM SRC winners from other computing areas in the ACM Grand Finals. Travel grant recipients cannot receive travel support from any other ICCAD or ACM/SIGDA sponsored program.
This year we plan to reserve as many as 5 poster session spots for undergraduate attendees to encourage their continuous investigation in design automation field. The exact number is subject to the total undergraduates submissions as well as the quality of the works.
 
Online Submission – EasyChair:
https://www.easychair.org/conferences/?conf=srciccad18
 
Important dates:

  • Abstract submission deadline: 11:59pm, PST, September 02, 2018
  • Acceptance notification: September 17, 2018
  • Poster session: November 5, 2018 from 11:30am–1:30pm, Private Dinning Room
  • Presentation session: November 5, 2018 from 6:45pm–8:30pm, Saint Tropez Room
  • Award winners announced at ACM SIGDA Dinner: November 6, 2018, from 6:30pm
  • Grand Finals winners honored at ACM Awards Banquet: June 2019 (Estimated)

 
Requirement:
Students submitting and presenting their work at SRC@ICCAD’18 are required to be members of both ACM and ACM SIGDA.
 
Organizers:
Cheng Zhuo (Zhejiang University, China)
Bei Yu (The Chinese University of Hong Kong, Hong Kong)