Federal University of S. Catarina
Advanced Compact MOSFET model
This work presents the ACM (Advanced Compact MOSFET) model. Developed at the Integrated Circuits Laboratory of Federal University of Santa Catarina (http://www.eel.ufsc.br/lci), it is a physical charge-based model suitable for analysis and design of integrated circuits. Also, we will bring some examples of the advantages of the ACM model and a simplified methodology for extracting its parameters.
The ACM model is implemented in a mixed mode simulator (SMASH from Dolphin Integration - http://www.dolphin.fr) and, unlike most of used models, presents a great simplicity in its equations and a small number of physics dependent parameters. This characteristics, allied with the strong accuracy in represent static and dynamic behavior of MOS transistors, allow its use in educational and industrial areas.
Another topic of our presentation will be the extraction of the parameters of ACM model. To help those planning use our model, we developed three simple and low cost methods:
1. The mapping of BSIM parameters to ACM parameters: we convert directly some BSIM 3 parameters to ACM parameters or, if it is not possible, we use analytical expressions to extract them;
2. Experimental determination of the ACM parameters: we have developed a set of very simple circuits, which require conventional laboratory equipment only, to determine the ACM parameters;
3. Determination of ACM parameters from simulation: with the same circuits of the second method, we can extract the ACM parameters by simulation.
Finally, we will present some simulations with ACM and some other models, where comparison of the results will show the advantages of our model.
UFRGS-Universidade Federal do Rio Grande do Sul
Path Search
This work address the problems of control and running time in VLSI routing by demonstrating two new algorithms and an accurate cost model. Net by net routing is still a very important technique used to make connections in VLSI circuits. Maze routing algorithms used for this purpose correspond to shortest path searches derived from basic
BFS or from A*, with many dedicated improvements. BFS searches without costs are faster because, although they visit larger areas, the visit time complexity is constant. Yet searches using A* and variable costs per arc require logarithmic time per visit. They are more efficient for global routing, that has to consider a smaller graph,
accounting for routing distribution. We will demonstrate LCS*, a new algorithm [Johann et al. 2000, LNAI-1952] that is the first bidirectional and heuristic search method to win from A* in average number of visited nodes and running time. It is shown that the cost model hardly affects interconnect's quality and running time of heuristic search algorithms, and traditional approaches cannot distinguish different search goals. Traditionally, a single cost value is used to model obstructions (variable) and path length (a constant positive factor). We propose a novel cost model that distinguishes the congestion, performance, and resource costs, and allows greater control of the resulting net quality by using search parameters, while preventing the algorithm to visit to much nodes. By playing with these parameters one can manually or automatically let the nets that have not achieved the desired performance to be re-routed using a new costing scheme that favors performance, while th!
e others keep saving resources for them.
Besides the importance of individual nets control, detailed routing of large areas needs to be successfully processed. The crosspoint assignment problem that arises at the interface of global cells in area routing imposes limits to the detailed algorithm efficiency in terms of area and runtime. We will demonstrate LEGAL, a new greedy algorithm
which is a generalization of channel routing algorithms. LEGAL can simultaneously process the detailed area routing in almost linear time given that global routing decisions have already been made. The greedy technique is used by exploiting the fact that only local decisions are left to the detailed routing step, and then, there is little chance of converging into a conflict situation. A simple implementation restricted to only 2-pin nets shows almost the same routing results of a very optimized maze router system, but with orders of magnitude less time.
University of Michigan
Satisfiability Solver
Boolean satisfiability techniques have been used successfully in a variety of EDA applications such as formal verification, testing, and timing analysis, just to name a few. A number of significant enhancements in both algorithm and implementation such as conflict analysis, recursive learning and different decision heuristics have contributed to the recent success of SAT approaches, making successful SAT solvers valuable building blocks for CAD tool designers.
Despite this, there are two important factors we need to consider when applying SAT to EDA problems. First, in many cases we need to solve a set of related SAT problems to complete a task. Second, while there exist methods to map application specific problems to a SAT’s CNF form, this conversion process is sometimes inefficient and increases the complexity of the problem at hand by introducing auxiliary variables and constraints.
We introduce SATIRE (SATisfiability Incremental Reasoning Engine), a new satisfiability solver that is particularly suited to verification and optimization problems in EDA. It has the ability to incrementally solve related problems and reuse information learned in previous search runs to solve new similar problems. Such problems typically arise in circuit testing, FPGA routing, microprocessor verification, etc. In each case the related problems are largely identical; only a small portion of the constraints change. The solver also has the ability to remove constraints from the problem.
Another feature of SATIRE is its ability to handle non-CNF constraint types. Many EDA problems can be converted to CNF in a straightforward manner, as simple logical relations usually correspond to a small number of CNF clauses. However, there are useful properties that cannot be conveniently represented by CNF. The presence of such non-scalable properties has made some EDA problems unamenable to solution by SAT. However, sometimes it is not a fundamental mismatch between the problem domain and SAT, but rather an unfavorable choice of representation. There are perfectly reasonable logical relationships, which do not have manageable CNF representations. We introduce a redefinition of SAT that does not specify the form of the constraints to be satisfied. By generalizing our definition of the problem we are able to address a broad range of problem domains more naturally, some of which were not feasible when represented only by CNF constraints.
In this demo, we will provide experimental evidence showing the effectiveness of these additions to classical satisfiability solvers.
Darmstadt University of Technology
Java-based Design
To keep with the increasing complexity of design of ICs, the use of higher abstraction levels is necessary. To deal with those levels of abstraction, tools and methodologies must be built in order to support the designer when adapting to the new paradigm. The proposed presentation will show a pair tool/methodology to support distributed collaborative design in a higher level of abstraction over Cave Framework infrastructure [1,2]. Object-oriented concepts are used as solutions to deal with complexity [3], taking into account the advantages it granted to complex software development [4]. A particular tool – named Homero – will also presented, which use a pair programming model [5] for collaborative development by groups of designers in different locations. The design flow starts at the conceptual level, where the system is modeled using OO techniques. Collaborative UML diagrams, such as collaboration diagrams, class diagrams or even use-cases can be used, through Ho!
mero infrastructure. From the conceptual level, the model should be refined to include the functional description of the system using Java language. This refinement may be done by hand – using Homero pair-programming infra-structure – or using commercially available CASE tool, such as Rational Rose [6]. At this point, an executable model should be created for functional simulation purposes. The test bench is written in Java (or using UML and doing the same process as the design) and run together with the executable model. After some refinement iterations, the executable model is used as input for the HDL model generation, using a tool such as Forge [7]. The generation is based on a set of architectural options done by the designer, so it separates the functional and architectural specifications, making it easy to compare several architectures based on a single functional description. After architectural experimentation, the generated HDL code may be synthesized u!
sing commercially available tools. A Codesign-like partition may also be done, selecting design blocks to run as embedded Java code over a hardware, software or mixed implementation of a Java Virtual Machine.
[1] INDRUSIAK, L.S.; REIS, R.A.L. A WWW approach for EDA tool integration. In: X BRAZILIAN SYMPOSIUM OF INTEGRATED CIRCUITS DESIGN, SBCCI, 10., 1997, Gramado. Proceedings... Porto Alegre: CPGCC UFRGS, 1997.
[2] INDRUSIAK, L.S.; REIS, R.A.L. A Case Study for the Cave Project. In: BRAZILIAN SYMPOSIUM OF INTEGRATED CIRCUITS DESIGN, SBCCI, 11., 1998, Armacao de Buzios. Proceedings... Los Alamitos: IEEE Computer Society Press, 1998.
[3] INDRUSIAK, L.S.; REIS, R.A.L. From a Hyperdocument-Centric to an Object-Oriented Approach for the Cave Project. In: BRAZILIAN SYMPOSIUM OF INTEGRATED CIRCUITS DESIGN, SBCCI, 13.., 2000, Manaus. Proceedings... Los Alamitos: IEEE Computer Society Press, 2000.
[4] PRESSMAN, R. S. Software Engineering: A Practitioner’s Approach. McGraw-Hill, 1996.
[5] WILLIAMS, L.; KESSLER, R.R.; All I Really Need to Know about Pair Programming I Learned In Kindergarten; Communications of the ACM Vol. 43 No. 5; 2000; p. 108-114.
[6] DAVIS, D. et. al. Forge-J: High Performance Hardware from Java. http://www.xilinx.com/forge/forge_J_wp.htm
[7] RATIONAL ROSE. http://www.rational.com
University of Pittsburgh
Chatoyant modeling system
We have developed Chatoyant to support modeling and simulating of micro-opto-electro-mechanical systems. This work will be demonstrated in this years University Booth. Chatoyant is built upon the object-oriented simulation engine Ptolemy. Chatoyant's component models are written in C++ with sets of user defined parameters for the characteristics of each module instance. Chatoyant performs static simulations to analyze such effects as mechanical tolerancing, power loss, insertion loss, and crosstalk, while dynamic simulations analyze data streams with techniques such as noise analysis and BER calculation.
Our work has been motivated by high-speed optical interconnect and switching systems based on OMEM (optical micro-electrical-mechanical) devices being a critical backbone technology for the next generation computer networks and systems. However, like many new technologies, design methods and tools for these systems are currently ad-hoc. Designers typically use combinations of tools that were built for the individual domains of optics, mechanics and electronics with little integration and with system level analysis based only on the experience of the designer or simply on assumptions about the ensemble behavior of the components. However, in order to support the design of these mulit-domain systems in a practical manor, computer aided design tools must be capable of modeling, electronics, electrostatics, mechanics, guided wave optics, and free space optics. The design tools must directly support the interfaces between models in all these domains, and characterize the behavi!
or of the resulting system in an single integrated environment. This is what we achieve in Chatoyant.
Our system level models consist of libraries of components and methodologies for signal propagation between components. A critical notion is the distinction between device models, which are domain specific and characterize the underlying physics of a device, versus component models that we use to characterize system level behavior. Component models can be supported at many levels of abstraction and can be derived from multiple sources including analytical data, device model simulations, or empirical measurements. For example, in the current implementation of Chatoyant, we have implemented models for vertical cavity surface emitting lasers (VCSEL) based on both empirical and analytical data and developed electrical models of CMOS devices and mechanical models of MEMS cantilevers using piece-wise linear evaluation of device characteristics. These modeling techniques allow us to perform trade offs between accuracy of our models and performance of our simulation and analysis t!
ools, which is essential to provide an interactive design environment.
In this demostration of Chatoyant, we will focus on out newest research:
- new accurate and efficient diffraction models
- piece-wise linear modeling of electrical and mechanical components
- new Java3D GUI
University of California, San Diego
Mixed-Signal Test
The proposed poster will describe a new high level approach
to the mixed-signal test synthesis problem based on
propagation of the test related information across the
chip boundaries.
First, fundamentals of signal propagation from a test
perspective will be discussed. Complications due to
analog domain, such as parameter tolerances, noise and
non-linearity, and how one can approach these challenges
through appropriate modeling of signal components and
modules will be presented. Following the discussion of
the basics of signal propagation, how one can utilize
such a propagation scheme for test generation, fault
simulation, and testability analysis will be presented.
Finally, a set of experimental results on a high frequency
signal down-conversion system that includes a filter, a
mixer, an amplifier and a data converter will be presented.
A list of published work in line with this proposal can be
found at:
http://www-cse.ucsd.edu/users/sozev/suleresume.html
As the industry is moving towards higher levels of
integration, traditional test methods of isolated basic
block testing prove to be insufficient to satisfy
stringent market requirements. The gap between
traditional test methods and application of these methods
at the system level can only be bridged through new
hierarchical approaches. This poster will be presting such
a high level approach to the mixed-signal test problem.
University of Southern California
Apollo: Adaptive Power Optimization
The Apollo project aims at significantly reducing power dissipation of next-generation mobile DoD computing and communication systems by means of operating system-directed power management, power-aware software compilation, and system-level synthesis and optimization of the integrated hardware/software platform subject to performance and quality-of-service constraints.
UCLA
PLAmap and RASP_SYN, Technology Mapping Packages
RASP_SYN, an LUT-based FPGA technology mapping package, is the synthesis core of the UCLA RASP System developed at UCLA VLSI CAD LAB. This release includes the following mapping algorithms:
DAG_Map (depth minimization) version 1.0
FlowMap (depth optimal) version 2.1
TurboMap (optimal mapping with retiming) version 1.0
FlowMap-r (area-delay tradeoff) version 2.0
FlowSYN (FPGA resynthesis) version 2.0
CutMap (simultaneous area delay minimization) version 1.2
ZMap (simultaneous area delay minimization) version 1.0
EMB_Pack (mapping for FPGAs with embedded memory blocks for area minimization while maintaining the delay) version 1.0
EMB_PreMap (the pre-mapping processing version for EMB_Pack) version 1.0
HeteroMap (delay optimal mapping for heterogeneous FPGAs) version 1.0
BinaryHM and CN-HM (delay-oriented mapping for heterogeneous FPGAs with bounded resources) version 1.0
This year we have three new additions to the RASP package:
1. Performance-Driven Mapping for CPLDs
2. Simultaneous Logic Decomposition and Technology Mapping for FPGAs
3. Performance-Driven Multilevel Clustering with Application
to Hieararchical FPGA Mapping
Please refer to http://ballade.cs.ucla.edu/~cong/publications.html
This demonstration will be a presentation in Windows Power-
Point for the research project related to technology mapping
problem for CPLD architectures. This research is summarized
as follows.
CPLDs are based on PLA-style logic cells, which are also
referred as p-term blocks or simply PLAs. Logic synthesis
and technology mapping for CPLDs are considered more
difficult than for FPGAs, due to the wide fanins,
inputs/pterm sharing and two level logic minimization.
We implemented a performance-driven mapping algorithm called
PLAmap. Notation (k,m,p)-PLA specifies a PLA with k or less
inputs, m or less product terms, and p or less outputs.
For CPLD structures consisting of small PLAs such as
(10,12,4)-PLAs we compared our results with that of TEMPLA,
which is a CPLD mapper targeting for area minimization.
Next, we modified our program to take into account
structural constraints of the p-term block in one commercial
CPLD device, Altera's MAX 7000B, whose PLA-style LAB can be
considered as a special (36,80,16)-PLA with structural
constraints. We also explored two ways targeting for
area/delay trade-off using threshold control strategies.
Our algorithmic flow consists of three stages.
Labeling: determines each node's logic depth (stored as the
label of the node) and provides clustering information for
the subsequent mapping step. To minimize depth in the final
PLA network, we label as if the target structure only
consists of (k,m,1)-PLAs, i.e., single-output PLAs, so that
we can form a PLA ``cluste'' (the chunk of logic covered by
this (k,m,1)-PLA) as deep as possible.
Mapping: generates (k,m,p)-PLAs based on the label
information of each node in the network. Since the logic
depth of the final network has already been decided,
the goal of the mapping stage is to minimize area without
affecting the logic depth of the network. The unique
feature of our mapping stage is to generate (k,m,p)-PLAs
(p > 1) directly by using cluster merging, slack time
relaxation, or node duplication. It efficiently takes
advantage of the attributes of the clusters formed in the
labeling stage and tries to share the inputs and pterms as
much as possible between different outputs in the same
PLA.
Packing: reduces mapping area further without depth
sacrifice. Two heuristic operations are developed. One is
called PLA collapsing. Any PLA that can be collapsed into
all of its fanout PLAs can be eliminated, provided that all
PLAs remain as (k,m,p)-PLAs after the collapsing. The
second operation is maximum shared-input bin packing.
Experimental results on various MCNC benchmarks show that
overall TEMPLA uses 8 to 11% less area at the cost of 96
to 106% more mapping depth, and MAX+PLUS II uses 12% less
area but 58% more delay compared with our mapper.
This work was presented at ACM/SIGDA International Symposium
on Field Programmable Gate Arrays, 2001. The paper (pdf)
can be retrieved at the publication list of Prof. Cong at
http://cadlab.cs.ucla.edu (item 109).
Politecnico di Torino
RTL power estimation
RTLEst is a power estimation tool for structural RTL VHDL
descriptions developed by the EDA Group of Politecnico di
Torino within ESPRIT project n. 26796 "PEOPLE".
Compared to existing RTL power estimators, the tool has
two distinctive features: ñ
- Power estimation of the RTL components (i.e., macros)
is obtained through accurate power macromodels, which are
built with a highly robust characterization procedure.
The demo will illustrate the major capabilities and
features of the tool.
Macromodels for the components are built once and for
all, and are stored into a cache of models for later reuse.
- Pre-characterization of the RTL components is not
required. The tool includes automatic characterization
capabilities, which allow easy migration to different
RTL libraries and technologies without the intervention
of the tool developer.
RTLEst guarantees faster estimation than existing tools,
because it relies on RT-level annotation of the switching
activity. Gate-level simulation is only required for power
characterization of the RTL components. Therefore, the tool
is particularly suited for design exploration of behavioral
synthesis alternatives, where changes in the design are
incremental.
The tool comes with a user-friendly TCL/TK-based GUI, as
well as a batch-mode interface. It supports two estimation
modes: Automatic and detailed. In automatic mode, the user
is allowed to control a few estimation parameters, mainly
regarding the specification of the model cache, the RTL
and technology libraries, and the stimuli file (testbench
or textual). Beside power estimation of the entire design,
the tool provides the user also with a macro-by-macro power
breakdown. In detailed mode, the user is prompted with a
window that allows him/her to control all the estimation
details. These include model selection and estimation
effort selection. Obviously, estimation carried out in
detailed mode yields more accurate results than in
automatic mode. Tool capabilities are currently being
enhanced to allow the processing of cycle-accurate RTL
descriptions, i.e., descriptions containing synthetic
operators and behavioral constructs. Improvements that
will enable the tool to account for dynamic effects (e.g.,
glitching) during estimation are also being considered.
University of Michigan
Two Dimensional Position Detection
A hybrid two-dimensional position sensing system is designed for mouse applications. the system measure the acceleration of hand movements which are converted into two dimensional location coordinates. the system consists of four major components : (1) MEMS accelerometers, (2) CMOS analog read-out circuitry , (3) an acceleration magnitude extraction module, and (4) a 16-bit RISC microprocessor. Mechanical and analog circuit simulation shows that the designed padless mouse system can detect accelerations as small as 5.3mg and operate up to 18MHz. we will demonstrate functionality of each module using either mechanical or electrical simulation tools.
University of Wisconsin - Madison
Lsim/p: power simulation
As semiconductor technology scales down, the leakage power
will soon become comparable to the dynamic power. To reduce both
dynamic and leakage power, power gating in addition to
clock gating should be used, because clock gating saves only dynamic power.
The knowledge of maximum current is needed to design reliable circuits
using power gating. We will explain the method to estimate such current, and
demonstrate a cycle-accurate architecture level simulation tool
integrating such current (power) estimation.
University of Wisconsin-Madison
SINO/SPR: RLC net synthesis
The following algorithms will be first presented:
(i) min-area simultaneous shield insertion and net ordering (SINO)
under the given RLC noise bound;
(ii) formula-based interconnect area estimation for optimal SINO
solutions;
and (iii) min-area simultaneous signal and power routing under
the given RLC noise bound.
Then an integrated toolset containing the above algorithms and
SPICE netlist generation will be demonstrated.
Both the algorithm and demonstration will be also available on
http://eda.ece.wisc.edu/tools.html.
University of Wisconsin-Madison
WebHenry
It is evident that on-chip inductance becomes increasingly
important for interconnect design and verification. We will
first present an efficient inductance extraction model and
a closed-form solution for RLC noise screening. We will then
demonstrate an on-line tool incorporating the above methods.
The online tools can be found at http://eda.ece.wisc.edu/tools.html
NCSU
Distributed Networked Design
We introduce a `universal client (OmniFlow)' whose GUI can
be readily configured by the user to invoke any number of applications,
concurrently or sequentially, anywhere on the network. The design and the
implementation of the client is based on the principles of taskflow-oriented
programming, whereby we merge concepts
from structured programming, hardware description, and mark-up languages. A
mark-up language such as XML supports a well-defined schema that captures
the decomposition of a program into a hierarchy of tasks, each representing
an instance of
a blackbox or a whitebox software component. The HDL-like input/output port
definitions capture data-task-data dependencies. A highly interactive hierarchical
GUI, rendered from the hierarchical taskflow descriptions in extended XML,
supports structured programming
language constructs to control sequences of task synchronization, execution,
repetition, and abort.
Experimental evaluations of the prototype, up to 9150 tasks and the longest
path of 1600 tasks, demonstrate the scalability of
the environment and the overall effectiveness of the
proposed architecture for a number of networked design and computing projects.
Politechnic Institute of Turin - Italy
HW/SW Design
We will start from a Simulink description of the PID controller.
This description will use not the standard Simulink block set, but a customized one, that allows
to supply additional data, as the implementation, than can be mainly software, hardware or none, several signal attributes,
and so on. We will be able to perform a full Simulink simulation of this system, and after the parameters tuning and the
system
implementation partitioning, we will show how we can predict some performance indexes, as the area
occupation and the maximum sampling frequency of the hardware subsystem, the code length and speed of the
software implemented one.
Once the functionality and the performance obtainded by simulation and extimation are satisfying, we will perform the VHDL/C code generation
for the target system we use. These codes will be compiled and the resulting binary code will be uploaded to the board (with DSP and an FPGA IC on board), linked
to a brushless motor by a power driver.
At this point we will be ready to see the system working.
It will be possible to show how the changing of parameters value affects the simulation results, and the real system behaviour.
UCLA
TRIO/IPEM
Our demo consists of two parts -- TRIO and IPEM. We have
outline our plan for both demos before.
TRIO
----
o demontsrate the usage of the tool to synthesize
topologies
o usage of the tool to perform interconnect optimization
through buffer insertion, wire sizing, device sizing
etc.
o show the usage of a graphical interface for easy
usage of the tool and viewing results
IPEM
---
o usage of IPEM to plan and estimate interconnect
performance based on repeater insertion, optimal wire
sizing etc.
o interface functions to IPEM
o GUI for IPEM
We will demonstrate the MCM layout system, including the
performance driven multi-layer global router, the crosstalk
noise constrained pseudo pin assignment, the multi-layer
gridless detailed router and the post-layout layer
assignment. We will demonstrate both the results and
run some examples on the spot.
Norwegian University of Science and Technology
STOREQ CAD tool
The planned demonstration is described in the pdf file
indicated in point 7 (http://www.fysel.ntnu.no/~pgk/STOREQ-Users-Manual.pdf).
It will show the DAC attendees how our tool guides the
designer towards achieving an optimized end product
through a focus on data transfer and storage at the highest
system levels.
Currently STOREQ is a stand-alone tool, but work is under
way to interface it with the ATOMIUM tool being developed
at IMEC in Belgium.
For more details on the methodology, see paper 23.4 and
its references.
University of Maryland at Baltimore County
Delay Faults Detection
A delay-fault testing strategy based on the analysis of power supply transient signals is presented. The method is an extension to a Go/No-Go device testing method called Transient Signal Analysis (TSA). TSA detects defects through the analysis of a set of power supply transient waveforms in the time or frequency domain, e.g., Fourier Phase components. A recent extension to TSA demonstrates a correlation between the Fourier Phase components and path delays in defect-free devices. The method proposed here is able to track increases in delay due to resistive shorting and open defects using a similar technique. In particular, we demonstrate that a delay defective device can be distinguished from a defect-free device through an anomaly in the Fourier Phase correlation profile of the device. Simulation experiment results show that the method is additionally capable of predicting the magnitude of the additional delay in some cases.
University of Tokyo
Circuit Transformation
In this demonstration, we show a new methodology to integrate multiple circuit tranformations and routing processes.
More specifically, this demonstration shows ways to utilize multiple choices of circuit transformations in routing processes.
First, we introduce a new logic representation that implements all possible wire reconnections implicitly by enhancing global flow optimization techniques.
Then we present two approaches for performing routing and wire reconnection simultaneously: exact approach and practical approach with commercial P/R tools.
Since our methods take into account multiple circuit transformations during routing phase where the accurate physical information is available, we can obtain better results than the conventional routing tools.
In addition, we can succeed in routing even if other routers like rip-up and reroute methods fail. We built a prototype system that implements the methods and preliminary results will be demonstrated.
Univeristy of Maryland at Baltimore County
3G SOC Design and Test
With recent advancements in wireless technologies such as
Bluetooth and 802.11, the design and testing issues in
implementations of these protocols on a single piece of
silicon are challenging traditional methods. Important
issues related to a single chip implementation, include
capability to reconfigure itself to use alternate wireless
communication protocols, interference and crosstalk
between modules, interoperability issues, control and
observability of nodes in the embedded cores.
University of California, Irvine
EXPRESSION
The EXPRESSION project which aims to demonstrate a working prototype a framework that generates a software toolkit consisting of a retargetable compiler, and a cycle-accurate simulator will be demonstrated at the booth. This will include both a demonstration of the software, and also the use of posters to explain the techniques.
University of California, Irvine
SPARK: High-Level Synthesis
The presentation will focus on showing how parallelizing compiler techniques can be applied to improve the results of high-level synthesis of control intensive designs. Some of the aspects covered will be:
* aggressive code motion techniques and their
- effects on performance
- effects on area and controller complexity
* loop transformations for high-level synthesis
* scheduling under system-level timing constraints
* control synthesis and optimization
* the synthesis system flow
- the modular and configurable aspects of Spark
We will show the results of such transformations on significant segments of industrial-strength benchmarks (e.g. MPEG, ADPCM)
Technical University of Munich
WiCkeD 3
We intend to perform a software demonstration of the tool "WiCkeD 3".
We will provide one or two circuits to exemplify the procedures of
design, modeling and yield optimization. Conference attendees will have
the chance to discuss tool and algorithms with the developers. Some of
the algorithms implemented in "WiCkeD 3" will be explained on a poster
at the booth.
Analog and mixed-signal (AMS) circuits play an important role in modern
applications of communication systems, multimedia, memory, or
automotive. Since design of analog devices is less automated than
design of digital circuits, their development takes a large part of
overall design time and cost of mixed-signal products. Performance
capabilities and parametric yield of mixed-signal circuits are often
limited by their analog components. Besides simulation and layout,
sizing is the most time-consuming and tedious task. By choosing values
of designable parameters like transistor geometries, the designer tries
to maximize the parametric yield of the circuit considering operating
conditions like temperature range, and random process fluctuations like
oxide thickness variation or Vth mismatch. Due to the complexity of
this problem and the growing influence of process fluctuations with
shrinking feature sizes, manual sizing has become a bottleneck for
design time and design quality. Most tools that try to automatically
size lack the designer's intuitive, structural knowledge of the
circuit's behaviour and often generate sizings that are technically
unreasonable. "WiCkeD 3" however solves this problem by rigorously
taking into account structural constraints, which vastly improves the
quality of models and optimization results.
The tool "WiCkeD 3" is a framework for scientific research. It serves
as a common platform for a set of different research projects at our
institute including sizing, modeling, tolerance analysis and test
design. To the researcher, the tool offers a stable implementation of
simulator interfaces, database access to previously performed
simulations, simultaneous simulation on a network of hosts, and
persistent storage of algorithm-specific data. Its modular structure
and powerful script interfaces allow to quickly implement and evaluate
new algorithms that would take months to develop from scratch. To the
designer, "WiCkeD 3" offers a graphical user interface to a large set of
state-of-the-art algorithms for analysis and sizing of analog circuits.
Analyses provided are sensitivities, performance and parameter
dependencies, Monte-Carlo based yield estimation including operating
range influence, and deterministic worst-case and yield analysis.
Advanced optimization algorithms for both nominal design and design
centering are included, the most recent one presented at the conference
(Session 50, Paper 3).
"WiCkeD 3" is in regular use by industrial partners for design of
amplifiers, filters, digital IO cells and other circuits.
Universitaet Hannover, Germany
Programmable Parallel Multimedia-DSP
Poster presentation of VLSI-design and applications for a second generation programmable parallel multimedia DSP (HiPAR-DSP 16), additionally chip samples of first and second generation will be displayed.
The results of the design project are twofold. First, a number of students worked on C++ modeling, VHDL design, synthesis, physical layout, verification as the development of DSP software tools, promoting the education of experts in the field of processor architecture and VLSI-design. Second, an ambitious DSP project with various challenges was mastered under university conditions.
Universidad Nacional de Ingenieria
Fast Prototyping
We will feature our 5 works, wich will demo tools, techniques and hardware-based networks.
You can read more about them below:
- MOLEWARE
We start showing a recent study of Molecular Scale Electronics, then we demo our
Q-Model a general nanoscale model technique wich allows us to develop MOLEWARE, a
tool for automation in simulation and specification of synthesis-based processes
steps to build such systems. Since there is no spec files already defined for this
early technology we extend some of the known standards specs and implement others.
- HDL/Scan Based Fault Injecton and Evaluation Technique for FPGA-Based systems
This section shows an HDL/Scan-Based Fault Injection technique wich is implemented
in an automation tool flow through a GUI. This flow will validate or not in early
design process stages a System implemented on a specified technology. This technique
in combination with others are increasing the interest of community due to the facts
involved in Critical Systems in Space and Ground in DSM Technologies. This Evaluation
Service can be integrated through a Unilinx Service explained below.
- ProtoFast / JBench Virtual Tester / FPGA2ASIC Migrator
* module I is used to generate independent HDL testbenches from Java in order to allow
validation also in the HDL level. It executes the System's Java Model and test it previously,
then after partition and synthesis of bytecodes take place in hardware and software the
generated hardware blocks can be tested in a lower hdl level, java distributed testbench
verification can be run through a Unilinx Service as explained below.
The same module allows integration with a GUI Architecture Factory wich in combination with
the testbench automation Java-HDL technique allow us to explore rapidly other designs spaces.
* module II generates a SVF file for each design specified by a constraints *.ucf, test vectors *.tv
generated previously in hdl simulation using the generated testbench, and *.bsd(BSDL) and *.hsd(HSDL) file for
the device/board implementation target). In other words, you can change any prototyping platform
in seconds without redesign wich is a MUST for Unilinx Multi-Prototyping-Platform Services.
* module III does Streamming of Test Vector Databases in Standard Bitstreams Formats such as SVF rather
than an ATPG at the HDL and EDIF Level. We can also ask for Comparing, Evaluating, and Diagnostic tasks.
* module IV Design Migration from ASIC to FPGA helps in porting VHDL code from FPGA Designers to other
subsets of VHDL used in specific tool sets (e.g. Alliance Academic tools).
- SuperJDrive 1532
This tool is a modification of the IEEE 1532 Std. JDrive Engine wich is a
JTAG-Based Driver in order to support Concurrent Programming and Scheduling
(This is the first tool in a family that support these features)
in Production Line or any other High-End Remote Test Environment. Here we also
demo a study of a 1532 master driver wich is in turn developed as an UNILINX
Service and can work in parallel with a Java-Based SVF Engine.
- Unilinx Network
Unilinx is a Network of Devices/Services (Clusters) and the automation of these Clusters to connect
to a Jini-Hibrid Network (Salutation & Bluetooth). The work gives a proof of concept and implement
a practical display remote system (Services and Networks). This Network, based on the API for Boundary
Scan, RMI(Remote Method Invocation) API, and Java-Spaces API, allows Integration of Design
and Fast Prototyping Tools, Hardware and Services with the Internet. Since this is playing an important
role on the unification of the hardware and software development processes we are calling for an initiative
so called PROTOWARE wich will bring to Industry some concepts such as: Scanlets for Distributed Processing,
Monitoring, Test and Upgrade Design, Remote Virtual Prototyping, Fault Detection and isolation,
Field diagnostics, upgrade, service and repair On-board and embedded products in a distributed and autonomous fashion.
Carnegie Mellon University
SirSim
I will demonstrate the basic capabilities of SirSim by
walking through several simple examples -- and then show
performance on a larger benchmark.
UCLA
Dragon
During the demonstration a high quality standard-cell
placement tool, called Dragon, will be shown. Dragon
combines multilevel partitioning and simulated annealing
techniques on a global bin framwork. It mainly focuses on
wirelength minimization for large scale standard cell
designs. Dragon's placement quality on large benchmarks,
in terms of total bounding box wirelength, is the best
among all published academic work. It is also a fast placer.
Using different speed parameter, it can achieve 5 times
speedup with only 10% loss of solution quality, e.g.,
finish a 100K cell design within 1 hour.
A new technique in Dragon is congestion estimation and
reduction. During placement process, Dragon can estimate
the congestion distribution for current placement state.
This prediction function is useful for logic synthesis
since it provides early routability information. Also
Dragon implements a post-processing step to reduce
congestion on a placed design. This is done by local
congestion improvement with global congestion knowledge.
The congestion reduction function greatly improves the
routability of the design. Congestion map during placement,
after placement and after global routing will be shown.
Carnegie Mellon University
TrailBlazer
TrailBlazer: Direct Transistor-Level Layout for Digital Blocks
Reference: http://www.ece.cmu.edu/~prakashg/trailblazer/
Contacts:
Faculty: Rob A. Rutenbar
Student: Prakash Gopalakrishnan
We present a complete transistor-level layout flow,
from logic netlist to final shapes, for blocks of combinational
logic up to a few thousand transistors in size. The direct
transistor-level attack easily accommodates the demands for careful
ustom sizing necessary in high-speed design, and is also
significantly denser than a comparable cell-based layout.
The key algorithmic innovations are (a) early identification of
essential diffusion-merged MOS device groups called clusters,
but (b) deferred binding of clusters to a specific shape-level layout
until the very end of a multi-phase placement strategy. A global
placer arranges uncommitted clusters; a detailed placer
optimizes clusters at shape level for density and for overall
routability. A commercial router completes the flow. Experiments
comparing to a commercial standard cell-level layout flow show
that, when flattened to transistors, our tool consistently
achieves 100% routed layouts that average 23% less area.
Demo will include the following:
1. TrailBlazer: A running version of the tool will be demonstrated.
2. Poster: Including details of new algortihms, results and layouts
comparing to commercial standard cell-level flow.
3. Student Presence: To answer any questions related to flow/algorithms.
Univ. of Wisconsin-Madison
Fast Circuit Simulator
When the number of transistors increasing rapidly nowaday, P/G network, interconnect become more and more complicate. Transistor level simulations are not practical due to the limitation of CPU run time and memory usage. These kind of problems are usually modeled as linear elements (RLC). However, general-purpose circuit simulators such as SPICE are still not efficient enough. Our software provides a fast solution to these linear systems, which is much faster than SPICE3.
University of Michigan
Wavelet-based video compression
The proposed demostration is for our design entitled: "A
Configurable, Algorithm-Specific Processor for Real-Time
Wavelet-Based Video Compression/Decompression", by Li Ding,
Yi Li and Richard B Brown. This project has been awarded 2nd
Place in the Student Design Contest (conceptual category).
An abstract of the project is as follows.
A real-time video codec based on wavelet transformation is
designed and implemented using the TSMC 0.25um process.
It incorporates a configurable pipelined vector processor,
which is up to two orders of magnitude faster than
general-purpose microprocessors for motion estimation,
and a dedicated module capable of both forward and inverse
wavelet transformation. This chip consumes 1.3 W at 100 MHz
and is capable of handling real-time duplex coding and
decoding of CIF format videos.
The demostration will be mainly poster-based with some
amount of computer-aided demo.
SUNY Binghamton
The Feng Shui Physical Design Tools
Demonstration of placer, global and detail router
National Tsing Hua University
Internet-based simulation
We will show an demonstration for out tool through internet. An Internet-based concurrent-simulation scheme helps to ease IP evaluation process between IP vendors and users. Complex system-on-a-chip design requires more and more IP modules from 3rd party vendors. What can be disclosed by the vendor without impairing its trade secrete and what needs to been examined by the user to gain satisfactory level of confidence are contradictory of each other. Via PLI interface functions and Internet protocol, our proposed software enables HDL simulators (Verilog) residing in both the vendor and user's sites to concurrently simulate the IP and SOC together. Only stimulus and response defined in the IP's module I/O are exchanged between the sites. Therefore, the vendor need not to create a functional model (or encrypted code) for the IP while the user is assured what he/she simulates is what he will purchase. Beside simulation speed degradation due to communication overhead, the SOC de!
sign/debug process is exactly same as if the IP is in the user's hand. Our contribution will help all IP providers expose their IPs to all potential users without human intervention and IP right infringement concern. More details in http://nthucad.cs.nthu.edu.tw/~op/cgi-bin/TIE/main.htm
Seoul Nat'l Univ.
Emulation-based Verification
Emulation-Based Coverification System for Hardware-Software Codesign: Demonstration of H.263 Encoder Coverification
Cycle-accurate co-verification is often the most time-consuming process in the co-design of hardware-software systems. To enable fast co-verification of hardware-software systems, we developed EVEREST (Emulation-based VERification environment for Embedded SysTems) verification system, which is capable of verifying complex hardware-software systems by both simulation and emulation. The major features of EVEREST include optimistic simulation and emulation of software part for fast co-verification, flexible co-emulation architecture by CES (Co-Emulation Server), and consistent user interface for debugging simulation, emulation, and prototyping at various levels of abstraction.
Demos will be prepared for three kinds of verification scenarios for an h.263 encoder example. An h.263 encoder partitioned into software and hardware will be verified using EVEREST. First, the software part is simulated using cycle-accurate instruction set simulator embedded in our in-circuit debugger, and the hardware part is emulated using our hardware emulator. For synchronous co-verification, the two parts are connected and co-emulated by CES. Second, the software part is executed on a real processor and debugged via the JTAG ports while the hardware part is still executed in our hardware emulator. Third, the whole system is prototyped on our board. In the first two scenarios the optimistic co-emulation results will be compared with a conventional approach.
The major components of EVEREST system are in-circuit debugger for software debugging, hardware emulator for hardware verification, and CES for connecting the two. The gdb-based in-circuit debugger supports non-intrusive (requiring no target system resource) in-circuit debugging using processor's JTAG (IEEE standard 1149.1) interface. It also supports optimistic emulation of real processors as well as optimistic simulation, which is crucial for hybrid co-emulation and co-simulation, thereby reducing total co-verification time dramatically. The hardware emulator is FPGA-based, which can be a very cost-effective solution for medium-to-small sized logic verification typically found in many embedded systems designs. CES basically connects simulators and emulators for cycle-accurate co-simulation and co-emulation. It works for memory-mapped interface, by monitoring memory access addresses and passing appropriate messages to hardware and software sides. By employing CES, the organiz!
ation of co-emulator is greatly simplified and any other simulators or emulators can easily be connected by a small modification of memory access parts. CES can both observe and control the messages going through the channel. As of prototyping, we have made a prototyping board containing a processor with an ARM7TDMI core and Virtex FPGAs for hardware emulation and board control. The above-mentioned verification tools for software debugging and hardware emulation can also be used for verifying/debugging the design being prototyped. Currently, the in-circuit debugger supports only ARM7-based processors but it may easily be extended to other processors or DSPs supporting JTAG-based debugging interface. All the above-mentioned tools are run on a Windows platform.
In summary, the EVEREST verification system provides efficient verification tools for designing complex embedded systems. It supports hybrid co-simulation and co-emulation by executing the software parts optimistically, which becomes more effective as the software parts grow bigger.
Seoul National University
Low-power Software Design
Power consumption has emerged as one of the most important performance metrics in digital systems. Among the range of power reduction techniques, high-level power optimization is useful for complex digital systems including microprocessor-equipped systems. Appropriate power consumption models are mandatory for high-level power reduction practices because high-level power optimization techniques do not concern physical designs.
At the last DAC University booth, we have demonstrated in-house hardware and software tools that can measure and analyze cycle-accurate energy consumption. Our technique guarantees accuracy when we use a sampling rate of twice the clock frequency under spiky current draw common in digital systems. It acquires the energy consumption profile in real-time without repeated operation of the target systems.
We have continued to enhance our tool in both hardware and software aspects. We added more target devices including flash memory and developed PC card-type ARM7TDMI board that is equipped with 16MB vector memory in order to measure energy consumption with real application programs such as Linux operating system. The PC card has high-speed PCI local bus bridge and enables real-time data acquisition without stopping the target program.
The PC card is running with a source-level debugger that has GNU debugger like look-and-feel. Software designers are able to do C-source code or assembly code debugging where each source code is associated with energy consumption information. The enhanced version can collect entire external bus signals. This feature enables us to come up with energy debugger environment that supports whole systems including various peripheral devices.
Our demonstration includes real-time energy measurement and characterization of a microprocessor and memory devices (the last DAC University booth), additional devices, a new PC card-type energy debugger hardware, GNU debugger like source-level energy debugging software, and some applications.
For more information, please visit
http://lowpower.snu.ac.kr
Chalmers University of Technology (Gothenburg, Sweden)
Lava HDL
I plan to demonstrate the interactive Lava-system and show how to describe hardware in it. I will then show how to prove properties about circuits, and discuss future feature, such as proving properties of parameterized circuits, instead of instances of them.
Pontificia Bolivariana (UPB)
Video acquisition chip
The image processing given the volume of data to be manipulated typically implies the great slowness or high requirements of hardware if satisfactory results are tried. The Programmable Logical Arrays raised the possibility of making a system of capture and image processing at high speed, enjoying the flexibility of a conjugated DSP of general intention with the speed of a dedicated Hardware.
The characteristics of the Chip allow to solve problems that at the moment can not be solved of simultaneous way at commercial level such as: minimum requirements of hardware, optimization in the use of the space and direct effect in the aesthetic presentation; under power consumption; highest frequencies of work guaranteeing excellent performance and introduction of the concept of execution in real time.
The captured signal is processed easily by the user, responding to options determined through a keyboard, that it activates algorithms for the presentation of píxeles in screen, is as well as effects are obtained:
·Elimination of colors, Freezing, the Negative, Average of the image, Binarización.
In short, a small sample of manipulation of images in real time of simple mode.
Throughout the presentation it is showed wirh a proctical demostration what has been obtained (the items above described) after an arduous work of investigation in the subject previously exposed
University of California, Irvine
IMPACCT
Title: A Power-aware Scheduling Tool for System-Level Power Management
We present a new scheduling tool for system-level power management in embedded-systems. This tool addresses the following key issues that are not adequately addressed by previous works. First, the new generation of embedded systems must be designed to be power-aware, rather than just low-power. Second, the power management decisions must be made at the system level, rather than only at the component level. Power-aware systems are those that must not only minimize power when the power budget is low, but also deliver high performance when required. They subsume traditional low-power systems as special cases. The power managers must also track the availability of all power sources and the power usage of all consumers in application-specific operating conditions. These conditions include the different energy models from expensive (e.g., non-rechargeable battery) to free (e.g., solar source), and the variance characteristics of the power consumers in the changing environment.
We believe that power-aware designs must be done at the system-level, not just at the component level. Amdahl's law applies to power as well as performance. That is, the power saving of a given component must be scaled by its percentage contribution in an entire system. Thus, it is critical to identify where power is being consumed in the context of a system; and a successful power manager must consider both computation domains (e.g., embedded processors, memory) and non-computation domains (e.g., mechanical and thermal subsystems) and coordinate their power usage as a whole system.
We present a prototype of a novel power-aware scheduling tool that supports system-level power management. Our underlying application model captures timing constraints on communicating tasks, as well as min/max power constraints on the entire system. The core scheduler produces a schedule that satisfies stringent timing constraints and the max power budget; it will also make the best effort to meet the min power goal corresponding to the free energy sources. The result is visually presented to the designers in two views. The time view shows the timing sequence of parallel task execution on multiple resources. The power view shows the system-level power curve of the schedule with corresponding attributes, including different energy sources, major power consumers, expensive energy vs. free energy, etc. Our tool can effectively automate the design space exploration with different energy/performance trade-offs. Moreover, the designers can manually intervene with the automated sche!
duling process by imposing additional timing constraints in the time view, while observing the results in the power view interactively.
Our work is motivated by the NASA/JPL Mars rover. It features several interesting problems that cannot be effectively handled by traditional low-power techniques. First, the rover has different energy sources: a non-rechargeable battery and a solar panel. Second, the major power consumers do not even include the embedded processors, but they consist of mechanical motors and thermal heaters. By applying the power-aware scheduling techniques to this application, our tool can produce alternative designs that achieve both improved performance and reduced energy cost simultaneously. This tool forms the basis of the IMPACCT system-level framework that will enable designers to explore many power-performance trade-offs with confidence.
University of California, Irvine
Component Based Design Framework
The design and reuse of hardware system level components written in C++ are ad hoc and tedious because of the strong emphasis on inheritance as the basic composition mechanism. In the demonstration, we show the BALBOA CAD system that relies on reuse by dynamic composition rather than reuse by inheritance. A system architect uses this system to build hardware and system level models by assembling IP library components defined in C++ and SystemC. The key feature of the tool is that the system architects do not have to write or modify C++ code directly, but rather use a Component Integration Language (CIL, an extension to OTcl) where connectivity and typing are abstracted.
In this demonstration, we show how to use the BALBOA CIL for the integration of an AMRM Adaptive Memory System (http://www.ics.uci.edu/~amrm/) from IP library components defined in C++ with SystemC. We show the efficiency of the CIL to perform communication refinement and how typing and connectivity abstractions helps the system architect to focus on the essential tasks of architectural exploration.
This demonstration should be of interest to DAC attendees because it describes a strategy for efficient high level modeling of hardware and system with C++. It demonstrates that C++ should be used to define IP components, but these components should be manipulated at a higher level of abstraction to hide details and guide the composition. This abstraction layer is implemented through the CIL. There is also design automation in this layer to solve the typing and connectivity abstraction. A moderately complex design example will also show how the usage of the CIL reduces the size of the model.
Princeton Univ.
Chaff SAT Solver
We plan to demonstrate our newly developed Boolean Satisfiability Solver called Chaff. The Chaff solver is developed by Prof. Sharad Malik's group at Princeton University.
In our demonstration, we will show the SAT solver working on various SAT problems generated from real EDA applications as well as some artificial instances. The detail of the SAT solver itself is described in session 33.1, Engineering a (super) fast SAT solver.
Interests in SAT solvers are resurging in the last couple of years because of the development in SAT based Model Checking and Bounded Model Checking. We developed the SAT solver Chaff that is fast on the structured problems from real world applications. The demo will show the strength of our solver in comparison with other state-of-the-arts solvers.
UC Berkeley
Ptolemy II
Ptolemy II is a high-level component-based design environment for modeling embedded systems. We demonstrate a variety of component-based system models created using graphical block diagrams. These models are specified using dataflow diagrams, finite-state machines, differential equations, 3 dimensional scene graphs, and other useful styles of component interaction. Ptolemy II is unique in its ability to allow these different types of diagrams to be combined hierarchically. This results in systems that are specified more completely, concisely and understandably than would be possible otherwise. We demonstrate models of signal processing, communications, and control systems. We will also show how Vergil, a user interface for Ptolemy II can now be used to create these models using graphical diagrams.
http://ptolemy.eecs.berkeley.edu
University of California, Los Angeles
Strategically Programmable System
During the demonstration an automated architecture
generation tool and a high-level synthesis tool targeting
the generated systems will be presented. The architecture
generation tool renders a novel programmable architecture
called the strategically programmable systesm (SPS). SPS
is a programmable system customized for a given family of
application. The customization is achieved through embedding
of optimized fixed blocks within a fully reconfigurable fabric.
The embedded fixed blocks are decided according to the
needs of the given application set.
For a given application set, first a pattern extraction task
is done. In this phase critical/common operations or clusters
of operations are identified. On data flow graphs created
for each application extracted patterns are displayed.
Also statistical data on the ferquencies and distribution of
operations and operation patterns are collected. The
collected data is input to the core of the architecture
generation. Based on this data delievered, decisions
regarding which types of fixed blocks to include on the chip
are made.
The second part of the demonstration shows how a given
application is mapped on one instance of SPS architecture.
The tool that we will demonstrate performs the scheduling
and binding of operations on a given architecture
configuration.
KULeuven
RF Mixer Circuits
The Demo will show a graphical representation of all the
steps in th optimization process of an RF mixer circuit
where handcalculations, Layout generation, parasitic extract
ion are combined with the power of the Differential
Evolution Algorithm to give a fast and flexible optimization
routine.
All the steps of the optimization algorithm will be shown
graphically. The optimization demo is centered around a
graphical interface showing the current status of the
optimization parameters, the evolution of the cost function.
The process of layout generation and parasitic estraction,
which is an integral part of the optimization will also be
demonstrated.
University of Technology RWTH Aachen
Flexible Datapath Generator
Following actual market constraints, the combination of the significant advantages of a
physical oriented design regarding power dissipation, throughput rate, and silicon
area with a design effort comparable to semi custom designs is highly desirable.
Consequently, a flexible datapath generator (DPG) has been developed that enables the
physical oriented design of datapath macros based on a given signal flow graph (SFG)
and a small set of optimized leaf cells at a very low design effort. At the booth it will
be shown how the DPG exploits the inherent regularity and locality typical for SFGs in
digital signal processing. In this way the optimization of silicon area, throughput rate,
and especially power dissipation is possible in short iteration cycles down to the physical
layout level. The DPG is fully embedded into an industrial standard framework
and allows the integration of the generated blocks as hardmacros. A detailed
description of the DPG-based physical oriented design methodology and industry related
benchmarks are presented at "http://www.eecs.rwth-aachen.de/dpg/info.html".
During the presentation we would like to demonstrate the DPG and the resulting
design methodology by generating the physical oriented design of a small datapath
containing typical DSP basic functions based on a parameterizable signal flow graph which
is designed and modified interacting with the audience. The DPG run will be interrupted at
predefined breakpoints clarifying the idea of deriving abutment basic cells from a small
set of leaf cells by powerful DPG routines for layout modification. The presentation will
be accompanied by a poster summarizing the DPG design flow and showing industry related
benchmarks.
TIMA -- Grenoble France
SOC Communication Hardware
The Demo will show a design flow for automatic design of hardware
software communication in the case of multi-processor SoCs.
The system is initially specified at the system level as an abstract
architecture that we call macro-architecture, this model is composed of
a set of modules interconnected through logical wires. Each module
represents a processor in the final architecture. This may be a software
processor (e.g. DSP or a microcontroller executing software), a hardware
processor (specific hardware) or an existing IP (global memory,
peripheral, bus controller, ?). The logical wires are abstract channels
that transfer fixed data types (e.g. integer, real) and may protocols
(e.g. handshake or memory mapped I/O). The architecture may also include
RTL modules. Co-simulation is required to validate the partitioning and
the abstract communication.
The design flow produces the detailed architecture that we call
micro-architecture. Software modules are mapped onto specific
processors. This requires the synthesis of a hardware interface to link
the processor to the rest of the architecture. When the software
includes parallel task a specific operating system (OS) is synthesized.
Hardware blocks are refined to the clock-cycle level. Finally, existing
blocks (IPs) are encapsulated within an interface in order to
accommodate the final protocols. The communication between the different
blocks is made through physical wires that implement the final
protocols. The interface block may include controllers and buffering
when needed.
The demo will show a of models and tools based on SystemC including:
1- A target architecture and SystemC based design representation models
that allows for flexible, modular and scalable design of multi-processor
systems on chip.
2- A design flow that allows bridging the gap between system
specification and multi-processor system-on-chip implementations.
3- A set of methods and tools that implements the design flow. These
include Co-simulation and Automatic generation of hardware software
interfaces for multiprocessor system on chip.