Federal University of S. Catarina
UFRGS-Universidade Federal do Rio Grande do Sul
University of Michigan
Darmstadt University of Technology
University of Pittsburgh
University of California, San Diego
University of Southern California
UCLA
Politecnico di Torino
University of Michigan
University of Wisconsin - Madison
University of Wisconsin-Madison
University of Wisconsin-Madison
NCSU
Politechnic Institute of Turin - Italy
UCLA
Norwegian University of Science and Technology
University of Maryland at Baltimore County
University of Tokyo
Univeristy of Maryland at Baltimore County
University of California, Irvine
University of California, Irvine
Technical University of Munich
Universitaet Hannover, Germany
Universidad Nacional de Ingenieria
Carnegie Mellon University
UCLA
Carnegie Mellon University
Univ. of Wisconsin-Madison
University of Michigan
SUNY Binghamton
National Tsing Hua University
Seoul Nat'l Univ.
Seoul National University
Chalmers University of Technology (Gothenburg, Sweden)
Pontificia Bolivariana (UPB)
University of California, Irvine
University of California, Irvine
Princeton Univ.
UC Berkeley
University of California, Los Angeles
KULeuven
University of Technology RWTH Aachen
TIMA -- Grenoble France


Federal University of S. Catarina
Advanced Compact MOSFET model
This work presents the ACM (Advanced Compact MOSFET) model. Developed at the Integrated Circuits Laboratory of Federal University of Santa Catarina (http://www.eel.ufsc.br/lci), it is a physical charge-based model suitable for analysis and design of integrated circuits. Also, we will bring some examples of the advantages of the ACM model and a simplified methodology for extracting its parameters. The ACM model is implemented in a mixed mode simulator (SMASH from Dolphin Integration - http://www.dolphin.fr) and, unlike most of used models, presents a great simplicity in its equations and a small number of physics dependent parameters. This characteristics, allied with the strong accuracy in represent static and dynamic behavior of MOS transistors, allow its use in educational and industrial areas. Another topic of our presentation will be the extraction of the parameters of ACM model. To help those planning use our model, we developed three simple and low cost methods: 1. The mapping of BSIM parameters to ACM parameters: we convert directly some BSIM 3 parameters to ACM parameters or, if it is not possible, we use analytical expressions to extract them; 2. Experimental determination of the ACM parameters: we have developed a set of very simple circuits, which require conventional laboratory equipment only, to determine the ACM parameters; 3. Determination of ACM parameters from simulation: with the same circuits of the second method, we can extract the ACM parameters by simulation. Finally, we will present some simulations with ACM and some other models, where comparison of the results will show the advantages of our model.


UFRGS-Universidade Federal do Rio Grande do Sul
Path Search
This work address the problems of control and running time in VLSI routing by demonstrating two new algorithms and an accurate cost model. Net by net routing is still a very important technique used to make connections in VLSI circuits. Maze routing algorithms used for this purpose correspond to shortest path searches derived from basic BFS or from A*, with many dedicated improvements. BFS searches without costs are faster because, although they visit larger areas, the visit time complexity is constant. Yet searches using A* and variable costs per arc require logarithmic time per visit. They are more efficient for global routing, that has to consider a smaller graph, accounting for routing distribution. We will demonstrate LCS*, a new algorithm [Johann et al. 2000, LNAI-1952] that is the first bidirectional and heuristic search method to win from A* in average number of visited nodes and running time. It is shown that the cost model hardly affects interconnect's quality and running time of heuristic search algorithms, and traditional approaches cannot distinguish different search goals. Traditionally, a single cost value is used to model obstructions (variable) and path length (a constant positive factor). We propose a novel cost model that distinguishes the congestion, performance, and resource costs, and allows greater control of the resulting net quality by using search parameters, while preventing the algorithm to visit to much nodes. By playing with these parameters one can manually or automatically let the nets that have not achieved the desired performance to be re-routed using a new costing scheme that favors performance, while th! e others keep saving resources for them. Besides the importance of individual nets control, detailed routing of large areas needs to be successfully processed. The crosspoint assignment problem that arises at the interface of global cells in area routing imposes limits to the detailed algorithm efficiency in terms of area and runtime. We will demonstrate LEGAL, a new greedy algorithm which is a generalization of channel routing algorithms. LEGAL can simultaneously process the detailed area routing in almost linear time given that global routing decisions have already been made. The greedy technique is used by exploiting the fact that only local decisions are left to the detailed routing step, and then, there is little chance of converging into a conflict situation. A simple implementation restricted to only 2-pin nets shows almost the same routing results of a very optimized maze router system, but with orders of magnitude less time.


University of Michigan
Satisfiability Solver
Boolean satisfiability techniques have been used successfully in a variety of EDA applications such as formal verification, testing, and timing analysis, just to name a few. A number of significant enhancements in both algorithm and implementation such as conflict analysis, recursive learning and different decision heuristics have contributed to the recent success of SAT approaches, making successful SAT solvers valuable building blocks for CAD tool designers. Despite this, there are two important factors we need to consider when applying SAT to EDA problems. First, in many cases we need to solve a set of related SAT problems to complete a task. Second, while there exist methods to map application specific problems to a SAT’s CNF form, this conversion process is sometimes inefficient and increases the complexity of the problem at hand by introducing auxiliary variables and constraints. We introduce SATIRE (SATisfiability Incremental Reasoning Engine), a new satisfiability solver that is particularly suited to verification and optimization problems in EDA. It has the ability to incrementally solve related problems and reuse information learned in previous search runs to solve new similar problems. Such problems typically arise in circuit testing, FPGA routing, microprocessor verification, etc. In each case the related problems are largely identical; only a small portion of the constraints change. The solver also has the ability to remove constraints from the problem. Another feature of SATIRE is its ability to handle non-CNF constraint types. Many EDA problems can be converted to CNF in a straightforward manner, as simple logical relations usually correspond to a small number of CNF clauses. However, there are useful properties that cannot be conveniently represented by CNF. The presence of such non-scalable properties has made some EDA problems unamenable to solution by SAT. However, sometimes it is not a fundamental mismatch between the problem domain and SAT, but rather an unfavorable choice of representation. There are perfectly reasonable logical relationships, which do not have manageable CNF representations. We introduce a redefinition of SAT that does not specify the form of the constraints to be satisfied. By generalizing our definition of the problem we are able to address a broad range of problem domains more naturally, some of which were not feasible when represented only by CNF constraints. In this demo, we will provide experimental evidence showing the effectiveness of these additions to classical satisfiability solvers.


Darmstadt University of Technology
Java-based Design
To keep with the increasing complexity of design of ICs, the use of higher abstraction levels is necessary. To deal with those levels of abstraction, tools and methodologies must be built in order to support the designer when adapting to the new paradigm. The proposed presentation will show a pair tool/methodology to support distributed collaborative design in a higher level of abstraction over Cave Framework infrastructure [1,2]. Object-oriented concepts are used as solutions to deal with complexity [3], taking into account the advantages it granted to complex software development [4]. A particular tool – named Homero – will also presented, which use a pair programming model [5] for collaborative development by groups of designers in different locations. The design flow starts at the conceptual level, where the system is modeled using OO techniques. Collaborative UML diagrams, such as collaboration diagrams, class diagrams or even use-cases can be used, through Ho! mero infrastructure. From the conceptual level, the model should be refined to include the functional description of the system using Java language. This refinement may be done by hand – using Homero pair-programming infra-structure – or using commercially available CASE tool, such as Rational Rose [6]. At this point, an executable model should be created for functional simulation purposes. The test bench is written in Java (or using UML and doing the same process as the design) and run together with the executable model. After some refinement iterations, the executable model is used as input for the HDL model generation, using a tool such as Forge [7]. The generation is based on a set of architectural options done by the designer, so it separates the functional and architectural specifications, making it easy to compare several architectures based on a single functional description. After architectural experimentation, the generated HDL code may be synthesized u! sing commercially available tools. A Codesign-like partition may also be done, selecting design blocks to run as embedded Java code over a hardware, software or mixed implementation of a Java Virtual Machine. [1] INDRUSIAK, L.S.; REIS, R.A.L. A WWW approach for EDA tool integration. In: X BRAZILIAN SYMPOSIUM OF INTEGRATED CIRCUITS DESIGN, SBCCI, 10., 1997, Gramado. Proceedings... Porto Alegre: CPGCC UFRGS, 1997. [2] INDRUSIAK, L.S.; REIS, R.A.L. A Case Study for the Cave Project. In: BRAZILIAN SYMPOSIUM OF INTEGRATED CIRCUITS DESIGN, SBCCI, 11., 1998, Armacao de Buzios. Proceedings... Los Alamitos: IEEE Computer Society Press, 1998. [3] INDRUSIAK, L.S.; REIS, R.A.L. From a Hyperdocument-Centric to an Object-Oriented Approach for the Cave Project. In: BRAZILIAN SYMPOSIUM OF INTEGRATED CIRCUITS DESIGN, SBCCI, 13.., 2000, Manaus. Proceedings... Los Alamitos: IEEE Computer Society Press, 2000. [4] PRESSMAN, R. S. Software Engineering: A Practitioner’s Approach. McGraw-Hill, 1996. [5] WILLIAMS, L.; KESSLER, R.R.; All I Really Need to Know about Pair Programming I Learned In Kindergarten; Communications of the ACM Vol. 43 No. 5; 2000; p. 108-114. [6] DAVIS, D. et. al. Forge-J: High Performance Hardware from Java. http://www.xilinx.com/forge/forge_J_wp.htm [7] RATIONAL ROSE. http://www.rational.com


University of Pittsburgh
Chatoyant modeling system
We have developed Chatoyant to support modeling and simulating of micro-opto-electro-mechanical systems. This work will be demonstrated in this years University Booth. Chatoyant is built upon the object-oriented simulation engine Ptolemy. Chatoyant's component models are written in C++ with sets of user defined parameters for the characteristics of each module instance. Chatoyant performs static simulations to analyze such effects as mechanical tolerancing, power loss, insertion loss, and crosstalk, while dynamic simulations analyze data streams with techniques such as noise analysis and BER calculation. Our work has been motivated by high-speed optical interconnect and switching systems based on OMEM (optical micro-electrical-mechanical) devices being a critical backbone technology for the next generation computer networks and systems. However, like many new technologies, design methods and tools for these systems are currently ad-hoc. Designers typically use combinations of tools that were built for the individual domains of optics, mechanics and electronics with little integration and with system level analysis based only on the experience of the designer or simply on assumptions about the ensemble behavior of the components. However, in order to support the design of these mulit-domain systems in a practical manor, computer aided design tools must be capable of modeling, electronics, electrostatics, mechanics, guided wave optics, and free space optics. The design tools must directly support the interfaces between models in all these domains, and characterize the behavi! or of the resulting system in an single integrated environment. This is what we achieve in Chatoyant. Our system level models consist of libraries of components and methodologies for signal propagation between components. A critical notion is the distinction between device models, which are domain specific and characterize the underlying physics of a device, versus component models that we use to characterize system level behavior. Component models can be supported at many levels of abstraction and can be derived from multiple sources including analytical data, device model simulations, or empirical measurements. For example, in the current implementation of Chatoyant, we have implemented models for vertical cavity surface emitting lasers (VCSEL) based on both empirical and analytical data and developed electrical models of CMOS devices and mechanical models of MEMS cantilevers using piece-wise linear evaluation of device characteristics. These modeling techniques allow us to perform trade offs between accuracy of our models and performance of our simulation and analysis t! ools, which is essential to provide an interactive design environment. In this demostration of Chatoyant, we will focus on out newest research: - new accurate and efficient diffraction models - piece-wise linear modeling of electrical and mechanical components - new Java3D GUI


University of California, San Diego
Mixed-Signal Test
The proposed poster will describe a new high level approach to the mixed-signal test synthesis problem based on propagation of the test related information across the chip boundaries. First, fundamentals of signal propagation from a test perspective will be discussed. Complications due to analog domain, such as parameter tolerances, noise and non-linearity, and how one can approach these challenges through appropriate modeling of signal components and modules will be presented. Following the discussion of the basics of signal propagation, how one can utilize such a propagation scheme for test generation, fault simulation, and testability analysis will be presented. Finally, a set of experimental results on a high frequency signal down-conversion system that includes a filter, a mixer, an amplifier and a data converter will be presented. A list of published work in line with this proposal can be found at: http://www-cse.ucsd.edu/users/sozev/suleresume.html As the industry is moving towards higher levels of integration, traditional test methods of isolated basic block testing prove to be insufficient to satisfy stringent market requirements. The gap between traditional test methods and application of these methods at the system level can only be bridged through new hierarchical approaches. This poster will be presting such a high level approach to the mixed-signal test problem.


University of Southern California
Apollo: Adaptive Power Optimization
The Apollo project aims at significantly reducing power dissipation of next-generation mobile DoD computing and communication systems by means of operating system-directed power management, power-aware software compilation, and system-level synthesis and optimization of the integrated hardware/software platform subject to performance and quality-of-service constraints.


UCLA
PLAmap and RASP_SYN, Technology Mapping Packages
RASP_SYN, an LUT-based FPGA technology mapping package, is the synthesis core of the UCLA RASP System developed at UCLA VLSI CAD LAB. This release includes the following mapping algorithms: DAG_Map (depth minimization) version 1.0 FlowMap (depth optimal) version 2.1 TurboMap (optimal mapping with retiming) version 1.0 FlowMap-r (area-delay tradeoff) version 2.0 FlowSYN (FPGA resynthesis) version 2.0 CutMap (simultaneous area delay minimization) version 1.2 ZMap (simultaneous area delay minimization) version 1.0 EMB_Pack (mapping for FPGAs with embedded memory blocks for area minimization while maintaining the delay) version 1.0 EMB_PreMap (the pre-mapping processing version for EMB_Pack) version 1.0 HeteroMap (delay optimal mapping for heterogeneous FPGAs) version 1.0 BinaryHM and CN-HM (delay-oriented mapping for heterogeneous FPGAs with bounded resources) version 1.0 This year we have three new additions to the RASP package: 1. Performance-Driven Mapping for CPLDs 2. Simultaneous Logic Decomposition and Technology Mapping for FPGAs 3. Performance-Driven Multilevel Clustering with Application to Hieararchical FPGA Mapping Please refer to http://ballade.cs.ucla.edu/~cong/publications.html This demonstration will be a presentation in Windows Power- Point for the research project related to technology mapping problem for CPLD architectures. This research is summarized as follows. CPLDs are based on PLA-style logic cells, which are also referred as p-term blocks or simply PLAs. Logic synthesis and technology mapping for CPLDs are considered more difficult than for FPGAs, due to the wide fanins, inputs/pterm sharing and two level logic minimization. We implemented a performance-driven mapping algorithm called PLAmap. Notation (k,m,p)-PLA specifies a PLA with k or less inputs, m or less product terms, and p or less outputs. For CPLD structures consisting of small PLAs such as (10,12,4)-PLAs we compared our results with that of TEMPLA, which is a CPLD mapper targeting for area minimization. Next, we modified our program to take into account structural constraints of the p-term block in one commercial CPLD device, Altera's MAX 7000B, whose PLA-style LAB can be considered as a special (36,80,16)-PLA with structural constraints. We also explored two ways targeting for area/delay trade-off using threshold control strategies. Our algorithmic flow consists of three stages. Labeling: determines each node's logic depth (stored as the label of the node) and provides clustering information for the subsequent mapping step. To minimize depth in the final PLA network, we label as if the target structure only consists of (k,m,1)-PLAs, i.e., single-output PLAs, so that we can form a PLA ``cluste'' (the chunk of logic covered by this (k,m,1)-PLA) as deep as possible. Mapping: generates (k,m,p)-PLAs based on the label information of each node in the network. Since the logic depth of the final network has already been decided, the goal of the mapping stage is to minimize area without affecting the logic depth of the network. The unique feature of our mapping stage is to generate (k,m,p)-PLAs (p > 1) directly by using cluster merging, slack time relaxation, or node duplication. It efficiently takes advantage of the attributes of the clusters formed in the labeling stage and tries to share the inputs and pterms as much as possible between different outputs in the same PLA. Packing: reduces mapping area further without depth sacrifice. Two heuristic operations are developed. One is called PLA collapsing. Any PLA that can be collapsed into all of its fanout PLAs can be eliminated, provided that all PLAs remain as (k,m,p)-PLAs after the collapsing. The second operation is maximum shared-input bin packing. Experimental results on various MCNC benchmarks show that overall TEMPLA uses 8 to 11% less area at the cost of 96 to 106% more mapping depth, and MAX+PLUS II uses 12% less area but 58% more delay compared with our mapper. This work was presented at ACM/SIGDA International Symposium on Field Programmable Gate Arrays, 2001. The paper (pdf) can be retrieved at the publication list of Prof. Cong at http://cadlab.cs.ucla.edu (item 109).


Politecnico di Torino
RTL power estimation
RTLEst is a power estimation tool for structural RTL VHDL descriptions developed by the EDA Group of Politecnico di Torino within ESPRIT project n. 26796 "PEOPLE". Compared to existing RTL power estimators, the tool has two distinctive features: ñ - Power estimation of the RTL components (i.e., macros) is obtained through accurate power macromodels, which are built with a highly robust characterization procedure. The demo will illustrate the major capabilities and features of the tool. Macromodels for the components are built once and for all, and are stored into a cache of models for later reuse. - Pre-characterization of the RTL components is not required. The tool includes automatic characterization capabilities, which allow easy migration to different RTL libraries and technologies without the intervention of the tool developer. RTLEst guarantees faster estimation than existing tools, because it relies on RT-level annotation of the switching activity. Gate-level simulation is only required for power characterization of the RTL components. Therefore, the tool is particularly suited for design exploration of behavioral synthesis alternatives, where changes in the design are incremental. The tool comes with a user-friendly TCL/TK-based GUI, as well as a batch-mode interface. It supports two estimation modes: Automatic and detailed. In automatic mode, the user is allowed to control a few estimation parameters, mainly regarding the specification of the model cache, the RTL and technology libraries, and the stimuli file (testbench or textual). Beside power estimation of the entire design, the tool provides the user also with a macro-by-macro power breakdown. In detailed mode, the user is prompted with a window that allows him/her to control all the estimation details. These include model selection and estimation effort selection. Obviously, estimation carried out in detailed mode yields more accurate results than in automatic mode. Tool capabilities are currently being enhanced to allow the processing of cycle-accurate RTL descriptions, i.e., descriptions containing synthetic operators and behavioral constructs. Improvements that will enable the tool to account for dynamic effects (e.g., glitching) during estimation are also being considered.


University of Michigan
Two Dimensional Position Detection
A hybrid two-dimensional position sensing system is designed for mouse applications. the system measure the acceleration of hand movements which are converted into two dimensional location coordinates. the system consists of four major components : (1) MEMS accelerometers, (2) CMOS analog read-out circuitry , (3) an acceleration magnitude extraction module, and (4) a 16-bit RISC microprocessor. Mechanical and analog circuit simulation shows that the designed padless mouse system can detect accelerations as small as 5.3mg and operate up to 18MHz. we will demonstrate functionality of each module using either mechanical or electrical simulation tools.


University of Wisconsin - Madison
Lsim/p: power simulation
As semiconductor technology scales down, the leakage power will soon become comparable to the dynamic power. To reduce both dynamic and leakage power, power gating in addition to clock gating should be used, because clock gating saves only dynamic power. The knowledge of maximum current is needed to design reliable circuits using power gating. We will explain the method to estimate such current, and demonstrate a cycle-accurate architecture level simulation tool integrating such current (power) estimation.


University of Wisconsin-Madison
SINO/SPR: RLC net synthesis
The following algorithms will be first presented: (i) min-area simultaneous shield insertion and net ordering (SINO) under the given RLC noise bound; (ii) formula-based interconnect area estimation for optimal SINO solutions; and (iii) min-area simultaneous signal and power routing under the given RLC noise bound. Then an integrated toolset containing the above algorithms and SPICE netlist generation will be demonstrated. Both the algorithm and demonstration will be also available on http://eda.ece.wisc.edu/tools.html.


University of Wisconsin-Madison
WebHenry
It is evident that on-chip inductance becomes increasingly important for interconnect design and verification. We will first present an efficient inductance extraction model and a closed-form solution for RLC noise screening. We will then demonstrate an on-line tool incorporating the above methods. The online tools can be found at http://eda.ece.wisc.edu/tools.html


NCSU
Distributed Networked Design
We introduce a `universal client (OmniFlow)' whose GUI can be readily configured by the user to invoke any number of applications, concurrently or sequentially, anywhere on the network. The design and the implementation of the client is based on the principles of taskflow-oriented programming, whereby we merge concepts from structured programming, hardware description, and mark-up languages. A mark-up language such as XML supports a well-defined schema that captures the decomposition of a program into a hierarchy of tasks, each representing an instance of a blackbox or a whitebox software component. The HDL-like input/output port definitions capture data-task-data dependencies. A highly interactive hierarchical GUI, rendered from the hierarchical taskflow descriptions in extended XML, supports structured programming language constructs to control sequences of task synchronization, execution, repetition, and abort. Experimental evaluations of the prototype, up to 9150 tasks and the longest path of 1600 tasks, demonstrate the scalability of the environment and the overall effectiveness of the proposed architecture for a number of networked design and computing projects.


Politechnic Institute of Turin - Italy
HW/SW Design
We will start from a Simulink description of the PID controller. This description will use not the standard Simulink block set, but a customized one, that allows to supply additional data, as the implementation, than can be mainly software, hardware or none, several signal attributes, and so on. We will be able to perform a full Simulink simulation of this system, and after the parameters tuning and the system implementation partitioning, we will show how we can predict some performance indexes, as the area occupation and the maximum sampling frequency of the hardware subsystem, the code length and speed of the software implemented one. Once the functionality and the performance obtainded by simulation and extimation are satisfying, we will perform the VHDL/C code generation for the target system we use. These codes will be compiled and the resulting binary code will be uploaded to the board (with DSP and an FPGA IC on board), linked to a brushless motor by a power driver. At this point we will be ready to see the system working. It will be possible to show how the changing of parameters value affects the simulation results, and the real system behaviour.


UCLA
TRIO/IPEM
Our demo consists of two parts -- TRIO and IPEM. We have outline our plan for both demos before. TRIO ---- o demontsrate the usage of the tool to synthesize topologies o usage of the tool to perform interconnect optimization through buffer insertion, wire sizing, device sizing etc. o show the usage of a graphical interface for easy usage of the tool and viewing results IPEM --- o usage of IPEM to plan and estimate interconnect performance based on repeater insertion, optimal wire sizing etc. o interface functions to IPEM o GUI for IPEM We will demonstrate the MCM layout system, including the performance driven multi-layer global router, the crosstalk noise constrained pseudo pin assignment, the multi-layer gridless detailed router and the post-layout layer assignment. We will demonstrate both the results and run some examples on the spot.


Norwegian University of Science and Technology
STOREQ CAD tool
The planned demonstration is described in the pdf file indicated in point 7 (http://www.fysel.ntnu.no/~pgk/STOREQ-Users-Manual.pdf). It will show the DAC attendees how our tool guides the designer towards achieving an optimized end product through a focus on data transfer and storage at the highest system levels. Currently STOREQ is a stand-alone tool, but work is under way to interface it with the ATOMIUM tool being developed at IMEC in Belgium. For more details on the methodology, see paper 23.4 and its references.


University of Maryland at Baltimore County
Delay Faults Detection
A delay-fault testing strategy based on the analysis of power supply transient signals is presented. The method is an extension to a Go/No-Go device testing method called Transient Signal Analysis (TSA). TSA detects defects through the analysis of a set of power supply transient waveforms in the time or frequency domain, e.g., Fourier Phase components. A recent extension to TSA demonstrates a correlation between the Fourier Phase components and path delays in defect-free devices. The method proposed here is able to track increases in delay due to resistive shorting and open defects using a similar technique. In particular, we demonstrate that a delay defective device can be distinguished from a defect-free device through an anomaly in the Fourier Phase correlation profile of the device. Simulation experiment results show that the method is additionally capable of predicting the magnitude of the additional delay in some cases.


University of Tokyo
Circuit Transformation
In this demonstration, we show a new methodology to integrate multiple circuit tranformations and routing processes. More specifically, this demonstration shows ways to utilize multiple choices of circuit transformations in routing processes. First, we introduce a new logic representation that implements all possible wire reconnections implicitly by enhancing global flow optimization techniques. Then we present two approaches for performing routing and wire reconnection simultaneously: exact approach and practical approach with commercial P/R tools. Since our methods take into account multiple circuit transformations during routing phase where the accurate physical information is available, we can obtain better results than the conventional routing tools. In addition, we can succeed in routing even if other routers like rip-up and reroute methods fail. We built a prototype system that implements the methods and preliminary results will be demonstrated.


Univeristy of Maryland at Baltimore County
3G SOC Design and Test
With recent advancements in wireless technologies such as Bluetooth and 802.11, the design and testing issues in implementations of these protocols on a single piece of silicon are challenging traditional methods. Important issues related to a single chip implementation, include capability to reconfigure itself to use alternate wireless communication protocols, interference and crosstalk between modules, interoperability issues, control and observability of nodes in the embedded cores.


University of California, Irvine
EXPRESSION
The EXPRESSION project which aims to demonstrate a working prototype a framework that generates a software toolkit consisting of a retargetable compiler, and a cycle-accurate simulator will be demonstrated at the booth. This will include both a demonstration of the software, and also the use of posters to explain the techniques.


University of California, Irvine
SPARK: High-Level Synthesis
The presentation will focus on showing how parallelizing compiler techniques can be applied to improve the results of high-level synthesis of control intensive designs. Some of the aspects covered will be: * aggressive code motion techniques and their - effects on performance - effects on area and controller complexity * loop transformations for high-level synthesis * scheduling under system-level timing constraints * control synthesis and optimization * the synthesis system flow - the modular and configurable aspects of Spark We will show the results of such transformations on significant segments of industrial-strength benchmarks (e.g. MPEG, ADPCM)


Technical University of Munich
WiCkeD 3
We intend to perform a software demonstration of the tool "WiCkeD 3". We will provide one or two circuits to exemplify the procedures of design, modeling and yield optimization. Conference attendees will have the chance to discuss tool and algorithms with the developers. Some of the algorithms implemented in "WiCkeD 3" will be explained on a poster at the booth. Analog and mixed-signal (AMS) circuits play an important role in modern applications of communication systems, multimedia, memory, or automotive. Since design of analog devices is less automated than design of digital circuits, their development takes a large part of overall design time and cost of mixed-signal products. Performance capabilities and parametric yield of mixed-signal circuits are often limited by their analog components. Besides simulation and layout, sizing is the most time-consuming and tedious task. By choosing values of designable parameters like transistor geometries, the designer tries to maximize the parametric yield of the circuit considering operating conditions like temperature range, and random process fluctuations like oxide thickness variation or Vth mismatch. Due to the complexity of this problem and the growing influence of process fluctuations with shrinking feature sizes, manual sizing has become a bottleneck for design time and design quality. Most tools that try to automatically size lack the designer's intuitive, structural knowledge of the circuit's behaviour and often generate sizings that are technically unreasonable. "WiCkeD 3" however solves this problem by rigorously taking into account structural constraints, which vastly improves the quality of models and optimization results. The tool "WiCkeD 3" is a framework for scientific research. It serves as a common platform for a set of different research projects at our institute including sizing, modeling, tolerance analysis and test design. To the researcher, the tool offers a stable implementation of simulator interfaces, database access to previously performed simulations, simultaneous simulation on a network of hosts, and persistent storage of algorithm-specific data. Its modular structure and powerful script interfaces allow to quickly implement and evaluate new algorithms that would take months to develop from scratch. To the designer, "WiCkeD 3" offers a graphical user interface to a large set of state-of-the-art algorithms for analysis and sizing of analog circuits. Analyses provided are sensitivities, performance and parameter dependencies, Monte-Carlo based yield estimation including operating range influence, and deterministic worst-case and yield analysis. Advanced optimization algorithms for both nominal design and design centering are included, the most recent one presented at the conference (Session 50, Paper 3). "WiCkeD 3" is in regular use by industrial partners for design of amplifiers, filters, digital IO cells and other circuits.


Universitaet Hannover, Germany
Programmable Parallel Multimedia-DSP
Poster presentation of VLSI-design and applications for a second generation programmable parallel multimedia DSP (HiPAR-DSP 16), additionally chip samples of first and second generation will be displayed. The results of the design project are twofold. First, a number of students worked on C++ modeling, VHDL design, synthesis, physical layout, verification as the development of DSP software tools, promoting the education of experts in the field of processor architecture and VLSI-design. Second, an ambitious DSP project with various challenges was mastered under university conditions.


Universidad Nacional de Ingenieria
Fast Prototyping
We will feature our 5 works, wich will demo tools, techniques and hardware-based networks. You can read more about them below: - MOLEWARE We start showing a recent study of Molecular Scale Electronics, then we demo our Q-Model a general nanoscale model technique wich allows us to develop MOLEWARE, a tool for automation in simulation and specification of synthesis-based processes steps to build such systems. Since there is no spec files already defined for this early technology we extend some of the known standards specs and implement others. - HDL/Scan Based Fault Injecton and Evaluation Technique for FPGA-Based systems This section shows an HDL/Scan-Based Fault Injection technique wich is implemented in an automation tool flow through a GUI. This flow will validate or not in early design process stages a System implemented on a specified technology. This technique in combination with others are increasing the interest of community due to the facts involved in Critical Systems in Space and Ground in DSM Technologies. This Evaluation Service can be integrated through a Unilinx Service explained below. - ProtoFast / JBench Virtual Tester / FPGA2ASIC Migrator * module I is used to generate independent HDL testbenches from Java in order to allow validation also in the HDL level. It executes the System's Java Model and test it previously, then after partition and synthesis of bytecodes take place in hardware and software the generated hardware blocks can be tested in a lower hdl level, java distributed testbench verification can be run through a Unilinx Service as explained below. The same module allows integration with a GUI Architecture Factory wich in combination with the testbench automation Java-HDL technique allow us to explore rapidly other designs spaces. * module II generates a SVF file for each design specified by a constraints *.ucf, test vectors *.tv generated previously in hdl simulation using the generated testbench, and *.bsd(BSDL) and *.hsd(HSDL) file for the device/board implementation target). In other words, you can change any prototyping platform in seconds without redesign wich is a MUST for Unilinx Multi-Prototyping-Platform Services. * module III does Streamming of Test Vector Databases in Standard Bitstreams Formats such as SVF rather than an ATPG at the HDL and EDIF Level. We can also ask for Comparing, Evaluating, and Diagnostic tasks. * module IV Design Migration from ASIC to FPGA helps in porting VHDL code from FPGA Designers to other subsets of VHDL used in specific tool sets (e.g. Alliance Academic tools). - SuperJDrive 1532 This tool is a modification of the IEEE 1532 Std. JDrive Engine wich is a JTAG-Based Driver in order to support Concurrent Programming and Scheduling (This is the first tool in a family that support these features) in Production Line or any other High-End Remote Test Environment. Here we also demo a study of a 1532 master driver wich is in turn developed as an UNILINX Service and can work in parallel with a Java-Based SVF Engine. - Unilinx Network Unilinx is a Network of Devices/Services (Clusters) and the automation of these Clusters to connect to a Jini-Hibrid Network (Salutation & Bluetooth). The work gives a proof of concept and implement a practical display remote system (Services and Networks). This Network, based on the API for Boundary Scan, RMI(Remote Method Invocation) API, and Java-Spaces API, allows Integration of Design and Fast Prototyping Tools, Hardware and Services with the Internet. Since this is playing an important role on the unification of the hardware and software development processes we are calling for an initiative so called PROTOWARE wich will bring to Industry some concepts such as: Scanlets for Distributed Processing, Monitoring, Test and Upgrade Design, Remote Virtual Prototyping, Fault Detection and isolation, Field diagnostics, upgrade, service and repair On-board and embedded products in a distributed and autonomous fashion.


Carnegie Mellon University
SirSim
I will demonstrate the basic capabilities of SirSim by walking through several simple examples -- and then show performance on a larger benchmark.


UCLA
Dragon
During the demonstration a high quality standard-cell placement tool, called Dragon, will be shown. Dragon combines multilevel partitioning and simulated annealing techniques on a global bin framwork. It mainly focuses on wirelength minimization for large scale standard cell designs. Dragon's placement quality on large benchmarks, in terms of total bounding box wirelength, is the best among all published academic work. It is also a fast placer. Using different speed parameter, it can achieve 5 times speedup with only 10% loss of solution quality, e.g., finish a 100K cell design within 1 hour. A new technique in Dragon is congestion estimation and reduction. During placement process, Dragon can estimate the congestion distribution for current placement state. This prediction function is useful for logic synthesis since it provides early routability information. Also Dragon implements a post-processing step to reduce congestion on a placed design. This is done by local congestion improvement with global congestion knowledge. The congestion reduction function greatly improves the routability of the design. Congestion map during placement, after placement and after global routing will be shown.


Carnegie Mellon University
TrailBlazer
TrailBlazer: Direct Transistor-Level Layout for Digital Blocks Reference: http://www.ece.cmu.edu/~prakashg/trailblazer/ Contacts: Faculty: Rob A. Rutenbar Student: Prakash Gopalakrishnan We present a complete transistor-level layout flow, from logic netlist to final shapes, for blocks of combinational logic up to a few thousand transistors in size. The direct transistor-level attack easily accommodates the demands for careful ustom sizing necessary in high-speed design, and is also significantly denser than a comparable cell-based layout. The key algorithmic innovations are (a) early identification of essential diffusion-merged MOS device groups called clusters, but (b) deferred binding of clusters to a specific shape-level layout until the very end of a multi-phase placement strategy. A global placer arranges uncommitted clusters; a detailed placer optimizes clusters at shape level for density and for overall routability. A commercial router completes the flow. Experiments comparing to a commercial standard cell-level layout flow show that, when flattened to transistors, our tool consistently achieves 100% routed layouts that average 23% less area. Demo will include the following: 1. TrailBlazer: A running version of the tool will be demonstrated. 2. Poster: Including details of new algortihms, results and layouts comparing to commercial standard cell-level flow. 3. Student Presence: To answer any questions related to flow/algorithms.


Univ. of Wisconsin-Madison
Fast Circuit Simulator
When the number of transistors increasing rapidly nowaday, P/G network, interconnect become more and more complicate. Transistor level simulations are not practical due to the limitation of CPU run time and memory usage. These kind of problems are usually modeled as linear elements (RLC). However, general-purpose circuit simulators such as SPICE are still not efficient enough. Our software provides a fast solution to these linear systems, which is much faster than SPICE3.


University of Michigan
Wavelet-based video compression
The proposed demostration is for our design entitled: "A Configurable, Algorithm-Specific Processor for Real-Time Wavelet-Based Video Compression/Decompression", by Li Ding, Yi Li and Richard B Brown. This project has been awarded 2nd Place in the Student Design Contest (conceptual category). An abstract of the project is as follows. A real-time video codec based on wavelet transformation is designed and implemented using the TSMC 0.25um process. It incorporates a configurable pipelined vector processor, which is up to two orders of magnitude faster than general-purpose microprocessors for motion estimation, and a dedicated module capable of both forward and inverse wavelet transformation. This chip consumes 1.3 W at 100 MHz and is capable of handling real-time duplex coding and decoding of CIF format videos. The demostration will be mainly poster-based with some amount of computer-aided demo.


SUNY Binghamton
The Feng Shui Physical Design Tools
Demonstration of placer, global and detail router


National Tsing Hua University
Internet-based simulation
We will show an demonstration for out tool through internet. An Internet-based concurrent-simulation scheme helps to ease IP evaluation process between IP vendors and users. Complex system-on-a-chip design requires more and more IP modules from 3rd party vendors. What can be disclosed by the vendor without impairing its trade secrete and what needs to been examined by the user to gain satisfactory level of confidence are contradictory of each other. Via PLI interface functions and Internet protocol, our proposed software enables HDL simulators (Verilog) residing in both the vendor and user's sites to concurrently simulate the IP and SOC together. Only stimulus and response defined in the IP's module I/O are exchanged between the sites. Therefore, the vendor need not to create a functional model (or encrypted code) for the IP while the user is assured what he/she simulates is what he will purchase. Beside simulation speed degradation due to communication overhead, the SOC de! sign/debug process is exactly same as if the IP is in the user's hand. Our contribution will help all IP providers expose their IPs to all potential users without human intervention and IP right infringement concern. More details in http://nthucad.cs.nthu.edu.tw/~op/cgi-bin/TIE/main.htm


Seoul Nat'l Univ.
Emulation-based Verification
Emulation-Based Coverification System for Hardware-Software Codesign: Demonstration of H.263 Encoder Coverification Cycle-accurate co-verification is often the most time-consuming process in the co-design of hardware-software systems. To enable fast co-verification of hardware-software systems, we developed EVEREST (Emulation-based VERification environment for Embedded SysTems) verification system, which is capable of verifying complex hardware-software systems by both simulation and emulation. The major features of EVEREST include optimistic simulation and emulation of software part for fast co-verification, flexible co-emulation architecture by CES (Co-Emulation Server), and consistent user interface for debugging simulation, emulation, and prototyping at various levels of abstraction. Demos will be prepared for three kinds of verification scenarios for an h.263 encoder example. An h.263 encoder partitioned into software and hardware will be verified using EVEREST. First, the software part is simulated using cycle-accurate instruction set simulator embedded in our in-circuit debugger, and the hardware part is emulated using our hardware emulator. For synchronous co-verification, the two parts are connected and co-emulated by CES. Second, the software part is executed on a real processor and debugged via the JTAG ports while the hardware part is still executed in our hardware emulator. Third, the whole system is prototyped on our board. In the first two scenarios the optimistic co-emulation results will be compared with a conventional approach. The major components of EVEREST system are in-circuit debugger for software debugging, hardware emulator for hardware verification, and CES for connecting the two. The gdb-based in-circuit debugger supports non-intrusive (requiring no target system resource) in-circuit debugging using processor's JTAG (IEEE standard 1149.1) interface. It also supports optimistic emulation of real processors as well as optimistic simulation, which is crucial for hybrid co-emulation and co-simulation, thereby reducing total co-verification time dramatically. The hardware emulator is FPGA-based, which can be a very cost-effective solution for medium-to-small sized logic verification typically found in many embedded systems designs. CES basically connects simulators and emulators for cycle-accurate co-simulation and co-emulation. It works for memory-mapped interface, by monitoring memory access addresses and passing appropriate messages to hardware and software sides. By employing CES, the organiz! ation of co-emulator is greatly simplified and any other simulators or emulators can easily be connected by a small modification of memory access parts. CES can both observe and control the messages going through the channel. As of prototyping, we have made a prototyping board containing a processor with an ARM7TDMI core and Virtex FPGAs for hardware emulation and board control. The above-mentioned verification tools for software debugging and hardware emulation can also be used for verifying/debugging the design being prototyped. Currently, the in-circuit debugger supports only ARM7-based processors but it may easily be extended to other processors or DSPs supporting JTAG-based debugging interface. All the above-mentioned tools are run on a Windows platform. In summary, the EVEREST verification system provides efficient verification tools for designing complex embedded systems. It supports hybrid co-simulation and co-emulation by executing the software parts optimistically, which becomes more effective as the software parts grow bigger.


Seoul National University
Low-power Software Design
Power consumption has emerged as one of the most important performance metrics in digital systems. Among the range of power reduction techniques, high-level power optimization is useful for complex digital systems including microprocessor-equipped systems. Appropriate power consumption models are mandatory for high-level power reduction practices because high-level power optimization techniques do not concern physical designs. At the last DAC University booth, we have demonstrated in-house hardware and software tools that can measure and analyze cycle-accurate energy consumption. Our technique guarantees accuracy when we use a sampling rate of twice the clock frequency under spiky current draw common in digital systems. It acquires the energy consumption profile in real-time without repeated operation of the target systems. We have continued to enhance our tool in both hardware and software aspects. We added more target devices including flash memory and developed PC card-type ARM7TDMI board that is equipped with 16MB vector memory in order to measure energy consumption with real application programs such as Linux operating system. The PC card has high-speed PCI local bus bridge and enables real-time data acquisition without stopping the target program. The PC card is running with a source-level debugger that has GNU debugger like look-and-feel. Software designers are able to do C-source code or assembly code debugging where each source code is associated with energy consumption information. The enhanced version can collect entire external bus signals. This feature enables us to come up with energy debugger environment that supports whole systems including various peripheral devices. Our demonstration includes real-time energy measurement and characterization of a microprocessor and memory devices (the last DAC University booth), additional devices, a new PC card-type energy debugger hardware, GNU debugger like source-level energy debugging software, and some applications. For more information, please visit http://lowpower.snu.ac.kr


Chalmers University of Technology (Gothenburg, Sweden)
Lava HDL
I plan to demonstrate the interactive Lava-system and show how to describe hardware in it. I will then show how to prove properties about circuits, and discuss future feature, such as proving properties of parameterized circuits, instead of instances of them.


Pontificia Bolivariana (UPB)
Video acquisition chip
The image processing given the volume of data to be manipulated typically implies the great slowness or high requirements of hardware if satisfactory results are tried. The Programmable Logical Arrays raised the possibility of making a system of capture and image processing at high speed, enjoying the flexibility of a conjugated DSP of general intention with the speed of a dedicated Hardware. The characteristics of the Chip allow to solve problems that at the moment can not be solved of simultaneous way at commercial level such as: minimum requirements of hardware, optimization in the use of the space and direct effect in the aesthetic presentation; under power consumption; highest frequencies of work guaranteeing excellent performance and introduction of the concept of execution in real time. The captured signal is processed easily by the user, responding to options determined through a keyboard, that it activates algorithms for the presentation of píxeles in screen, is as well as effects are obtained: ·Elimination of colors, Freezing, the Negative, Average of the image, Binarización. In short, a small sample of manipulation of images in real time of simple mode. Throughout the presentation it is showed wirh a proctical demostration what has been obtained (the items above described) after an arduous work of investigation in the subject previously exposed


University of California, Irvine
IMPACCT
Title: A Power-aware Scheduling Tool for System-Level Power Management We present a new scheduling tool for system-level power management in embedded-systems. This tool addresses the following key issues that are not adequately addressed by previous works. First, the new generation of embedded systems must be designed to be power-aware, rather than just low-power. Second, the power management decisions must be made at the system level, rather than only at the component level. Power-aware systems are those that must not only minimize power when the power budget is low, but also deliver high performance when required. They subsume traditional low-power systems as special cases. The power managers must also track the availability of all power sources and the power usage of all consumers in application-specific operating conditions. These conditions include the different energy models from expensive (e.g., non-rechargeable battery) to free (e.g., solar source), and the variance characteristics of the power consumers in the changing environment. We believe that power-aware designs must be done at the system-level, not just at the component level. Amdahl's law applies to power as well as performance. That is, the power saving of a given component must be scaled by its percentage contribution in an entire system. Thus, it is critical to identify where power is being consumed in the context of a system; and a successful power manager must consider both computation domains (e.g., embedded processors, memory) and non-computation domains (e.g., mechanical and thermal subsystems) and coordinate their power usage as a whole system. We present a prototype of a novel power-aware scheduling tool that supports system-level power management. Our underlying application model captures timing constraints on communicating tasks, as well as min/max power constraints on the entire system. The core scheduler produces a schedule that satisfies stringent timing constraints and the max power budget; it will also make the best effort to meet the min power goal corresponding to the free energy sources. The result is visually presented to the designers in two views. The time view shows the timing sequence of parallel task execution on multiple resources. The power view shows the system-level power curve of the schedule with corresponding attributes, including different energy sources, major power consumers, expensive energy vs. free energy, etc. Our tool can effectively automate the design space exploration with different energy/performance trade-offs. Moreover, the designers can manually intervene with the automated sche! duling process by imposing additional timing constraints in the time view, while observing the results in the power view interactively. Our work is motivated by the NASA/JPL Mars rover. It features several interesting problems that cannot be effectively handled by traditional low-power techniques. First, the rover has different energy sources: a non-rechargeable battery and a solar panel. Second, the major power consumers do not even include the embedded processors, but they consist of mechanical motors and thermal heaters. By applying the power-aware scheduling techniques to this application, our tool can produce alternative designs that achieve both improved performance and reduced energy cost simultaneously. This tool forms the basis of the IMPACCT system-level framework that will enable designers to explore many power-performance trade-offs with confidence.


University of California, Irvine
Component Based Design Framework
The design and reuse of hardware system level components written in C++ are ad hoc and tedious because of the strong emphasis on inheritance as the basic composition mechanism. In the demonstration, we show the BALBOA CAD system that relies on reuse by dynamic composition rather than reuse by inheritance. A system architect uses this system to build hardware and system level models by assembling IP library components defined in C++ and SystemC. The key feature of the tool is that the system architects do not have to write or modify C++ code directly, but rather use a Component Integration Language (CIL, an extension to OTcl) where connectivity and typing are abstracted. In this demonstration, we show how to use the BALBOA CIL for the integration of an AMRM Adaptive Memory System (http://www.ics.uci.edu/~amrm/) from IP library components defined in C++ with SystemC. We show the efficiency of the CIL to perform communication refinement and how typing and connectivity abstractions helps the system architect to focus on the essential tasks of architectural exploration. This demonstration should be of interest to DAC attendees because it describes a strategy for efficient high level modeling of hardware and system with C++. It demonstrates that C++ should be used to define IP components, but these components should be manipulated at a higher level of abstraction to hide details and guide the composition. This abstraction layer is implemented through the CIL. There is also design automation in this layer to solve the typing and connectivity abstraction. A moderately complex design example will also show how the usage of the CIL reduces the size of the model.


Princeton Univ.
Chaff SAT Solver
We plan to demonstrate our newly developed Boolean Satisfiability Solver called Chaff. The Chaff solver is developed by Prof. Sharad Malik's group at Princeton University. In our demonstration, we will show the SAT solver working on various SAT problems generated from real EDA applications as well as some artificial instances. The detail of the SAT solver itself is described in session 33.1, Engineering a (super) fast SAT solver. Interests in SAT solvers are resurging in the last couple of years because of the development in SAT based Model Checking and Bounded Model Checking. We developed the SAT solver Chaff that is fast on the structured problems from real world applications. The demo will show the strength of our solver in comparison with other state-of-the-arts solvers.


UC Berkeley
Ptolemy II
Ptolemy II is a high-level component-based design environment for modeling embedded systems. We demonstrate a variety of component-based system models created using graphical block diagrams. These models are specified using dataflow diagrams, finite-state machines, differential equations, 3 dimensional scene graphs, and other useful styles of component interaction. Ptolemy II is unique in its ability to allow these different types of diagrams to be combined hierarchically. This results in systems that are specified more completely, concisely and understandably than would be possible otherwise. We demonstrate models of signal processing, communications, and control systems. We will also show how Vergil, a user interface for Ptolemy II can now be used to create these models using graphical diagrams. http://ptolemy.eecs.berkeley.edu


University of California, Los Angeles
Strategically Programmable System
During the demonstration an automated architecture generation tool and a high-level synthesis tool targeting the generated systems will be presented. The architecture generation tool renders a novel programmable architecture called the strategically programmable systesm (SPS). SPS is a programmable system customized for a given family of application. The customization is achieved through embedding of optimized fixed blocks within a fully reconfigurable fabric. The embedded fixed blocks are decided according to the needs of the given application set. For a given application set, first a pattern extraction task is done. In this phase critical/common operations or clusters of operations are identified. On data flow graphs created for each application extracted patterns are displayed. Also statistical data on the ferquencies and distribution of operations and operation patterns are collected. The collected data is input to the core of the architecture generation. Based on this data delievered, decisions regarding which types of fixed blocks to include on the chip are made. The second part of the demonstration shows how a given application is mapped on one instance of SPS architecture. The tool that we will demonstrate performs the scheduling and binding of operations on a given architecture configuration.


KULeuven
RF Mixer Circuits
The Demo will show a graphical representation of all the steps in th optimization process of an RF mixer circuit where handcalculations, Layout generation, parasitic extract ion are combined with the power of the Differential Evolution Algorithm to give a fast and flexible optimization routine. All the steps of the optimization algorithm will be shown graphically. The optimization demo is centered around a graphical interface showing the current status of the optimization parameters, the evolution of the cost function. The process of layout generation and parasitic estraction, which is an integral part of the optimization will also be demonstrated.


University of Technology RWTH Aachen
Flexible Datapath Generator
Following actual market constraints, the combination of the significant advantages of a physical oriented design regarding power dissipation, throughput rate, and silicon area with a design effort comparable to semi custom designs is highly desirable. Consequently, a flexible datapath generator (DPG) has been developed that enables the physical oriented design of datapath macros based on a given signal flow graph (SFG) and a small set of optimized leaf cells at a very low design effort. At the booth it will be shown how the DPG exploits the inherent regularity and locality typical for SFGs in digital signal processing. In this way the optimization of silicon area, throughput rate, and especially power dissipation is possible in short iteration cycles down to the physical layout level. The DPG is fully embedded into an industrial standard framework and allows the integration of the generated blocks as hardmacros. A detailed description of the DPG-based physical oriented design methodology and industry related benchmarks are presented at "http://www.eecs.rwth-aachen.de/dpg/info.html". During the presentation we would like to demonstrate the DPG and the resulting design methodology by generating the physical oriented design of a small datapath containing typical DSP basic functions based on a parameterizable signal flow graph which is designed and modified interacting with the audience. The DPG run will be interrupted at predefined breakpoints clarifying the idea of deriving abutment basic cells from a small set of leaf cells by powerful DPG routines for layout modification. The presentation will be accompanied by a poster summarizing the DPG design flow and showing industry related benchmarks.


TIMA -- Grenoble France
SOC Communication Hardware
The Demo will show a design flow for automatic design of hardware software communication in the case of multi-processor SoCs. The system is initially specified at the system level as an abstract architecture that we call macro-architecture, this model is composed of a set of modules interconnected through logical wires. Each module represents a processor in the final architecture. This may be a software processor (e.g. DSP or a microcontroller executing software), a hardware processor (specific hardware) or an existing IP (global memory, peripheral, bus controller, ?). The logical wires are abstract channels that transfer fixed data types (e.g. integer, real) and may protocols (e.g. handshake or memory mapped I/O). The architecture may also include RTL modules. Co-simulation is required to validate the partitioning and the abstract communication. The design flow produces the detailed architecture that we call micro-architecture. Software modules are mapped onto specific processors. This requires the synthesis of a hardware interface to link the processor to the rest of the architecture. When the software includes parallel task a specific operating system (OS) is synthesized. Hardware blocks are refined to the clock-cycle level. Finally, existing blocks (IPs) are encapsulated within an interface in order to accommodate the final protocols. The communication between the different blocks is made through physical wires that implement the final protocols. The interface block may include controllers and buffering when needed. The demo will show a of models and tools based on SystemC including: 1- A target architecture and SystemC based design representation models that allows for flexible, modular and scalable design of multi-processor systems on chip. 2- A design flow that allows bridging the gap between system specification and multi-processor system-on-chip implementations. 3- A set of methods and tools that implements the design flow. These include Co-simulation and Automatic generation of hardware software interfaces for multiprocessor system on chip.


University Booth Table