|
|
|
GALS @ ETHZ[Home] [About] [People] [Docs] [Links]
Series in Microelectronics: Volume 120Globally-Asynchronous Locally-Synchronous Architectures for VLSI SystemsAbstractThis thesis describes the specification and implementation of a design methodology for globally-asynchronous locally-synchronous (GALS) architectures. Such architectures are a novel approach to the design of complex VLSI systems which are suffering from clocking problems and high power consumption. In a GALS architecture the system gets partitioned into several independently clocked modules which are communicating in a self-timed manner. Thus the functionality of each locally-synchronous (LS) module can be described and synthesized along well established synchronous design flows and clocking problems are eased by confining them to a moderately sized subsystem. The circuitry necessary to coordinate clock-driven with self-timed operation is contained in an asynchronous wrapper that surrounds each LS module. These wrappers have a modular internal structure that can be assembled from a library of six basic elements: four different port controllers for input and output ports respectively, a memory access controller, and a clock generation unit. To prevent metastability while transferring data across clock boundaries, the concept employs a pausible clocking scheme, i.e., whenever data changes and sampling clock edge occur dangerously close to another, either the clock or the data get delayed. The pausible clock is provided by an on-chip clock generator, which does properly arbitrate between numerous concurrent requests for pausing and active clock edges. The frequency of the clock is programmable for a large range of values. Data channels between LS modules, which are established with the mentioned basic elements, are able to safely transfer over 300e6 data words of arbitrary width per second (All measured numbers refer to a 0.25um CMOS technology). The channels have a latency of far less than one clock period and transmissions in subsequent local clock cycles are possible. These two features and the prevention of metastability distinguish the proposed method from all previous concepts. By realizing a cryptosystem in GALS architecture the methodology is validated on silicon. The ASIC encrypts and decrypts data with the Safer SK-128 algorithm in all standardized block cipher modes. The system contains 58.000 gate equivalents and scan-testability of synchronous circuitry is fully sustained. This work is the first to demonstrate GALS operation for a system of substantial complexity. A second ASIC implements the same cryptosystem in a conventional globally-synchronous architecture and serves as a reference for comparisons. Both chips have been manufactured on the same wafer. The measured maximum data throughput of the synchronous design is about 30% higher than that of the GALS counterpart. On the other hand the synchronous chip, which already employs clockgating to save power, dissipates 30% more energy per Mbit of data than the GALS implementation. AuthorsOrdering Information[Home] [About] [People] [Docs] [Links] |