The Totem Project is an effort of ACME Labs within the EE Department of the University of Washington . The goal of the project is to create tools to automatically generate domain-specific reconfigurable architectures based on users' needs. This includes the creation of a high-level description of the architecture, a VLSI layout, and custom tools to provide place and route functionality for the generated architecture.
Reconfigurable hardware has been shown to provide an efficient compromise between the post-fabrication flexibility and hardware performance. For this reason, reconfigurable hardware is a good choice to include on SoCs. Although standard FPGA cells could be used, we can optimize the hardware based upon the target applications for the SoC. The resources and degree of reconfigurability can be customized to support a class or a specific set of algorithms, while still providing flexibility within that set. The manual creation of these customized reprogrammable architectures would be a labor-intensive process, leading to high design costs for the SoC. Instead, The Totem Project's focus is to provide automatic generation of specialized reconfigurable architectures.
All of the work done for this project futhers the development of three distinct type of tools: architecture generators, VLSI layout generators, and place and route tools.
The first tool in the flow is the architecture generator, which automatically creates a high-level reconfigurable architecture description based upon the targeted applications and user constraints. The targeted application space is often described in terms of circuits that need to be supported, and common user constraints such as speed and area requirements. The architecture description that is generated is sent to both a VLSI layout generator and a place and route tool.
The VLSI layout generator is responsible for producing an efficient VLSI layout from the architecture description. There are many different ways of achieving this depending upon what type of reconfigurable architecture is being created.
The final tool in the Totem flow, the place and route tool, performs placement and routing of application netlists onto the created hardware. This tool accepts a netlist and a description of the architecture, and it outputs a bitstream that can be used to configure the generated architecture with the specified netlist. The bitstream format may require information from the layout tool to obtain the correct ordering of the programming bits.
The Totem tool flow aims to provide a solution for each of the important steps in the design of reconfigurable architectures, from high-level decisions of component organization and connectivity, to low-level VLSI layout, to software tools which facilitate the use of the hardware. In this manner, optimized custom reconfigurable hardware on SoCs becomes achievable without a significant impact on time-to market or design costs.
Completed Work - Totem RaPiD
To provide an early proof-of-concept, we first began work focusing around coarse-grained architectures that consisted of components such as multipliers and adders connected via a one-dimensional routing structure. Totem RaPiD was a project that leveraged the RaPiD architecture (developed by Carl Ebeling) in order to create application specific reconfigurable hardware. RaPiD is a one-dimensional course grained architecture that uses components such as multipliers, ALUs, and RAMs in order to achieve efficient computation on word-sized data.
- 1D High-Level Architecture Generation
We have explored a number of methods to generate the overall structure of Totem RaPiD architectures automatically from a given application set [Compton01, Compton02]. We also discovered that the development of application-specific FPGAs presents some unique and difficult challenges regarding how silicon can be most effectively used [Eguro03].
- RaPiD VLSI Layout Generation
This tool allow users to automatically generate fabrication-ready layouts for the custom reconfigurable fabrics that have been specified by the Totem RaPiD high-level architecture generator. We are targeting users that are working within the system-on-a-chip (SOC) design methodology. Our goal in the generation of these custom architectures is to give the user the opportunity to create a unique reconfigurable device that is specific to an application domain. Since the reconfigurable fabric is to be inserted into an SOC, we must ensure that even though a new architecture is being created, we do not interrupt the design flow and therefore increase the time to market. Three different methods are being explored to automate the layout process: standard cells [Phillips02], template reduction [Phillips04], and FPGA-specific circuit generators [Phillips05].
- Place & Route Tool Creation
Totem's high-level architecture generation tool [Sharma01, Compton02] produces an architecture file that describes the various aspects of the generated architecture. This information includes the resource mix, the relative positions of the components within the computational structure, and the routing structure. The place and route tool is parameterized to use this architecture information as the target for an application netlist file. An optimal physical placement for the netlist is created using a detailed cost function based on the area of the design, wire length, cut size, etc. The router then assigns the necessary component connections for the application to the physical wires described in the architecture file.
Current Work - Totem CPLD
We have expanded our investigation to tools that will automatically create application specific reconfigurable CPLDs for SoC. CPLDs are reconfigurable architectures that use PLAs or PALs as their functional units, and which utilize either hierarchical, island-style, or centralized routing structures to route between functional units.
In order to explore the idea of application specific CPLDs, we felt that it was first necessary to explore the idea of application specific PLAs and PALs [Holland04]. CPLDs are composed of either PLAs or PALs, so learning how to tailor a PLA/PAL to a specific application domain will give us insight into tailoring a CPLD to a particular domain.
The development of application specific PLAs/PALs follows the same steps as other Totem RaPiD in that it requires an architecture generator, a VLSI layout generator, and a place and route tool. One difference, however, is that the architecture generator and place and route tool are actually implemented together. Circuits that comprise a target application domain are fed into an architecture generator, and the architecture generator creates the PLA or PAL architecture to support the domain AND it outputs bitstreams that can be used to implement the circuits that were used to develop the architecture. The architecture description created by the architecture generator is fed to a VLSI layout generator which creates an optimized VLSI layout from pre-created, optimized layout tiles.
- PLA/PAL Architecture Generation
The architecture generator is responsible for creating a PLA or PAL architecture for the specified application domain. Before the circuits are fed to the architecture generator, they are put through ESPRESSO, which is a two-level logic minimization tool. The minimized circuits are then fed into the architecture generator, which attempts to create the smallest and fastest PLA or PAL that is capable of supporting the circuits provided in the application domain.
The size of the ultimate PLA or PAL is easily determined, as the number of inputs, product terms, and outputs in the final PLA/PAL are simply the maximum occurrence of inputs, product terms, or outputs from the set of input circuits. Also note that as long as you know all the circuits that you’ll need to implement in the PLA or PAL, you can determine all the array locations (within the PLA or PAL) that will ever need to be programmed. By intelligently mapping the input circuits to the PLA and PAL arrays using simulated annealing, we were able to minimize the number of array locations that require programmability, and we were able to remove the unneeded programming points to improve the performance of the arrays.
- PLA/PAL VLSI Layout Generation
PLAs and PALs are very regular structures, and are thus good candidates for VLSI layout generation. By highly optimizing a small set of layout tiles (an AND-plane tile, OR-plane tile, inverters, buffers, decoders, etc.), and by tiling these optimized layout tiles, the layout generator is able to create a highly optimized full PLA or PAL layout. This VLSI layout can then be provided directly to an SoC designer as a hard core to be integrated into their SoC design.
Current Work - Totem2
Using lessons and techniques gleaned from the Totem RaPiD system we intend to develop a toolflow that can automatically create specialized 2-D island style FPGAs. During the development of architectures such as the encryption-specialize RaPiD system mentioned in [Eguro03], we quickly realized the benefit and flexibility of 2-D island style FPGAs. Although the challenges are similar to those of the 1-D case, there are several important differences.
- 2-D High-Level Architecture Generation
Current generation commercial FPGAs utilize a mixture of small Look Up Tables (LUTs) embedded within a very rich routing structure. While such an interconnect system lends a great deal of flexibility to the device, it also requires an incredible amount of area. The techniques that we use to generate this type of more conventional FPGA architecture will need to be much more sensitive to improving area utilization while not destroying the inherent flexibility of the 2-D FPGA.
- Island-Style FPGA VLSI Layout Generation
The last few generations of Xilinx and Altera FPGAs have included a number of monolithic dedicated multipliers and block memories embedded throughout the array. This is because these manufactures have realized that there are several types of commonly used calculations that incur a stiff area and performance penalty when mapped to LUTs. We expect this trend to continue - doublely so for our system of specialized 2-D island style FPGAs. Because such a system cannot take advantage of many assumptions that we made in the Totem RaPiD toolflow, such as uniform device height, the development of intelligent circuit generators becomes far more difficult.
- Place & Route Tool Creation
Although the FPGA architectures that we generate will be tuned to benefit a given set of applications the most, we would like to maintain as much of the flexibility of the device as possible. As mentioned in [Eguro05], this may include scenarios that are simply not addressed by conventional place and route algorithms. For example, we have shown that existing placement algorithms fail to find valid arrangements when faced with even marginally routing-poor architectures.
This effort is headed by Scott Hauck, Associate Professor in the EE Department at University Washington, and director of ACME Labs.
Other researchers include:
- Ken Eguro, University of Washington
- Mark Holland, University of Washington
- Shawn Phillips, University of Washington
- Akshay Sharma, University of Washington
Former group members include:
- Katherine Compton, University of Wisconsin - Madison
K. Compton, S. Hauck, "Automatic Design of Area-Efficient Configurable ASIC Cores", submitted to IEEE Transactions on Computers.
S. Hauck, K. Compton, K. Eguro, M. Holland, S. Phillips, A. Sharma, "Totem: Domain-Specific Reconfigurable Logic", submitted to IEEE Transactions on VLSI Systems.
A. Sharma, Development of a Place and Route Tool for the RaPiD Architecture, M.S. Thesis, University of Washington, Dept. of EE, 2001.
K. Eguro, RaPiD-AES: Developing an Encryption-Specific FPGA Architecture, M.S. Thesis, University of Washington, Dept. of EE, 2002.
K. Compton, Architecture Generation of Customized Reconfigurable Hardware, Ph.D. Thesis, Northwestern University, Dept. of ECE, 2003.
S. Phillips, Automating Layout of Reconfigurable Subsystems for Systems-on-a-Chip, Ph.D. Thesis, University of Washington, Dept. of EE, 2004.
S. Phillips, S. Hauck, "Automatic Layout of Domain-Specific Reconfigurable Subsystems for System-on-a-Chip", ACM/SIGDA Symposium on Field-Programmable Gate Arrays, pp. 165-173, 2002.
K. Compton, A. Sharma, S. Phillips, S. Hauck, "Flexible Routing Architecture Generation for Domain-Specific Reconfigurable Subsystems", International Conference on Field Programmable Logic and Applications, pp. 59-68, 2002.
A. Sharma, C. Ebeling, S. Hauck, "PipeRoute: A Pipelining-Aware Router for FPGAs", ACM/SIGDA Symposium on Field-Programmable Gate Arrays, pp. 68-77, 2003.
K. Eguro, S. Hauck, "Issues and Approaches to Coarse-Grain Reconfigurable Architecture Development", IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 111-120, 2003.
K. Compton, S. Hauck, "Track Placement: Orchestrating Routing Structures to Maximize Routability", International Conference on Field Programmable Logic and Applications, 2003.
K. Compton, S. Hauck, "Flexibility Measurement of Domain-Specific Reconfigurable Hardware", ACM/SIGDA Symposium on Field-Programmable Gate Arrays, pp. 155-161, 2004.
A. Sharma, K. Compton, C. Ebeling, S. Hauck, "Exploration of Pipelined FPGA Interconnect Structures", ACM/SIGDA Symposium on Field-Programmable Gate Arrays, pp. 13-22, 2004.
M. Holland, S. Hauck, "Automatic Creation of Reconfigurable PALs/PLAs for SoC", International Symposium on Field-Programmable Logic and Applications, pp. 536-545, 2004.
S. Phillips, A. Sharma, S. Hauck, "Automating the Layout of Reconfigurable Subsytems Via Template Reduction", International Symposium on Field-Programmable Logic and Applications, pp. 857-861, 2004.
K. Eguro, S. Hauck, A. Sharma, "Architecture-Adaptive Range Limit Windowing for Simulated Annealing FPGA Placement", to appear in Design Automation Conference, 2005.
S. Phillips, S. Hauck, "Automating the Layout of Reconfigurable Subsystems Using Circuit Generators", IEEE Symposium on Field-Programmable Custom Computing Machines, 2005.
M. Holland, S. Hauck, "Automatic Creation of Domain-Specific Reconfigurable CPLDs for SoC", to appear in International Symposium on Field-Programmable Logic and Applications, 2005.
A. Sharma, C. Ebeling, S. Hauck, "Architecture-Adaptive Routability-Driven Placement for FPGAs", to appear in International Symposium on Field-Programmable Logic and Applications, 2005.
A. Sharma, S. Hauck, "Accelerating FPGA Routing Using Architecture-Adaptive A* Techniques", submitted to IEEE International Conference on Field Programmable Technology, 2005.
A. Sharma, C. Ebeling, S. Hauck, "Architecture Adaptive Routability-Driven Placement for FPGAs", ACM/SIGDA Symposium on Field-Programmable Gate Arrays, 2005.
M. Holland, S. Hauck, ""Automatic Creation of Domain-Specific Reconfigurable CPLDs for SoC", IEEE Symposium on Field-Programmable Custom Computing Machines, 2005.
K. Eguro, S. Hauck, "Decipher: Architecture Development of Reconfigurable Encryption Hardware", University of Washington, Dept. of EE Technical Report UWEETR-2002-0012, 2002.
K. Compton, S. Hauck, "Track Placement: Orchestrating Routing Structures to Maximize Routability", University of Washington, Dept. of EE Technical Report UWEETR-2002-0013, 2002.
A. Sharma, C. Ebeling, S. Hauck, "PipeRoute: A Pipelining-Aware Router for FPGAs", University of Washington, Dept. of EE Technical Report UWEETR-2002-0018, 2002.
K, Eguro, S. Hauck, "Issues of Wirelength Cost Models in Routing-Constrained FPGAs", University of Washington, Dept. of EE Technical Report UWEETR-2004-0006, 2004.
A. Sharma, C. Ebeling, S. Hauck, "Architecture Adaptive Routability-Driven Placement for FPGAs", 2005.