Back to résumé

Ghostwritten for Altera Corporation for publication in Electronic Engineering Times.

 

Embedded soft cores in PLDs 

In general, integration has been a major strategic pathway to the industry’s Big Three: higher performance, shorter time-to-market, and lower manufacturing costs.  On that bumpy road to ever higher integration, microprocessors and programmable logic devices have evolved separately, but along similar lines.

The drive for ever-higher performance has led to microprocessors with wider data paths capable of handling longer instructions.  On-board memory caching, better clock rates, and more efficient logic operations have increased speed. At the same time processors have become more complex, and to deal with the complexity designers are using high-level languages such as C, C++, and Java, and the processors often come with on-chip debugging tools such as Background Debug Mode, Enhanced JTAG, and N-Wire. 

At the same time, PLDs have become larger, faster, and cheaper through improved process technology. As geometries continue to shrink, the size of the die remains relatively constant, so it becomes practical to add more elements, such as on-board RAM.  As with microprocessors, greater integration has meant memory embedded in PLD architectures.  And just as increased complexity has led to the use of higher level languages in the design of microprocessors, complexity in PLDs has resulted in hardware description languages becoming the de facto means of developing PLD designs.  PLD applications have grown larger and more intricate as well, with on-chip debugging tools the result.

The parallel development of microprocessors and PLDs arises from the fact that both devices use logical operations to solve a set of problems where, within the range of functionality of the device, the exact combination or configuration of operations changes with the user’s individual application.  In general, operations that require the highest performance need direct hardware implementation, while less stringent performance requirements can be met with the slower sequential functioning of a microprocessor.  It should be no surprise, then, that the next logical evolutionary step is the integration of microprocessors and PLDs.

Before this integration could happen, however, PLD technology had to reach a critical point.  The early PLDs were too small to hold processors.  As process technology improved and geometries shrank, there was enough room on the die for a processor but they were too expensive and too slow.  But the PLDs kept getting bigger, and in 1999 a critical point was reached where a viable soft core can be implemented in a PLD with both price and performance on a par with board-level devices. 

With this quantum step in integration, performance is enhanced through the elimination of on-chip/off-chip delays.  Power consumption is reduced and the smaller die and reduced manufacturing expenses all work toward lowering overall cost. 

Most of the above material down to here has been culled from the Excalibur backgrounder.

A recent happy development is that 32-bit RISC processor cores are now available for programmable logic. These cores are ideal for use in many embedded systems applications because of the higher integration, greater flexibility, and shorter time-to-market they can achieve. This is the development that enabled Altera to introduce the industry's first soft core RISC embedded processor optimized specifically for programmable logic.

One of the considerations a designer has to face in designing an embedded processor PLD is whether to use a soft or hard core.  Soft cores, or portable logic blocks, have many advantages over hard cores for some applications.  Flexibility is the main advantage.  A soft core processor has configurable parameters that enable you to meet a variety of application needs. If you buy an off-the-shelf solution, there will be tradeoffs between your design goal and what is pre-canned in the off-the-shelf product.  You may end up paying for peripherals you don’t want, or you may end up with larger peripherals than you need for the specific task.  In both cases you’re wasting silicon and you’re wasting money.  With a soft core you get and pay for only what you want. 

In basic hard core technology, the standard methodology for building configurability into hard cores is to design in the desired functionality and then use registers and mulitplexers to select which functionality is used for a specific application (called run-time configurability).  That is, when the system boots up it programs the registers to provide the function needed.  With soft cores, configurability is built in at compile time, thus saving hardware because you don’t implement the features that you’re not going to use.

The configurability of the soft core also makes it easy to make changes, which can dramatically shorten PLD design cycle times. For example, a USB core is used for a variety of applications where each has its own characteristics, which means that each application has to be configured individually. If you had to design each application separately, as you do with a hard core, costs would escalate. 

Configurable parameters mean that there are fundamental performance area tradeoffs a designer can make.  The designer can choose either a 16- or 32-bit data width, and whether or not to use barrel shifters, which allow multi-bit shifts in a single clock cycle; m-step, which is a special instruction that does the add and shift in one clock cycle; and the size of the register file, which maps interrupt routines.  For example, if an application requires a lot of multi-bit shifts, from a performance point of view it would make sense to implement a barrel shifter, even though it might raise your costs and require a larger chip.  On the other hand, if the application doesn’t require a lot of multi-bit shifts, then the barrel shifter could be eliminated to achieve a smaller, less expensive core.  This gives a degree of flexibility as to how fast your processor performs for a given application, versus how much of the chip it consumes. These factors, in turn, impact cost.

In an off-the-shelf processor the designer is sometimes stuck with a canned peripheral included on the chip that may not offer all the functionality needed.  With a programmable-logic-based processor a customized peripheral can be built with precisely the functionality that is needed, no more, no less. 

However, soft cores have design complexities that must be understood  before you can design them into your system.

A soft-core description is usually designed at a behavioral or register-transfer level (RTL) in a hardware-description language (HDL) such as VHDL or Verilog, although some vendors offer soft cores in gate-level or netlist descriptions. A synthesis tool is used to create a gate-level representation of the design, also in an HDL, and target it to a specific technology.

Behaviorally described soft cores have no physical attributes and are appropriate for chips spanning a range of process technologies. You also often can choose to implement a design with a specific core: cell-based, gate array, and PLD.

With the growing market for high-density programmable logic chips, PLD vendors are partnering with third-party core vendors to enable you to design core-based PLDs just as you design core-based ASICs. The trade-off is that PLDs cost more per unit than ASICs with comparable numbers of gates, but PLDs have a faster time to market for proof of concept. Another advantage of PLDs is that the NRE cost is usually lower than the cost of a comparable ASIC.

Hard core vs. soft core

Hard cores are designed at a physical level. They are predefined blocks with timing specifications for a particular technology. Soft cores, on the other hand, are designed to meet minimum performance specifications over a range of technology implementations, even though core performance varies across technologies.

The well-defined timing parameters of hard cores make them the best choice for timing-critical applications such as high-performance processor engines and high-speed I/O functions. You can also implement analog portions of mixed-signal chips as hard cores.

Hard core logic needs to be redesigned and re-verified for each targeted technology.  A soft core, on the other hand, is technology-independent and requires only simulation and timing verification after synthesis to a target technology—another aspect that speeds development time.

When you try to solve all your problems with the processor, then everything depends on how sophisticated the processor is and how fast it runs.  But in a programmable logic framework, you have the option of taking sections of code that in a processor would be executed sequentially, and executing them in parallel in the PLD by offloading them to smart peripherals.  For example, DES (Data Encryption Standard) can be implemented in a software loop, but a dedicated DES peripheral can execute encryption at a far faster rate.  So once again, tradeoffs can be made between what is done with the microprocessor and what is done in discrete hardware.  The ability to take pieces of a solution that previously were done in software and move them to hardware profoundly impacts performance. 

Third-party vs. home-grown cores

It is usually preferable to use a soft core from a supplier rather than develop one in-house.  For one thing, a vendor may have already developed the core you need, which saves you development costs.  Also, the vendor normally provides a synthesis suite, a test bench, and documentation.  When you don’t have to do these things in house it dramatically speeds development time.

Another good reason to consider soft cores from third-party vendors is that you are assured of compliance with industry standards. Core vendors also offer predefined Verilog and VHDL core models for design verification.

A soft-core-based design requires an RTL description of the desired logic, a core model to simulate the logic's behavior, a test program to verify the core, and a synthesis script. For cores that must comply with industry standards, you need a compliance-test environment which the core vendor designs and verifies.

Pressure to shrink design time to cope with shortened product life cycles is making soft-core-based design a fast-growing and sometimes essential part of system-on-a-chip design, but pre-defined soft cores require that you be willing to trade off performance for shortened design time and compliance to industry standards. The tradeoffs have to be balanced against the alternative of designing all the logic from scratch.

If you buy a core from a vendor, the complexity and process/design-implementation flexibility of soft cores mean that you may need design assistance from the core provider or licensee. Most core vendors provide service as well as products.

Core-based design issues

Soft-core models lack interconnect parasitic-delay information, and so they also lack accurate timing information. The vendor may provide sample timing data for a technology. The vendor may also supply a bus-functional model that simulates the core's behavior at the pins between the core and external circuitry without modeling the core's internal configuration.

Core vendors design their products as synchronous, timing-predictable logic blocks to guarantee operation over a range of technologies. Because behavioral or RTL simulation does not verify performance, you can check timing performance only after you synthesize the core to the gate level for a target technology. Then you use vendor-supplied estimated timing delays for that technology. Accurate timing simulation can be done only after place-and-route and back-annotation of interconnect parasitics.

After you’ve chosen and configured your processor, chosen peripherals and provided peripheral stubs or any custom peripherals you’re going to create for this design, you have the task of implementing the internal bus structure that glues it all together. 

It’s important that these elements should be automatically generated, because if it’s automatically generated, it’s correct by design.  That is, the chance for human error is eliminated.

A good system design tool will generate the common elements to connect all the peripherals to the processor, including the chip-select decoder, data return multiplexer, interrupt controller, and wait-state generator.  For Altera’s Nios soft core processor for our APEX product line, for example, we have a wizard included in the Excalibur development tool kit that automatically configures the peripheral bus module (PBM). It generates HDL, instantiates and connects all the peripherals to the processor – whether they are canned or custom -- and creates the necessary glue logic.  The wizard automatically generates wait states, interrupt control, variable bus sizers, and address decoding.  For example, the user can indicate which peripherals interrupt the processor.  For each one that does, the wizard automatically assigns an address in the Interrupt Lookup Table and generates the corresponding interrupt control logic. Bus size converters are used to adapt 32-bit peripherals to 16-bit configurations as needed.  This kind of functionality not only dramatically shortens the design cycle, but assures that the logic is right the first time. 

Testing issues

Soft cores present unique testing challenges. Core vendors should implement and prove the design in silicon, and should validate that their products comply with industry-standard specifications and provide acceptable simulation on an accompanying test bench. However, since soft cores have flexible technologies and implementations, you must provide a means for testing them in your particular design.

For cores in HDL format, a test bench defines how to exercise the core logic's nodes after you embed the core into a chip and test the entire chip. A test bench should provide high fault coverage for the core. At a gate-level representation, test-vector suites provide the same function of exercising the core with high fault coverage.

Back to top           Back to résumé

Back to Word Sculptors main page