Back to résumé

Contributed article for the Embedded Supersection of Electronic Engineering Times.   Ghostwritten for Tektronix.

 

Deep memory simplifies debugging of embedded systems

 

Real-time software problems are difficult to debug because they occur only when the system is running at speed. Debug monitors are not satisfactory for real-time applications because they stop execution of the program in order to capture information, but in real-time systems you usually cannot stop the program while data is coming in without losing a significant amount of data. Emulators can help, but they sometimes lack the triggering capability or the acquisition memory depth to find the problem. Furthermore, debug monitors and emulators only display the current status of the program at the time it is stopped, not the detailed history needed for effective analysis.

A logic analyzer can be used to debug the system in its normal real-time mode of operation, and its ability to capture anomalies in a digital stream of data is enhanced significantly when the logic analyzer has "deep memory". Deep memory is a relative term that relates to depth of the time window for acquiring system execution history in real-time, and is measured in the number of samples it can hold (ksamples or Msamples).

Logic analyzers with deep memory, such as the Tektronix TLA700 series with a TLA7P4 module (which has a memory depth of 16 Msamples), allow users to perform real-time trace, which records the activity of the program running at full speed without stopping execution and provides the large amounts of historical data needed to identify the problem. Thus, real-time trace is critical for debugging real-time software. The deeper the trace memory the better.

Problems often occur in a system under test (SUT) that are neither hardware nor software bugs, but are a result of hardware and software not working together. If the memory of the logic analyzer is not deep enough then the embedded systems developer will not be able to get enough trace history to see what is happening. Early logic analyzers functioned well with 2 Msamples of memory per channel. However, the speed and complexity of today’s designs demand troubleshooting and analysis tools with ever-increasing power, requiring memory depths up to 16 Msamples.

Figure 1 shows a simplified view of how a logic analyzer captures data. The logic analyzer is connected to a data source and is set to continuously collect data into a circular buffer. When the buffer fills, new data overwrites the oldest data in the buffer. If you stop the data capture at any point, you have a window of data that you can scan to look back and get a picture of what happened up to that point. How far you can look back in time is determined by how much memory you have in the circular buffer.

Deep memory is extremely valuable in tracing a number of different types of problems, such as crashes, memory leaks, stack overruns, hardware glitches, and timing errors.

For example, embedded systems differ from computer applications in that they generally do not have protection from a stray program crashing the entire system. Computer operating systems have many schemes for isolating the systems from a misbehaving application, but embedded systems often do not. Thus, when your embedded software system crashes, it frequently takes the whole system down, losing any information that may help determine the cause. Furthermore, as embedded application software complexity increases, the decoupling of cause (problem) and effect (crash) can greatly increase. Logic analyzers can provide the history to quickly determine the cause of crashes. The deeper the memory, the more data you have to analyze to find the cause of the problem, and the source of the crash can be farther away (or "decoupled") from the actual crash.

A memory leak is an error in a program’s dynamic memory allocation logic that causes it to fail to free up memory that is no longer used, leading to eventual collapse due to memory exhaustion. These leaks often cause immediate crashes in older designs with small fixed-size address spaces. With the increasing amount of memory available in systems today, it may take more time for the crash to occur, making it more difficult to isolate the fault. However, deep memory provides a trace buffer large enough to accommodate the needs of modern designs.

Coupled with deep memory, conditional storage with context capability (contextual storage) can greatly extend the time window acquired by the analyzer. Contextual storage means that the analyzer automatically captures windows of activity around an event of interest. For example, acquiring just the memory allocation/deallocation routines, along with the context immediately prior to and after the routine, can limit the analyzer acquisition to just the areas of interest and greatly extend the capture window.

A hardware glitch or timing error can occur when the inputs of a circuit change, causing the outputs to change to some random value for a brief time before they settle down to the correct value. If another circuit inspects the output at the wrong time and reads the random value, the results can be wrong and very hard to debug. If the logic analyzer has deep memory and glitch storage, the user can trigger on the symptom and still acquire the glitch that caused the failure even though it occurred much earlier.

Stack overruns occur when a program attempts to push more information onto the stack than it can hold. The maximum size of a stack is set first by the size of numbers the relevant register can hold, second by the initial value of the stack pointer. If a logic analyzer does not have deep memory, it is difficult to trace historical stack pointer cycles to capture overrun data.

There are three primary selection parameters that should be considered when evaluating a logic analyzer with deep memory. A key consideration should be how the logic analyzer manages the large amount of data acquired. Sophisticated data handling techniques can eliminate confusion and help you find the exact data you need with optimum speed. Data handling techniques can be evaluated by looking at how quickly the analyzer updates the display after scrolling or zooming, the speed with which it searches through the data to find an anomaly, and how fast it saves a large acquisition be saved to the hard drive.

Hardware acceleration greatly improves the manageability of large amounts of data, so this is an important capability to look for in a logic analyzer. Hardware acceleration provides data so fast that the waveform display can be drawn in seconds rather than minutes, and enables the logic analyzer to quickly search the acquired data to find an anomaly.

Timestamp is a tool that significantly increases the usability of deep memory. Logic analyzers with this capability store a separate timestamp with each data sample. When timestamp memory is separate from acquisition memory, it’s easier for the logic analyzer to maintain time-correlation between samples and show time between samples, which is useful with data qualifications. Timestamp information can be used to indicate the elapsed time between samples, or the total time from the beginning of the acquisition or trigger.

A second use of the timestamp information is to time-correlate data between different acquisition modules. If a common reference point such as the start of an acquisition or a system trigger can be established between modules, the data between modules can be accurately correlated. We all know how important data correlation is when looking at mixed analog and digital signals. But viewing logic analysis data acquired from multiple modules connected to different bus structures running at different rates can actually present an even greater challenge. An example of this would be a microprocessor and a peripheral bus such as PCI or RAMBus.

Every timestamp counter is, of course, driven by a clock source. But every clock source drifts relative to its designed center frequency. As logic analyzer memory depths increase, the time covered by the acquisition window starts to get long enough that you can see significant timestamp errors between logic analyzer acquisition modules.

For example, suppose there are two logic analyzer modules that have 1M samples of memory depth, each using an independent clock source for its timestamp counter. Let's assume that these cards use a 100 MHz oscillator to run the timestamp counter and that the oscillator has 100 ppm accuracy. If one logic analyzer's clock source is fast and the other is slow, by the time we look at the one-millionth sample we can see correlation errors of up to +/- 100 samples.

This is a problem common with older logic analyzer architectures that often requires rather elaborate workarounds to compensate for the errors. To prevent this problem it is important to use a logic analyzer based on a modern architecture where all of the logic analyzer modules are automatically phase-locked to the same clock source. Without this feature there can be significant correlation errors.

The third important consideration in choosing a logic analyzer is transitional storage. Deep memory applications can usually be classified into two categories: externally clocked (i.e., synchronous) or internally clocked (i.e., asynchronous). When acquiring data synchronously (externally clocked), the raw clock on a target system often is used to acquire data; however, large amounts of redundant data are often stored. With transitional storage, a logic analyzer can be configured to acquire data only when a specific channel group has a data change.

For example, if acquiring data from a target system where only one in four samples contains data of interest, a logic analyzer module with 1M depth and transitional storage can effectively store the same amount of data as a logic analyzer module with 4M depth but which doesn’t have transitional storage.

Sampling data asynchronously (internally clocked) with a logic analyzer isn't too different from doing so with an oscilloscope. In both cases, the data should be over-sampled to ensure faithful data reproduction.

With a logic analyzer, one should strive to over-sample by at least 5X the fastest data rate in the target system. However, it often happens that four out of every five samples show the same or unchanging data. With transitional storage, only the data that changed is stored. Each stored sample is timestamped to ensure that the data is accurately displayed, thereby preserving the time relationship.

Whether acquiring synchronously or asynchronously it is important for your logic analyzer not to trade memory depth for timestamp data.

 

Back to top           Back to résumé

Back to Word Sculptors main page