By Johannes Roettig, Field Applications Engineer, Missing Link Electronics
Both embedded systems and the next-generation cyber-physical systems sense aspects of their physical environments and return force to it. This process is performed by reading mostly analog sensor data, performing computations, and driving analog actuators and motors. Signal conditioning for such systems means not just reading the input data, but analyzing the data despite noise and other distortions, to compute proper responses in a timely, cost-efficient manner.
In most applications, signal processing is performed digitally because of the significant cost benefits of digital circuitry over analog components. This article is an introduction to signal processing in embedded systems for digital designers without a signal-processing background. It starts with revisiting key topics from signal theory and the digital processing domain to help you evaluate and validate your design options.
While the world is analog, today’s signal processing is digital. To go from the analog domain to the digital domain, a signal must be converted by repeatedly sampling the analog signal and assigning a number to the magnitude of each sample. Thus the continuous-time analog signal becomes a series of numbers represented by digital values. During this conversion, there are two main sources of error:
In other words, conversion can create errors in magnitude and errors in time.
There are multiple approaches during sampling to cope with the value quantization. If the signal is known to require different resolutions for the values over a large dynamic range, one possibility is to have the digital quantization be non-linear instead of equidistant. This allows for using a higher resolution in the quantization on the parts of the signal that do not change much in value and a lower resolution on the parts of the signal where the change is large.
Take a sine wave for example. A normalized sine has values that range from -1 to +1. In the range around 0, the rate of change in the signal is large (the derivative of a sine is a cosine), while there are relatively smaller changes in the area around the high and low points of the sine. So for representing the sine with quantized values, it could be better to have larger quantization steps in the area around the value 0 and smaller steps around the values 1 and -1. In this way, the shape of such a signal can be much closer represented by the digital values. (How and why linear sampling works is discussed in greater detail in [1].)
Another way to get around the problems of sampling slowly changing the signals is by adding noise to the signals. At first glance, adding noise seems counterintuitive, but it can help in two ways. If the added noise is somehow deterministic (such as a pseudo noise sequence), the noise can afterwards be mathematically eliminated. In addition, using carefully crafted noise can help overcome the problem of the quantization resolution being too low. (More about this approach, called dithering, is given in [2].)
With these two value-discrete sampling techniques, it is possible to avoide most discreteness-of-values problems when sampling data, but discreteness is not the only problem. Digital signals are also discrete in time, the problem addressed by the Nyquist-Shannon sampling theorem.
This theorem states that if you want to sample a signal and then reconstruct the signal accurately from the sampled data, you must sample at no less than twice the frequency of the highest-frequency component of the signal. If you don’t obey the Nyquist-Shannon sampling theorem, the function you reconstruct from the sampled data is usually different from the original function. The result can be quite striking, as the extreme example in [3] illustrates.
Another common problem is that in real-life sampling, another source of error is introduced by non-equidistant samples, in contrast to theoretical sampling, where all the samples are equally spaced in time. This error is called jitter. There are multiple ways of overcoming most of the jitter in a sampling system by filtering, or, even better, preventing as much of it as possible prior to the conversion. (See [4] for further details on jitter.)
Delta-sigma modulation (DSM) is a modern approach for converting between analog and digital data, and can be applied to both analog-to-digital converters (ADCs) and digital-to-analog converters (DACs). There have been many approaches to converting an analog signal into a digital signal over the years. But with the increasing speed and density of digital circuits and the modern preference to use as much digital and as little analog circuitry as possible, DSM techniques have come to dominate the field.
Fundamentally, the DSM converts a continuous-time analog signal into a pulse-frequency-modulated stream of pulses. A DSM does this conversion using a simple negative-feedback control loop, as shown in Figure 1.
Figure 1. First-order DSM.
At a very conceptual level (that is, analog engineers will begin to snicker almost at once), here is how it works. To follow this explanation, assume that the sample rate, Fs in the figure, is significantly faster than the highest frequency in the input signal x(t).
Let’s say that the sampling switch closes, causing a pulse in the 1 bit DAC (basically just a pulse shaper.) The 1 bit-DAC output is then a narrow pulse that arrives at the negative side of the summing junction. Assume that we have set the pulse generator so that the pulse is large enough that it drives the output of the summing junction goes negative, and long enough that the output of the integrator also goes strongly negative. When the integrator output goes negative, it drives the output of the 1 bit ADC (also known as a comparator) to zero. That zero is sampled by subsequent closures of the sampling switch, so the sampler does not turn on the pulse generator.
With no more pulses entering the summing junction from this 1 bit DAC, the integrator simply integrates the (biased to be always positive) input signal x(t) until the integrator output is again positive. A positive integrator output switches the comparator, again allowing the sampler to send positive pulses to the pulse generator, starting the process all over again.
The greater the amplitude of x(t) at a given time, the faster the integrator integrates its way back up to 0 V, releasing the next pulse. Thus the pulse frequency at the DSP output y(t) (or y(z), if you are a fan of z-transforms) is approximately proportional to the input current x(t).
One of the major advantages of DSM converters hinges on that word “approximately.” Here we could start on a fascinating analysis in the frequency—or even better, the z—domain, but let’s stay in the time domain. You can see intuitively that if the sampling frequency is too low, the input x(t) will change significantly between output pulses. These input changes create not an accurate output signal, but a mess. Conversely, when the sample rate is much, much greater than the highest frequency in the input, so that the input is essentially constant between pulses, the output is a very accurate reflection of the input. Just how accurate depends on how little the input changes before the integrator triggers the next pulse, and on what error there is in reflecting those tiny—that is, high-frequency—changes between output pulses.
Put in conventional terms in the frequency domain, turning up the sample clock is called oversampling. Oversampling shifts the quantization noise (the error between the instantaneous input current and the instantaneous output pulse frequency) up to higher frequencies. If you are clever, you can sample so fast that most of the quantization noise is much higher than the frequencies you actually care about in the signal, and you can then filter it out with a low-pass filter.
Now clever control-theory types have figured out another trick. By changing the circuitry in the DSM loop—technically, by changing the loop transfer function—you can also “shape” the noise. That is, instead of just spreading out the noise across the frequency spectrum, you can push it into a heap at the high end of the spectrum, moving almost all of it out of the band where the signal lives. In practical terms, this change means you can, for example, design an audio DSM converter that has an overall signal-to-noise ratio (SNR) of about 6 dB, but so much of the quantization noise has been shifted out of the audio band that the audible SNR is over 100 dB.
These effects can be seen in Figure 2. Figure 2a shows the Nyquist sampling converter. All quantizing noise is in the desired signal band. A first improvement comes from oversampling (Figure 2b). The noise is equally distributed over the sample spectrum, but now much of it is well above the signal band. A second improvement is noise shaping (Figure 2c), which acts as a high pass noise filter, piling the noise up at the higest frequencies. Noise shaping is the key enabler for reasonable SNR measures.
Figure 2. (a) Nyquist sampling converter, (b) with oversampling, (c) with noise shaping.
However, there is a problem. The drawback of noise shaping is that you must come up with appropriate parameter sets to operate the inherently instable DSM. Obtaining a proper parameter set for the DSM is essential to assure precision and stability. The parameter space addresses the integrator time constant (which, in this case, is a low-pass time constant) and the sample frequency, which determines the oversampling rate.
We show a heat map of a simulation of a typical DSM in Figure 3. The heat map is plotted by drawing the two different parameters, integrator time constant and sample rate, on the axes and overlaying the resulting SNR value. At first, the shows a good parameter set for the desired input frequency. However, it also shows that just using a higher sample frequency won’t automatically give better results unless you consider other parameters, such as the integrator time constant, at the same time. So when parameterizing the hardware part of the DSM, you must first choose your desired SNR and then derive the proper parameter set.
Figure 3. Heat map of the SPICE simulation.
Embedded mixed-signal systems must provide flexible, direct interfaces that support digital and analog connectivity. DSM can almost entirely be embedded inside modern FPGA devices, free of any active peripheral components. However, it is just not sufficient to insert a DSM function block into the FPGA device to become a viable and dependable option for implementing mixed-signal systems. Instead, an application-specific design concept that augments the DSM ADC and the DSM DAC with appropriate signal conditioning via adequate digital filters is needed.
The basic components of an FPGA-based ADC implementation utilizing DSM are shown in Figure 4. The only external components are the two resistors and one capacitor. In combination with the low-voltage differential signaling (LVDS) receiver and a flipflop, the fundamental architecture of the DSM ADC is built. The subsequent filter stage is essential to obtain the desired signal. It basically comprises a moving-average lowpass filter, to cut off the high-band noise and to prevent aliasing, and a decimation filter, to convert the signal to the desired sample rate and bit width. Optionally, calibration filters for offset and gain compensation and filters to remove AC components can further improve signal quality.
Figure 4. An FPGA-based ADC implementation utilizing DSM.
Such an implementation can achieve a SNR up to 60 dB, a sample rate up to 500 k samples per second, and an input range 0 V to 3.3 V with high linearity. A multichannel 8 ADC design would use about 800 FPGA logic elements (LEs) and 16000 memory cells on, for example, Altera® Cyclone® IV devices.
Despite overcoming all these analog-to-digital conversion problems, you have just scratched the surface of the important problems. Digital signal processing (DSP) itself is quite error prone, if you are not cautious. In engineering, a typical signal processing approach is transforming a signal from the time domain to the frequency domain. Other common areas of application are spectral analysis, compression of audio and video data, and decomposition of signals.
Widely used examples of implementations for such transformations are fast Fourier transforms (FFTs) and discrete cosine transforms (DCTs), which both use weighted sinusoidal functions to represent a signal, or more general transformations like wavelet transforms. While there are a lot of ready-to-use implementations in this field, we will focus on what must be considered and done with the data before using these implementations. Using a FFT, the engineer’s go-to tool, these signals include:
In DSP, we can exclude the first two right from the start, as we cannot work on continuous signals. So what about the other two? An aperiodic signal can only be represented in the frequency domain by an infinite number of sinusoidal signals, so there is only one type of signal left for transformation—a discrete and periodic signal. The main problem with such a signal is that there is never have enough time to record a periodic signal (as you would need an infinite amount of time and an infinite amount of memory). But if you just record a periodic signal for a short time interval, it is no longer periodic: it begins at zero in the infinite past, suddenly springs to life, runs for the interval, and then returns to zero for the rest of infinite time.
While this sounds as if we have just proved that we can’t do anything, there is a workaround. What is actually performed is that every kind of signal is converted in a discrete and periodic signal. Converting an aperiodic signal to a periodic signal is as simple as taking a sample and repeating it forever. But creating that artificial periodic signal creates its own issues.
Repeating even a continuous signal such as a sine wave will most likely change the signal. There is a slight chance of being able to ideally repeat the signal—for instance, if the interval you choose to replicate exactly lines up with the period of the signal. Typically, however, there are jumps in the signal at the edges of the repeating window. These jumps introduce so-called leakage into your signal processing. These leaks can be seen as wider signal peaks and high frequency noise in the Fourier spectra. You can also observe the implications of this leakage in the frequency domain. A jump in the signal at the edge of the window results in very high frequencies in the signal, which makes a representation in the frequency domain with limited functions impossible (and introduces frequencies that are not in the original signal).
To minimize the effect of leakage on a signal, the signal is typically run through a so-called windowing filter, prior to replicating it. Good windowing functions all have one thing in common: they reduce the height of the jump at the boundary of the window to a relatively small value, and thus limit the effect of the leakage. Common filters for windowing are Hanning, Hamming, Gaussian, or Blackman windows. (A more in depth description of these filters and their influence on the signal can be found in [5].)
Apart from the widely known FFT, another standard approach in DSP is cross-correlation. Cross-correlation is a method to compare two signals against each other. In this method, every sampled value of function A is multiplied by each sampled value of function B, shifted in time by t samples. The results of these multiplications for each data point in A are summed up, and form the result of the cross-correlation operation at time t. Then the process (essentially a dot-product of the two sets) is repeated for each value of t. In practice, there are two ways of calculating the cross-correlation: either in the time domain via the sum of the sampled data multiplications, or in the frequency domain via the inverse FFT of the cross-spectral density.
Now, after having touched on the principles of how to get analog data into the digital domain, the next step is to process this data. The usual challenge is that a system design has more than one analog signal to be processed, such as multiple analog inputs in a microphone array.
There are two main alternatives to think about when processing these multiple data streams. One is the sequential approach where each data sample is processed in order. The other is the parallel approach, where several samples of data are processed simultaneously. The parallel approach offers different benefits such as a speedup of the data processing or a relief of other processing elements. The problem lies in the fact that not every process is parallelizable and the possible speedup does not necessarily scale with the count of the parallel processing units. Amdahl’s law is one rule how to calculate this possible speedup:
M = 1 / ((1 – P) + P/S)
where M is the maximum possible acceleration from parallelization, P is the proportion of the data processing that is parallelizable, and S is the speedup.
N parallel units do not always result in a speedup of the factor N. So if the whole system is parallelizable, and the speedup of the parallelization is 2, the whole system’s speedup is 2. If, with the same speedup, only 50% of the system is parallelizable, the speedup narrows down to 1.3.
Granularity defines how a system can be broken down into smaller independent parts. The granularity is an indicator for the possible speedup of parallelism. The finer the granularity, the better results can be expected from parallelizing the system.
Latency is the time difference between the occurrence of an event and the system’s response, with the aim of low-latency systems. For example, real-time processing systems need a low latency to be able to respond to a given sample within the deadline. When implementing systems, it is often beneficial to dedicate some specific task to a parallel processing unit to lower its latency. This unit can be the separate baseband unit in a cell phone or some specialized logic in an FPGA.
In addition, often there is a real-time requirement involved when designing systems. Real-time processing means the system has a constraint in response time. A real-time system must guarantee the response within a given time. This response is usually in the region of up to a few milliseconds, but the key is that the system must not take longer to respond.
There are three types of real-time processing systems:
For real-time DSP for example, the requirement is that the response time to a given sample must be smaller than the sampling interval, which means that the worst-case execution time must be smaller than the sampling intervals.
There is a trend for the DSP to move to higher bandwidths. There are several reasons for this trend. For example, the occupied spectra induces the need for higher frequencies, and the higher data rates induce the need for higher bandwidths. In addition, the need of systems, such as the MIMO WLAN technology, to process multiple channels of data at once raises the bandwidth need. This need results in higher digital processing requirements, which can be met by moving to a faster processor, using multicore processing, or by moving to a specialized technology like FPGAs.
Control loops are a good example of the above mentioned principles of real-time processing, parallelization and multichannel data processing. A control loop is a closed-loop system that tries to compensate for an error and keep the system in a predefined set-point. The DSM in our previous discussion is an example of a control loop.
The most basic control loop can be seen in Figure 5. The process is to be controlled has input y(t) and output x(t). The disturbance z(t) interferes. The controller has the input e(t), which is the difference between the set-point w(t) and the feedback signal and the output y(t).
Figure 5. Basic control loop.
In system theory, this process is usually described as a linear continuous-time system:
dx(t)/dt = Ax(t) + Bu(t)
This system can have multiple inputs and outputs. From this point, the system theory calculates the control matrix B from the prerequisites of the control. (The actual calculation of this theory can be read in [5].) These calculations include various parameters such as stability, which is calculated using the Ljapunov stability criteria.
In a digital system, the time is not continuous, which prevents the use of these methods if no precautions are taken. A rule of thumb is that the classical methods can be used if the sampling frequency is ten times higher than the highest internal frequency. If the sampling frequency gets lower, the deadtime between samples affects the stability of the control loop. A z-transform is used to control these systems.
The z-transform is the counterpart to the Laplace transformation in the standard control loop design. (The interested reader can learn more about z-transforms in [6].)
With all these prerequisites, it is now possible to model the data flow of multiple channels, such as multiple sensory inputs into a system. These inputs can, as previously described, be processed in parallel, sequence, or both. To be able to process data in parallel efficiently, it is necessary that the data does not have dependencies on each other.
Modern embedded systems that interface between the physical “real” world and the computational “cyber” world add to the challenges of signal processing. To the engineer’s rescue come new powerful digital circuits that effectively push the analog domain boundaries further in favor of digital programmable processing. These modern circuits combine one or more CPUs with DSP and look-up table (LUT) processing to give us more choices for digital signal conditioning. Versatile and fast LVDS pin pairs give us additional design choices for analog-to-digital and digital-to-analog signal conversion.