Design Choices for Cyber-Physical Systems with FPGAs

This Technical Brief describes how to take advantage of Field-Programmable Gate-Array (FPGA) technology with embedded microprocessors in Cyber-Physical Systems design. Special attention will be drawn to multichannel systems and how to take advantage of an FPGA’s true parallel processing capabilities.

First, the traditional approach of designing a control system with a closed feedback loop and interfaces to the real world is explained. The challenges of a sequential implementation using a conventional microcontroller to meet real-time requirements for control will be discussed. In the second section, an alternative approach is presented which takes advantage of FPGAs to process data in a real parallel manner. By completely segregating the timing issues of the real-time control from the interfaces to the physical world and the user interface, the development of the control system becomes a lot easier and more stringent. At the same time, the FPGA building blocks forming the interface and the control algorithm can be easily exchanged and adopted to various requirements. This results in more flexibility and higher degrees of freedom to explore the design space for different control methods and interfaces. Finally, in the last section, we will demonstrate how to take advantage of pre-validated system platforms such as the Missing Link Electronics “Soft” Hardware Platform to ease the hardware-to-software interface design challenges and to implement Real-Time Control Systems rapidly.

Copyright © 2012 Missing Link Electronics. All rights reserved. Missing Link Electronics, the stylized Missing Link Electronics MLE logo are the service mark and/or trademark of Missing Link Electronics, Inc. All other product or service names and trademarks are the property of their respective owners.


Cyber-Physical Systems (CPS) and their special kind the so-called Real-Time Control Systems (RTC) must integrate computational and physical processes [Lee2007] and, as a result, their design complixity is dominated by managing time and concurrency in the computational part [Marwedel2010]. Additional complexity arises from the many different I/O standards for sensors and actuators that have to be connected to. And, because of the long lifecycles of RTC - which nowadays can be measured in multiples of semiconductor technology generations - device obsolescence and long-term parts availability add to the design challenge. Many applications in this realm have demanding real-time requirements. Typically, there is a software application which interacts with a user or an external system via some dedicated interface while at the same time certain closed-loop control must maintain certain modes of operation and must react in real-time to arbitrary external disturbances.

The challenge is that during control system design, there are two completely different tasks to be fulfilled; both come from different domains with different requirements merged together into one single system: The real-time control and the user software application.

In many control systems the user software application implements a man-machine-interface for system status visualization and for management of certain operational parameters. While user interaction is sporadic, the closed-loop real-time control portion must regulate the environment with minimum jitter, at all given times. Things get more complicated, if the control system not only consists of a single channel, but of several channels with different timing constraints; a very common setup.

In this White Paper we will show the principles of designing closed-loop Real-Time Control System taking advantage of inherently parallel computing Field Programmable Gate Array (FPGA) technology. This can avoid real-time problems during the implementation of complete systems, can facilitate the design exploration, and makes production systems more robust and easier to scale.

1 Cyber-Physical Systems

Figure 1 shows an example of the building blocks of a basic control system. The aspect of the physical system to be controlled, which is a state variable like rotational speed, temperature, flow, etc. can be seen on the right hand side. This physical value is either directly measured by sensors or, if this is not possible for some reason, calculated from other sensors data, and thus obtained indirectly.


Figure 1: Basic Structure of a Multichannel Control System

Once the value of the physical variable is known to the system, it is compared to an externally given setpoint, which is the target value for this variable. This comparison is done by computing the difference, the so called control deviation, between setpoint and current value of the state variable. The control deviation is forwarded to a controller, which then adjusts its outputs according to its inherent control strategy to bring the system more closely to the target setpoint. This output controls an actuator which influences the physical behaviour of the system under control. In most systems the setpoint is not fixed, but can be influenced by the user of the system or some control software.

Also there may be multiple channels with different physical values and different dynamic behavior to be controlled. This is indicated by the stacked layers in Figure 1.

One challenge in designing a closed-loop control system is the selection of the right control algorithm for the application including the right set of parameters for this algorithm. An extensive design exploration trying out different control algorithms with different parameter sets and different speeds of execution is crucial for finding the optimum solution.

1.1 Traditional Approach Using Microcontrollers

Let’s first have a look on how to implement such a system using a microcontroller to implement a closed-loop Real-Time Control System together with the necessary user application in software. The basic layout of such a system is shown in Figure 2.


Figure 2: Architecture of Microcontroller-Based Control System

As the microcontroller has to cope with both, the continuous control and the user interaction, the software has to implement both parts. Being only capable of executing operations sequentially, a microcontroller can only work on one task at a time. Hence the inherently parallel problem — maintaining certain target values while processing a user application — has to be serialized first. This can be implemented by a program structure as shown in Figure 3.


Figure 3: Flowchart of a Microcontroller-Based Control System

The calculation of the controller output has to happen quasi-continously, or at least on a very regular deterministic basis in respect to timing. To achieve this, the complete control algorithm including data aquisition and data output is done in an interrupt service routine that is called regularly using a timer interrupt. This timer interrupt divides the program execution into time slices. In every time slice the control algorithm has to be performed to acquire sensor measurements, compute the next control value and propagate this value to the output stage. When these tasks are finished, the user application may use the processor for the rest of the time slice. This concept is similar to time-triggered architectures [Kopetz03] and enables the design of deterministic, distributed systems.

In Figure 4 a visualisation is given on how the flowchart of Figure 3 is handled in the microcontroller using interrupt service routines (ISR) and "normal" program execution.


Figure 4: Timing in an interrupt driven control application

The time slices must be large enough such that the entire control algorithm is guaranteed to be completed within one slice leaving enough computing time to execute the "background" user application. Between the execution of controller code and the user application there is some time needed for the context switch. As the different portions of the computations within each time slice are not exactly the same in each cycle, additionally there has to be some slack for worst case execution times to allow the algorithm to "breathe". Depending on the chosen input and output devices (mostly analog-to-digital and digital-to-analog converters) there may be some additional latency for data acquisition and data output.

As accurate timing is crucial for the system to work, the first step in the software development process is to decide on a specific control algorithm and then to validate the implementation with respect to timing. Furthermore, some interfaces (universal asynchronous receiver/transmitter (UART), Serial Peripheral Interface (SPI), Inter-Integrated Circuit Bus (IIC), etc. may be supported directly by the microcontroller, while others may need to be implemented via software routines. Typically, this depends on the microcontroller used and can add to the design complexity.

Once the definition and implementation of the control algorithms and the interface routines are done, it is necessary to estimate the maximum worst-case execution time and, thereby, to define the minimum cycle time. This cycle time must be sufficiently long (including slack), as some background processing time must be available to run the user application. Even though this user application is not as time critical as the control algorithm, there must be enough slack in processing time to achieve sufficient responsiveness to the users input.

To the novice control system designer it may sound strange to run the actual user application in the background while having the timer interrupt routine doing a major part of the work. Nevertheless one gets used to write software code in such manner and – after some hands-on experience – most likely gets such code to work after all.

However, it remains difficult to write such code and a lot of caution is necessary to keep oversight over all the different timing and execution layers. One popular pitfall, for example, is the handling of nested interrupts which is necessary as the software is processing the timing interrupt handler for quite a bit of time. Even with a carefully designed system and a powerfull microcontroller it is sometimes not possible to deliver a repetition rate for the closed-loop control faster than in the low kHz range. This rate falls approximately proportionally with the number of channels to be controlled.

1.2 Multichannel Control System

Very often, multiple channels must be controlled concurrently by one single control unit. Examples are multiple motors or certain lighting applications synchronised to multi-axis motion control, etc. Such a multichannel closed-loop Real-Time Control System is outlined in Figure 5.


Figure 5: Microcontroller-Based Multichannel Control System

Under normal circumstances it is difficult enough to serialize a closed-loop control system for a single channel onto a sequential microcontroller and maintain all the real-time constraints. But having multiple channels with different time domains can make things really complicated. For example, because of the different closed-loop control paths running at different speeds, the time slices for the interrupts have to be determined similar to a least common denominator in terms of timing.


Figure 6: Flowchart of a Multichannel Microcontroller-Based Control System

Shown in Figure 6 there is no linear straight forward software flow anymore. Quite contrary, in every interrupt cycle a decision has to be made which task needs to be computed next and which task won’t need attention for the moment. The calculation of the worst case execution time ends up in a quite complex design challenge using queuing theory to be handled correctly. In addition to the increased complexity in software development, this implementation inevitably introduces a considerable amount of jitter into each of the individual closed-loop control channels.

2 FPGA-Based Parallel Approach

Instead of serializing the entire control system, one can decouple and segregate the different tasks of the closed-loop control system and run each task independent from each other, concurrently in modular hardware. This is an obvious approach to get out of timing trouble.

In a true parallel system, the user application and the control loop computations can be independent from each other. They may have to exchange data for current setpoints and current system states, for example, but this does not require any tight coupling. Instead it can happen asynchronously through a defined interface such as message passing or shared memory.

To the contrary, in the microcontroller-based approach these two independent tasks become very tightly coupled and influence each other very strongly, because computation is done sequentially with one task possibly blocking the ressources of the other task.

One solution to achieve such parallelism is to use Field Programmable Gate-Arrays (FPGA) and to implement each different task of the control system using a separate hardware module which all run concurrently.

FPGAs can be seen as flexible hardware processing devices that may be configured to execute almost any digital computation. The possibilities range from simple glue logic to advanced microprocessor designs that can be implemented in FPGA logic. Due to their programmability and the flexible interconnection inside it is easy to maintain the inherent parallelism of concurrent tasks. One application very well suited for FPGA logic is to run finite state machines as they are fundamental to many of the control problems described above. Additionally, with the possibility to embed one (or more!) microprocessors into an FPGA, sequential software applications can also be handled efficiently by FPGA devices.

During the design phase of a Real-Time Control System processing can be divided into several tasks with well defined interfaces to each other. In our example, the tasks of a closed-loop Real-Time Control System (as shown in Figure 2 and Figure3) are: data acquisition by reading the sensors, signal conditioning, computation of the output control signal, data output to the actuator under control, application software for user interaction.

The first four tasks can be implemented via independent hardware modules realized directly in configurable FPGA logic, while the application software stays on a sequential processor which is now exclusively available to the user application.


Figure 7: Architecture of an FPGA-Based Control System

As you can see in Figure 7, four different tasks are now implemented in hardware modules that run independently and parallel within the FPGA. Even though there is of course some synchronisation and communication between the hardware modules, this is done inside FPGA logic and not by a sequential program as in Figure 2. As a result, each individual module can now be modified, copied or replaced individually, without influencing the timing behaviour of the other modules.

Modularization also enhances design efficiency as it enables concurrent engineering and early testing of the system. We will now have a closer look at the different hardware modules to build such a FPGA-based Real-Time Control System.

2.1 Data Acquisition Module

The Data Acquisition Module together with the connected sensors is responsible for measuring the values of the system’s physical state in each control cycle. It can be realized as a simple state machine that acquires data from an external sensor and stores it into a register over and over again. The acquisition speed can be provided externally to the module to synchronize the complete system.

Depending on the sensors in use it may be necessary to have a certain protocol to access the sensor. For example, it is very common to connect sensors via analog-to-digital converters (ADC) which again support connectivity via SPI or IIC interfaces. Optionally, very fast ADCs can be connected via the FPGA’s LVDS interfaces. Sometimes special protocols are required to interface to the sensors. Such protocols can also be implemented directly within the FPGA logic, so that the entire data acquisition is completely contained within the Data Aquisition Module. The advantage is that changing sensors has little or no impact on the rest of the system. This significantly reduces the design risks for late changes, for example, and is one of the key benefits of using FPGAs for Real-Time Control System.


2.2 Signal Conditioning Module

The Signal Conditioning Module converts the acquired data into an internal data format suitable for further processing. Data formats can in many cases be based upon a numerical fixed point representation which is very suitable for FPGA logic. Depending on the measured physical values it may also be necessary to perform certain data pre-processing before passing the value to the control algorithm module. Such digital signal processing operations may include the computation of a derivative, for example to transform a rotational speed of a wheel into a unidirectional speed over ground. Other examples of data pre-processing include low-pass filtering of the signal to cancel out distortions.

As we will explain in the next section, for most of those signal conditioning algorithms there exist pre-designed signal processing hardware blocks that can be combined to quickly build very complex pre-processing. Using those hardware blocks, for each sensor input (or more precisely, for each Data Acquisition Module) there can be a corresponding, independent Signal Conditioning Module.


2.3 Control Algorithm Module

A key portion is the implementation of a closed-loop control algorithm. The control algorithm has to compute the next output value for the controlled system depending on the current sensor feedback and must be appropriate for the given control system application behind.

The careful selection and design of the control algorithm is critical to the quality of results and the robustness of the Real-Time Control System. Now, using FPGA technology, the different tasks are decoupled, independently implemented in hardware modules and it becomes much easier to explore different control algorithm strategies using different configurations without interfering with the overall system’s timing behavior.

There exist plenty of different control principles that are more or less suited to a concrete problem and, normally, it is left to the designer to deliver the best suited algorithm for the specific problem. Even after deciding on an appropriate control algorithm, the design exploration is not finished, yet. For most of the algorithms there exist different hardware implementation alternatives. Depending on the Real-Time Control System’s cycle time, and utilizing the ability to trade-off between area and speed in an FPGA implementation, an algorithm can be implemented either in a parallel manner to achieve maximum computation speed or in a more sequential fashion to save FPGA resources. As an example, a comparison of different hardware implementations of the popular PID algorithm can be found in [WEI].

One - fast - implementation shown there is given in Figure 8 which needs only one single clock cycle to calculate the next control output from the measured input. Compared to a microprocessor-based implementation this is an incredibly fast computation and, hence, can lead to incredibly short cycle times. This aspect highlights yet another benefit of FPGA-based RTC implementations: The control cycle times can be sped-up by orders of magnitude.


Figure 8: Speed-Optimized Control Algorithm Module for PID Control

Another – slower but area-optimized – implementation of the same PID controller is shown in Figure 9.

2.4 Data Output Module

In its’ structure the Data Output Module is very similar to the Data Acquisition Module, except that the Data Output Module takes the output of the Control Algorithm Module, possibly does data pre-processing and then drives the actuators. Most often this is implemented as a direct pulse-width-modulation or as a sigma-delta converter feeding a simple power output stage like a full- or halfbridge driver. In other system configurations, the data output may drive an external digital-to-analog converter (DAC), connected via SPI or similar protocols. Data pre-processing may be as simple as clipping, to limit the output to a certain valid range. Or, it may involve sophisticated signal processing implemented in FPGA logic. Because of an FPGA’s superior processing power digital signal processing algorithms which can be implemented as a DSP program also fit most FPGA devices.


Figure 9: Area-Optimized Control Algorithm Module for PID Control


2.5 User Application Software

The last task to be implemented is the user application program which interacts with the user and propagates high level parameters from and to the control system. This portion is best suited to be run as a software program on a microprocessor. In contrast to the traditional microprocessor-based approach, almost all time-critical and performance hungry control and interface tasks are offloaded from the microprocessor into dedicated hardware modules. Now, the microprocessor is available almost exclusively to the user application software.

The effect of such a system partitioning is that the closed-loop control part and the real-world interface with hard real-time requirements is now completely decoupled from the user software which normally has far less stringent requirements – at least as far as real-time behavior is concerned. The software programmer can now focus on the functionality of the software instead of struggling with complex interrupt timings to force an inherently parallel problem into a sequential compute scheme. As we will show in the last section, the many choices for CPUs embedded inside the Xilinx FPGAs make it perfect for running the user application software.


2.6 Better Scalability

Once a control algorithm is implemented for one single channel, it is easy to scale the Real-Time Control System to multiple channels. This is as easy as instantiating another set of hardware modules for the Data Acquisition Module, the Signal Conditioning Module, the Control Algorithm Module and the Data Output Module. Assuming sufficient logic resources in the FPGA, there is no impact on timing and controller cycle time, regardless whether a single or several channels must be controlled. As an example, a three-channel FPGA-based Real-Time Control System is shown in Figure 10.


Figure 10: FPGA-Based Multichannel Control System Using Hardware Replication

Often, a multichannel Real-Time Control System has very different real-time requirements for each different channel. Sometimes, the real-time constraints of each different channel are orders of magnitude apart from each other. In traditional microprocessor-based Real-Time Control System design this imposes a serious challenge as it either complicates the interrupt service routines or requires that slower control channels must use the cycle time of the fastest channel. However, in FPGA-based Real-Time Control System each channel operates in parallel and, thus, can be optimized independently from the other channels.

Therefore, it is possible to implement the most timing constrained channel using a speed-optimized control loop while less demanding channels use more resource optimized control loops.

Again, these – and other – optimizations become possible because of the parallel processing capabilities of FPGA technology which allows to decouple each task of a closed-loop Real-Time Control System and to implement each task in an independent hardware module.

3 Hardware / Software Co-Design Aspects

When building an FPGA-based Real-Time Control System with different hardware modules and user application software, another challenge lies in the integration of hardware and software together with robust interfaces between both domains.

The “Soft” Hardware Platform from Missing Link Electronics [MLE] aims at solving that problem using a pre-validated and extensible system platform which provides the missing links between the FPGA hardware and the operating system and application software. Most functionality is integrated into one single configurable System-on-Chip microcontroller which is optimized for FPGA implementation and tuned towards running Open Source Linux. The platform comes with a full GNU/Linux software stack pre-installed and tested and supports industrial PC connectivity (USB, SATA, Bluetooth, DVI, AC97), full TCP/IP networking support over Ethernet, WiFi, UMTS/GSM, plus high-speed CAN, configurable LVDS GPIO and more.

An exemplary system architecture of Figure11 shows how both, I/O connectivity and data processing, can be extended to accomodate different requirements of different Real-Time Control System applications.


Figure 11: Custom Multichannel closed-loop controller in the MLE 1000 RPS environment.

At the I/O layer, the versatile FPGA pins can be configured to implement Low-Voltage Differential-Signaling (LVDS) and/or single-ended General Purpose IO (GPIO). As we have outlined in section 3 [GLENN!] the pins of a Spartan-6 device support different voltage levels and frequencies from 1 Hz to many hundered MHz. Obviously, this allows to implement a wide variety of configurable interfaces to connect sensors and actuators, or amplifiers: Adjustable SPI, IIC or LVDS based external devices can be connected via predesigned hardware-software building blocks. Data output supports direct pulse-width modulation (PWM) or Sigma-Delta converters. These Sigma-Delta converters can be implemented using FPGA logic and basically turn an FPGA pin either into a configurable analog-to-digital or into a digital-to-analog converter with reasonably high quality.

The MLE “Soft” Hardware turn the embedded MicroBlaze CPU into a microcontroller implemented as a configurable System-on-Chip. As Figure 11 illustrates one or more hardware blocks for closed-loop Real-Time Control System channels can be integrated. To add the missing control system functionality one inserts the additional custom hardware modules (such as the Data Acquisition Modules, Signal Conditioning Modules, Control Algorithm Modules and Data Output Modules) into the platform and connects them to the microcontroller’s system bus. Once corresponding device drivers for the Linux operating system have been added these hardware implemented channels can interface to the software world. The MLE “soft” Hardware comes with templates for GNU/Linux kernel modules to serve as device drivers. For example, using the standard Linux concept of the virtual filesystem /sys software programs can read data from and write to the closed-loop real-time control system hardware modules.

At application software layer the MLE “Soft” Hardware ships with hundreds of open source packages, pre-compiled, tested and ready-to-run. These packages include filesystems, TCP/IP networking, scripting languages and more and help to jumpstart the development of user application software. Therefore, the platform is suitable for prototyping or to implement customized, production Real-Time Control System. Starting off a known-good system significantly reduces time-to-market and design risks and allows to take advantage of FPGA-based Real-Time Control Systems by focusing on the controller functionality.



[AN380]    Herbert Sax, ST Microelectronics:

[Kopetz03]    Hermann Kopetz et al.:
The Time-Triggered Architecture, January 2003.
Proceedings of the IEEE, Vol. 91, No. 1

[Lee2007]    E. A. Lee.
Computing Foundations and Practice for Cyber-Physical Systems: A Preliminary Report, May 2007.
Technical Report UCB/EECS-2007-72, University of California, Berkeley

[Marwedel2010]   Peter Marwedel:
Embedded and Cyber-Physical Systems in a Nutshell, 2010 DAC.COM Knowledge Center Article

[MLE]    Missing Link Electronics, Inc.:
Pre-Validated “Soft” Hardware Platform

[WEI]    Wei Zhao, Byung Hwa Kim, Amy C. Larson, and Richard M. Voyles:
FPGA Implementation of Closed-Loop Control System for Small-Scale Robot, July 2005.