“Mommy, Daddy … Why is the sky blue?”
As you scramble for an answer that lies somewhere between a discussion of refraction in gasses and “Oh, look—a doggie!” you already know the response to whatever you say will be a horrifyingly sincere
And so on, until your patience lies in tatters at your feet. If you can capture that feeling, welcome to the world of Internet of Things (IoT) security, where no measure is ever quite enough, and no answer is ever unqualified. Just as with a child’s question—which could be idle chatter or the awakening of scientific curiosity—it will ultimately be up to you—the system designer—to say when enough is enough. And just as with that leading question, much will depend on what is actually at stake.
What is at Risk
Securing an IoT device can become, as we shall see shortly, a nearly infinite regression at nearly infinite cost (Figure 1). But as ARM* IoT Services group vice president Ian Ferguson observed at the recent IoT Device Security Summit, most IoT devices have cost targets far below near-infinity.
“The economics are broken,” Ferguson warned. “A secure chip cannot be a $1 chip.”
So the first question in IoT security has to be “what’s it worth to you?” And that depends entirely on what is at stake.
Unfortunately that is not an easy question—and not one that you can answer from just the intended purpose of the device. Ferguson offered the example of an attack that exploited a vulnerability in an aquarium controller to penetrate a casino’s secure network. You have to ask not what the IoT device is supposed to do, but what it will be physically capable of doing. Could it deflect an aileron on a jetliner? Manipulate a high-voltage switch in a power plant? Trigger an unnecessary maintenance procedure? Sneak malware past a firewall?
The governing principle in IoT device security is not to make the device invulnerable. You cannot. It must be to make any successful attack harder than it would be worth to a plausible attacker. Making that determination requires both an understanding of just what damage a compromised device could do, and also a close look at exactly what has to be secured in an IoT device, and how to secure it.
For our purposes, let’s say there are four kinds of objects in an IoT device: the hardware, the software and firmware, data, and messages into and out of the device. Each time the device uses one of these objects, it makes three assumptions:
At each use, the device can either trust these assumptions, or it can test them.
Perhaps an illustration is in order. Suppose we have an IoT device that manages a set of sensors on a high-voltage AC power switch and controls the switch position. From this functional description, the device could be a fairly simple microcontroller (MCU). Now let’s have a command arrive at the device’s network port, telling the controller to close the switch. You would expect the controller to read the command, energize the actuator, and when the switch is closed acknowledge the new state in a message back through the network port.
But let’s rewind this scenario and walk through it from the perspective of a secure design. First, the command arrives. You have some choices: do you trust that it is authentic and unmodified, or do you authenticate it? Do you trust its behavior, or do you monitor it?
If you assume the command is legitimate, you are choosing to trust a remote application in a data center—which may be a public cloud—and whatever pieces of the Internet happen to lie along the path the message took. There are some situations that justify such trust. It might be that the results of an incorrect command are inconsequential—not for our power switch, surely, but maybe for a simple consumer toy. Or it might be that your device is connected not directly to the Internet, but to a trusted hub, via a secure connection. Otherwise, you need to authenticate the command.
Fortunately, there is a reasonable way to do this using public-key encryption (Figure 2). This technique uses a pair of encryption keys, one of which is a secret known only to its owner, and the other of which is available to anyone. Holders of either key can decrypt anything encrypted with the matching key. So, in our case, a server can create a hash of the command it intends to send, encrypt it with its private key, and then send the command and the encrypted hash to your device. You can then hash the command yourself, and then decrypt the encrypted version from the server using your public key. Then you compare the hash you made with the decrypted hash. If they are identical, you can say with very high probability that:
In fact, you don’t need the hash—you can encrypt and decrypt the entire command. But public-key algorithms are compute-intensive, so devices that lack hardware crypto accelerators and lots of memory try to minimize the length of strings they must process. And the hash will be a lot shorter than any but a very concise command.
So you have authenticated and verified the command. You trust that it is legitimate. But why? How do you know the command and the public key you used didn’t both come from an attacker instead of from your server? The answer is a Certification Authority: a trusted third party who has attested to the identity of the originator of that public key. But why do you trust the Certification Authority’s certificate? Because you authenticated it using the same public-key algorithm with a key they sent you. But why … well, you get the idea. The chain keeps going until you choose to trust a party and the key they gave you. Sometimes this comes down to a human being handing a document to another human being.
At this point a genuine paranoid might recognize that we are still making some sweeping assumptions here. What if someone has tampered with your crypto application code, or your operating system, to trick you into accepting the command? What if someone has altered your folder of public keys? Or even more sinister, what if someone has put a secret back door in your hardware?
Each of these questions can seem implausibly neurotic if there is little a risk, or if you are not familiar with the history of attacks that have actually taken place on high-value systems. After all, who would tamper with the application code on, say, a cheap consumer device? Well, someone did, attacking thousands of IoT baby monitors to create a botnet that was then used to launch one of the most devastating denial-of-service attacks in history. It’s not neurosis: even if you aren’t designing an aircraft control system or a nuclear plant, you need to think about these issues.
Verifying software and firmware can be as simple as checking a hash of the object against a hash that you trust from when you first received and authenticated the object. Updates must be authenticated as we described above for our command. Many secure systems perform this task every time they load OS and application code. But why do you trust that hash you saved? For that matter, why do you trust your hashing function? Eventually, this regression brings us to an important concept: the root of trust.
The Root of the Solution
Many CPU architectures now include a secure operating mode—one in which you can be reasonably sure that the code is authentic, the data correct, and all tasks are properly authorized. This mode allows you to have confidence in your device’s encryption and authentication processes, and it gives you a way to store and protect keys and certificates so that they are much more difficult for an attacker to alter.
Implementing a root of trust is not a simple job for CPU designers: they face the same infinite regression problem users do. A root of trust must begin with trusted hardware, booting trusted code. Now trusted hardware is a relative term. If you are designing something harmless, it can simply mean you bought the microcontroller unit (MCU) from a major vendor. If you are working on fuses for nuclear weapons, it can mean you designed the chip yourself using rigorous standards for formal verification and functional-safety, supervised the fabrication of the chip in a domestic fab controlled or inspected by your organization, and that you incorporated into the design tamper protection, side-channel attack prevention, and a physically unclonable function (PUF) to give the chip a secret ID number that cannot be copied. Then you can trust the hardware.
Secure boot can be equally challenging. You must be sure that the boot code is authentic and unmodified. We can of course use the same solution we used for the incoming command: public-key decryption and hashing. But that is circular—how do we decrypt signatures and compute hashes before we have loaded any code?
In large systems, the answer is a hardware security module (HSM). See Figure 3—a board or secured box, protected by tamper and intrusion detection that contains secure key storage and a trusted, often hardware-based crypto engine. The HSM supervises the secure boot process, authenticating each code module before it is enabled.
But in an MCU? The industry is still wrestling with how to reduce a box-level HSM to an IP block on a low-cost chip. Ferguson, for one, believes it cannot be done. Others argue that simpler measures, such as keeping boot and crypto code in on-die non-volatile memory are sufficient for most systems.
However you protect the boot process, there is a third fundamental component to a root of trust: memory protection. There must be areas of memory—for trusted code, key storage, system software, and data structures—that can only be read or modified from within the root of trust. This requires a hardware memory protection unit, which itself can only be set up from within the root of trust. Clearly this memory protection unit would have to initialize to a state that enabled a secure boot.
In practice the system would work something like this. At initialization a secure boot process would create a trusted region in memory and load into it trusted code and data, including crypto routines, a trusted light-weight hypervisor, encryption keys, and certificates. This hypervisor would then load operating-system and application code in encrypted form, assigning to each task physically protected areas of memory, and decrypting and authenticating each block of code with the hash and public key method before activating it. In this way you can be relatively certain that all the active code in the device is from a trusted source and uncontaminated. And the memory protection hardware ensures that even if something goes wrong, no task can read or write another task’s code or data.
That last feature also makes it possible for your device to run code that you don’t trust. Memory protection ensures that even malicious code will be unable to branch out of its assigned region, to corrupt its neighbors, or to perform I/O without the consent of the hypervisor.
So how do legitimate applications do I/O? Through system calls, of course. But once again you have a choice. You could simply say that all the applications are authenticated so let them use the system I/O calls as they wish. Or you could require I/O requests—and inter-task communications, for that matter—to go to the hypervisor for approval. The hypervisor could then check rules to see if the request is appropriate for the requesting task. It could even demand from the task a certificate proving its right to make the request.
All of these measures are intended to prevent compromised code, data, or commands from getting into the device in the first place, or from doing something illegal once they get there. But in critical systems you may need yet another level of security: active monitoring of the device’s behavior.
The concept is simple: a trusted source knows what the device is supposed to do and not do, and it monitors the behavior of the device. At its simplest, this monitor can be a block of code, or even a hardware state machine running in the trusted region of the device. This approach makes sense when you can define correct behavior by a compact set of rules. Memory protection is an example, actually: a set of rules determines what code can access what addresses in memory, and the protection unit’s hardware enforces those rules.
As the rules get more complex, it may no longer be feasible to do monitoring by rules evaluated in trusted software. Or the system behavior may be too complex to successfully capture in a set of rules. One suggestion to relieve this problem is machine learning. In principle, you could train a neural network to recognize good and bad behavior without having to figure out rules. Then the network could continuously evaluate the actions of the system and cry foul when it saw something it didn’t like.
But a deep-learning network presents its own problems. People with the required design and training skills are hard to find in today’s job market. It can be hard—or impossible—to assemble the tens of thousands of tagged examples you would need for supervised learning. And the process of building the example set is itself subject to attack: a malefactor could induce bias into your example set that would cause the network to ignore a particular attack. Finally, the computing and memory resources necessary to run the trained network in inference mode are not trivial. It’s not a job for that $1 MCU.
For the most critical systems there is an extreme step: heterogeneous redundancy. In this approach you have three—or if halting and disabling the device during operation is an option, two—different systems, each designed and programmed to implement the device. Each uses different hardware, and each is programmed by a different team using different algorithms and libraries.
The three systems run in parallel, determining the behavior of the device by majority vote. To successfully hack the device, an attacker would have to penetrate two of these quite different systems simultaneously. It sounds nearly absurd to go to these lengths, but such redundant systems may be required on some military and other lives-at-stake devices.
Paying the Bill
We’ve seen that developing a secure IoT device can require trusted hardware and a functional-safety-compliant methodology. It can mean crypto hardware acceleration, a tamper-proof place to store keys, trusted operating modes, real-time device monitors, or even full redundancy.
But none of this makes any sense on the $1 MCU. What to do?
Some will ignore the problem, or will offer some token, easily defeated attempt at security. Others will escalate the responsibility for security to IoT hubs, raising a security umbrella over the endpoint devices, and in effect converting the IoT design into an edge-computing network. As potential sales volume grows and the reality of the threat sinks in, still other vendors will use the vast transistor counts of advanced process nodes to attack the problem. At 7 nm, a triple-redundant MCU with some form of internal HSM could be feasible.
But ARM’s Ferguson takes a more radical view. At the Device Security Summit, he advocated recognizing up-front the real cost of securing IoT devices and changing the industry’s business model to spread the cost across the supply chain. This could be done by assessing royalties or tariffs on systems that talk to IoT devices. Or semiconductor vendors could capture a portion of the revenue or savings produced by the finished IoT systems. Or, perhaps more likely, secure devices will just cost $25 instead of $1, and really secure systems will be deployed by vertically integrated companies that can absorb the device cost into the system cost, and then recover it from service contracts, maintenance, and the value of the data collected from the systems.
Compared to the old merchant semiconductor market for MCUs, it is a whole new world. But it’s a world with a whole new level of threats, and sooner or later they have to be taken seriously.