Tuesday, July 17, 2012

The Death of Hard Power Off

You've already noticed this: when you hit the power button, it takes several seconds for the device to go dark. If it has a display, your device might show a little spinning or blinking icon to indicate it is doing something. This is true for your hand-held mobile devices: your tablet, your mobile phone. It is also true for your laptop and your desktop, and for the rack-mounted servers at your data center. It is also the case for consumer devices you perhaps don't give that much thought to: your digital video recorder, your MP3 player, or, perhaps, even your automobile.

Increasingly, digital devices implement a soft power off. Which is to say, when you press the power off button, you are not turning the power off. You are informing a piece of software of your desire for it to turn the power off. A million lines of executed code later, the power turns off. Usually.

Compare this to a hard power off, which is more like a simple light switch: the instant the purely mechanical switch separates a set of metallic contacts, the electrical circuit is interrupted and power is immediately removed from the device.

Soft power off has permeated our digital devices, more or less without us users thinking about it, for one reason: the need to maintain a consistent state.

This state could be your device remembering your web page history. Or its position in your music play-list. Or where you paused your movie. Or the last number you dialed. But state can also be something a lot more abstract, data the device has to save as part of some function or service it is doing on your behalf, or even something in the realm of routine maintenance, the details of which might make your eyes glaze over if you actually had to know about it. For example, devices with global positioning system capabilities - which is nearly everything now - like to save information about the GPS satellites used during the last position fix because this can vastly speed up acquisition of the same satellites the next time you turn the device on, providing you haven't moved very far or it hasn't been turned off for very long. You appreciate this capability even if you don't know about it.

This state could be saved on a remote server for network attached devices, whether they are wireless or wireful. But more often than not these days, state is saved on a persistent  read-write storage drive embedded directly in your device. The growth of read-write storage in embedded devices has exploded in recent years. Very early mobile digital devices actually had tiny surface-mount spinning disk drives. But the introduction of less expensive flash memory, read-write persistent semiconductor memory with no mechanical parts, now dominates the mobile device market, and is beginning to dominate even the less mobile laptop market.

Sometimes this flash memory is used directly by the device; the operating system uses a file system implementation like Journalling Flash File System 2 (JFFS2) and Yet Another Flash File System (YAFFS) that makes the flash behave less like memory and more like a disk drive, and which provides the usual functional capabilities like directories and files and permission bits and the like.

Some read-write persistent storage devices, like a USB memory stick, or a microSD memory card, offer a slightly more disk-like hardware interface on top of the underlying flash, and the operating system conspires to make the storage device behave like a disk to the application software. My little shirt-pocket-sized ODROID-A4, a battery-powered and WiFi-connected platform reference device produced by Hardkernel for developers writing low level code for Samsung's Galaxy Android smartphones and tablets, uses a microSD card for its persistent storage. But the A4 layers on top of it disk partitioning and multiple EXT4 file systems, something you would have in the past expected to find on a server at the data center.

Solid state disks (SSDs) are storage devices which emulate a full blown disk hardware interfaces on top of the underlying flash memory. Not even the operating system may be able to tell the difference between the SSD and a spinning disk. I've built embedded products using SSDs that used the stock disk drivers in Linux. On these systems I had no choice but to use file system implementation tailored for disk drives, like EXT3, because that's the hardware interface I had to work with.

The introduction of read-write persistent disk-like semantics to mobile devices brings with it not just all the convenience and capabilities of having spinning disks, but all the issues the plague our mobile devices' bigger cousins that traditionally use those spinning disks. I've already written about issues of data remanence and solid-state storage devices. But here, I'm talking about basic reliability.

Perhaps you have learned the hard way to put your desktop system on an uninterruptible power supply. You may not appreciate the fact that your laptop has its own built-in UPS, but you depend on that fact just the same. And pulling the power cord out of a running server at your data center is a good way to get escorted to the door by your organization's security apparatus. There is a reason why all of these devices now implement soft power off. And why Google added a twelve volt battery to each individual server.

The reason is that as application software has become more and more complex, its demands of its underlying storage system has increasingly become more and more like a database transaction, either in fact (because it uses an actual database) or in function (because it requires atomically consistent behavior to be reliable). It is for this reason that, no matter what the nature of the underlying storage device, file system implementations like EXT3 and EXT4 have borrowed from the database world and are journalled file systems: a single atomic write to a sequential file or journal on the storage device is first done to record the intent of the following more complex multiple write operations which may be spread across the storage device. If a failure occurs during the multiple writes, the journal is consulted during the restart to repair the file system. (Log-structured file systems do away with the subsequent multiple write operations completely and merely reconstruct the vision of the file system as seen by the application software from the sequential log file as a kind of dynamic in-memory hallucination, with some performance penalty.)

Update 2016-03-09: Something I failed to make clear here is that in the case of journalled file systems, only the meta-data -- that is, only the writes done to modify the structure the file system itself -- are saved in the journal, not the writes of the data blocks. This allows the file system to be repaired following a power cycle, such that the file system structure is intact and consistent. But the data writes in progress at the time of the power cycle are lost. One of the symptoms I've seen of this is zero-length files. The file entry in the directory was completed from its record in the journal, but the actual data payload was not.

The need for consistent file system semantics has lead to a lot of research in file system architectures, because techniques like journalling are not perfect, and sometimes not adequate for applications software that have more complex consistency requirements than just knowing whether a particular disk block has been committed reliably to the storage device. But more practically, it has lead to the end of hard power off as a hardware design. Soft power off gives the software stack time to commit pending writes to storage to insure a consistent state on restart. (And for network connected devices which may depend on consistent state on remote servers, it allows for a more orderly notification of the far-end and shutdown of communication channels.)

The web is full of woeful tales of users who bricked their devices by cutting the power to them at an inopportune moment. And I have my own horror stories of products on which I've worked with read-write persistent storage but architected with only hard power off.

Hard power off is such an issue in maintaining the integrity of SSDs that the more reliable ones (by which I mean, the only ones you should ever use) implement their own soft power off, in the form of a capacitor-based energy storage system, to keep the device running long enough to reach a consistent internal state. There are a lot of SSDs that don't do this. Those SSDs are crap. As you will learn the hard way once you've cycled power on them just a few times. (If you are using SSDs in any context, adding a UPS to the system in which they are used is not sufficient. As long as power is applied, the tiny controller inside the SSD is doing all sorts of stuff, all the time, asynchronously, whether or not your system is using it, even if your system has been shutdown. Like garbage collecting flash sectors for erasure as a background task. Only the controller inside the SSD knows when it's reached a consistent state; neither the operating system nor even the BIOS has any visibility into that activity.)

This is just going to get worse. The decreasing cost of solid-state read-write persistent storage makes it more likely that it will be used in less and less expensive (and hence a greater and greater number of) small digital devices. Increasing memory sizes on digital devices allows more complex software, which places greater demands on the storage system. Larger memory also increases the amount of data cached there, typically for reasons of performance, which stretches the latency in committing the modified data to storage, and increases the likelihood that an inconsistency will happen should a failure should occur. (One of the principle differences between the EXT3 and the EXT4 file systems is the latter caches data more aggressively.) We should have expected this just by looking at the disparate technology growth curves.

As you consider expanding your product line to include more digital control in your embedded products, it will occur to you that adding some solid-state read-write persistent storage would be a really good thing: to store user settings, to allow firmware to be updated, to implement more and smarter features. Once you take that step, remember that you now face the issue of soft power off where perhaps you didn't before. Because with today's digital devices, hard power off is dead.

Update 2016-03-09: Nearly four years after having written this article, I am still trying to convince my clients that they cannot design their embedded products with a hard power off switch, even as the complexity of these products, in both software and hardware, evolves to look more and more like data center servers. And failing that, helping them try to figure out how to make their products more recoverable in the field when their file systems -- much much larger than they were four years ago -- are hopelessly scrogged.

No comments: