Monday, May 22, 2017

The Cost of Doing Business

My software development career over the past forty-plus years has been all over the map: supercomputers, eight-bit micro controllers, vast distributed telecommunications systems, multi-core microprocessors. The one thing they all have in common - at least at the level I work at near bare metal - is that the skill sets are all basically the same. But one thing that isn't the same is the cost of the tools to debug, functionally test, and validate my work.

Perhaps twenty years ago, I was writing firmware in C and C++ for optical networking products based on ATM and SONET. One night I found myself working late in a lab trying to debug some recalcitrant real-time code. I was surrounded by an H-P broadband analyzer, two H-P ATM protocol analyzers, and a tangle of multimode optical fiber. I realized I had maybe US$300,000 worth of test equipment dedicated to this one task, never mind my high-end workstation and the room-filling racks of telecommunications equipment. I paused for a moment to realize that an organization smaller than Bell Labs might have had a tough time tackling this product development effort, the investment in infrastructure was so significant.

Untitled

Moore's Law hasn't really changed things that much, because as the tools got cheaper, the systems we develop and debug have gotten larger and more powerful right along with them. Just the other day I was working on my stratum-0 atomic clock project and I realized that my little hobbyist oscilloscope - one of the most useful tools I have ever purchased - was completely inadequate for doing any kind of jitter analysis on the output of the device. After some reflection, I decided I probably didn't know anyone that had an oscilloscope that could measure the jitter of my chip-scale atomic clock. Any jitter that might be detected would surely be in the oscillator used by the measurement instrument itself.

Capture

See, our measurement of time is a funny thing. It is a completely synthetic artifact, a singularly human invention. Thanks to natural variation in the Earth's rotation and in its orbit, time isn't even tied to the movement of the Sun any longer. It's defined, in the International System of Units, to the beat of an oscillator based on the cesium-133 time standard. And the thing I was trying to measure was itself an oscillator based on the cesium-133 time standard. The only way we can measure the precision and accuracy of a clock is by using a more precise and more accurate clock. I'm not likely to ever have one of those.

The larger lesson here is this: before you decide to tackle a project, make sure you can get the tools you need for the entire life-cycle of the product. You may find out that the economics of developing a product are substantially different from the economics of debugging, testing, validating, and supporting that product.

Wednesday, May 17, 2017

Engines of Time

In My Stratum-1 Desk Clock and My Stratum-0 Atomic Clock I described the two clocks that I built: O-1 a.k.a. "Hourglass", and O-2 a.k.a. "Astrolabe". Both act as stratum-1 NTP servers on my home network, along with two commercial units, "Waterclock" and "Sundial", that I have previously described. I mounted both clocks on inexpensive acrylic fixtures from office supply stores that just happened to be about the right size and shape to serve as stands to display my handiwork.

Hourglass on a Stand

Hourglass is my desk clock sitting on a shelf in my home office. It is a Raspberry Pi 3 using Raspbian and runs the GPS daemon and the NTP daemon. It connects to my home network wirelessly, although the wirefull port works as well. It uses the NMEA output and 1PPS strobe from a Uputronics GPS board to discipline its system clock. It has a real-time clock board that with the push of a button can save the time to a battery-backed clock to preserve its state should it need to be powered off. Thanks to the standard Linux kernel and GNU software, it automatically adjusts to Daylight Saving Time, and GPS itself automatically adjusts to the occasional leap second. As long as it can receive GPS signals from an amplified antenna sitting in my office window, it will be within milliseconds (at least) of Universal Coordinated Time.

Untitled

Astrolabe sits on a small table in the living room. It is virtually identical to Hourglass in both hardware and software, but replaces the Uputronics GPS board with a Jackson Labs Technologies CSAC GPSDO, a board that has a Microsemi Chip-Scale Atomic Clock that it phase locks to GPS time. Astrolabe receives NMEA output from the GPS chip on the GPSDO board, but gets its 1PPS strobe from the CSAC cesium-133 oscillator. Once the CSAC is initially disciplined to the GPS 1PPS, and the Pi's system clock is disciplined to both GPS and 1PPS, even if it subsequently cannot receive GPS signals from the matchbook-sized amplified antenna sitting in the living room window, it will still be within milliseconds or even microseconds of UTC. Astrolabe also includes a couple of test points: a BNC connector to the 10 MHz output of the CSAC, and an RS-232 serial port to the command interface of the GPSDO.

This is the most accurate and precise timepiece I will ever own, perhaps could ever own, my own stratum-0 atomic clock. Far far better clocks exist, but I am unlikely to ever own them, much less have them in my living room. Its time is traceable to the atomic clocks in the U.S. NAVSTAR GPS satellite constellation in orbit, which are traceable to the U.S. Air Force Master Control Station clock in Colorado Springs Colorado, which is traceable to the Washington Master Clock at the U.S. Naval Observatory in Washington D.C., which is traceable to the NIST-F2 clock at NIST in Boulder Colorado, which is synchronized with atomic clocks all over the world coordinated by the International Bureau of Weights and Measures in Paris.

I have been fascinated by time, time measurement, and time keeping, for decades. Having spent the past forty-plus years working on real-time systems close to bare metal, this was perhaps inevitable. The ability for information to cross the boundary between the real world and the digital world fundamentally depends on the quality of our frequency sources. And our interpretation of reality rests on the accuracy and precision of our clocks.

Tuesday, May 09, 2017

My Stratum-0 Atomic Clock

In My Stratum-1 Desk Clock I wrote about my quest for accurate and precise timekeeping that lead me to hack together a desk clock that disciplined its system clock against time taken from the U.S. global positioning system (GPS). I have come to think of this first effort as "O-1", inspired by John Harrison, the eighteenth-century carpenter who invented the marine chronometer and many of the fundamental innovations that remain in quality mechanical clocks today. As detailed in Dava Sobel's excellent book Longitude: The True Story of a Lone Genius Who Solved the Greatest Scientific Problem of His Time, Harrison designed and built a series of chronometers which he named H-1 (for Harrison-1) through H-5, in competition for the prize offered by England's Board of Longitude, an organization that in many ways was the DARPA and NSF of its day. Some other horologists of his time adopted his nomenclature for their own entries into that competition.

But O-1 has a flaw: it is only as good as its signal from the GPS satellites. Should the GPS transmissions, from which the one Hertz 1PPS pulse was synthesized, be unavailable, due to weather or what not, O-1 would be only as good as the standard quartz oscillator used by its Raspberry Pi 3. In time, O-1 would drift away from UTC. What good is that? What I needed was a better, more stable oscillator.

What I needed for O-2 was an atomic clock.

Atomic clocks are not clocks in the sense of telling the time of day or measuring time duration. What they are are extremely stable frequency sources, a mechanism to generate very regular ticks, tied to a fundamental physical property of the universe. Modern atomic clocks have the usual trade off of cost versus accuracy and precision, so that they can range from the big refrigerator-sized units fielded by the U.S. National Institute of Standards and Technology (NIST) costing hundreds of thousands of dollars, to the suitcase-sized commercial units found in telecommunications systems and which cost tens of thousands of dollars (and which from time to time you can find used on eBay for around U.S. twelve to fifteen grand).

All of those were outside my price range.

In 2001, and again in 2008, emboldened by developments in practical manufacturing of micro-electro-mechanical systems (MEMS), and by research by organizations like NIST into exploiting quantum effects to eliminate some of the bulkier portions of classic atomic clock designs, the U.S. Defense Advanced Projects Agency (DARPA) notified the R&D community, in BAA-01-32 and BAA-08-32 respectively, that it was interested in funding research in, and development of, a chip-scale atomic clock (CSAC). A CSAC would be an atomic clock - specifically, a cesium-133 reference oscillator - packaged in a surface-mount device.

The implications of such a device are wide ranging and profound. Precision frequency sources are ubiquitous components in both military and civilian applications in the realm of telecommunications, sensors and instrumentation, navigation, and even munitions. A relatively inexpensive, easily portable, and - it must be said - economically destructible cesium reference oscillator would open up vast possibilities of new mobile devices requiring precision timing and frequency standards.

Your tax dollars at work: San Jose-based Symmetricom, who had already purchased the frequency and time standard business unit of Agilent, formerly Hewlett-Packard, the folks who made the suitcase-sized cesium atomic clocks, developed the SA.45s CSAC. (Symmetricom has since been acquired by its California neighbor Microsemi.) Symmetricom with Las Vegas-based Jackson Labs Technologies, a developer of GPS-disciplined oscillators (GPSDO), integrated the CSAC "physics package" (as it is euphemistically called) with a board that incorporates a u-blox GPS chip and a 32-bit ARM-based microcontroller, to create a CSAC GPSDO. This provides the time of day via the output of the GPS chip, and phase locks the cesium oscillator with the 1PPS derived from the GPS system.

Untitled

Once the output of the CSAC is initially disciplined against GPS, the board can maintain better than 0.3 parts per billion stability in the absence of a GPS signal. An O-2 built with the JLT CSAC GPSDO could be within a few microseconds of UTC per day, even without GPS.

It sure wasn't cheap, but it was doable.

Untitled

O-2 runs virtually the same software as O-1, with only minor tweaks. The function of the O-1 GPS board has been replaced by the JLT GPSDO, but the interface to the Raspberry Pi 3 is the same: NMEA 0183 sentences arrive via the serial port, and the 1PPS via a GPIO pin. The interface between the two boards was only complicated by the need for level translation: from RS-232 on the GPSDO to TTL on the O-2 for NMEA, and from 5V logic on the GPSDO to 3.3V logic on the O-2 for 1PPS.

The ARM microcontroller on the GPSDO has its own LCD display (on the right) in addition to the O-2 LCD display (on the left). Here I show it - somewhat redundantly - displaying the time, but a button on the GPSDO can be used to cycle through a number of status displays.

The GPSDO has two serial ports, one read-only providing the raw NMEA stream from the u-blox chip which I use to drive the GPS daemon on the Raspberry Pi, and one read-write serial port used for ASCII commands to the ARM microcontroller on the GPSDO to which I attached a FTDI serial-to-USB convertor connected to my desktop. The GPSDO exports a second serial control interface via its own FTDI chip over the USB connection to the Raspberry Pi that is also used to power the device.

Like O-1 (a.k.a. "hourglass"), O-2 (a.k.a. "astrolabe") doubles as a stratum-1 NTP server on my local area network, accessible via either WiFi or by a CAT5 cable.

astrolabe

(I now have four stratum-1 NTP servers on my LAN, all requiring GPS antennas. I dunno, is that too many?)

Four GPS Antennas

O-1 got mounted on a fixture to become a desk clock in my home office, costing hundreds of dollars in materials. I anticipate O-2 similarly becoming a mantle or wall clock, costing thousands of dollars. But both projects were sure awesome learning experiences.

Acknowledgements

Thanks to reader Fazal Majid who shamed me into developing the O-2. Thanks also to Said Jackson and the folks at Jackson Labs Technologies who graciously put up with some questions.

Repositories

https://github.com/coverclock/com-diag-astrolabe

https://www.ntpsec.org/white-papers/stratum-1-microserver-howto/clockmaker

git://git.savannah.nongnu.org/gpsd.git

https://gitlab.com/NTPsec/ntpsec.git

Sources

R. L. Beard, J. D. White, J. A. Murray, "Military Applications of Time and Frequency", Proc. of the 28th Annual Precise Time and Time Interval Systems and Applications Meeting, Reston, Virginia, 1996

Jackson Labs, CSAC GPSDO User Manual, ver. 1.6, 80200506, 2014-03-16

Jackson Labs, CSAC GPSDO Release Notes, rel. 1.2, 80200505, 2012-01-30

Jackson Labs, CSAC GPSDO Operating Recommendations, rev. 1.1, 2016-04-13

J. Jespersen, J. Fitz-Randolph, From Sundials to Atomic Clocks: Understanding Time and Frequency, 2nd ed., Dover, 1999

A. Lal, "Integrated Micro Primary Atomic Clock Technology (IMPACT)", DARPA BAA-08-32, 2008-04-21

Microsemi, SA.45s Chip-Scale Atomic Clock User Guide, rev. C, 098-00055-000, 2016-07

G. Miller, E. Raymond, GPSD Time Service HOWTO, ver. 2.10, 2016-09

E. Raymond et. al, Building GPSD from source, 2016-04-10

E. Raymond et. al, Hacker's Guide to GPSD, 2016-09-02

K. Shenoi, "Clock, Oscillators, and PLLs: An introduction to synchronization and timing in telecommunications", Workshop on Synchronization in Telecommunication Systems, San Jose, CA, 2013-04

D. Sobel, Longitude: The True Story of a Lone Genius Who Solved the Greatest Scientific Problem of His Time, Bloomsbury USA, 1995

Symmetricom, "Introduction to Symmetricom's QUANTUM(tm) Chip Scale Atomic Clock SA.45s CSAC", 2012 International Technical Meeting of The Institute of Navigation, Newport Beach, CA, 2012-01

Symmetricom, The SA.45S Chip-Scale Atomic Clock, 2011 Sanford PNT Symposium, Menlo Park, CA, 2011-11-18

W. Tang, "A Chip-Scale Atomic Clock", DARPA BAA-01-32, 2001-07-06

Wednesday, April 05, 2017

Tick

Regardless of the technology, our measurement of time is based on a tick. Every timepiece ticks. But not all ticks are created equal. An error in the period of the tick causes a clock to run too fast or too slow - drift - or introduces uncorrelated high frequency - jitter - or low frequency - wander - variation. This error is sometimes described in dimensionless units like parts per million or PPM: a one part per million error in a clock with a tick of one second is equivalent to an error of about thirty-two seconds each year, because there are approximately thirty-two million seconds in a year. One PPM is equivalent to an error of 0.000001 or 0.0001%.

The inverse of the period of the tick is, of course, its frequency, what a scientist or engineer would measure in units of Hertz or cycles per second, but what horologists refer to as beats per second.

Rolex Oyster Perpetual Date GMT Master II Automatic Chronometer

My beloved Rolex GMT Master II wristwatch - originally designed by Rolex for Pan American flight crews on long haul flights - ticks at a frequency of eight beats per second: if you listen very carefully, for every second the second hand marks off, you can hear it tick eight times. That's the sound of the mechanical escapement and balance wheel - you can think of them as a kind of tiny pendulum - inside my Rolex, driven by a mainspring that is automatically wound every time I move my arm. Eight beats per second is pretty typical of a good mechanical watch. The mechanism, or movement, of the watch handles the translation of the 8Hz beat into the motion of all of the hands on the watch.

A one PPM error rate in my Rolex would equate to it being off about four seconds a year. In practice, no mechanical watch is anywhere near that good. Rolex aims for a few seconds a day, perhaps in the neighborhood of fifty PPM. Still, my Rolex is a certified Swiss chronometer, a test for stability, accuracy, precision, and reliability that can be traced back to the invention of chronometers for celestial navigation in the days of sail. My Rolex is perfectly adequate for navigating by sextant, should the need arise. Also, it doesn't need batteries.

Quartz watches work like this too. But instead of a mechanical escapement, they use the vibration of a quartz crystal, called an oscillator, driven by an electric current, to count off ticks. (The audible click you may hear in an analog quartz watch isn't the quartz crystal, which is vibrating much to fast to discern, but that of the electro-mechanical actuator of the watch that moves its hands.) The quartz watch movement exploits the piezoelectric effect: a minute electrical current applied to a quartz crystal causes it to vibrate at a known rate. (And vice versa: applying mechanical stress to a quartz crystal generates a minute electrical current. Science!)

Todd Snyder Timex Mod Quartz Watch

The quartz crystal in my affordable yet oh so stylish Todd Snyder Timex Mod wristwatch vibrates at a frequency of 32,768 beats per second, 4,096 times more often than my Rolex. So the period of the Timex tick is a tiny fraction of that of my Rolex.

Even so, quartz crystals vary in their exact rate of vibration according to their temperature, which can vary widely in a wristwatch. My Timex may have an error rate of perhaps four PPM. In applications which require a higher degree of precise timing, a crystal oscillator may be embedded in an oven that thermostatically controls its temperature. Such oven-controlled crystal oscillators (OCXO) can be surprisingly small surface mount components.

Casio F-91W Quartz Digital Watch

The use of crystal oscillators as timing sources, or frequency standards, is ubiquitous. There is at least one crystal oscillator in every digital device you own. The accuracy, precision, and reliability of the oscillator in this US$10 Casio F-91W digital watch - that, plus its back is easily removed, giving access to its innards, and, of course, it's cheap as dirt and easily available - makes it a favorite amongst fabricators of improvised explosive devices (IED), or so I've read.

A rubidium atomic clock - an atomic clock is really an extraordinarily precise oscillator - has a frequency of 6,834,682,610.904 beats per second. That astonishingly high frequency is why atomic clocks are so good: we are hard pressed to even measure a relative error in a period that short, except by comparing two atomic clocks. (Which, by the way, is one way in which time dilation in the Theory of Relativity has been experimentally verified.)

A cesium atomic clock has a frequency of exactly 9,192,631,770 beats per second. Why exactly? Because 9,192,631,770 beats of a cesium atomic clock is the definition of a second in the International System of Units.

And a strontium atomic clock - the most accurate timekeeping device ever constructed by humankind - beats at about 430 trillion times a second. It will lose a second perhaps every fifteen billion years. That's an error rate better than any number for which we have a name.

Tuesday, April 04, 2017

Some Stuff That Has Worked For Me In C

Diminuto (Spanish for "tiny") is a library of C functions and macros that support the kind of low level embedded, real-time, and systems programming work I am typically called upon to do on the Linux/GNU platform. It's no longer so tiny. I started it way back in 2008 and it is still going strong. Its open-source (FSF LGPL 2.1) code has found its way into a number of shipping commercial products for more than one of my clients. Diminuto can be found on GitHub along with eighteen (so far) other repositories (some private) of my work.

My forty year career has been a long, strange, and marvelous trip. Along the way, I've found a number of techniques of architecture, design, and implementation in C and C++ that have worked well for me, solving a number of recurring problems. I don't claim that these techniques are the best, or that I am the only, or even the first, developer to have thought them. Some of them I shamelessly borrowed, sometimes from other languages or operating systems, because I try to know a good idea when I see one.

This article documents some of those ideas.

Standard Headers and Types

Just about every big closed-source project I've worked on defined its own integer types. Even the Linux kernel does this. Types like u32 for unsigned 32-bit integers, or s8 for signed 8-bit integers. Historically - and I'm talking ancient history here - that was a good idea, way back when C only had integer types like int and char, and there weren't dozens and dozens of different processors with register widths that might be 8, 16, 32, or 64 bits.

But this kind of stuff is now available in standard ANSI C headers, and Diminuto leverages the heck out of them.

stdint.h defines types like uint32_t, and int8_t, and useful stuff like uintptr_t, which is guaranteed to be able to hold a pointer value, which on some platforms is 32-bits, and on others 64-bits.

stddef.h defines types like size_t and ssize_t that are used by a variety of POSIX and Linux systems calls and functions, and are guaranteed to be able to hold the value returned by the sizeof operator.

stdbool.h defines the bool type and constant values for true and false (although, more about that in a moment).

In at least one client project, I talked them into redefining their own proprietary types via typedef to use these ANSI types, which simplified porting their code to new platforms.

Boolean Values

Even though I may use the bool type defined in stdbool.h, I don't actually like the constants true and false. For sure, false is always 0. But what value is true? Is it 1? Is it all ones, e.g. 0xff? Is 9 true? How about -1?

In C, true is anything that is not 0. So that's how I code a true value: !0. I don't care how the C standard or the C compiler encodes true, because I know for sure that !0 represents it.

I normalize any value I want to use as a boolean using a double negation. For example, if I have two variables alpha and beta that are booleans. Do I use
(alpha == beta)
to check if they are equal? What if alpha is 2 and beta is 3? Both of those are true, but the comparison will fail. Do I use
(alpha && beta)
instead? No, because I want to know if they are the same, both true, or both false, not if they are both true. If C had a logical exclusive OR operator, I'd use that - or, actually, the negation of that - but it doesn't. I could do something like
(((alpha && beta) || ((!alpha) && (!beta)))
but that hurts my eyes.

Double negation addresses this: !!-1 equals !0 equals whatever the compiler uses for true. I use
((!!alpha) == (!!beta))
unless I am positive that both alpha and beta have already been normalized.

I also do this if I am assigning a value to a boolean
alpha = !!beta
unless I am very sure that beta has already been normalized.

Inferring Type

If I want to check if a variable being used as a boolean - no matter how it was originally declared - is true, I code the if statement this way.
if (alpha)
But if the variable is a signed integer and I want to know if it is not equal to zero, I code it this way.
if (alpha != 0)
And if it's an unsigned integer, I code it this way.
if (alpha > 0)
When I see a variable being used, I can usually infer its intended type, no matter how it might be declared elsewhere.

Parentheses and Operator Precedence

You may have already noticed that I use a lot of parentheses, even where they are not strictly speaking necessary. Can I remember the rules of operator precedence in C? Maybe. Okay, probably not. But here's the thing: I work on big development projects, hundreds of thousands or even millions of lines of code, with a dozen or more other developers. And if I do my job right, the stuff I work on is going to have a lifespan long after I leave the project and move on to something else. Just because I can remember the operator precedence of C, the next developer that comes along may not. So I want to make my assumptions explicit and unambiguous when I write expressions.

Also: sure, I may be writing C right now. But this afternoon, I may be elbow deep in C++ code. And tomorrow morning I may be writing some Python or Java. Tomorrow afternoon, sadly, I might be hacking some JavaScript that the UI folks wrote, even though I couldn't write a usable line of JavaScript if my life depended on it. Even if I knew the rules of operator precedence for C, that's not good enough; I need to know the rules for every other languages I may find myself working in.

But I don't. So I use a lot of parentheses.

This is the same reason I explicitly code the access permissions - private, protected, public - when I define a C++ class. Because the default rules in C++ are different for class versus struct (yes, you can define access permissions in a struct in C++), and different yet again for Java classes.

Exploiting the sizeof Operator

I never hard-code a value when it can be derived at compile time. This is especially true of the size of structures or variables, or values which can be derived from those sizes.

For example, let's suppose I need two arrays that must have the same number of elements, but not necessarily the same number of bytes.
int32_t alpha[4];
int8_t beta[sizeof(alpha)/sizeof(alpha[0])];
The number of array positions of beta is guaranteed to be the same as the number of array positions in alpha, even though they are different types, and hence different sizes. The expression
(sizeof(alpha)/sizeof(alpha[0]))
divides the total number of bytes in the entire array with the number of bytes in a single array position.

This has been so useful that I defined a countof macro in a header file to return the number of elements in an array, providing it can be determined at compile time. I've similarly defined macros like widthof, offsetof, memberof, and containerof.

Doxygen Comments

Doxygen is a documentation generator for C and C++, inspired by Java's documentation generator Javadoc. You write comments in a specific format, run the doxygen program across your source code base, and it produces an API document in HTML. You can use the HTML pages directly, or convert them using other tools into a PDF file or other formats.

/**
 * Wait until one or more registered file descriptors are ready for reading,
 * writing, or accepting, a timeout occurs, or a signal interrupt occurs. A
 * timeout of zero returns immediately, which is useful for polling. A timeout
 * that is negative causes the multiplexer to block indefinitely until either
 * a file descriptor is ready or one of the registered signals is caught. This
 * API call uses the signal mask in the mux structure that contains registered
 * signals.
 * @param muxp points to an initialized multiplexer structure.
 * @param timeout is a timeout period in ticks, 0 for polling, <0 for blocking.
 * @return the number of ready file descriptors, 0 for a timeout, <0 for error.
 */
static inline int diminuto_mux_wait(diminuto_mux_t * muxp, diminuto_sticks_t timeout)
{
    return diminuto_mux_wait_generic(muxp, timeout, &(muxp->mask));
}

Even if I don't use the doxygen program to generate documentation, the format enforces a useful discipline and consistent convention in documenting my code. If I find myself saying "Wow, this API is complicated to explain!", I know I need to revisit the design. I document the public functions in the .h header file, which is the de facto API definition. If there are private functions in the .c source file, I put Doxygen comments for those functions there.

Namespaces (updated 2017-04-25)

Most of the big product development projects I've worked on have involved writing about 10% new code, and 90% integrating existing closed-source code from prior client projects. When that integration involved using C or C++ code from many (many) different projects, name collisions were frequently a problem, either with symbols in the source code itself (both libraries have a global function named log), or in the header file names (e.g. both libraries have header files named logging.h).

In my C++ code, such as you'll find in my Grandote project on GitHub (a fork from Desperadito, itself a fork from Desperado, both since deprecated), there is an easy fix for the symbol collisions: C++ provides a mechanism to segregate compile-time symbols into namespaces.

namespace com {
namespace diag {
namespace grandote {

Condition::Condition()
{
 ::pthread_cond_init(&condition, 0);
}

Condition::~Condition() {
 ::pthread_cond_broadcast(&condition);
 ::pthread_yield();
 ::pthread_cond_destroy(&condition);
}

int Condition::wait(Mutex & mutex) {
 return ::pthread_cond_wait(&condition, &mutex.mutex); // CANCELLATION POINT
}

int Condition::signal() {
 return ::pthread_cond_broadcast(&condition);
}

}
}
}

It doesn't matter that another package defines a class named Condition, because the fully-qualified name of my class is actually com::diag::grandote::Condition. Any source code that is placed within the namespace com::diag::grandote will automatically default to using my Condition, not the other library's Condition, eliminating the need for me to specify that long name every time.

Note too that the namespace incorporates the domain name of my company in reverse order - borrowing an idea from how Java conventionally organizations its source code and byte code files. This eliminates any collisions that might have occurred because another library itself has the same library, class, or module name as part of its own namespace.

This works great for C++, but the problem remains for C. So there, I resort to just prepending the library name and the module name to the beginning of every function. You already may have noticed that in the diminuto_mux_wait function in the Doxygen example above: the wait operation in the mux module of the diminuto library.

But one problem that neither C++ namespaces nor my C naming convention, solves is header file collisions. So in both C and C++ I organize header file directories in a hierarchical fashion so that #include statements look like this, from my Assay project.

#include <stdio.h>
#include <errno.h>
#include "assay.h"
#include "assay_parser.h"
#include "assay_fixup.h"
#include "assay_scanner.h"
#include "com/diag/assay/assay_scanner_annex.h"
#include "com/diag/assay/assay_parser_annex.h"
#include "com/diag/diminuto/diminuto_string.h"
#include "com/diag/diminuto/diminuto_containerof.h"
#include "com/diag/diminuto/diminuto_log.h"
#include "com/diag/diminuto/diminuto_dump.h"
#include "com/diag/diminuto/diminuto_escape.h"
#include "com/diag/diminuto/diminuto_fd.h"
#include "com/diag/diminuto/diminuto_heap.h"

The first two header files, stdio.h and errno.h, are system header files. The next four header files whose names begin with assay_ are private header files that are not part of the public API and which are in the source directory with the .c files being compiled. The next two header files are in a com/diag/assay subdirectory that is part of the Assay project. The remaining header files are in a com/diag/diminuto subdirectory that is part of the Diminuto project.

Here's another example, from the application gpstool that makes use of both my Hazer and Diminuto projects.

#include <assert.h>
#include <unistd.h>
#include <stdio.h>
#include <string.h>
#include <stdint.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include "com/diag/hazer/hazer.h"
#include "com/diag/diminuto/diminuto_serial.h"
#include "com/diag/diminuto/diminuto_ipc4.h"
#include "com/diag/diminuto/diminuto_ipc6.h"
#include "com/diag/diminuto/diminuto_phex.h"

This convention is obviously not so important in my C header files, where the project name is part of the header file name. But it becomes a whole lot more important in C++, where my convention is to name the header file after the name of the class it defines, as this snippet from Grandote illustrates.

#include <unistd.h>
#include <sys/stat.h>
#include <fcntl.h>
#include "com/diag/grandote/target.h"
#include "com/diag/grandote/string.h"
#include "com/diag/grandote/Platform.h"
#include "com/diag/grandote/Print.h"
#include "com/diag/grandote/DescriptorInput.h"
#include "com/diag/grandote/DescriptorOutput.h"
#include "com/diag/grandote/PathInput.h"
#include "com/diag/grandote/PathOutput.h"
#include "com/diag/grandote/ready.h"
#include "com/diag/grandote/errno.h"
#include "com/diag/grandote/Grandote.h"

Once I came upon this system of organizing header files, not only did it become much easier to integrate and use multiple projects into a single application, but the source code itself became more readable, because it was completely unambiguous where everything was coming from. This strategy worked so well, I use it not just for C and C++, but to organize my Python, Java, and other code too. I also organize my GitHub repository names, and, when I use Eclipse, my project names, in a similar fashion:

  • com-diag-assay,
  • com-diag-grandote,
  • com-diag-diminuto,
  • com-diag-hazer

etc.

This is more important than it sounds. Digital Aggregates Corporation (www.diag.com) is my consulting company that has been around since 1995. It holds the copyright on my open-source code that may find its way into my clients' products. The copyright of my closed-source code is held by another of my companies, Cranequin LLC (www.cranequin.com). So if I see the header file directory com/cranequin or a project or repository with the prefix com-cranequin, I am reminded that I am dealing with my own proprietary code under a different license.

Time

The way time is handled in POSIX- whether you are talking about time of day, a duration of time, or a periodic event with an interval time - is a bit of a mess. Let's see what I mean.

gettimeofday, which returns the time of day, uses the timeval structure that has a resolution of microseconds. It's cousin time uses an integer that has a resolution of seconds.

clock_gettime, which when used with the CLOCK_MONOTONIC_RAW argument returns an elapsed time suitable for measuring duration, uses the timespec structure that has a resolution of nanoseconds.

setitimer, which is be used to invoke an periodic interval timer, uses the itimerval structure that has a resolution of microseconds.

select, which is be used to multiplex input/output with a timeout, uses the timerval structure and has a resolution of microseconds. Its cousin pselect uses the timespec structure that has a resolution of nanoseconds.

poll, which can also be used to multiplex input/output with a timeout, uses an int argument that has a resolution of milliseconds. It's cousin ppoll uses the timespec structure that has a resolution of nanoseconds.

nanosleep, which is used to delay the execution of the caller, uses the timespec structure that has a resolution of nanoseconds. Its cousin sleep uses an unsigned int argument that has a resolution of seconds. Its other cousin usleep uses an useconds_t argument that is a 32-bit unsigned integer and which has a resolution of microseconds.

Seconds, milliseconds, microseconds, nanoseconds. Six different types. It's too much for me.

Diminuto has two types in which time is stored: diminuto_ticks_t and diminuto_sticks_t. Both are 64-bit integers, the only difference being one is unsigned and the other is signed. (I occasionally regret even that.) All time is maintained in a single unit of measure: nanoseconds. This unit is generically referred to as a Diminuto tick.

diminuto_time_clock returns the time of day in Diminuto ticks. There is another function in the time module to convert the time of day from a ticks since the POSIX epoch into year, month, day, hour, minute, second, and fraction of a second in - you guessed it - Diminuto ticks.

diminuto_time_elapsed returns the elapsed time in Diminuto ticks suitable for measuring duration.

diminuto_timer_periodic invokes a periodic interval timer, and diminuto_timer_oneshot invokes an interval timer that fires exactly once, both using Diminuto ticks to specify the interval.

diminuto_mux_wait uses pselect for I/O multiplexing, and diminuto_poll_wait does the same but uses ppoll, both specifying the timeout in Diminuto ticks. (I believe pselect is now implemented in the Linux kernel using ppoll, but that wasn't true when I first wrote this code.)

diminuto_delay delays the caller for the specified number of Diminuto ticks.

Floating Point

If you really need to use floating point, you'll probably know it. But using floating point is problematic for lots of reasons. Diminuto avoids using floating point except in a few applications or unit tests where it is useful for reporting final results.

Even though Diminuto uses a single unit of time, that doesn't mean the underlying POSIX or Linux implementation can support all possible values in that unit. So every module in Diminuto that deals with time has an inline function that the application can call to find out what resolution the underlying implementation supports. And every one of those functions returns that value in a single unit of measure: Hertz. Hertz - cycles per second - used because it can be expressed as an integer. It's inverse is the smallest time interval supported by the underlying implementation.

diminuto_frequency returns 1,000,000,000 ticks per second, the base frequency used by the library. The inverse of this is one billionth of a second or one nanosecond.

diminuto_time_frequency returns 1,000,000,000, the inverse being the resolution of timespec in seconds.

diminuto_timer_frequency returns 1,000,000, the inverse being the resolution of setitimer.

diminuto_delay_frequency returns 1,000,000,000, the inverse being the resolution of timespec.

With just some minor integer arithmetic - which can often be optimized out at compile-time - the application can determine what is the smallest number of ticks the underlying implementation can meaningfully support in each module. (Note that this is merely the resolution representable in the POSIX or Linux API; the kernel or even the underlying hardware may support an even coarser resolution.)

Remarks

I first saw the countof technique in a VxWorks header file perhaps twenty years ago while doing embedded real-time development in C++ at Bell Labs. Similarly, a lot of these techniques have been picked up - or learned the hard way - while taking this long strange trip that has been my career. I have also benefitted greatly from having had a lot of mentors, people smarter than I am, who are so kind and generous with their time. Many of these techniques gestated in my C++ library Desperado, which I began putting together in 2005.

I started developing and testing the Diminuto C code for a number of reasons.
  • I found myself solving the same problems over and over again, sometimes for different clients, sometimes even for different projects for the same client. For one reason or another - sometimes good, sometimes not so good - the proprietary closed-source code I developed couldn't be shared between projects. But open-source code could easily be integrated into the code base.
  • I wanted to capture some of the design patterns I had found useful in my closed-source work. Working, unit tested, open-source code, developed in a clean room environment, and owned by my own company, was a useful way to do that.
  • I needed a way to get my head around some of the evolving C, C++, POSIX, and Linux APIs. Since I can only really learn by doing, I had to do, which for me meant writing and testing code.
  • I wanted to make an API that was more consistent, more easily interoperable, and less prone to error than was offered by raw POSIX, Linux, and GNU.
  • In the past I have been a big proponent of using C++ for embedded development. I have written hundreds of thousands of lines of C++ while doing such work over the years. But more recently, I have seen a gradual decline in the use of C++ by my clients, with a trend more towards segregating development into systems code in C and application code in languages like Python or Java. Although I miss C++, I have to agree with the economics that. And I fear that C++ has evolved into a language so complex that it is beyond the ken of developers that my clients can afford.
I am not a "language lawyer". I get paid to develop product that ships and generates revenue. I believe I have been successful in part because I am a very hands-on and evidence based person, but also because I am extremely pragmatic about what works and what doesn't.

This is some stuff that, over the span of many years, has worked.

Update (2017-04-14)

I recently forked Desperadito (which itself is a mashed up fork of both Desperado and Hayloft) into Grandote (https://github.com/coverclock/com-diag-grandote). What's different is Grandote uses the Diminuto C library as its underlying platform abstraction. It requires that you install Diminuto, Lariat (a Google Test helper framework), and Google Test (or Google Mock). That's some effort, but at least for me it wasn't overly burdensome when I built it all this morning on a Ubuntu 16.04 system. The other improvement over Desperadito is that both the old Desperado hand coded unit tests, and the newer Hayloft Goggle Test unit tests, work. Grandote would be my C++ framework going forward, if I were to need my own C++ framework. (And Grandote doesn't preclude using STL or Boost as well.)

Friday, March 31, 2017

My Stratum-1 Desk Clock

Long time readers who remember my article Does anybody really know what time it is? from way back in 2006 will appreciate that I have a deep and abiding interest in timekeeping. So it might come as a surprise to those folks that it has taken me this long to build a stratum-1 desk clock. This is a desk clock so accurate and precise that it is within a few milliseconds of atomic clocks, which aren't really clocks in the usual sense but oscillators that make super precise frequency standards, the best sources of clock ticks that humankind knows how to make.

My Stratum-1 Desk Clock

I cobbled together this little project from
All of the heavy lifting is done by open source software written by other folks. I followed Eric S. Raymond's Stratum-1-Microserver HOWTO and used his clockmaker Python script to do most of the software build and install. As a result, the Pi runs the gpsd Global Positioning System (GPS) daemon and the ntpd Network Time Protocol (NTP) daemon. I administered the WiFi radio on the desk clock so that it can be queried by NTP daemons running on other systems on my local area network. As far as I know, this is the only desk clock I've ever owned that has virtual memory and supports ssh.

The GPS daemon reads the National Marine Electronics Association (NMEA) sentences output from the GPS board via the Pi's serial port. These sentences contain the computed time and date based on timestamps received from both the U. S. NAVSTAR GPS and the Russian GLONASS satellites in mid-earth orbit. The daemon also synchronizes with the GPS board's One Pulse Per Second (1PPS) output via a General Purpose Input/Output (GPIO) pin. This 1Hz heartbeat is derived from the GNSS signals, and is the high technology digital equivalent to the "when you hear the tone, the time will be..." from the old time-of-day telephone service.

The Real-Time Clock (RTC) saves the current time in non-volatile memory when the Pi is shutdown, and keeps ticking off the seconds even when the Pi is powered off, thanks to a battery backup. The time is restored from the RTC when the Pi boots up. This allows the desk clock to know the time and date (within a fraction of a second anyway) before the GPS board can achieve a lock on enough satellites (four at least) to compute the time.

The LCD display is driven by a modest little Python script that I wrote that just queries the Linux system clock five times a second. The NTP daemon keeps the system clock synchronized to GPS Time (GPST) plus the appropriate number of leap seconds offset from Coordinated Universal Time (UTC) as received from the satellites. The GNU library handles the conversion from UTC to local time, depending on the administered time zone, as well as any arcane conversions necessary for Daylight Saving Time (DST).

This is remarkably complex, but all handled by the open source software. The Python script that I wrote is perhaps a dozen lines of code at most. By virtue of the fact that the time maintained by the GNSS satellites is stratum-0 - we'll get to what that means in a moment - and the desk clock is constantly being corrected to follow that clock source, my desk clock is effectively a stratum-1 clock.

Untitled2

There was a time when owning a stratum-1 clock would have been a major financial investment. Today, it's a hobby.

You can find the additional files I modified or added on the Raspberry Pi on GitHub: https://github.com/coverclock/com-diag-hourglass .

Too Much Information

A little background might be called for to really appreciate my desk clock. I haven't discovered the optimal order to present these factlets, so you will have to forgive a few forward references.

GNSS. There are two fully operational Global Navigation Satellite Systems in medium Earth Orbit (about 12,000 miles altitude). The Navigation Satellite Timing and Ranging (NAVSTAR) Global Positioning System (GPS) constellation was the first one, launched and supported by the U. S. Department of Defense. The Globalnaya Navigazionnaya Sputnikovaya Sistema (GLONASS) is the Russian constellation. BeiDou-2 is the constellation that China is building. Galileo is the constellation the European Union is building. Some GPS receivers only receive NAVSTAR signals; some (like the one in my desk clock) receive both NAVSTAR and GLONASS signals. (Henceforth I shall use the initialism GPS as a generic term.)

Accuracy versus precision. (Updated 2017-05-07) Accuracy is a measure of systematic error. Precision is a measure of random error. The classic example from statistics class is if you fire arrows at a target and the arrows scatter in and around the bullseye, you are accurate but not precise; if the arrows all hit near the same spot that is not the bullseye, you are precise but not accurate.



This is accurate, but not precise (image: Wikimedia).



This is precise, but not accurate (image: Wikimedia).

Clocks can be described the same way. Clocks that are synchronized to GPS are accurate because of the timestamps transmitted by the GPS satellites, which can be read from the NMEA data stream presented by the GPS receiver. They are precise because of the 1Hz 1PPS pulse derived from the GPS transmissions that indicates exactly when the timestamp is correct. (Not all GPS receivers emit the 1PPS pulse.)
(Update 2017-04-27) I have recently learned that folks that think about timekeeping a lot deeper than I do prefer the term "stability" to the more statistical "precision". I personally find that term to be a little ambiguous in my line of work. But it's important to know the lingo.
(Update 2017-05-08) ISO 5725-1:1994(E) defines the term "trueness" to express how close the mean of a set of measurements is to the accepted reference value, and "accuracy" to be how close a single measurement is to the reference value. This departs from what I was taught in statistics and engineering classes eons ago. But I can see how this is a useful distinction.
Clock strata. Clocks in information and telecommunication systems are conventionally classified as to their precision, accuracy, and stability into strata. (This classification predates GPS and its use as a clock source.) Stratum-0 clocks are the most accurate and precise clocks we know how to construct: "atomic" clocks that discipline the duration of their ticks by measuring the physical properties of streams of cesium or rubidium atoms. Stratum-1 clocks are those that derive their timing from stratum-0 clocks, and so forth.
(Updated 2017-04-11) Although I don't see it referred to much, ANSI T1.101 (6.1, p. 24, 1987) specifies the minimum accuracy for clocks of each strata in dimensionless units, omitting stratum-0; this was in the days of copper (e.g. DS-1) and occasionally optical (e.g. OC-3) telecommunications channels.
  • Stratum-1: 1 x 10-11 (read as "1 part per 1011")
  • Stratum-2: 1.6 x 10-8 (16 parts per billion or "16 ppb")
  • Stratum-3: 4.6 x 10-6 (4.6 parts per million or "4.6 ppm")
  • Stratum-4: 3.2 x 10-5 (32 ppm)
Clocks of each stratum are kept accurate and precise by a combination of continual adjustment to their higher-quality lower-stratum clock source, and by their own physical properties. For example, digital electronic clocks discipline the period of their ticks on the vibration of a crystal, which is more consistent if it is in a temperature controlled environment. I have worked on embedded systems which had surface mount clock crystals contained in a tiny oven - a thermostatically controlled chamber that kept the crystal at a constant operating temperature to insure consistent behavior - called an Oven Controlled Crystal Oscillator (OCXO). The inherent quality of a clock is a measure of its ability to holdover - to maintain the correct time during any period when it temporily loses contact with its lower-stratum clock source.

Time of day, duration, and frequency. When we talk about clocks we are really talking about three fundamentally different kinds of things: an instrument which measures the time of day; one which measures time duration; and one that serves as a frequency standard. Sometimes the same device can be used for all three, but in digital information and telecommunications systems, typically not.

When we think of a clock, we probably first think of one that tells us the time of day. In fact, some computer systems refer to such a device as the Time Of Day (TOD) clock. (POSIX likes to refer to this as the real-time clock, but this time is no more, and sometimes a lot less, real than other ways of measuring time.) A TOD clock is useful for knowing when to meet your friends for lunch. But it's usefulness as a measurement of time duration breaks down when it is
  • sprung forward or fallen backward for Daylight Saving Time, or
  • the NTP daemon adjusts the clock to synchronize it with a remote time source, or
  • a leap second is inserted (or subtracted) to re-align the TOD clock with the solar day.
Time Of Day clocks are useful for time stamping events in log files, especially if the clocks on different systems are more or less synchronized, and are even more useful if such timestamps are kept in a common time zone, like Coordinated Universal Time, so that related events in different systems can be easily correlated. But such time stamps are at best only approximately useful for measuring duration, since the TOD clock may be routinely adjusted forwards or backwards while the system is running. The POSIX standard defines the gettimeofday(2) system call to acquire the time of day, but this call is not appropriate for measuring duration for just the reasons cited.

When we think of measuring time duration, we might think of a stopwatch. The POSIX standard defines just such a capability in the clock_gettime(2) system call when it is invoked using CLOCK_MONOTONIC_RAW argument. This provides access to a monotonically increasing hardware clock with which duration can be measured by subtracting successive values from prior values. It is useless for telling time of day, but it is the right way to measure a time duration since it does not change as the TOD clock is adjusted.

All conventional digital electronic systems are synchronous: actions in integrated circuits are triggered by digital pulses generated, directly or indirectly, by the tick of a common frequency standard. Typically this frequency standard is a quartz crystal oscillator - basically the same hardware as in your quartz wristwatch. (It is this oscillator function - the thing that provides the tick of the clock - that the atomic clock provides.) Sometimes it is a periodic digital pulse arriving from an external source, like a communication channel. The entire global telecommunication infrastructure - landlines, internet connections, cellular systems - are in effect all synchronized using a shared frequency standard. A Bell Labs colleague once described this as "getting everyone in the world to jump up and down at the same time". Before GPS was available to provide such a shared standard, using a common constellation of satellites containing atomic clocks, AT&T owned several ground-based atomic clocks, which were used as high quality frequency standards for their network and others. The POSIX standard defines the setitimer(2) system call that can be used to access an interval timer that fires with a caller-defined period.
Update (2017-04-25): Although frequency, duration, and time of day are three very different things, they may be derived from one another. Time duration may be derived from counting ticks from a frequency standard. And time of day can be thought of as a nomenclature for naming a duration from a fixed point in time, such as midnight or noon. But in hardware and software systems, they are typically all treated very differently.
Offset, jitter, and drift. These are terms used to describe variation between two clocks, or between successive ticks of the same clock. Offset is easy: it is the absolute difference between two clocks. The hard part is that the offset is likely to change as time goes by; that's where jitter and drift come in. The exact definitions of jitter and drift are somewhat open to debate, but here is how I have come to think about them: jitter is a high-frequency, short-term, statistical variation in a clock; drift is a low-frequency, long-term, systematic bias in a clock. Or: jitter is about precision, drift is about accuracy.

Jitter and Drift

Jitter in real-time data streams like audio or voice is a small variation in the expected inter-arrival time of the packets, either a little too early or a little too late. With jitter, there is no correlation between successive packets; the next packet is just as likely to be early as late, so that the mean arrival time is correct. As long as jitter is small, packets arriving early can be accommodated by using jitter buffers; the receiver just holds the data until its expected arrival time occurs, then plays it out. However, if a packet of data arrives too late, it may have to be discarded because its "best if used before" time has past and the next packet in the stream is right on its tail, and the user will notice a audible or visible artifact in the play back of the stream.

Drift is most typically unidirectional, although it can also be bidirectional. Let's say you have an interval timer that is supposed to fire once per second - 1Hz. So you expect a timer event to occur at 1 second, 2 seconds, 3 seconds, etc. on the timeline. But the clock that is driving the interval timer is running at a frequency that is 10% too slow, so its period is 10% too long; the first timer event arrives at 1.1 second, the second at 2.2 seconds, the third at 3.3 seconds. The timeline of the events is shifting in the increasing direction. Something similar happens if the clock is 10% too fast, except the events on the time line shift in the decreasing direction. 10% is a big error; if the error in the clock is really tiny, you might not notice the shift in the timeline until a lot of them go by. This kind of thing can happen with the old school RS-232 serial connections because the transmitter and the receiver had no common reference clock. (My digital design friends would say that each end of the serial connection is in a different clock domain.) The received data would seem just fine - because the receiver was still sampling the incoming pulses within the window of correctness - until its clock drifted its sampling into the next character frame, at which point that character would be read incorrectly. That's why RS-232 has start and stop bits at the beginning and end of each character frame, whose purpose is to identify such a framing error so that the bad character can be discarded.
(Update 2017-05-08) I was recently reminded that my colleagues who worry about synchronization (getting two clocks to agree on the time of day) and syntonization (getting two clocks to agree on frequency) in telecommunication systems seem to use the term "drift" to talk about unidirectional long-term variation, and "wander" to talk about "bidirectional" long-term variation. Jitter typically refers to variations above 10 Hz and wander for variations below 10 Hz (admittedly a completely arbitrary convention). Sometimes they use the term "skew" where I might use drift, as in "clock skew". Historically, drift is the term I see used by horologists. To make it more confusing, different standards documents and organizations do not seem to agree on this terminology.
Navigation and time. Global navigation satellite systems know nothing about position. What they know about is time. Every GPS satellite has multiple redundant atomic clocks - cesium and/or rubidium - which are the most precise timepieces humankind knows how to construct. Every GPS satellite is accurate because the onboard atomic clocks are synchronized to within a few nanoseconds of each other by commands from ground stations.

Each GPS satellite continually transmits its unique identity number, a timestamp, and its ephemeris (a description of its orbit). By receiving this information from four different satellites, the receiver can solve four equations in four unknowns - its x, y and z position in space, and its t position in time -  because it can compute its exact distance from each of the four satellites. The receiver then uses an abstract model of the earth - the World Geodetic System - to convert the x, y and z spatial coordinates into something humans find more useful: latitude, longitude, and altitude above mean sea level.

Each satellite also transmits the offset between GPS time (which does not recognize leap seconds) and UTC (which does). This offset changes each time the International Earth Rotation Service (IERS) adds a leap second to UTC. This is done to keep our wall clocks in sync with the rotation of the Earth as it gradually slows down due to the drag of the tidal forces of the Sun and the Moon.

The use of time for navigation in this way is a very old idea. The technology of mechanical clocks themselves was driven by the need for an accurate and precise portable time standard with which to compute longitude when using celestial navigation during the age of sail. Later, in the age of steam, ships captains noted when they heard foghorns, which ran off mechanical clockwork automation, and could use the offset from their ship's clock plus knowledge of the speed of sound to compute their distance from shore. In WWII, Long Range Navigation (LORAN) used radio beacons to similar effect, although before digital computers it required the ship carry an oscilloscope - which in those days was the size of a dorm refrigerator and took three men and a little boy to carry - tied to the LORAN receiver; the operator would measure distances between pulses from LORAN transmitters at known locations to triangulate the ship's position.

Distributed systems and time. Whether you are running a vast distributed network of devices, a few computers in an organization, or just a laptop, there are a lot of advantages to synchronizing the Time Of Day clock on a computer with the Internet at large. That, and a lot of other network protocols depend on it. That's what the Network Time Protocol (NTP), and (on Linux/GNU systems anyway) the NTP daemon (ntpd) does. NTP on the client periodically exchanges time stamps with a remote server, and from those time stamps it computes a clock offset from the server and a round trip time between it and the server. This allows NTP to gradually synchronizes the clock on the local client with that on the remote server, often to within just a few milliseconds, by inserting or removing ticks of the TOD clock. It can do this with a number of remote servers and, by measuring the quality of the received timestamps in terms of offset, jitter, and drift, decide which remote server has the most accurate and precise clock and the best network connection.

While the statistical filters and algorithms used by NTP are beyond my ken, the protocol is not in principle complicated. Measurements are based on four time stamps:
  • T1 is when the client transmitted the request packet,
  • T2 is when the server received the request packet,
  • T3 is when the server transmitted the response packet, and
  • T4 is when the client received the response packet.
The time offset (which can be positive or negative) between the client and the server clocks is
Toffset = ((T2 - T1) + (T3 - T4)) / 2
(the trick to this algebra is the assumption that the travel time for the packet is the same in either direction) and the round trip time between the client and server is
Trtt = (T4 - T1) - (T3 - T2) .
Buying Instead of Building

It took a bit of effort to put this desk clock together. Besides the basic hardware assembly and minor software hacking, the Adafruit LCD board was a kit, so I had to break out the Weller soldering station.

Some Assembly Is Required

But the principle of operation of my desk clock is so simple, you could be forgiven for wondering "Hey, why can't I just buy one of these devices?" Indeed, you can. And for a lot less money than my desk clock cost me to build, taking my hourly rate into account. Cheap enough that you might be tempted, as I was, to actually purchase a couple of inexpensive GPS-based NTP servers and try them out on your home network, comparing them to your home brew device and to each other.

Communication Systems Solutions TM 1000A. The CSS TM 1000A (the TM stands for Time Machines) is US$300 GPS-disclipined NTP server in (literally) a black box.

CSS TM 1000A Time Machine NTP Server

Its front panel has three LEDs indicating power, satellite lock, and the 1PPS heartbeat.

CSS TM 1000A Time Machine NTP Server (back)

The rear panel has a round jack for its dedicated power brick, an RJ45 Ethernet jack, a DB9S serial port for debugging, and an SMA connector for the GPS antenna.

The SMA connector provides a voltage bias that is used to power the amplifier in the remote GPS antenna.

Screen Shot 2017-03-29 at 10.20.09 AM

The TM 1000A is (mostly) easily administered using a password protected web page at its initial address of 192.168.1.15. I say "mostly" because it took about an hour of fiddling on my part to understand that the device's embedded web server didn't work with the Safari browser on my desktop Mac. It wasn't until I plugged a USB-to-serial adaptor into the TM 1000A's debug serial port and watched the diagnostic output that I switched to using Firefox and everything worked just fine. Had I started with Firefox, device setup would have been a matter of five minutes or so.

One peculiarity of the web page is that it reports the signal strength of three satellites it uses to compute its time and space solution. My understanding is that four satellites are necessary for a solution since there are four unknowns: x, y, z and t. Not a big deal, but it does call into question my own understanding of the theory behind all of this.

The TM 1000A responds just fine to requests from NTP daemons on other computers on my LAN. It does not however seem to respond to queries from ntpq, the NTP administration tool. That's not a deal breaker, but it can make debugging NTP problems more difficult. I fired off a query to the CSS tech support email address, but have not heard back yet.
Update 2017-04-03: The CSS folks have verified that the Safari browser is not supported, and neither is ntpq.
I haven't tried this, but the TM 1000A can be configured to emit its 1PPS digital pulse on one of the pins on the serial connector. This means the device can in principle be use to discipline digital clocks on other devices using (for example) a phase-locked loop (PLL).

The best part about the TM 1000A is that you can order it from Amazon.com and have it at your front door in a couple of days.

Leo Bodnar Electronics LeoNTP. The Leo Bodnar LeoNTP is a US$325 GPS-disclipined NTP server that would fit in your shirt pocket.

Leo Bodnar LeoNTP NTP Server

It's front panel has an LCD display and a simple turn/push control whose use is reminiscent of the scroll wheel on the original Apple iPod.

Leo Bodnar LeoNTP NTP Server (back)

The rear panel has a BNC connector, an RJ45 Ethernet jack, a female USB type C connector for power, and an SMA connector for the GPS antenna.

I didn't test it, but the BNC connector can be administered to output the 1PPS digital pulse, allowing it in principle to be used to discipline other digital clocks.

I didn't test it, but the Ethernet jack supports Power over Ethernet (PoE), a common feature of Ethernet switches used in many telecommunications applications, which eliminates the need for a power brick (lots of VoIP phones for example use PoE, requiring only the Ethernet cable be run to the device from a PoE-capable switch).

The SMA connector provides a voltage bias that is used to power the amplifier in the remote GPS antenna.

Leo Bodnar LeoNTP Stratum-1 NTP Server (operational)

The LCD display and the scroll button are the sole means of administering the device. It is so simple to use that it makes you wonder what all the fuss about web pages is about. Administering the device's static IP address was a matter of just a few seconds work. The downside is anyone walking by the device can administer it; there is no security. That's not a problem for me (unless the cats figure it out), but it might be for you. The color of the display, white or orange, indicates whether the device has a satellite lock (white is good). A green dot in the upper left corner of the display blinks at 1Hz to indicate the 1PPS signal. A bar graph indicates the network load on the device as it processes NTP requests.

In theory, the front panel can be administered to display the local time by setting a time zone. However, not my time zone. Which is just as well, since to actually be correct, the device would have to understand Daylight Saving Time in my neck of the woods, which the powers that be are apt to change, requiring a firmware upgrade of the device. I'm happy with it displaying UTC.

Untitled

The LeoNTP slowly responds to ntpq queries, which is useful for debugging. The LeoNTP reports that it contains a 32-bit ARMv7E-M microcontroller. This suggests that the LeoNTP runs some flavor of real-time operating system (RTOS), or possibly even just a task loop, instead of Linux, which (as we shall see) is the likely reason for its higher precision and lower jitter when reporting the time via NTP.

I ordered my LeoNTP from Uputronics in England (and had it at my front door near Denver Colorado via UPS in just a few days - yet more evidence that I am living in the future), but it's sold in the U.S. and U.K. by other dealers as well.

Results

I now have both off-the-shelf NTP servers wired up to the Ethernet switch in my home office to which everything but my Mac laptop are connected. Both of these servers are plugged into a small UPS so that they can continue to keep time in the event of a brief power outage. The home brew desk clock communicates over WiFi and has a real-time clock with a battery backup.

Untitled

The TM 1000A (left) is named "waterclock". The LeoNTP (right) is named "sundial". My home brew desk clock is named "hourglass".

Untitled

I ran coaxial cable behind the bookcases and whatnot in my home office so that I can place three GPS antennas - two patch antennas and an outdoor marine antenna - in my second story office window where they can get a clear view of the sky. (This stage of the project may require an understanding spousal unit.) The transmissions from the GPS and GLONASS satellite constellations are remarkably low power - I'm told they have about the same received power as a night light - and their transmissions can be obstructed by as little as heavy tree cover. I was concerned about the glass window (and now I'm concerned about replacing this window with a more energy efficient model), but this setup seems to work fine: all three NTP servers quickly achieve a satellite lock. Early testing with the GPS board on the Raspberry Pi with a small patch antenna revealed it routinely can see as many as twelve satellites amongst the GPS and GLONASS constellations. The LeoNTP with the big waterproof marine antenna reports it can see fifteen satellites.

I have the NTP daemon on my Ubuntu development system "mercury" configured to query all three NTP servers, plus some other publicly accessible NTP servers on the Internet.

Untitled

Here is the output of an ntpq query on mercury. Using its own algorithms to measure the quality of each NTP server as a timing source, the NTP daemon on mercury has chosen the LeoNTP as its primary time reference, with the TM 1000A and the Pi as its backups. You can see the wired LeoNTP and TM 1000A have substantially lower round trip delay than the wireless Pi, and all three are a lot lower than any of the Internet servers. The LeoNTP wins in terms of measured jitter; that's why I suspect that the LeoNTP is running an RTOS with a lot less software latency and scheduling variation than you see with a typical Linux system. The use of stratum-1 time sources on the local network causes the NTP daemon on mercury to declare itself a stratum-2 time source.

All three NTP servers have been running for days now with no issues.

Conclusions

I'd be happy with either the TM 1000A or the LeoNTP as a time server on my home network. But the LeoNTP won me over with its clever user interface design, it's pretty front panel LCD display, its support of ntpqand its marginally better performance. It's support of a BNC connector could be a plus in the future; BNC and coax is a common mechanism used for exporting precision timing pulses to other devices. It's lack of security wasn't a concern for me.

Untitled

There can be no doubt that it is a lot cheaper to go with either of the off-the-shelf NTP servers than building your own. But maybe not as fun. Plus, now, when asked "Does anybody really know what time it is?" I can definitively say "I do!"

Sources

ANSI, "Telecommunications - Synchronization Interface Standards for Digital Networks", T1.101, 1987

ISO, "Accuracy (trueness and precision) of measurement methods and results - Part 1: General principles and definitions", ISO 5725-1, 1994

E. Kaplan (ed.), Understanding GPS Principles and Applications, Artech House Publishers, 1996

J. Laird, Clock Synchronization Terminology, InterOperability Laboratory, University of New Hampshire, 2012-06

G. Miller, E. Raymond, "GPSD Time Service HOWTO", http://www.catb.org/gpsd/gpsd-time-service-howto.html

D. Mills, Computer Network Time Synchronization (2nd ed.), CRC Press, 2011

D. Mills, J. Martin, J. Burbank, W. Kasch, "Network Time Protocol Version 4: Protocol and Algorithm Specification", RFC 5905, 2010-06

J. Mogul, D. Mills, J. Brittenson, J. Stone, U. Windl, "Pulse-Per-Second API for UNIX-like Operating Systems" (Version 1.0), RFC 2783, 2000-03

C. Overclock, "Does anybody really know what time it is?", http://coverclock.blogspot.com/2006/11/does-anybody-really-know-what-time-it.html

Rakon Limited, "Timekeeping with Quartz Crystals", 2009

Raltron Electronics Corporation, "Stratum Levels Defined", http://www.raltron.com/products/pdfspecs/sync_an02-stratumleveldefined.pdf

E. Raymond, "Stratum-1-Microserver HOWTO", https://www.ntpsec.org/white-papers/stratum-1-microserver-howto/

D. Sobel, W. Andrewes, The Illustrated Longitude, Walker & Company, 2003

Wikipedia, "Accuracy and precision", https://en.wikipedia.org/wiki/Accuracy_and_precision

Wikipedia, "Atomic clock", https://en.wikipedia.org/wiki/Atomic_clock

Wikipedia, "Error analysis for the Global Positioning System", https://en.wikipedia.org/wiki/Error_analysis_for_the_Global_Positioning_System

Wikipedia, "Foghorn", https://en.wikipedia.org/wiki/Foghorn

Wikipedia, "Global Positioning System", https://en.wikipedia.org/wiki/Global_Positioning_System

Wikipedia, "GPS signals", https://en.wikipedia.org/wiki/GPS_signals

Wikipedia, "History of longitude", https://en.wikipedia.org/wiki/History_of_longitude

Wikipedia, "John Harrison", https://en.wikipedia.org/wiki/John_Harrison

Wikipedia, "LORAN", https://en.wikipedia.org/wiki/LORAN

Wikipedia, "Marine chronometer", https://en.wikipedia.org/wiki/Marine_chronometer

Wikipedia, "Network Time Protocol", https://en.wikipedia.org/wiki/Network_Time_Protocol