SIOT: ADR: Time Validation

Some thinking on this:

@bminer, @bmoyers input welcome!

Just a brain dump of thoughts to add to the discussion based on my past experiences across numerous projects…

Time and time synchronization is a much tougher problem than most people give it credit for on the surface. I’ve had to address the beast several times on various projects and ended up solving it in several different ways based on required functionality.

Client Side:

  • There are many scenarios and use cases to cover here, so I think we may struggle to come up with a one solution fits all cases
  • I like the idea of the system being able to say or know for certain that it has accurate time. Typically this would come from NTP if it is an Internet facing system, but could be from a GPS or satellite modem on completely offline systems. This can be problematic though, as you need to be able to know for certain that you can invalidate the time across reboots, etc. - some sort of a state machine with known transitions.
  • Knowing you have a valid state machine and whether or not you can trust the system time is just one component of the problem though - some of the systems I’ve built needed to run and function whether the time was correct or not.
  • What happens when you have a process that is running, collecting data with the incorrect time, and then the system time gets updated? Typically, I’ve just purged the data to keep data integrity on the client, but what if it’s actually mission critical information just with an inaccurate timestamp?
  • I’ve used systemd-time-wait-sync.service to spin up services that can’t run without accurate time (works well if you know you’ll be connected to the Internet and quickly get NTP on boot): systemd-time-wait-sync

Server Side:

  • This can be just as tricky, since sometimes you can’t just look at timestamps of data flowing in and compare it to the current system time on the server. A client that has been offline for several hours or days might be collecting data and then do a huge push once it comes back online.
  • There may need to be some type of flag in the data on whether the client verifies it was collected with accurate time? Still, can this be trusted?

More to come as I have new ideas, but just wanted to get some of my thoughts down while it was fresh in my mind.

Thanks for the great feedback! I’ll try to process it all as I continue to work through this.

I’d like to design the system to run in offline mode (in case we have rules, etc that need to run at the edge), so maybe we’ll keep this option as a last resort.

I had the same issue. Even though systemd-time-wait-sync worked really well, I wasn’t guaranteed that that the system would be connected to the Internet so offline mode functionality requirement won out.