A brief history of ChromeOS time

ChromeOS, the operating system which powers one of the fastest growing set of devices in the world, is one of the very few OS which is designed ground-up with security and simplicity. It may not have all the bells and whistles like in the latest Windows or Mac OS, but it does have something which others don’t. It’s currently the most secure operating system and can hold its own without the need of an expensive Anti-virus software.

Building a secure operating system is not easy because it requires a lot of things to work together. This book touches one tiny component of this Operating System: The device clock and how it maintains time.

Tracking time

Tracking “time” is one of the most fundamental features of every computing device today, but unfortunately it doesn’t get enough recognition. The massive effort towards the end of the last century to address the Y2K [Ref 1] problem did bring some of this to the forefront, but the explosion of the internet has brought on new challenges which don’t get talked about as much.

The accuracy and precision of “time” needed for any task could vary significantly for different tasks. Most mechanical wrist watches lose a few seconds every single day, and for most part it gets ignored. My dad used to physically synchronize his watch to the time on the radio every few months. And this was good enough for almost everything he did.

The stock markets on the other hand rely on millisecond-level accuracy and they are constantly synchronizing their time with a more trusted source. With the advent of GPS and cellular phones, most of our connected devices do this as well in their own way but each and every one of them requires a significant level of engineering to make it happen.

A Brief history of ChromeOS Time

Importance of timestamps in distributed systems

Distributed systems are significantly cheaper to build than monolithic systems. Having accurate timestamps (after calibrating the clocks) is important to allow different parts of the system to operate independently, in parallel, to solve the same set of problems.

For example, a chain of supermarkets could have multiple customers buying potatoes at any given time, and there could also be daily shipments of the same product coming in every morning to each of the stores. The shopkeepers can build an accurate chronological log of every transaction to know exactly how close they got to a stock-out and what time of the day it happens. Knowing this state allows the shopkeeper to predict and schedule the size of the shipments they would need to prevent a situation where the shelves are completely empty.

This ability to aggregate inventory logs to build a single chronologically timestamped log is not that different from a postmortem which also depends on “time”. Knowing exactly what order a series of events happened in is critical to understand the overall state of the system.

How timestamps are used today

When the first routing protocols were written, the creators were worried about packets going around the internet in circles without ever expiring. This can happen due to bugs in routing protocol, and if they do happen, it would completely overwhelm the network. To avoid this, they intentionally added a “counter”, also sometimes referred to as “hop“ or “ttl” [Ref12], to count how many routers a packet has jumped through. The routers were programmed to discard packets which hit a certain count.

Similarly, routers which were passing routing maps to its peering devices, used “timestamp” to keep track of the latest routing maps and it was designed to discard the ones which are stale/outdated after a certain amount of time.

Most of our low-latency APIs today have some form of “timestamp” component. As an example, when you submit your Credit card information on an eCommerce site, you would want that request to either execute quickly or expire+fail so that you can confidently resubmit the request without worrying about double charging.

Timestamps and “timeouts” have become a critical part of our every day APIs. The devices on the internet today not only need to keep track of their own time, but they also have to synchronize time with most other devices across billions of other devices.

NTP : The most popular time protocol

When the architects of the early internet started working on routing protocols, they realized quickly the challenges of a distributed computing environment requires all the devices to agree on “time”.
The first attempt to discuss how to synchronize time was made when a new routing protocol, HELLO [Ref 6] was being introduced. In 1981 Comsat laboratories published its RFC-like document in ien-173 “Time Synchronization in DCNET Hosts“ [Ref 7].

…describes an alternative mechanism using local-net protocols to synchronize a logical clock in each of a set of internet hosts to a single physical clock, such as an NBS radio clock. The mechanism has been incorporated as an integral component of the DCNET network routing algorithm and depends for its accuracy upon the careful control of link delays.

Ref: https://www.eecis.udel.edu/~mills/database/rfc/ien-173.txt

The proposal for the first version of NTP (Network time protocol) protocol was made a few years later as part of RFC-958 in 1985. Its currently on version 4 (RFC-5905).

NTP provides the protocol mechanisms to synchronize time in principle to precision in the order of nanoseconds while preserving a non-ambiguous date, at least for this century. The protocol includes:

NTP provides the protocol mechanisms to synchronize time in principle to precisions in the order of nanoseconds while preserving a non-ambiguous date, at least for this century. The protocol includes
provisions to specify the precision and estimated error of the local clock and the characteristics of the reference clock to which it may be synchronized. However, the protocol itself specifies only the data representation and message formats and does not specify the synchronizing algorithms or filtering mechanisms.

Ref: https://www.rfc-editor.org/rfc/rfc5905.html

The protocol worked on a simple principle that it’s hard to maintain very accurate physical clocks, so it relied on a tiered network of devices to connect to each other to distribute the current time across the network. The “Stratum 1” servers synchronized their clocks with very precise physical clocks (some of them were based on atomic clocks). “Stratum 2” servers synchronized with “Stratum 1”… and so on.

“Time” prevents crime

As we mentioned before, putting timestamps on communication packets allows the systems to order events happening across the system. This capability was used over time to prevent a form of attack which is often called a “replay attack”.

Imagine issuing a signed check to a service provider, who attempts to photocopy it and submit it 10 times to your bank. This particular attack is hard to do, because all banks track the Check numbers very closely and they would never allow the same check to be cashed twice.

Most modern communication protocols today include a component of time (often called a “timestamp”) which is used to document the time when the message was created or arrived. And most devices today check to make sure the timestamp on the action they are performing is recent and within a range of acceptable time window.

Timestamp + Public key encryption to prevent crime

The early days of the Internet allowed any client to spoof [Ref 13] any other client. TCP/IP’s “3-way-handshake” [Ref 14] and its ability to randomize sequence numbers [Ref 15] protected against some of these attacks.

But it was the introduction of public key cryptography which took this to the next level. By allowing servers and clients to authenticate and securely sign timestamped messages significantly reduced the ability for attackers to spoof.

URLs which have “https” in the URL uses SSL (Secure Socket Layer) protocol and is a perfect example of this infrastructure which provides identity and encryption. Unless you have been living under a rock for the last few years, you would have noticed that most of the web services today use SSL/HTTPS by default. Some of our ISPs who were able to see what their customers [Ref 16] were searching for have lost this ability (which is good)  due to this migration towards SSL.

Unfortunately enforcement of HTTPS presented two new challenges which required more work:

Problem 1: Clients could be convinced to use HTTP instead of HTTPS by devices in the middle

Since not all browsers supported HTTPS initially, it was normal for many of the web servers to maintain both HTTP and HTTPS versions of their services. In many cases, the only action an attacker in the network had to take was to update hyperlinks from “https://…” to “http://…”. This would effectively force the clients to continue using the insecure HTTP protocol and then open them up to sniffing and injection attacks by any device the traffic goes through. This was very powerful, because the same HTTP traffic included passwords and Cookies which could be stolen by attackers for misuse.

HSTS [Ref 17] was introduced as a feature of HTTPS to protect against this attack. Using HSTS, the servers could instruct the browsers to always use HTTPS which dramatically reduced the attack surface an attacker could use. Once a browser got these instructions, it was instructed to ignore the HTTP endpoint for that service and forced the browser to pick the HTTPS URL even if the link it was asked to navigate to mentioned HTTP. And if HTTP endpoint was not available, the browser would refuse to connect at all, which protected the user from ever sharing confidential information in clear text and informed the user that something may be wrong on the network.

Problem 2: An attacker could use expired/broken certificate to attack a victim

While attackers may not be able to decrypt HTTPS traffic, there may still be able to find a old, expired or invalidated certificate to attack a victim.

Every client does at least four checks to validate a new SSL traffic

  1. Verify certificate period – Every certificate itself has a start/end timestamp during which it’s valid. Any certificate used outside this period should be assumed untrustworthy and rejected.
  2. Verify root authority – Every certificate is signed by one of the root CA providers. Here is the list for Chrome. Note that every cert included has a start/stop date during which period its valid. Any certificate signed by a CA provider which is outside this period should be assumed untrustworthy and rejected
  3. Check against CRL (Certificate Revocation List): If a certificate is part of this list, it should be assumed untrustworthy and rejected. Certificates can make it onto the CRL if they were compromised in any way. Since all of these certs are still valid, most browsers will trust these certs unless something like CRL blocks it. Similarly, since CRL is not required for expired certificates, its fair to assume that after certificate expiry these certificates could be dropped from the CRL.

“Time” as you see is part of each of these three checks and for a sufficiently motivated attacker, modifying system time on the victim’s device could help them compromise the victim’s device.

HSTS is a blessing, but can’t be implemented if devices don’t have accurate time

While HTTPS/SSL significantly raised the bar for security, the protections introduced against MITM (Man in the middle) attack and replay attacks has forced devices to always maintain good time. Its common to see connectivity failures on devices which have incorrect time.

For example: If your device is being asked by HSTS to use HTTPS to connect to this blog, but the time on your device is off by a few years, your browser will reject the perfectly valid certificate because according to the device the validity of the certificate is not within an acceptable window.

How do modern computers maintain+update time ?

Most desktops and laptops include BIOS [Ref 20] which allows the user to set the device time before the OS boots up. They also come with a “CMOS battery” [Ref 21] which allows the RTC [Ref 22] included in the BIOS to keep track of time even if the device is fully powered off. While they do need replacements every few years, they are for most part user serviceable and easy to update if needed.

Microsoft Windows, and MacOS also have support for NTP which allows them to fetch the latest time from public NTP servers. However, NTP doesn’t always work in every environment due to its dependence on a protocol which may not be widely open in every network.

Why doesn’t ChromeOS use NTP ?

It may be a surprise to some of you that unlike other OS, ChromeOS doesn’t use NTP. There are three very strong reasons for this:

  1. NTP has a 30 year old code base: NTP code base still has fragments of code which is close to 30 year old. This code has a lot of bells and whistles which is almost useless for most ChromeOS devices. And that introduces code complexity and a potential of security bugs which goes against the basic design principle for an OS which is designed ground up with security in mind.
  2. NTP is a 30 year old protocol: NTP protocol has some known issues which could be abused in interesting ways. For example, every NTP client is also an NTP server, and the fact that the NTP protocol currently requires the server to specifically mention which server its syncing with, exposes NTP clients to DOS attacks.
  3. NTP uses udp: Chromebooks are heavily used in Enterprises and Schools where it has had challenges communicating with UDP for QUIC protocol. That suggests that relying on NTP for such a critical service may cripple the device if it has no battery backup for the RTC.

How ChromeOS tracks time

ChromeOS is unlike most other OS we have seen in the past. The BIOS on the Chromebooks you get today have no user interface to update time in. And none of the Chromebooks available today have a CMOS battery either which means when you completely power off the Chromebook, the RTC will probably reset to Jan 1, 1970 at the next reboot.

Interestingly, ChromeOS devices will never show Jan 1, 1970 as the date because a few tricks to correct the time.

#1: Chromebooks don’t completely power off when you shut it down

When you power off a Chromebook, they do shut down the ChromeOS. However the BIOS inside keeps running, thanks to the batteries which are included with every laptop. In this suspended state the device continues to power up the RTC and the device will see the right time at the next power up.

Note that this is only possible on Chromebooks. It won’t help Chromeboxes or Chromebases which don’t have batteries.

#2: ChromeOS logs the latest time before shutdown

Since ChromeOS devices have permanent storage which persist across reboots, the OS stores the last good known time at shutdown. At reboot, ChromeOS compares the time reported by the RTC with the last known good time it logged and will pick the higher one. If the RTC was reset to 1970, it will always be lower than the last good known time. In such situations the device will come back with the time the device had when it was last powered off.

#3: ChromeOS will use the OS build time

If the device is being powered up for the first time, it may not have any log from the previous boot cycle. In such cases, the device will use the timestamp of the OS image as its current date. For example lets say just booted up a device for the first time and it has ChromeOS M105, you can be fairly confident that the devices timestamp will not be older than Aug 20, 2022 because that is approximately when the first version of M105 build was built.

#4: ChromeOS will use tlsdate

ChromeOS was designed to be secure ground up, and the architects of the OS avoided putting in any non-critical software component which it could avoid. Instead of using NTP to fetch time, they relied on “httpdate” to do the same job but in a much simpler way. “httpdate” worked on a very simple principle that since every HTTP response headers has a

timestamp already (“date” header). The device could quickly fetch the latest timestamp by reaching out to any trusted server . I believe the early versions of ChromeOS used www.google.com as the time server to fetch time from.

A few years ago HTTPdate was replaced by “tlsdate” which was even smaller in terms of lines of code, but also provided significantly better protection against MITM attack since every connection could be validated and encrypted.

Note that tlsdate is not 100% fool proof since there is a slight risk of attack in the first tlsdate fetch request when the device cannot validate certificate interval validity, but this is still a significantly better implementation than using httpdate.

#5: User will be prompted to update time

There is always a possibility that none of the four options would ever get the device in a state where it would accept the certificates due to incorrect time. In such cases the device will prompt the user to update the time.

Conclusion

Security experts often design protocols and features with the assumption that time is always reliable. Unfortunately reality is more complex and I expect the next generation of protocols to have better protections against time related hacks.

But until then, OS manufacturers will have to keep maintaining a fine balance between making time accurate+precise vs reducing attack surface by synchronizing time using something significantly more secure, but less accurate. ChromeOS devices are not yet involved for activities which need very precise time, and I’m sure that the ChromeOS development team will switch to something more precise before we get there.
If you liked this book, please consider reading other articles on my blog [https://www.flagthis.com/] and consider subscribing to it.

References


Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *