Nicholas Merriam
Nicholas Merriam Author of Embedded Answers

How to read RISC-V mtime without an infinite loop

How to read RISC-V mtime without an infinite loop

The RISC-V mtime register is a 64-bit timer. For 32-bit RISC-V profiles, there is an interesting challenge involved in simply reading it. We cannot read the 64-bit timer atomically because it is too long for a 32-bit read. If we read the low half as 0xFFFFFFFF and then read the upper half (mtimeh), we cannot be sure if the upper half was read just before the lower half overflowed, or just after. The difference is over four billion ticks, so it makes a big difference.

Hardware latching

This is a very common problem with more than one possible solution. It has existed ever since 8-bit processors were reading 16-bit timers. Some processors “latch” one half when you read the other half. This means that when you read one half, the hardware makes a copy of the value of the other half at the time of that first read. The other half continues to count but the register will read not as the current time but as the copy that is consistent with the first read.

A problem with hardware latching is that you have to be careful not to get interrupted between the two reads. Unless every pair of reads is correctly protected from interrupts, no pair of reads knows the state of the latch on entry. If you have third-party software components of differing levels of safety, in a mixed criticality environment, this creates a real headache.

Hardware latching is not necessarily a bad solution, but it does have the unattractive aspect of appearing to make the software easier, while retaining an dangerous wrinkle. I presume that it is for this reason that RISC-V does not specify hardware latching.

Pi Pico2

According to their own website

Raspberry Pi Pico 2 is a low-cost, high-performance microcontroller board with flexible digital interfaces.

The Pi Pico2 datasheet describes a recipe for reading all 64 bits of mtime. To paraphrase, they sandwich the read of the low half between two reads of the high half, and repeat until the two reads of the high half are the same.

How (not) to loop forever

This is a well-known recipe that has been used successfully for many years. It has one unfortunate characteristic. While nobody seriously thinks that this loop might run forever, it is annoyingly hard to prove that it terminates. This recipe is intended for use in any system, and without knowing the system, we cannot prove that we do not continuously get an interrupt between the two reads of the high half that uses so much time that they read different values. We can claim that it is impossibly unlikely, but we cannot prove that it will not happen.

In a safety-critical system, allowing a loop for which we cannot prove termination creates painful expense. Static analysis tools that attempt to prove loop termination must be assuaged. Code reviews must cross-reference the justification of this particular loop.

In the end, the more cost-effective solution is allow this loop only in environments where interrupts are disabled. Without interrupts, we get at most two iterations of the loop. The first iteration will sometime see the high half increment, but then the second iteration can be proven to complete before the high half increments again.

No loop

Knowing that the loop always needs either one or two iterations raises the question of whether we really need a loop. Can we have single path code?

Yes, and the solution is quite simple. With just one loop iteration, we have three timer reads:

  1. mtimehBefore
  2. mtime
  3. mtimehAfter

The difference mtimehAfter - mtimehBefore is either one or zero. If it is zero, we know that mtimeh did not change and we have no problem to concatenate the two values to make a 64-bit result. If the difference is one, we know that we can concatenate mtime with one of mtimehBefore and mtimehAfter and we have to work out which one. If mtime is small, it belongs with mtimehAfter, because mtime was read just after the overflow. Otherwise, it belongs with mtimehBefore. Subtracting the most significant bit of mtime from mntimehAfter gives us the correct value to concatenate with mtime.

The C implementation compiles to branchless code, and is an excellent candidate for inlining. I have ignored the interrupt lock, which might be the responsibility of the caller or might be included in GetMtime64.