How to read RISC-V mtime without an infinite loop
The RISC-V mtime register is a 64-bit timer. For 32-bit RISC-V profiles,
there is an interesting challenge involved in simply reading it. We cannot read the 64-bit timer atomically because
it is too long for a 32-bit read. If we read
the low half as 0xFFFFFFFF and then read the upper half (mtimeh), we cannot
be sure if the upper half was read just before the lower half overflowed,
or just after. The difference is over four billion ticks, so it makes a
big difference.
Hardware latching
This is a very common problem with more than one possible solution. It has existed ever since 8-bit processors were reading 16-bit timers. Some processors “latch” one half when you read the other half. This means that when you read one half, the hardware makes a copy of the value of the other half at the time of that first read. The other half continues to count but the register will read not as the current time but as the copy that is consistent with the first read.
A problem with hardware latching is that you have to be careful not to get interrupted between the two reads. Unless every pair of reads is correctly protected from interrupts, no pair of reads knows the state of the latch on entry. If you have third-party software components of differing levels of safety, in a mixed criticality environment, this creates a real headache.
Hardware latching is not necessarily a bad solution, but it does have the unattractive aspect of appearing to make the software easier, while retaining an dangerous wrinkle. I presume that it is for this reason that RISC-V does not specify hardware latching.
Pi Pico2
According to their own website
Raspberry Pi Pico 2 is a low-cost, high-performance microcontroller board with flexible digital interfaces.
The Pi Pico2 datasheet
describes a recipe for reading all 64 bits of mtime. To paraphrase,
they sandwich the read of the low half between two reads of the high
half, and repeat until the two reads of the high half are the same.
How (not) to loop forever
This is a well-known recipe that has been used successfully for many years. It has one unfortunate characteristic. While nobody seriously thinks that this loop might run forever, it is annoyingly hard to prove that it terminates. This recipe is intended for use in any system, and without knowing the system, we cannot prove that we do not continuously get an interrupt between the two reads of the high half that uses so much time that they read different values. We can claim that it is impossibly unlikely, but we cannot prove that it will not happen.
In a safety-critical system, allowing a loop for which we cannot prove termination creates painful expense. Static analysis tools that attempt to prove loop termination must be assuaged. Code reviews must cross-reference the justification of this particular loop.
In the end, the more cost-effective solution is allow this loop only in environments where interrupts are disabled. Without interrupts, we get at most two iterations of the loop. The first iteration will sometime see the high half increment, but then the second iteration can be proven to complete before the high half increments again.
No loop
Knowing that the loop always needs either one or two iterations raises the question of whether we really need a loop. Can we have single path code?
Yes, and the solution is quite simple. With just one loop iteration, we have three timer reads:
- mtimehBefore
- mtime
- mtimehAfter
The difference mtimehAfter - mtimehBefore is either one or zero. If
it is zero, we know that mtimeh did not change and we have no
problem to concatenate the two values to make a 64-bit result.
If the difference is one, we know that we can concatenate mtime
with one of mtimehBefore and mtimehAfter and we have to work out
which one. If mtime is small, it belongs with mtimehAfter, because
mtime was read just after the overflow. Otherwise, it belongs with
mtimehBefore. Subtracting the most significant bit of mtime from
mntimehAfter gives us the correct value to concatenate with mtime.
The C implementation compiles to
branchless code, and is an excellent candidate for inlining. I have
ignored the interrupt lock, which might be the responsibility of the
caller or might be included in GetMtime64.