Obviously gettimeofday is not precise enough because it returns a timeval structure (defined in time.h), which only has a microsecond resolution:
struct timeval {
  time_t      tv_sec;       /* seconds */
  suseconds_t tv_usec; /* microseconds */
};Instead, the timespec structure (also defined in time.h) seems perfect with a nanosecond resolution:
struct timespec {
  time_t   tv_sec;  /* seconds */
  long     tv_nsec;   /* nanoseconds */
};So, I stumbled upon kernel POSIX timers which can be accessed from user-land with the following functions (triggering syscalls):
#include <time.h> int clock_gettime(clockid_t clk_id, struct timespec *tp);According to the manual, a system-wide realtime clock is provided (CLOCK_REALTIME), perfect here.
When you look at the libc and kernel source kernel/posix-timers.c, it is interesting how these clocks are implemented... and definitely not trivial!
Now how to use it? Consider the following code which measures the latency between two gettime calls, for each type of clock.
#include <stdio.h>
#include <time.h>
int main(int argc, char **argv, char **arge) {
  struct timespec tps, tpe;
  if ((clock_gettime(CLOCK_REALTIME, &tps) != 0)
  || (clock_gettime(CLOCK_REALTIME, &tpe) != 0)) {
    perror("clock_gettime");
    return -1;
  }
  printf("%lu s, %lu ns\n", tpe.tv_sec-tps.tv_sec,
    tpe.tv_nsec-tps.tv_nsec);
  return 0;
}Compile and link with librt (for clock_gettime).$ gcc -Wall -lrt -o time time.cNow here is what I get on a Core i7 950, Linux 2.6.33, libc 2.10.2-6, Debian sid.
$ ./time 0 s, 113 nsPretty good!
Interesting thing, if in another shell if I run a do-nothing loop (« while(1); »), I have better timings:
$ ./time 0 s, 58 ns
Apparently, it comes from a feature of modern CPU: the ability to shutdown ("relax") when idle. And it takes some time for the CPU to wake-up. You can modify this behaviour by using the idle=poll kernel command line. This way, the CPU never relaxes and is always ready (polling) to work.
 
hmm interesting thing ;)
ReplyDeleteAh! thanks for the post!
ReplyDeleteI spent like couple of hours trying to figure out what the hell is going on... (I was getting quite different results when running same code on the same input)
Stupid power saving mode...
Im wondering if it is really ns precisely... But it is impressive !
ReplyDeletereally impressive and nice to know. I have been searching some way of calculating elapsed time of a function in nanoseconds. love it.
ReplyDeleteOn newer versions of gcc, the gcc command above will fail with "undefined reference to clock_gettime" - move the -lrt to after the source file name to fix this, e.g.:
ReplyDeletegcc -Wall -o time time.c -lrt
You have to take care of the condition when tpe.tv_nsec < tps.tv_nsec.
ReplyDelete