Friday, March 12, 2010

Nanosecond time measurement with clock_gettime, CPU idling

Recently, I had to do latency testing of message passing between kernel-land and user-land, so I first looked after time structures and functions available in both the kernel and the libc, and then I looked after possible tuning of the kernel.

Obviously gettimeofday is not precise enough because it returns a timeval structure (defined in time.h), which only has a microsecond resolution:
struct timeval {
  time_t      tv_sec;       /* seconds */
  suseconds_t tv_usec; /* microseconds */
};

Instead, the timespec structure (also defined in time.h) seems perfect with a nanosecond resolution:
struct timespec {
  time_t   tv_sec;  /* seconds */
  long     tv_nsec;   /* nanoseconds */
};

So, I stumbled upon kernel POSIX timers which can be accessed from user-land with the following functions (triggering syscalls):
#include <time.h>
int clock_gettime(clockid_t clk_id, struct timespec *tp);
According to the manual, a system-wide realtime clock is provided (CLOCK_REALTIME), perfect here.

When you look at the libc and kernel source kernel/posix-timers.c, it is interesting how these clocks are implemented... and definitely not trivial!

Now how to use it? Consider the following code which measures the latency between two gettime calls, for each type of clock.
#include <stdio.h>
#include <time.h>

int main(int argc, char **argv, char **arge) {
  struct timespec tps, tpe;
  if ((clock_gettime(CLOCK_REALTIME, &tps) != 0)
  || (clock_gettime(CLOCK_REALTIME, &tpe) != 0)) {
    perror("clock_gettime");
    return -1;
  }
  printf("%lu s, %lu ns\n", tpe.tv_sec-tps.tv_sec,
    tpe.tv_nsec-tps.tv_nsec);
  return 0;
}
Compile and link with librt (for clock_gettime).
$ gcc -Wall -lrt -o time time.c
Now here is what I get on a Core i7 950, Linux 2.6.33, libc 2.10.2-6, Debian sid.
$ ./time
0 s, 113 ns
Pretty good!

Interesting thing, if in another shell if I run a do-nothing loop (« while(1); »), I have better timings:
$ ./time
0 s, 58 ns

Apparently, it comes from a feature of modern CPU: the ability to shutdown ("relax") when idle. And it takes some time for the CPU to wake-up. You can modify this behaviour by using the idle=poll kernel command line. This way, the CPU never relaxes and is always ready (polling) to work.

6 comments:

  1. Ah! thanks for the post!
    I spent like couple of hours trying to figure out what the hell is going on... (I was getting quite different results when running same code on the same input)
    Stupid power saving mode...

    ReplyDelete
  2. Im wondering if it is really ns precisely... But it is impressive !

    ReplyDelete
  3. really impressive and nice to know. I have been searching some way of calculating elapsed time of a function in nanoseconds. love it.

    ReplyDelete
  4. On newer versions of gcc, the gcc command above will fail with "undefined reference to clock_gettime" - move the -lrt to after the source file name to fix this, e.g.:

    gcc -Wall -o time time.c -lrt

    ReplyDelete
  5. You have to take care of the condition when tpe.tv_nsec < tps.tv_nsec.

    ReplyDelete