Manually unpacking a UPX packed binary with radare2 (Part_1)

“Kitten in a cardboard box”, by Revital Salomon, licensed under CC BY-SA 4.0

I recently attempted a binary challenge involving a packed executable however as the binary was packed with UPX the challenge becomes a bit trivial once you know you can unpack by running UPX -d.

I decided to look into the process of manually unpacking UPX, almost all the tutorials I found still use OllyDbg on windows.
In this post I will run through one possible method for manually unpacking a binary packed with a modern version of UPX using radare2 on Linux.

First lets compile and pack a simple hello world:

#include <stdio.h>
int main() {
printf("Hello, World!");
return 0;
}

Compile and pack:

$ gcc -o hello hello.c 
$ upx -o hello_upx hello
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2018
UPX 3.95 Markus Oberhumer, Laszlo Molnar & John Reiser Aug 26th 2018
File size Ratio Format Name
-------------------- ------ ----------- -----------
upx: hello: NotCompressibleException
Packed 1 file: 0 ok, 1 error.

We can see UPX has given an error that the packing has failed.
However this makes sense, our hello world is tiny and UPX is primarily a compression tool, it has nothing to compress.
The easiest way around this is to inflate the size of our compiled executable by statically linking it.
Compiling again with static linking turned on we see the size of the compiled binary has gone from 17KiB to 852KiB.

$ ls -lah
total 48K
drwxrwxr-x 2 user user 4,0K May 7 22:20 .
drwxrwxr-x 5 user user 4,0K Apr 28 18:34 ..
-rwxrwxr-x 1 user user 17K May 7 22:20 hello
-rw-rw-r-- 1 user user 76 May 7 20:15 hello.c
$ rm hello
$ gcc -static -o hello hello.c
$ ls -la
total 880
drwxrwxr-x 2 user user 4096 May 7 22:20 .
drwxrwxr-x 5 user user 4096 Apr 28 18:34 ..
-rwxrwxr-x 1 user user 871688 May 7 22:20 hello
-rw-rw-r-- 1 user user 76 May 7 20:15 hello.c
$ ls -lah
total 880K
drwxrwxr-x 2 user user 4,0K May 7 22:20 .
drwxrwxr-x 5 user user 4,0K Apr 28 18:34 ..
-rwxrwxr-x 1 user user 852K May 7 22:20 hello
-rw-rw-r-- 1 user user 76 May 7 20:15 hello.c
$ upx -o hello_upx hello
Ultimate Packer for eXecutables
Copyright (C) 1996 - 2018
UPX 3.95 Markus Oberhumer, Laszlo Molnar & John Reiser Aug 26th 2018
File size Ratio Format Name
-------------------- ------ ----------- -----------
871688 -> 335268 38.46% linux/amd64 hello_upx
Packed 1 file.
$ ls -la
total 1216
drwxrwxr-x 2 user user 4096 May 7 22:21 .
drwxrwxr-x 5 user user 4096 Apr 28 18:34 ..
-rwxrwxr-x 1 user user 871688 May 7 22:20 hello
-rw-rw-r-- 1 user user 76 May 7 20:15 hello.c
-rwxrwxr-x 1 user user 335268 May 7 22:20 hello_upx

Attempting to pack with UPX now succeeds and the packed elf is 335268 bytes in size.

DIE is a great tool similar to PEiD, taking a quick look at the file in DIE shows us straight away that the file is packed with UPX:

Now lets try to unpack it!

The are many different approaches to unpacking such as memory breakpoints pattern recognition etc, however in this post I will try a slightly simpler aproach. (It helps that we have access to the packer however the technique can still be useful regardless). We will use strace to get an idea of what the packer is doing and work from there. strace will run an executable and log the syscalls.

First run the unpacked executable with strace:

$ strace ./hello
execve("./hello", ["./hello"], 0x7fffe8ead6f0 /* 61 vars */) = 0
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffd64a7d430) = -1 EINVAL (Invalid argument)
brk(NULL) = 0x2433000
brk(0x24341c0) = 0x24341c0
arch_prctl(ARCH_SET_FS, 0x2433880) = 0
uname({sysname="Linux", nodename="pqrxyz", ...}) = 0
readlink("/proc/self/exe", "/home/user/c4_p"..., 4096) = 41
brk(0x24551c0) = 0x24551c0
brk(0x2456000) = 0x2456000
mprotect(0x4bd000, 12288, PROT_READ) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x5), ...}) = 0
write(1, "Hello, World!", 13Hello, World!) = 13
exit_group(0) = ?
+++ exited with 0 +++

It looks like there is some standard setup stuff going on and then we see the syscalls that correspond to our print statement. Now lets do the same for the packed binary:

$ strace ./hello_upx 
execve("./hello_upx", ["./hello_upx"], 0x7ffe71a36cc0 /* 61 vars */) = 0
open("/proc/self/exe", O_RDONLY) = 3
mmap(NULL, 308235, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fd176ca0000
mmap(0x7fd176ca0000, 307874, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0) = 0x7fd176ca0000
mprotect(0x7fd176cea000, 5131, PROT_READ|PROT_EXEC) = 0
readlink("/proc/self/exe", "/home/user/"..., 4095) = 45
mmap(0x400000, 802816, PROT_NONE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x400000
mmap(0x400000, 1304, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x400000
mprotect(0x400000, 1304, PROT_READ) = 0
mmap(0x401000, 603617, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0x1000) = 0x401000
mprotect(0x401000, 603617, PROT_READ|PROT_EXEC) = 0
mmap(0x495000, 157093, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0x95000) = 0x495000
mprotect(0x495000, 157093, PROT_READ) = 0
mmap(0x4bd000, 21008, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0xbc000) = 0x4bd000
mprotect(0x4bd000, 21008, PROT_READ|PROT_WRITE) = 0
mmap(0x4c3000, 2432, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4c3000
mmap(NULL, 4096, PROT_READ, MAP_PRIVATE, 3, 0) = 0x7fd176c9f000
close(3) = 0
munmap(0x7fd176ca0000, 308235) = 0
arch_prctl(0x3001 /* ARCH_??? */, 0x7ffecd4988d0) = -1 EINVAL (Invalid argument)
brk(NULL) = 0x2244000
brk(0x22451c0) = 0x22451c0
arch_prctl(ARCH_SET_FS, 0x2244880) = 0
uname({sysname="Linux", nodename="pqrxyz", ...}) = 0
readlink("/proc/self/exe", "/home/user/"..., 4096) = 45
brk(0x22661c0) = 0x22661c0
brk(0x2267000) = 0x2267000
mprotect(0x4bd000, 12288, PROT_READ) = 0
fstat(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x5), ...}) = 0
write(1, "Hello, World!", 13Hello, World!) = 13
exit_group(0) = ?
+++ exited with 0 +++

We see a bunch of calls that were not there before. Let’s try and break down what exactly is going on here.

open("/proc/self/exe", O_RDONLY)        = 3

Gets a file descriptor to the the currently running process’s binary. The next two calls to mmap allocate 308235 bytes in memory and map in the current processes binary data.

readlink("/proc/self/exe", "/home/user"..., 4095) = 45

Reads the actual path of the executable into memory. We then see some more calls to mmap and mprotect.

At a glance it looks like a blob of data around the size of our original unpacked elf has been written to memory starting at the address 0x400000.
Finally the original file descriptor is closed and munmap is called on the blob of the packed binary that was copied into memory at the start.

So in summary, at a high level it looks like the packed program maps a copy of itself into memory. The execution then jumps to the new memory region and the program unpacks itself to 0x400000 overwriting the original data. It then de-allocates the memory and closes the file descriptor. Presumably it then jumps to the unpacked code and begins executing.

So at the point of the call to munmap we would expect that the unpacking is finished, the process address space contains the unpacked binary and we are close to jumping to the original entry point. Lets open the thing in radare and take a look!

Open the packed elf with radare2 in debug mode:

$ r2 -d ./hello_upx 
Process with PID 33060 started...
= attach 33060 33060
bin.baddr 0x00400000
Using 0x400000
asm.bits 64
-- Hang in there, Baby!
[0x0044a970]>

Running aaa and pdf to analyze the binary and print the function at the current entry-point… does not tell us very much:

From the strace output we saw there were about 17 syscalls before we get to the call to munmap. Running 15 dcs we continue execution through 15 syscalls and then call dcs a couple more times until we get to munmap:

Using v to switch to visual mode we see we are about to pop a value off the stack and then return to the value 0x00401bc0 which is indeed our OEP.

Stepping through a couple of times and we are now at the start of execution of the original executable:

If you just want to analyze the unpacked code that’s it, we can pretty much stop here, the executable is unpacked in memory (although stripped) and we are at the OEP.

However if we want to reconstruct an executable file we must dump the various regions of the processes memory and piece them back together.

Find that in part 2.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store