6 minute read

Because I want to learn more about reverse engineering, I did the MalwareTech Reversing Challenges and made a write-up of it.

String 1

Opening the binary in Ghidra shows the following disassembly of the function entry.

image-20200223212334261

The memory address on the stack points to the flag FLAG{CAN-I-MAKE-IT-ANYMORE-OBVIOUS}.

image-20200223212636518

Strings 2

Opening the binary in Ghidra shows the following disassembly of the function entry. Letter by letter is pushed on the stack, which shows the flag FLAG{STACK-STRINGS-ARE-BEST-STRINGS}.

image-20200223214620536

Strings 3

The pseudo-code of the function entry is showed below.

image-20200225212630630

The API call FindResource determines the location of the resource identified by rc.rc. The flag is loaded from the resource section by the identifier 272 (0x110) using LoadStringA, which Ghidra resolves in the disassembly view based on references.

image-20200225195733062

image-20200225213643139

FLAG{RESOURCES-ARE-POPULAR-FOR-MALWARE}

Turn on WindowsResourceReference so that Ghidra resolves the string in the disassembly view.

image-20200225195817403

Shellcode 1

The entry function shows a string that’s moved to the heap.

image-20200306234357029

After moving the string to the heap the entry function copies a few bytes renamed to shellcode to memory and calls it like a function with CALL dword pt r [ EBP + shellcode_in_virtualmem].

image-20200306234442778

The shellcode bytes are showed in the disassembly view and byte viewer of Ghidra:

image-20200306215907716

To view the bytes as disassembly, select the bytes->right-click->Clear Code Bytes -> Disassembly. Now Ghidra shows the bytes as disassembly.

image-20200306225420943

The shellcode loops over the string and executes the rol instruction (https://www.aldeid.com/wiki/X86-assembly/Instructions/rol). Because the challenge did not allow debugging, I recreated the function in Python.

string = [0x32, 0x62 ,0x0a ,0x3a ,0xdb ,0x9a ,0x42, 0x2a, 0x62, 0x62, 0x1a, 0x7a, 0x22, 0x2a, 0x69, 0x4a ,0x9a ,0x72 ,0xa2, 0x69 ,0x52, 0xaa, 0x9a, 0xa2, 0x69, 0x32 ,0x7a, 0x92, 0x69 , 0x2a, 0xc2, 0x82, 0x62, 0x7a,  0x4a, 0xa2, 0x9a, 0xeb]

# https://www.falatic.com/index.php/108/python-and-bitwise-rotation
rol = lambda val, r_bits, max_bits=8: \
    (val << r_bits%max_bits) & (2**max_bits-1) | \
    ((val & (2**max_bits-1)) >> (max_bits-(r_bits%max_bits)))
flag = ""

for s in string:
    flag += chr(rol(s,5))

print(flag)
python flag.py
FLAG{SHELLCODE-ISNT-JUST-FOR-EXPLOITS}

Shellcode 2

The function entry starts with a few characters that moved to the stack. Then it creates a pointer to the heap and moves four values to the pointer.

  • heap_pointer[0] =LoadLibrary
  • heap_pointer[1] = GetProcAddress
  • heap_pointer[2]= flag
  • heap_pointer[3]= 0x24

image-20200413222814614

After that, it copies shellcode from the data segment to memory and executes the shellcode with the heap_pointer as the first argument.

image-20200413224415880

Let’s analyze the shellcode to retrieve the flag. To better analyze the shellcode with Ghidra I selected the shellcode -> right-click -> Create Function by doing this Ghidra names the characters that are moved to the stack. This will become handy when naming the variables for analysis later on. Here is the disassembly view of the string msvcrt.dll moved to the stack. This is the same for the strings below.

  • msvcrt.dll
  • kernel32.dll
  • fopen
  • fread
  • fseek
  • fclose
  • GetModuleFileNameA
  • rb

Without looking at the remaining shellcode this already looks weird because kernel32.dll is already loaded to every process on Windows. The shellcode is trying to use library functions without being caught by the imports windows of the executable.

image-20200413225653313

After moving strings to the stack the same strings are used to import library functions. At the start it gets the address of msvcrt.dll and kernel32.dll with the user of GetProcAddress. After getting the addresses of the DLL’s it gets the addresses of the functions fopen, fseek, fread, fclose from the DLL’s in memory.

image-20200414203221362

After the addresses of the functions are moved to a memory address, the file path of the current process that is executed is moved to file_name with using GetModuleFileName. And then the file is opened on the offset 0x4e using fseek and reads 0x26 bytes from the file using fread. This resolves to the string This program cannot be run in DOS mode.

image-20200414203347328

To get the string from the file I recreated the instructions in Python:

file = open("shellcode2", "rb")
file.seek(0x4e)
string = file.read(0x26)
print(string)
python .\flag.py
This program cannot be run in DOS mode

Before executing a while loop a string is moved from the heap_pointer to the register EDI.

The loop executes the instruction xor string[EDX], file_content[EDX] and increments the value in register EDX until EDX eqauls 0x24.

image-20200414203434568

To get the flag from the file I recreated the instructions in Python:

file = open("shellcode2", "rb")
file.seek(0x4e)
file_content = file.read(0x26)
print(file_content)

string = [0x12, 0x24, 0x28, 0x34, 0x5b, 0x23, 0x26, 0x20, 0x35, 0x37, 0x4c, 0x28, 0x76, 0x26, 0x33, 0x37, 0x3a, 0x27, 0x3d, 0x6e, 0x25, 0x48, 0x6f, 0x3c, 0x58, 0x3a, 0x68, 0x2c, 0x43, 0x73, 0x10, 0xe, 0x10, 0x6b, 0x10, 0x06f]

flag = ""

for i in range(0x24):
    flag += chr(string[i] ^ ord(file_content[i]))

print(flag)
PS C:\Users\re\Documents\shellcode2> python .\flag.py
This program cannot be run in DOS mode
FLAG{STORE-EVERYTHING-ON-THE-STACK}

VM1

The entry function starts with copying 0x1fb bytes from the data segment to the heap. And then it calls unkown_function.

image-20200415202851791

The function creates 3 variables arg_1, arg_2 and arg_3 based on the heap_pointer at offset 0xff.

image-20200416223303489

The values moved to the variables are moved from the memory address [EDX + ECX * 0x1 + 0xff].

  • EDX=heap_pointer=0x404040
  • ECX=count=0
  • 0x404040 + 0xff = 0x40413f

Looking at the binary at the address 0x40413f shows the following.

image-20200415235633013

Comparing the string_offset in the binary with the file ram.bin shows that they both contain the same values. ram.bin is a copy of the memory when vm1 got executed.

image-20200415235954903

After creating the variables FUN_00402270 is called with the variables as arguments.

image-20200416223346070

Based on the value of arg_1 it jumps to a specific memory address labelled with the value.

image-20200416223517276

image-20200416223600752

To better understand what the function does, I recreated the function in Python. But instead of opening the executable, it opens ram.bin that was provided with the challenge.

file = open("ram.bin", "rb")
heap_pointer = bytearray(file.read(0x1fb))
file.close()

counter = 0
random = 0

while True:
    arg_1 = heap_pointer[counter + 0xff]
    arg_2 = heap_pointer[1 + counter + 0xff]
    arg_3 = heap_pointer[2 + counter + 0xff]

    if arg_1 == 1:
        heap_pointer[arg_2] = arg_3
    if arg_1 == 2:
        random = heap_pointer[arg_2]
    if arg_1 == 3:
        heap_pointer[arg_2] ^= random
    if arg_1 > 3:
        break

    counter += 3

print(heap_pointer)
PS C:\Users\re\Documents\vm1> python.exe .\flag.py
FLAG{VMS-ARE-FOR-MALWARE}
[...]

Ransomware

The entry function pushes two zero’s on to the stack and calls FUN_00401000. These are 2 function arguments that contain the value 0, but normally would be FUN_0040100(filename_unencrypted, encryption_key).

image-20200418144149860

The function starts with creating a new filename called filename_encrypted. It opens filename_unecrypted copies the content to buffer_file1. So in short filename_unecrypted-> encryption_routine-> filename_encrypted.

image-20200418194821084

The function loops over the bytes of the file and XOR’s the byte with a key buffer_file[i] ^= key[i] (i = increment). The key length is max 0x20 because EDX gets divided by 0x20.

image-20200418200245892

I don’t have access to the key, but the challenge provided a few images (JPG) that we’re encrypted with the use of this binary. And because the encryption method uses XOR we could calculate some values and get the key. Like this key = encrypted file ^ possible value of unencrypted file.

Later on, I figured that the folder Sample Pictures is a standard folder in Windows 7. So I download the pictures from the archive.org.

image-20200418200845764

It opens the unencrypted file and XOR’s it with the encrypted file. The result of that is the encryption key. After that, it uses the encryption key to decrypt the encrypted flag flag.txt_encrypted.

koala_unencrypted =  bytearray(open("koala.jpg", "rb").read(0x20)) # unencrypted
koala_encrypted = bytearray(open("Koala.jpg_encrypted", "rb").read(0x20))

key = []
for i in range(0x20):
    key.append(koala_encrypted[i] ^ koala_unencrypted[i])

file = open("flag.txt_encrypted", "rb")
buffer_file = bytearray(file.read())

flag = ""
for i in range(len(buffer_file)):
    flag += chr(buffer_file[i] ^ key[i % 0x20])

print(flag)
re@DESKTOP-S7VKEO8:/mnt/c/Users/re/Documents/ransomware1/EncryptedFiles/Documents$ python flag.py
FLAG{XOR-MAKES-KNOWN-PLAINTEXT-AND-FREQUENCY-ANALYSIS-EASY}

Updated: