Month: November 2018

Intro to Radare2 for Malware Analysis

Introduction

In recent years, a variety of inexpensive or free disassemblers and debuggers have gained serious momentum, including radare2 (a.k.a. “r2”), Cutter (GUI for radare2), Binary NinjaHopper, and x64dbg. If you have a license for IDA Pro and are happy with the experience, you may have little reason to explore other options. However, if you are still in the early stages of your career in malware analysis, or you are working with a small budget, you may not have access to this relatively pricey product. Regardless of your background, disassembler preference, or budgetary restrictions, each tool listed above provides a different reverse engineering experience, and each is worth trying once. At the very least, a test drive can clarify your preferences and bring an appreciation for the tool(s) you choose to use.

This post focuses on an initial workflow for performing static code analysis using radare2. Specifically, it will cover how to load a PE file into radare2, identify an imported API of interest, find a reference to the API, view assembly at that location, and begin to assess the code’s purpose.

Radare2 is a project that contains multiple tools for binary analysis, and radare2 (yep, same name) is the primary tool that performs disassembling and debugging. It is command line driven, which may be daunting after extensive use of other disassemblers that provide a GUI. In fact, the official R2 book depicts the level of effort required to learn radare2 like this:

learning_curve

Source: https://radare.gitbooks.io/radare2book/content/first_steps/intro.html

Let there be no doubt – learning how to use radare2 is complicated. However, the best way to tackle a difficult task is to get started.

Installing Radare2

Radare2 binaries and source for a variety of operating systems are available here. I used a 64-bit Windows VM environment for my analysis, so I downloaded and ran the appropriate binary.  Specifically, I’m using the Windows VM we distribute in the SANS FOR610 Reverse Engineering Malware course, so you will see references to the “REM” user.

Analyzing a File with Radare2

Loading a binary

For this post, we will use a Gandcab ransomware sample. If you want to follow along, you can download the sample here (password: malware).

To load the file into radare2, simply type radare2 , as shown below.

load

We now have a radare2 shell waiting for additional commands. Notice the shell indicates we are at the address 0x004044bb, which is the entry point for this executable (more on that in a moment).

To navigate an executable within radare2, you will use text-based commands to initiate processing and query information. Along the way, using the question mark (“?”) will provide help about command options. For example, type a question mark ? and hit return. Below is an excerpt of the initial output.

question_mark.jpg

If you scroll down the output on your screen, you will find a reference to the command:

i_option

The i command provides information about a file. For more detail on the type of information we can query, type i?:

i?.jpg

For example, typing ie will provide information about the executable’s entry point. Below is an excerpt of this command’s output.

ie

Notice the virtual address (“vaddr”) matches the address of our location within the radare2 shell, confirming that we currently reside at the executable’s entry point.

Initiating code analysis

To begin our code analysis with radare2, we must first kick off some automated analysis. Depending upon your prior exposure to radare2, you may be surprised to know that, by default, radare2 does not perform any analysis at startup. Other disassemblers and debuggers like IDA Pro and x64dbg will automatically analyze the binary to identify functions, code and data. The author of radare2 (pancake), however, takes a different approach. He details his case here, but the basic point is that radare2 aims to run on various platforms with varying levels of computing power, and it is capable of analyzing many different binary architectures. As a result, no analysis is typical, so it’s up to the analyst to determine what types of processing are relevant. While some may reject this approach, it forces the analyst to be more deliberate in their work. In fact, the entire radare2 experience reinforces this by requiring explicit commands to view information and navigate the code.

While we won’t discuss the details of all possible commands (see this resource), you can see options by typing aa? and aaa?. If you want to take a leap of faith and perform a variety of analyses against a file, I suggest using the aaa command. After executing this command, you will see a variety of output messages as radare2 analyzes the binary (excerpt below):

aaa

Viewing imports

Now that the initial auto analysis is complete, it’s time for us to manually navigate the code. One approach is to first find Windows API references that support malicious functionality. To view functions imported by the suspect binary, we can type ii:

ii.jpg

There are many APIs we could explore, but for this post, we will focus on CreateToolHelp32Snapshot (not shown above). This API is used to capture a snapshot of running processes on a system. Malware often uses this functionality to enumerate running processes and identify specific process names. To find this imported function in the import list, type ii~CreateToolHelp32Snapshot (the tilde searches the output of ii for the specified text):

ii_CreateTool

Finding an API reference

Next, we want to locate references to this API. To query this information, we will type the letters (analysis), (cross references), and t (find references to the specified address), followed by the address of the imported function:

axt

In the output above, we see references to two CALL instructions, which represent instructions that call CreateToolhelp32Snapshot. We could explore both references, but for this post we will only jump to the first reference address using the (seek) command:

s_

Notice the address in the prompt changed to our destination address, a signal that we have arrived at the desired location. To confirm this, we can use the pd (print disassembly) command and its subcommands:

pd?

First, let’s make sure we reside at a CALL to CreateToolhelp32Snapshot using the command pd N, where N indicates number of disassembly lines to print (pardon the small text size to maintain formatting):

pd1

Notice the autogenerated comments in red are unhelpful in this case, but they are included for completeness.

Understanding the code

To view a summary of the function where we currently reside, we can type pds (print disassembly summary):

pds

As indicated by the help information, pds focuses on strings, calls, jumps, and references to provide an overview of the function. Looking at the above output, notice the APIs CreateToolhelp32Snapshot, Process32First, lstrcmpiA, and Process32Next. This progression of CALLs is often used to capture a snapshot of running processes, begin iterating through the list, compare process names to one or more predefined names, and continue through the list, respectively.

To understand precisely how these calls are used and what decision points are encountered, we need more information about the function. We could print the entire function body with the command pdf (print disassembled function), but this prints a rather large amount of output that you have to scroll through in the terminal.

One approach to evaluating the context of the CreateToolhelp32Snapshot CALL is to view the instructions that occur right before it. To view the 10 instructions before the CALL, we can type pd  -10 (only the command and disassembly are shown due to space constraints):

pd-10

I mentioned earlier that when malware uses CreateToolhelp32Snapshot to capture and evaluate running processes, it often compares the process list snapshot to predefined values. It is likely this is that predefined group of process name strings. Considering this sample is ransomware, it makes sense that the malicious code would want to check for these process names, as they may have a lock on files worth encrypting.

Another approach to understanding the context of this code is to use radare2’s visual mode, which allows you to browse the assembly similar to a GUI-based disassembler. Type to enter visual mode, and use the HJKL keys (similar to vi/vim) to navigate the code:

V

Since this approach allows for easy exploration of the code, it is my preferred method for code analysis. Typing a question mark will provide a help menu specific to this interface, although that output is not shown here.

Another perspective on this code and its flow of execution is achieved by entering graph mode with the command VV. Once in graph mode, the HJKL keys will allow you to browse the decision points that occur throughout the function. Below, focusing on the lstrcmpiW, OpenProcess and TerminateProcess CALLS in this more visual interface provides insight into what happens if the program matches a process name against its predefined list.

VV

Specifically, if a string match is found, the program will access the target process via OpenProcess and then terminate it.

Closing Thoughts

This post introduced radare2 and explored a basic workflow to load a binary and begin analysis. There is certainly much more to learn about radare2, but I hope this jumpstarts your journey.

For more information on radare2, I encourage you to explore these resources:

If you would like to learn more about malware analysis strategies, join me at an upcoming SANS FOR610 course.

-Anuj Soni / @asoni