REMnux v6 for Malware Analysis (Part 1): VolDiff

Introduction

As you may have heard, Lenny Zeltser recently released version 6 of his popular REMnux malware analysis Linux distribution. I’m a big fan of REMnux because it reduces some of the overhead associated with malware analysis. Rather than spending hours downloading software, installing tools, and navigating through dependency hell, this distribution gives you access and exposure to numerous tools quickly. Once you see the value of a tool for yourself, you can then dive into the code and configuration files to develop a deeper understanding of its inner workings and customize it to your needs.

This is the first in a series of posts where I will highlight my favorite new additions to REMnux and why you should include them in your malware analysis process.

VolDiff

One quick, effective approach to assessing a suspicious file is to capture a snapshot of system activity, execute the file, capture another snapshot, and then compare the two system states to determine the impact of execution. The popular regshot tool uses this approach to log registry and file system changes after an event like double-clicking malware. VolDiff, included in REMnux v6, allows us to perform similar analysis against memory dumps. Developed by @aim4r, VolDiff is a Python script that uses the Volatility memory analysis framework to analyze two memory dumps and output the differences between them. When applied to memory analysis, this script will focus your attention on memory artifacts generated after, and possibly as a result of, code execution. This can expedite your analysis of large memory dumps to detect activity such as code injection and provide visibility into packed or obfuscated code. However, keep in mind that memory is in a state of flux, so changes included in the diff results are not necessarily caused by executing the suspect file.

VolDiff resides on REMnux in /opt/remnux-scripts/, but you can run it from anywhere since this location is included in the PATH environment variable. If you do not have the latest version of VolDiff (v2.1 at the time of this writing), you can update your remnux-scripts directory by running the commands sudo apt-get update and then sudo apt-get install remnux-scripts.

An Example with VolDiff 

Let’s explore the value of VolDiff with a malware sample. I used a file named funfile.exe, and if you want to follow along, you can download the sample here (password: infected).

A few words about my test environment – since it’s advisable to perform malware analysis within a virtual machine, I used VMware Fusion. For this analysis, I started 1) a REMnux v6 VM with host-only networking and 2) a 32-bit Windows 8.1 VM with host-only networking. Within the Windows VM, I configured the Default Gateway and Preferred DNS Server with REMnux’s IP address. Lastly, I verified connectivity by pinging each host from the other. Note: If you dedicated more than a couple GB of memory to your Windows VMs, consider decreasing this value or you may be waiting hours for VolDiff processing to complete.

Some initial behavioral analysis indicated that this sample generated network traffic to an IP address. Since our goal is to assess memory artifacts, I chose to launch several “fake” services in REMnux to encourage activity. Specifically, I ran the following from a REMnux terminal:

  • accept-all-ips start: This bash shell script written by Lenny Zeltser redirects all network traffic destined for an IP address to the REMnux VM.
  • inetsim: This tool simulates a variety of network services, including HTTP, HTTPS, FTP, and SMTP. If my suspect file expected to contact a web server, for example, I wanted it to do so to facilitate additional activity.

To compare memory dumps using VolDiff, we need to capture a memory image before and after infecting a sacrificial host. With VMware, one approach to obtaining a memory image is to use the snapshot feature. Whenever a snapshot is created, VMware saves a “.vmem” file that includes the contents of memory at the time the snapshot was created. This file can then be analyzed using a memory analysis tool like Volatility. To create the memory dumps VolDiff requires, I followed these steps:

  • I copied funfile.exe to the Windows VM desktop.
  • I created a VM snapshot and noted the new “.vmem” file name on my host.
  • In the Windows VM, I right-clicked funfile.exe and selected “Run as administrator” to execute the sample with admin rights.
  • After giving the sample a couple minutes to run, I created another VM snapshot and noted this second “.vmem” file name.

I then copied these files into REMnux for analysis. While there are several ways to do this, I chose to start the SSH server on REMnux and SCP the files into the VM. To ensure I did not confuse the two “.vmem” files, I renamed my baseline file to “baseline.vmem” and my second snapshot to “infected.vmem”.

To kick off VolDiff against my two memory dumps, I ran the command shown below. Note that the command requires the correct OS profile for the memory images.

VolDiff Command

Figure 1: VolDiff command to compare two memory dumps

VolDiff processed my 2 GB (each) memory images in about 45 minutes. The result was a directory of output, but the critical file to review is VolDiff-report.txt. This file contained the key differences between the two memory dumps. My entire output file can be viewed here, but let’s discuss some excerpts.

Screen Shot 2015-06-25 at 8.58.29 PM

Figure 2: VolDiff malfind results

The output above shows new malfind results. The malfind Volatility plugin helps identify injected code, and in this case it discovered a suspicious memory segment within the svchost.exe process with PID 2976. Looking at the ASCII representation of the first few bytes of this segment, you may recognize the “MZ” string. This likely indicates we are looking at injected, executable code. It’s important to note that running malfind does sometimes result in hits even on a clean system; running the malfind plugin against my baseline image produced one hit. However, VolDiff’s diff operation focused my efforts only on new activity.

Let’s look at some more output:

Screen Shot 2015-06-25 at 9.02.47 PM

Figure 3: VolDiff netscan results

The output above shows new netscan entries. The netscan Volatility plugin locates network artifacts in memory. Running this plugin against my “infected.vmem” alone revealed 57 connection artifacts. Since VolDiff highlights changes in the victim system’s state, it trimmed my analysis data set to only two connections, one of which I have included above. This output clearly shows that the suspicious svchost.exe (based on malfind output) established a TCP connection over port 443.

VolDiff also includes a –malware-checks option to look for anomalous activity in an infected memory dump. You can run this option against a single memory dump if you do not have a baseline, or you can simply add it to the command line to both perform a diff and check the infected memory dump for potentially malicious behavior:

Screen Shot 2015-06-25 at 9.42.00 AM

Figure 4: VolDiff –malware-checks option

Much of the output mirrors the earlier VolDiff-report.txt, but it includes additional checks that compare the infected memory dump against characteristics of a known good Windows systems. You can view the entire output file here, but let’s look at one example included in the report:

Screen Shot 2015-06-25 at 9.05.47 PM

Figure 5: VolDiff –malware-checks result excerpt

In this case, VolDiff indicates that the svchost.exe is running in an unexpected session. Session 0 is reserved for system processes and services, and a legitimate svchost.exe process should be running in that session. However, the svchost.exe with PID 2976 is running in session 1, which is associated with a user session. In this way, VolDiff goes beyond simply diffing two memory snapshots and includes built-in heuristics to identify potential malicious activity. At its core, this is an even more powerful diff operation, because it relies on certain absolutes (i.e., a legitimate svchost.exe always runs in session 0) and makes no assumptions about the state of  your baseline image.

In case you’re wondering, this sample has a 39/53 detection rate on VirusTotal. Microsoft identifies it as a Win32/Tofsee variant, a spambot that is commonly spread via email. As we suspected, it launches and injects executable code into svchost.exe and attempts to connect to IP addresses for command and control.

Closing Thoughts

Diffing two system states is a powerful malware analysis technique because it shines a spotlight on new activity. VolDiff, included in REMnux v6, uses this approach to focus your analysis on memory artifacts most likely associated with code execution. I encourage you to explore this and other REMnux v6 tools on your own, or join me at the upcoming FOR610 Reverse-Engineering Malware course in Virginia Beach this August.

-Anuj Soni


About the Author:
Anuj Soni is a Senior Threat Researcher at Cylance, where he performs malware research and reverse engineering. He is also a SANS Certified Instructor and co-author of the course FOR610:Reverse-Engineering Malware. If you would like to learn more about malware analysis strategies, join him at an upcoming SANS FOR610 course.

Key Questions to Guide Malware Analysis

Introduction

Performing malware analysis during incident response can be an exciting, creative exercise. But it can also be a nebulous process, with no clear beginning and no defined end state. While there are numerous articles, books, and tools that cover the topic, the sheer volume of resources can sometimes lead to decision fatigue, and the question becomes: What do I do next? To focus your attention and guide your analysis, begin by answering the four key questions set forth below.

Ask Yourself

To be clear, my intent is not to create a comprehensive list of questions, but to highlight the ones that will yield the most value. If you work on answering these questions first, you will stay on task, make real progress, and better understand the next few steps in the context of your specific incident.

1) What are the artifacts of execution?

This question will fuel your static and behavioral analysis of the sample.  Your precise goal is to document activity on the file system, in memory, and across the network. This includes launched processes, created and deleted files, modified registry entries, and command and control network traffic. Assume the malware sample has the highest level of privilege on your network and has access to all the local and online resources it needs. Also, consider interacting with your analysis environment by launching enterprise applications, browsing to common sites, and rebooting the machine to facilitate activity. Be sure to record any active you observe.

2) What is the potential impact of code execution?

This question expands upon the first question by requiring you to dig deeper and piece together observed activity to determine functionality and purpose. For example, perhaps you observed a file created during behavioral analysis. You must now determine what this file is used for. Does it log keystrokes? Perhaps it stores encoded configuration data that the malware relies upon. Answering this question often requires iterative testing (an important reason to use virtual machines and create snapshots throughout analysis). Reaching a solution may be as simple as completing a few Google searches or may involve more complex code analysis.

3) What is the potential impact of code execution in your environment?

Notice the difference between this question and the second question. While understanding the absolute potential impact is important, in the context of an enterprise security incident, your management is most concerned about the impact on the corporate network. Becoming equipped with the answer to this question and differentiating it from the previous one is a distinguishing factor between a shining, proficient malware analyst and a reckless one.

4) What are the sample’s key host and network indicators of compromise (IOCs)?

Review your artifacts of execution, including registry keys, file names and locations, hashes, C&C specifics, and strings to highlight key information which will allow you to seek out similar activity across the network. This information will not only expedite the detection of other compromised machines on the network, but it will also feed into generating valuable threat intel for your organization and any information sharing partners.

Final Thoughts

As you try to answer these questions, remember that malware analysis is an iterative process. You may not answer each question in totality with 100% certainty before moving onto the next. This is why performing malware analysis is similar to the practice of an art – not because it is indescribable or intangible (these are usually symptoms of a poor process); but because it requires patience and the discipline to know what approaches are working, which ones are not, when you need to start over, and when you need to step away and refocus on the problem at another time.

Clearly, there are other important questions to answer, but investigating the ones listed above will get you moving in the right direction, and answering them as quickly as possible will arm you with the information management often needs once an incident kicks off.

-Anuj Soni


About the Author:
Anuj Soni is a Senior Threat Researcher at Cylance, where he performs malware research and reverse engineering. He is also a SANS Certified Instructor and co-author of the course FOR610:Reverse-Engineering Malware. If you would like to learn more about malware analysis strategies, join him at an upcoming SANS FOR610 course.