SMDA: Static Malware Disassembly Analysis Library

2.5.3 · active · verified Sat Apr 11

SMDA is a minimalist recursive disassembler library optimized for accurate Control Flow Graph (CFG) recovery, particularly from memory dumps. Built upon Capstone, it currently supports x86/x64 Intel machine code. It processes arbitrary memory dumps (ideally with known base address) to output a structured collection of functions, basic blocks, and instructions, including their respective edges. The library is actively maintained, with the current stable version being 2.5.3.

Warnings

Install

Imports

Quickstart

This quickstart demonstrates how to use `smda` to disassemble a file. It creates a simple dummy binary, disassembles it, and then prints the detected functions and their instructions. The resulting `SmdaReport` object provides programmatic access to the disassembly results, which can also be converted to a dictionary for JSON output.

import os
from smda.Disassembler import Disassembler
from smda.common.SmdaReport import SmdaReport

# Create a dummy file for demonstration purposes
dummy_file_path = "dummy_binary.bin"
# A very simple x64 'ret' instruction (0xc3) as binary content
# In a real scenario, this would be a full executable or memory dump
dummy_binary_content = b"\xc3"

try:
    with open(dummy_file_path, "wb") as f:
        f.write(dummy_binary_content)

    # Initialize the disassembler
    disassembler = Disassembler()

    # Disassemble the dummy file
    # For a real binary, replace dummy_file_path with an actual path, e.g., "/bin/ls"
    # For a memory dump, use disassembleBuffer(buffer, base_address)
    report: SmdaReport = disassembler.disassembleFile(dummy_file_path)

    print(f"\nDisassembly Report for '{dummy_file_path}':")
    if report.functions:
        print(f"Detected {len(report.functions)} function(s).")
        for fn in report.getFunctions():
            print(f"Function at 0x{fn.offset:08x}:")
            for ins in fn.getInstructions():
                print(f"  0x{ins.offset:08x}: {ins.mnemonic} {ins.operands}")
            print("-" * 20)
    else:
        print("No functions detected.")

    # The report can be converted to a dictionary for JSON serialization
    # json_report = report.toDict()
    # print(json_report) # Uncomment to see the full JSON representation

finally:
    # Clean up the dummy file
    if os.path.exists(dummy_file_path):
        os.remove(dummy_file_path)

view raw JSON →