VBA P-Code Disassembler
pcodedmp is a Python library and command-line tool for disassembling VBA p-code from Microsoft Office documents. It supports various Office formats (e.g., `.docm`, `.xlsm`, `.pptm`) and aims to provide detailed insight into embedded VBA macros for analysis. The current version is 1.2.6, with releases typically tied to bug fixes or feature additions for better p-code parsing.
Warnings
- gotcha Prior to v1.2.1, pcodedmp only officially supported Python 2.6+. Attempting to use older versions on Python 3.x will likely result in compatibility errors.
- gotcha Before v1.2.3, installation via `pip install pcodedmp` was not fully supported, requiring manual setup. Users attempting to install or use older versions might encounter difficulties.
- gotcha While support for 64-bit Office documents and VBA7 features (like `PtrSafe`) improved in versions 1.1.0 and 1.2.1, the library's README still notes known limitations. Disassembly of some complex or obscure 64-bit specific p-code instructions might be incomplete or incorrect.
- deprecated In v1.2.5, output functions like `dump_file` gained an `output_file` parameter, allowing explicit control over where disassembly output is written. While implicitly writing to `sys.stdout` still works by default, explicitly passing `output_file=sys.stdout` or a file handle is the recommended and more robust approach.
Install
-
pip install pcodedmp
Imports
- dump_file
from pcodedmp.pcodedmp import dump_file
- dump_stream
from pcodedmp.pcodedmp import dump_stream
Quickstart
import os
import io
import sys
from pcodedmp.pcodedmp import dump_file
# This library processes existing Office documents containing VBA macros.
# Replace 'path/to/your/document.docm' with the actual path to your target file.
# For example, create a simple .docm file with a basic macro (e.g., MsgBox 'Hello').
#
# If the file does not exist, the library will raise a FileNotFoundError.
# This example is designed to be runnable and show the expected output,
# whether it's successful disassembly or an error due to a missing file.
document_path = "path/to/your/document.docm" # Replace with a real path if you have one.
# Capture stdout to inspect the disassembly output without writing to console
original_stdout = sys.stdout
captured_output = io.StringIO()
sys.stdout = captured_output
try:
print(f"Attempting to disassemble VBA p-code from: {document_path}")
# The dump_file function writes output to sys.stdout by default.
# You can also specify an output file: dump_file(document_path, output_file=open('output.txt', 'w'))
dump_file(document_path)
# Get the captured output
output_lines = captured_output.getvalue().strip().split('\n')
print("\n--- Disassembly Attempt Result ---")
if output_lines and output_lines[0].startswith('VBA p-code disassembler'): # Check for actual content
print("\n".join(output_lines[:10])) # Print first 10 lines of actual disassembly
if len(output_lines) > 10:
print("...")
print(f"(Full output length: {len(output_lines)} lines)")
else:
# No relevant output from disassembly, likely an error message from the library itself
print("No relevant disassembly output. See error messages below if any.")
except FileNotFoundError:
print(f"\nERROR: Input file not found at '{document_path}'.")
print("Please replace 'path/to/your/document.docm' with a valid path to an Office document containing VBA.")
except Exception as e:
print(f"\nAn unexpected error occurred: {e}")
finally:
sys.stdout = original_stdout # Restore original stdout
print("\nQuickstart finished. Check the 'Disassembly Attempt Result' above.")