pyvex
PyVEX is a Python interface to libVEX, Valgrind's VEX Intermediate Representation (IR) engine. It provides bindings to translate machine code from various architectures into a common, architecture-agnostic, side-effects-free IR, facilitating static and dynamic program analysis. PyVEX is a foundational component of the angr binary analysis framework and is actively maintained with frequent releases, typically alongside the broader angr project.
Common errors
-
ImportError: libvex failed to initialize
cause This error typically indicates that the underlying C library (libVEX) that `pyvex` binds to could not be loaded or initialized correctly. This might be due to a corrupted installation, missing shared libraries, or environmental issues.fixTry reinstalling `pyvex` (`pip uninstall pyvex && pip install pyvex`). If the issue persists, check system-wide dependencies that `libVEX` might rely on (e.g., C runtime libraries). -
pyvex.errors.LiftingException: Empty IRSB passed to SimIRSB.
cause This specific error, often seen when using `angr` or directly manipulating `IRSB` objects, occurs when a lifted block of code (IRSB) is empty. This can happen if `pyvex.lift` fails to decode any instructions or if specific lifting parameters (like `max_inst=1` on architectures with delay slots) prevent a meaningful block from being generated.fixReview the input `binary_code`, `base_address`, and `architecture` parameters passed to `pyvex.lift`. If using `max_inst` or `max_bytes`, consider increasing them or removing the limits to allow `pyvex` to lift a complete, valid block. Explicitly check `len(irsb.statements)` after lifting. -
error: Unable to build libVEX. ... fatal error C1083: Cannot open include file: 'stdarg.h'
cause During `pip install pyvex` on Windows, this indicates a failure to compile the C components of `libVEX` because a standard C header file (`stdarg.h`) or other essential build tools are not found by the compiler.fixEnsure that 'Desktop development with C++' workload is installed for your Visual Studio version. Launch a 'x64 Native Tools Command Prompt for VS' and run `pip install pyvex` from within that environment. This ensures the correct compiler and include paths are set.
Warnings
- breaking Major version bumps (e.g., from 8.x to 9.x) in pyvex, especially as part of the broader angr project, can introduce API changes and require specific dependency versions. This can necessitate updating related libraries like `archinfo` and `angr` in tandem to maintain compatibility.
- gotcha When lifting certain instruction types (e.g., MIPS branches or jumps) with `max_inst=1` (limiting to one instruction), PyVEX might return an empty IRSB because these instructions often require 'delay slots' to be processed together. This can lead to a `SimIRSBError` if an empty block is passed to downstream analysis.
- gotcha PyVEX provides a *syntactic* representation of a basic block. This means it describes the operations and control flow but does not inherently provide semantic context like the actual data written by a store instruction or the live values of registers at a given point without further analysis.
- gotcha Windows installations might encounter C compiler errors, often related to missing include files (e.g., 'stdarg.h') during the build process of the underlying libVEX C component, even with Visual Studio installed.
Install
-
pip install pyvex
Imports
- lift
import pyvex import archinfo irsb = pyvex.lift(b'\x90', 0x400000, archinfo.ArchAMD64())
- IRSB
from pyvex import IRSB # ... or access via lifted block: # irsb = pyvex.lift(...) # type(irsb) == pyvex.IRSB
Quickstart
import pyvex
import archinfo
# Binary code: 5 NOPs (0x90) for AMD64
binary_code = b"\x90\x90\x90\x90\x90"
# Base address for the code
base_address = 0x400400
# Architecture definition
architecture = archinfo.ArchAMD64()
# Lift the binary code into a VEX Intermediate Representation Super-Block (IRSB)
irsb = pyvex.lift(binary_code, base_address, architecture)
print("--- Lifted IRSB ---")
irsb.pp() # Pretty-print the IRSB
print("\n--- IRSB Statements ---")
for stmt in irsb.statements:
stmt.pp()
print("\n--- Next IR Expression (Jump Target) ---")
irsb.next.pp()
print(f"Jump Kind: {irsb.jumpkind}")