Striga: Lifting x86 to LLVM IR with Python
Background While discussing with eversinc33 about lifting BinaryShield to LLVM IR I decided it would be useful to write a basic lifter in Python that can lift x86_64 instructions to LLVM IR. He has since released his blog post: Writing a Naive LLVM-based Devirtualizer, which I highly recommend you check out! This post assumes familiarity with the basics of LLVM IR. You can find some references at the end of this post. Over the years I noticed that a lot of people get stuck exploring lifters, because existing tooling is too difficult to compile. In October 2025 I spent around a month redoing Remill’s build system (remill#723) and earlier this month I did the same for the Dna project (Dna#9). Last year I also started working on Python bindings for LLVM, which I wanted to use for a real project. You can find the lifter at LLVMParty/striga. The goal of this post is to lower the barrier of entry and let you experiment with lifting to LLVM IR. For inspiration you can look at the Static Devirtualization of Themida post that was just released by Back Engineering Labs, as well as the Pushan: Trace-Free Deobfuscation of Virtualization-Obfuscated Binaries paper by ASU researchers published in March. If you enjoy this article and would like to learn more, see my website for information about my in-person trainings. Lifting Lifting is the process of translating assembly instructions to some kind of intermediate representation (IR). The motivation is usually that directly analyzing and manipulating (x86) assembly instructions is complex and error prone. The lifter translates the underlying instruction semantics directly to an IR that is easier to reason about (and therefore to manipulate as well). A few popular IRs: SMT-LIB, used by Triton (symbolic execution) VEX, used by angr Miasm IR Sleigh, used by Ghidra, Remill and Icicle LLVM IR, used by Rellume, revng and Remill Microcode, used by IDA (proprietary) BNIL, used by Binary Ninja (proprietary) For this project I picked LLVM IR, because I am the most familiar with it and it has a well-established ecosystem. LLVM already has all of the common compiler optimizations and it is used and maintained by teams at large corporations. Architecture The architecture of the lifter is very much inspired by remill, but I simplified some things to make it easier to follow. In LLVM a register is actually an SSA value, which means we can only assign to it once. CPU registers are variables that can be assigned to multiple times. We model this by creating a State structure in memory that represents the x86 CPU state: struct State { uint64_t rax; uint64_t rbx; uint64_t rcx; uint64_t rdx; // ... GPRs uint8_t cf; uint8_t zf; uint8_t of; // ... Flags // ... XMM }; Instructions that read or write to RAX will load/store to State->rax. If we play our cards right, the optimizer will use the mem2reg pass to translate this into SSA form for us and enable further optimizations. An important difference to an actual CPU is that flags are modelled as independent 8-bit registers. This makes it easier to reason about compared to a packed bitfield. For instance, it helps the optimizer to perform dead store elimination and propagation. In addition to the State, we need an opaque memory pointer that helps us differentiate a load/store in the State from memory accesses by the x86 CPU. In short: the State pointer is used to model the CPU and the memory pointer is used to model the RAM. While lifting, the prototype of the lifted function is void lifted(State* state, void* memory). Later on we will perform brightening, to turn this into something we can recompile. Below is the LLVM IR for the instruction mov rax, rcx, with comments in pseudo-C: define internal void @lifted_0x140001000(ptr %state, ptr %memory) { initialize: ; uint64_t* rcx = &state->rcx; %rcx = getelementptr inbounds nuw %State, ptr %state, i32 0, i32 2 ; uint64_t* rax = &state->rax; %rax = getelementptr inbounds nuw %State, ptr %state, i32 0, i32 0 ; Jump to the first instruction br label %insn_0x140001000 insn_0x140001000: ; preds = %initialize ; uint64_t v0 = *rcx; %0 = load i64, ptr %rcx, align 4 ; *rax = v0; store i64 %0, ptr %rax, align 4 ; Jump to the next instruction br label %insn_0x140001003 insn_0x140001003: ; preds = %insn_0x140001000 ; Block terminator to keep the IR valid ret void } We start out with the initialize block, which is used to get pointers to the relevant State members. Then every instruction gets its own basic block named insn_. Every instruction is responsible for emitting an unconditional branch to its successors. The basic block for the successor is created with just a ret terminator, to keep the module verifier happy. To illustrate memory accesses, here is the LLVM IR for mov rax, qword [rbx+42]: define internal void @lifted_0x140001000(ptr %state, ptr %memory) { initialize: %rbx = getelementptr inbounds nuw %State, ptr %state, i32 0, i32 1 %rax = getelementptr inbounds nuw %State, ptr %state, i32 0, i32 0 br label %insn_0x140001000 insn_0x140001000: ; preds = %initialize ; uint64_t v0 = rbx; %0 = load i64, ptr %rbx, align 4 ; uint64_t v1 = v0 + 42; %1 = add i64 %0, 42 ; uint8_t v2 = &memory[v1]; %2 = getelementptr i8, ptr %memory, i64 %1 ; uint64_t v3 = (uint64_t)v2; %3 = load i64, ptr %2, align 1 ; *rax = v3; store i64 %3, ptr %rax, align 4 br label %insn_0x140001004 insn_0x140001004: ; preds = %insn_0x140001000 ret void } Here you can see the getelementptr i8, ptr %memory, i64 %1 instruction which uses memory as a base, signaling that this is a read from the x86 memory (we will clean this up later). The lifter itself is contained in a ~500 line Semantics class with these main functions (some are omitted for brevity): # src/striga/semantics.py class Semantics: def init(self, module: Module): ... # Lifting def begin(self, address: int) -> Function: ... def get_or_create_block(self, address: int) -> BasicBlock: ... def lift_bytes(self, address: int, code: bytes) -> list[Successor]: ... # Semantic helpers def reg_read(self, name: str) -> Value: ... def reg_write(self, name: str, value: Value): ... def mem_read(self, addr: Value, ty: Type) -> Value: ... def mem_write(self, addr: Value, value: Value): ... def op_mem(self, op: X86Op) -> Value: ... def op_read(self, index: int) -> Value: ... def op_write(self, index: int, value: Value): ... def flag_read(self, name: str) -> Value: ... def flag_write(self, name: str, value: Value): ... # State (simplified) module: Module function: Function ir: Builder insn: CsInsn The begin(address) function is used to create the lifted_ function in LLVM IR and create the initialize block with a branch to the first instruction: def begin(self, address: int) -> Function: name = f"lifted_{hex(address)}" fn = self.module.get_function(name) if fn is None: fn = self.module.add_function(name, self.lifted_ty) fn.param_attributes(0).add("noalias") fn.param_attributes(1).add("noalias") state, memory = fn.params memory.name = "memory" state.name = "state" self.function = fn self.reg_ptrs = {} self.insn_blocks = {} entry = fn.append_basic_block("initialize") assert fn.last_basic_block == entry with entry.create_builder() as ir: ir.br(self.get_or_create_block(address)) else: # Omitted for brevity return self.function To create the instruction block, get_or_create_block is used: def get_or_create_block(self, address: int) -> BasicBlock: block = self.insn_blocks.get(address) if block is None: block = self.function.append_basic_block(f"insn_{hex(address)}") with block.create_builder() as ir: ir.ret_void() self.insn_blocks[address] = block assert block.function == self.function return block As mentioned above, an empty block is not valid LLVM IR so we populate it with a ret instruction. When actually lifting into the basic block, that instruction will be replaced with the lifted code. To lift a single instruction we pass its address and bytes to lift_bytes, which is responsible for producing LLVM IR: def lift_bytes(self, address: int, code: bytes) -> list[Successor]: # Ensure we have a function to lift into if not hasattr(self, "function"): self.begin(address) insn = self.cs_disasm(address, code) if self.verbose: print(";", hex(insn.address), insn.mnemonic, insn.op_str) # Skip lifting if the block is already populated block = self.get_or_create_block(address) assert block.first_instruction if block.first_instruction.opcode == Opcode.Ret: block.first_instruction.erase_from_parent() else: return [] with block.create_builder() as ir: # State used by semantic handlers self.ir = ir self.insn = insn handler = _semantics.get(insn.mnemonic) if handler is None and insn.mnemonic.startswith("lock "): # LOCK preserves the single-threaded architectural result; the # lifter does not model inter-thread atomicity separately. handler = semantics.get(insn.mnemonic.removeprefix("lock ")) if handler is None: raise NotImplementedError(insn.mnemonic) successors = handler(self) if successors is None: # Linear fallthrough - handler didn't emit a terminator. fallthrough = address + insn.size ir.br(self.get_or_create_block(fallthrough)) successors = [Successor(address, self.const64(fallthrough))] # Make sure the handler produced valid IR self.module.verify_or_raise() return successors The function first ensures an empty insn block by removing the temporary ret instruction. Then it creates an IR Builder and calls the handler responsible for producing IR for the instruction being lifted (more on that below). If the handler does not return successors, lift_bytes handles the common fallthrough case by creating a basic block for the next instruction. It is up to the caller to handle the list of Successor tuples: class Successor(NamedTuple): src: int dst: Value We use an LLVM Value for the branch destination, because it is not always concrete (for example jmp reg). The semantic handlers are registered globally: # src/striga/semantic.py SemanticFn: TypeAlias = Callable[["Semantics"], list[Successor] | None] _semantics: dict[...
Discussion in the ATmosphere