Last year at Igalia we started coding pflua, a high-performance packet filter which runs on top of LuaJIT. Pflua is now included in Snabb Switch as an external library and it’s used to do all the packet filtering tasks that were initially done using libpcap. Pflua is capable of compiling, while performing several optimizations, pflang expressions to Lua functions. One of the first tasks I did in pflua was obtaining the machine code that LuaJIT produces for a translated Lua function. In this post, I take a high-level look at LuaJIT’s disassembler, the piece of code that allows LuaJIT to print out the compiled machine code that itself produces.
LuaJIT 2.1, currently in beta stage, introduced two very useful tools: a statistical profiler and a code dumper. Both tools are actually Lua modules. They can be used either from source code or externally when calling LuaJIT. In that case, it’s necessary to add the parameters -jp=options and jdump=options.
LuaJIT’s profiler can help us to understand what code is hot, in other words, what parts of the program are consuming most of the CPU cycles. The dumper helps us understand what code is produced for every trace. It’s possible to peek at the resulting Lua’s bytecode, LuaJIT’s SSA IR (Static-single assignment IR, an intermediate representation form often used in compilers for procedural languages such as Java and C) and machine code, using different flags. Either for the profiler and the code dumper, all the possible flags are best documented at the headers of their respective source code files: src/jit/p.lua and src/jit/dump.lua. In the particular case of the code dumper, the flag ‘m’ prints out machine code for a trace:
LuaJIT uses its own disassembler to print out machine code. There’s one for every architecture supported: ARM, MIPS, PPC, x86 and x64. LuaJIT’s disassemblers are actually Lua modules, written in Lua (some of them as small as 500 LOC), and live at src/jit/dis_XXX.lua. Mike Pall comments on the header of dis_x86.lua that an external disassembler could have been used to do the actual disassembling and later integrate its result with the dumper module, but that design would be more fragile. So he decided to implement his own disassemblers.
As they are coded as modules, it could be possible to reuse them in other programs. Basically, each disassembler module exports three functions: create
, disass
and regname
. The disass
function creates a new context and disassembles a chunk of code starting at address. So it would be possible to pass a section of a binary and get it decoded using the disass
function.
In the example below, I’m going to use LuaJIT’s x86-64 disassembler to print out the .text section of a binary, which is the section that contains the compiled source code. I use the binary of following hello-world program as input.
I need to know the offset and size of the .text section:
Now all I have to do is to read that chunk of the binary and pass it to disass
.
And this what I get when I print out the first 10 lines of the .text section:
To validate the output is correct I compared it to the same output produced by ndisasm
and objdump
.
It looks almost the same as LuaJIT’s disassembler, but that’s because LuaJIT’s dissasembler follows ndisasm format, as it’s stated in the source code. Objdump produces a slightly different output but semantically equivalent:
It is possible to the same thing by instantiating a context via create
and call context:disass()
to disassemble a chunk of machine code. This approach allow us to have a finer control of the output as create is passed a callback for each diassembled line. We could accumulate the disassembled lines in a variable or print them out to stdio, as in this example.