Observability

I’m realizing that this blog is mostly about debugging, since figuring out how to debug SpiderMonkey is taking up most of my time. Right now, the actual compilation algorithms I’m implementing are straightforward; if I was working on performance tuning an existing implementation or something, I’m sure it would be different. What’s actually hard is finding ways to make generated code more observable. I’m sure that more experienced SpiderMonkey hackers have their own techniques, but since there isn’t a lot of documentation to go on, I’m making it up mostly from scratch.

Having gotten empty records to work, I started working on code generation for the AddRecordProperty opcode, so we can have non-empty records. When I initially tested my code with input x = #{"a": 1}, I got an assertion failure:

Assertion failure: !this->has(reg), at /home/tjc/gecko-fork/js/src/jit/RegisterSets.h:680

Thread 1 "js" received signal SIGSEGV, Segmentation fault.
0x00005555580486c9 in js::jit::SpecializedRegSet<js::jit::LiveSetAccessors<js::jit::TypedRegisterSet >, js::jit::TypedRegisterSet >::add (
    this=0x7fffffffc768, reg=...) at /home/tjc/gecko-fork/js/src/jit/RegisterSets.h:680
680     MOZ_ASSERT(!this->has(reg));

AddRecordProperty takes three operands (the record, the field name, and the value for the field). When I dug into it, it turned out that two of the three operands had been allocated to the same register, which wasn’t right. The CacheIRCompiler module handles register allocation, spilling registers to the stack when necessary, so you can implement new operations without having to worry about those details. But in this case, I had to worry about that detail. Internally, the register allocator keeps a list, operandLocations_ that maps integers (operands are numbered from 0 onward) to physical registers. It turned out that operands 0 and 2 were both allocated to %rax, which I figured out in gdb (irrelevant details snipped):

(gdb) p operandLocations_[0]
$13 = (js::jit::OperandLocation &) @0x7fffffffc680: {
  kind_ = js::jit::OperandLocation::PayloadReg, data_ = {payloadReg = {reg = {
        reg_ = js::jit::X86Encoding::rax, 
                ...
(gdb) p operandLocations_[1]
$14 = (js::jit::OperandLocation &) @0x7fffffffc690: {
  kind_ = js::jit::OperandLocation::ValueReg, data_ = {payloadReg = {reg = {
        reg_ = js::jit::X86Encoding::rbx, 
                ...
(gdb) p operandLocations_[2]
$15 = (js::jit::OperandLocation &) @0x7fffffffc6a0: {
  kind_ = js::jit::OperandLocation::ValueReg, data_ = {payloadReg = {reg = {
        reg_ = js::jit::X86Encoding::rax,  

The BaselineCacheIRCompiler::init() method (code) is called each time a new opcode is compiled, and it assigns a virtual register for each operand. I compared the code I’d added for AddRecordProperty with the code for the other opcodes that also have 3 operands (such as SetElem), I noticed that all of them passed the third operand on the stack. When I changed AddRecordProperty to work the same way, the problem was fixed. I don’t know how I would have figured this out except for a lucky guess! I’m guessing it’s this way because of the small number of registers on x86, but the internal error-checking here seems lacking; it doesn’t seem good to silently alias distinct operands to each othre.

Once I changed the init() method and modified my emitAddRecordPropertyResult() method to expect the third operand on the stack, I found that for the compiled code to be run and not just emitted, I had to embed the code in a function once again:
function f() { x = #{"a":1}; }

(I still don’t understand the details as to when the JIT decides to generate native code.)

The result was a segfault, which was okay for the time being; it means that my code was at least able to generate code for setting a record property, without internal errors.

I realized that I wasn’t using the protocol for calling from the JIT into C++ functions (the callVM() method in the baseline compiler) correctly, but took a break before fixing it.

After taking a break, I fixed that problem only to run into another one: callVM() doesn’t automatically save registers. The macro assembler provides another method, callWithABI() (code), that makes it easier to save and restore registers. Evidently, ABI calls are less efficient than VM calls, but I’m not worried about performance at the moment. I switched it around to use callWithABI() and had to do some debugging by setting breakpoints both in the C++ code for the callee function and in the generated assembly code to make sure all the data types matched up. That worked reasonably well, but alas, I had to switch back to callVM() because I couldn’t figure out how to pass a HandleObject (see the comment at the beginning of RootingAPI.h for a fairly good explanation) into an ABI function. I added code to manually save and restore registers.

That worked okay enough, and my generated code got almost all the way through without a segfault. At the very end, my code calls the existing emitStoreDenseElement method. Internally (in the current implementation; this is all subject to change), records contain a sorted array of keys. Separately from storing the actual value at the correct offset in the record, the key has to be added to the array. (Since records are immutable, the actual sorting only has to happen once, when the FinishRecord opcode executes.)

I was getting a segfault because there was something wrong about the array that I was passing into the generated code for storing dense elements. I haven’t found the problem yet, but (going back to the beginning of this post), I’m trying to find ways to make the generated code more transparent. In this case, I knew that the emitStoreDenseElement code was getting its array input in the %rsi register, so I put a breakpoint in the right place and looked at it in gdb:


Thread 1 "js" received signal SIGTRAP, Trace/breakpoint trap.
0x00001dec1e58ab70 in ?? ()

(gdb) p *((JSObject*) $rsi)
$3 = {<js::gc::CellWithTenuredGCPointer> = { = {
      header_ = {<mozilla::detail::AtomicBaseIncDec> = {<mozilla::detail::AtomicBase> = {
            mValue = {<std::__atomic_base> = {static _S_alignment = 8, _M_i = 6046507808}, 
              static is_always_lock_free = true}}, }, }, static RESERVED_MASK = 7, static FORWARD_BIT = 1}, }, 
  static TraceKind = JS::TraceKind::Object, static MAX_BYTE_SIZE = 152}
(gdb) p *((JSObject*) $rsi).getHeader()
Attempt to take address of value not located in memory.
(gdb) p ((JSObject*) $rsi)->header_
$4 = {<mozilla::detail::AtomicBaseIncDec> = {<mozilla::detail::AtomicBase> = {
      mValue = {<std::__atomic_base> = {static _S_alignment = 8, _M_i = 6046507808}, 
        static is_always_lock_free = true}}, }, }
(gdb) p ((JSObject*) $rsi)->slots_
There is no member or method named slots_.
(gdb) p ((js::HeapSlot*) ($rsi + 8))
$5 = (js::HeapSlot *) 0x3a6c5de007f8
(gdb) p ((js::HeapSlot*) ($rsi + 8))[0]
$6 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 93824997702264}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 8))[1]
$7 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 64237105842200}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 8))[2]
$8 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 0}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 16))[0]
$9 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 64237105842200}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 16))[0].value
$10 = {asBits_ = 64237105842200}
Slot*) ($rsi + 16))))
No symbol "ObjectElements" in current context.
(gdb) p (js::ObjectElements::fromElements (((js::HeapSlot*) ($rsi + 16))))
$11 = (js::ObjectElements *) 0x3a6c5de007f0
(gdb) p (js::ObjectElements::fromElements (((js::HeapSlot*) ($rsi + 16))))->initializedLength
$12 = 1
(gdb) 

I’m posting this debugging transcript to show some of the things I’ve discovered so far. In gdb, you can cast a pointer to an arbitrary type and use that to print out the contents according to the physical layout of that type; hence, printing ((JSObject*) $rsi. Weirdly, the NativeObject class (code) seems to be inaccessible in gdb; I’m not sure if this is because NativeObject inherits from JSObject (in this case, I know that the object pointed to by %rsi is a NativeObject, or because of something different about that class. (None of NativeObject, js::NativeObject, or JS::NativeObject seem to be names that gdb recognizes.) That makes it harder to print some of the other fields, but the debugger does know about the header_ field, since it’s defined in the Cell class (code) that JSObject inherits from. The getHeader() method is just supposed to return the header_ field, so I’m not sure why gdb reported the “Attempt to take address of value not located in memory” error.

I knew that NativeObjects have a slots_ field and an elements_ field, physically laid out right after the header_ field, but due to the problem I mentioned with the NativeObject class, gdb won’t recognize these field names. I knew that slots_ would be at offset 8 and elements would be at offset 16, so using some pointer math and casting the pointers to the type that both slots_ and elements_ have (js::HeapSlot*; the HeapSlot class is defined here), I was able to make gdb treat these addresses as HeapSlot arrays. HeapSlot is basically an alias for the Value type, which represents all runtime values in SpiderMonkey (see Value.h).

The slots_ array contains “fixed slots” (attributes, basically); this object’s slots array has two elements that are initialized (the one at index 2 is printed as if its asBits_ field is 0, meaning it’s just a 0 and is probably uninitialized). For array-like objects (which this one is — if you’ve lost track, it’s the array of sorted record keys that’s part of a record), the elements_ array is an array of, well, elements. NativeObject has a getElements method, which casts the HeapSlot array into an ObjectElements object (which has some additional methods on it), but I can’t call any NativeObject methods from gdb, so instead I called the ObjectElements::fromElements method directly to cast it. Since the initializedLength() method on the resulting object returns 1, which is correct (the previous generated code sets the initialized length to 1, since we’re adding a property to a previously empty record), it seems like the layout is vaguely correct.

That’s as far as I got. Debugging tools for generated code that don’t require all these manual casts and have more access to type information that would make pretty-printing easier would be great! Maybe that exists and I haven’t found it, or maybe it’s just that compiler implementors are a small enough group that making tools more usable isn’t too compelling for anyone. In any case, the hard part about all of this is finding good ways to observe data at runtime and relate it to types that exist in the compiler.

Tags: ,

Leave a Reply