Further adventures in gdb

So, that updating-every-day plan hasn’t exactly worked out. Working on the code is so much fun that it’s hard to stop, and then I have to choose between going to bed and writing a blog post.

I’m currently working on generating code for the three opcodes related to records: AddRecord, FinishRecord, and AddRecordProperty. The work I wrote about in previous posts wasn’t exactly complete, because it allowed code generation in the presence of records, but I just plugged in no-op implementations for the tryAttachStub() methods in CacheIR.cpp that actually do the work. I still had to change some code generation methods so that existing code could work with records and tuples, but the really fun stuff is what I’m doing now!

Rather than try to cover everything, I’ll summarize my notes from the end of last week. I had added two methods to BaselineCacheIRCompiler.cpp, which is the part of the baseline compiler that generates baseline code from CacheIR code. As an initial test case, I was using a function that just returns an empty record literal:

function f() { x = #{}; }

since I hadn’t yet implemented code generation for adding record properties. The JavaScript bytecode for this function looks like:

loc     op
-----   --
00000:  BindGName "x"                   # GLOBAL
00005:  InitRecord 0                    # GLOBAL 
00010:  FinishRecord                    # GLOBAL 
00011:  SetGName "x"                    # 
00016:  Pop                             # 
00017:  RetRval                         # 

The InitRecord 0 instruction creates an uninitialized record of length 0, and the FinishRecord instruction sets internal flags to make the record read-only. In a non-empty record, one or more AddRecordProperty instructions would appear in between; it would be a dynamic error to invoke another AddRecordProperty instruction again after the FinishRecord.

The bytecode above is from executing dis(f) in the interpreter; I also discovered that you can do disnative(f) after CacheIR and then native code are generated for f. This prints out the generated assembly code; it’s actually much easier to read if you execute print(disnative(f)). Invoking the interpreter with the environment variable IONFLAGS=codegen also prints out the code as it’s generated, which is sometimes more useful since it can be interleaved with other debug messages that show which methods are generating which bits of code.

I was getting a segfault, meaning that the code being generated by my emitNewRecordResult() method didn’t match the expectations that the code generated by my emitFinishRecordResult() method had about the data. So I fired up gdb, as I had done before. Trying to zero in on the problem, I was looking at some generated code that took its input in the %rax register (gdb uses the $rax syntax instead.) With careful use of casts and calling accessor methods, it’s possible to look at the C++ representations of the objects that generated code creates. I found it helpful to look at the value of this expression:

(gdb) p (*(*((JSObject*) $rax)).shape()).getObjectClass()
$9 = (const JSClass *) 0x5555590a6130 

From context, I was able to figure out that 0x5555590a6130 was the address of the JSClass structure for the ArrayObject type. This wasn’t what I expected, since the input for this code was supposed to be a record (with type RecordType in C++).

Since this was a few days ago, I’ve lost track of the logical steps in between, but eventually, I put a breakpoint in the createUninitializedRecord() method that I’d added to MacroAssembler.cpp; shape and arrayShape are variable names referring to two registers I allocated within that method.

Thread 1 "js" hit Breakpoint 1, js::jit::MacroAssembler::createUninitializedRecord (
    this=0x7fffffffae00, result=..., shape=..., arrayShape=..., temp=..., temp2=..., 
    recordLength=0, allocKind=js::gc::AllocKind::OBJECT4_BACKGROUND, 
    initialHeap=js::gc::DefaultHeap, fail=0x7fffffffaa38, allocSite=...)
    at /home/tjc/gecko-fork/js/src/jit/MacroAssembler.cpp:493
warning: Source file is more recent than executable.
493   allocateObject(result, temp, allocKind, 0, initialHeap, fail, allocSite);
(gdb) p shape
$10 = {reg_ = js::jit::X86Encoding::rcx, static DefaultType = js::jit::RegTypeName::GPR}
(gdb) p arrayShape
$11 = {reg_ = js::jit::X86Encoding::rcx, static DefaultType = js::jit::RegTypeName::GPR}
(gdb) quit

This debug output shows that the register allocator assigned both shape and arrayShape to the same register, %rcx. As a result, the code I generated to initialize the shapes was using the same shape for both the record object itself, and the internal sortedKeys array pointed to by the record’s sortedKeys field — causing createUninitializedRecord() to return something that looked like an array rather than a record.

The reason this was happening was the following code that I wrote:

AutoScratchRegisterMaybeOutput shape(allocator, masm, output);
AutoScratchRegisterMaybeOutput arrayShape(allocator, masm, output);

The AutoScratchRegisterMaybeOutput class (defined in CacheIRCompiler.h) re-uses the output register if possible, so calling it twice allocates both symbolic names to the same register. That wasn’t what I wanted.

The solution was to use the AutoScratchRegister class instead, telling the register allocator that shape and arrayShape should live in distinct registers. After making that change, repeating the original experiment gives:

(gdb) p (*(*((JSObject*) $rcx)).shape()).getObjectClass()
$5 = (const JSClass *) 0x5555590b5690 
(gdb) c
(gdb) p (*(*((JSObject*) $rbx)).shape()).getObjectClass()
$8 = (const JSClass *) 0x5555590a60f0 

(To be sure that $rcx and $rbx were the registers allocated to shape and arrayShape, I had to set a breakpoint in my createUninitializedRecord() method again and look at the values of these variables.)

I hope this might be marginally useful to somebody trying to debug SpiderMonkey baseline code generation, or if nothing else, useful to me if I ever have to set aside this work for a few months (let’s be honest, weeks) and come back to it!

Tags: ,

Leave a Reply