Adventures in gdb

November 10th, 2021

I picked up from yesterday wanting to see what code was being generated for record initialization. A colleague pointed me to a page of SpiderMonkey debugging tips. This was helpful, but required being able to run the JS interpreter inside GDB and type some code into the REPL. The problem is that before it got to that point, the interpreter was trying to compile all the self-hosted code; I knew that this wasn’t going to succeed since I’ve only implemented one of the record/tuple opcodes. I wanted to be able to just do:

> x = #{}

(binding the variable x to an empty record literal) and see the generated code. But because the much-more-complicated self-hosted code has to get compiled first, I never get to that point.

Another colleague suggested looking at the IONFLAGS environment variable. This, in turn, seems to only have an effect if you build the compiler with the --enable-jitspew option. Once I did that, I was able to find out more:

$ IONFLAGS=zzzz mach run
obj-x64-debug/dist/bin/js
found tag: zzzz
Unknown flag.

usage: IONFLAGS=option,option,option,... where options can be:

  aborts        Compilation abort messages
  scripts       Compiled scripts
  mir           MIR information
    ...
    

And so on.

I found that IONFLAGS=codegen mach run would cause the interpreter to print out all the generated assembly code, including all the code for self-hosted methods. This wasn’t entirely helpful, since it was hard to see where the boundaries were between different methods.

I decided to try a different strategy and see what I could do inside gdb. I’ve avoided using debuggers as much as possible throughout my programming career. I’m a fan of printf-style debugging. So much so that I created the printf-style debugging page on Facebook. (This made more sense back when Facebook pages were “fan pages”, so you could be a “fan of” printf-style debugging.) I’ve always had the feeling that any more sophisticated debugging technology wasn’t worth the difficulty of use. Working on a compiler implemented in C++, though, it seems I’m finally having to suck it up and learn.

The first question was how to set a breakpoint on a templated function. I found the rbreak command in gdb, which takes a regular expression. I realized I could also just do:

(gdb) info functions .*emit_InitR.*
All functions matching regular expression ".*emit_InitR.*":

File js/src/jit/BaselineCodeGen.cpp:
2590:   bool js::jit::BaselineCodeGen::emit_InitRecord();
2590:   bool js::jit::BaselineCodeGen::emit_InitRecord();

File js/src/jit/BaselineIC.cpp:
2454:   bool js::jit::FallbackICCodeCompiler::emit_InitRecord();
(gdb)

So I set a breakpoint on the method I wrote to generate code for the InitRecord opcode:

(gdb) b js::jit::BaselineCodeGen::emit_InitRecord
Breakpoint 1 at 0x555558093884: file /home/tjc/gecko-fork/js/src/jit/BaselineCodeGen.cpp, line 2591.
(gdb) b js::jit::FallbackICCodeCompiler::emit_InitRecord
Breakpoint 2 at 0x5555580807b1: file /home/tjc/gecko-fork/js/src/jit/BaselineIC.cpp, line 2455.
(gdb) run
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /home/tjc/gecko-fork/obj-x64-debug/dist/bin/js 
[snip]

Thread 1 "js" hit Breakpoint 2, js::jit::FallbackICCodeCompiler::emit_InitRecord (this=0x7fffffffd1b0)
    at /home/tjc/gecko-fork/js/src/jit/BaselineIC.cpp:2455
2455      EmitRestoreTailCallReg(masm);
(gdb) 

Finally! At this point, I was hoping to be able to view the code that was being generated for the empty record literal. Stepping through the code from here gave me what I was looking for:

(gdb) s
js::jit::FallbackICCodeCompiler::tailCallVMInternal (
    this=0x7fffffffd1b0, masm=..., 
    id=js::jit::TailCallVMFunctionId::DoInitRecordFallback)
    at /home/tjc/gecko-fork/js/src/jit/BaselineIC.cpp:510
510   TrampolinePtr code = cx->runtime()->jitRuntime()->getVMWrapper(id);
(gdb) n
511   const VMFunctionData& fun = GetVMFunction(id);
(gdb) n
512   MOZ_ASSERT(fun.expectTailCall == TailCall);
(gdb) n
513   uint32_t argSize = fun.explicitStackSlots() * sizeof(void*);
(gdb) n
514   EmitBaselineTailCallVM(code, masm, argSize);
(gdb) n
515   return true;
(gdb) p code
$18 = {value = 0x1e4412b875e0 "H\277"}
(gdb) p code.value
$19 = (uint8_t *) 0x1e4412b875e0 "H\277"
(gdb) x/64i code.value
   0x1e4412b875e0:  movabs $0x7ffff4219000,%rdi
   0x1e4412b875ea:  mov    0x1c0(%rdi),%rax
   0x1e4412b875f1:  mov    %rsp,0x70(%rax)
   0x1e4412b875f5:  movabs $0x55555903de60,%r11
   0x1e4412b875ff:  push   %r11
   0x1e4412b87601:  lea    0x18(%rsp),%r10
   0x1e4412b87606:  movabs $0xfff9800000000000,%r11
     

So that’s the generated code for DoInitRecordFallback (the fallback method implemented in the inline cache module of the baseline compiler), but I realized this wasn’t really what I was hoping to find. I wanted to see the intermediate representation first.

From there, I realized I was barking up the wrong tree, since the baseline compiler just goes straight from JS to assembly; only the more sophisticated compilers (which weren’t being invoked at this point) use MIR and LIR. (A blog post from Matthew Gaudet, “A Beginners Guide To SpiderMonkey’s MacroAssembler”], explains some of the pipeline.)

So at least I knew one way to get to the generated assembly code for one opcode, but it wasn’t particularly helpful. My co-worker suggested putting in no-op implementations for the other opcodes so that it would be able to compile all the self-hosted code (even if the generated code wouldn’t work). This seemed like the fastest way to get to a functioning REPL so I could experiment with simpler code snippets, and it worked. After just adding a no-op emit_ method in BaselineCodeGen.cpp for each opcode, the interpreter was able to start up.

When I typed code into the REPL, I could tell it was only being interpreted, not compiled, since everything still worked, and I would expect anything that used records/tuples except for an empty record literal to fail. I found the --baseline-eager flag with a little bit of digging, and:

obj-x64-debug/dist/bin/js --baseline-eager
js> function f() { return #{}; }
function f() { return #{}; }
js> f()
f()
Assertion failure: !BytecodeOpHasIC(op) (Missing entry in OpToFallbackKindTable for JOF_IC op), at js/src/jit/BaselineIC.cpp:353
Segmentation fault
$

Excellent! This pointed to something I didn’t change yesterday (since the compiler didn’t make me) — I had to update the OpToFallbackKindTable in BaselineIC.cpp.

Once I did that, I realized that I couldn’t get very far with just InitRecord, since I wouldn’t expect even the empty record to compile without being able to compile the FinishRecord opcode. (Since records are immutable, Nicolò’s implementation adds three opcodes for creating records: one to initialize the empty record, one to add a new record field, and one to finish initialization, the last of which marks the record as immutable so that no more fields can be added.)

So I implemented FinishRecord, similarly to the work from yesterday. Now what? I was able to type in an empty record literal without errors:

> x = #{}
#{}

But how do I know that x is bound to a well-formed record that satisfies its interface? There’s not too much you can do with an empty record. I decided to check that typeof(x) worked (it should return “record”), and got an assertion failure in the emitGuardNonDoubleType() method in CacheIRCompiler.cpp). It took me some time to make sense of various calls through generated code, but the issue was the TypeOfIRGenerator::tryAttachStub() method in CacheIR.cpp:

AttachDecision TypeOfIRGenerator::tryAttachStub() {
[...snip...]
  TRY_ATTACH(tryAttachPrimitive(valId));
  TRY_ATTACH(tryAttachObject(valId));

  MOZ_ASSERT_UNREACHABLE("Failed to attach TypeOf");
  return AttachDecision::NoAction;
    }
    

This code decides, based on the type of the operand (valId) whether to use the typeOf code for primitives or for objects. The record/tuple implementation adds “object primitives”, which share some qualities with objects but aren’t objects (since, among other things, objects are mutable). The tryAttachPrimitive() call was successfully selecting the typeOf code for primitives, since the isPrimitive() method on the Value type returns true for object primitives. Because there was no explicit case in the code for records, the code for double values was getting called as a fallback and that’s where the assertion failure was coming from. Tracking this down took much more time than actually implementing typeOf for records, which I proceeded to do. And now I can get the type of a record-valued variable in compiled code:

js> x = #{}
    #{}
js> typeof(x)
"record"

This provides at least some evidence that the code I’m generating is laying out records properly. Next up, I’ll try implementing the opcode that adds record properties, so that I can test out non-empty records!

Adding record and tuple support to the JIT

November 9th, 2021

Today I started working on implementing the Record and Tuples proposal for JavaScript in the JIT in SpiderMonkey. All of this work is building on code written by Nicolò Ribaudo, which isn’t merged into SpiderMonkey yet but can be seen in patches linked from the Bugzilla bug.

Up until now, SpiderMonkey would automatically disable the JIT if you built it with the compile-time flag that enables records and tuples. Currently, the interpreter implements records and tuples, but not the compiler. I started by searching through the code to figure out how to re-enable the JIT, but realized it would be faster to look through the commit history, and found it in js/moz.configure. (If you try to follow along, you won’t be able to see some of the code I’m referring to since it’s in unapplied patches, but I’m including some links anyway to give context.)

I saw that if I just pass in the --enable-jit build flag explicitly, it should override what the config file said, and it indeed did. I decided to operate on the assumption that the compiler error messages would tell me what I needed to implement, which isn’t always a safe assumption when working in C/C++, but seems to have served me okay in my SpiderMonkey work so far.

The first set of compiler errors I got had to do with adding the IsTuple() built-in method to the LIR. (The MIR and LIR, two of the intermediate languages used in SpiderMonkey, are explained briefly on the SpiderMonkey documentation page.) This involved implementing EmitObjectIsTuple() and visitIsTuple methods in CodeGenerator.cpp, part of the baseline compiler (the documentation also explains the various compilers that make up the JIT). That was straightforward, since IsTuple() is just a predicate that returns true for tuple arguments and false for arguments of any other type. When I implemented this method before, I chose to implement it as a JS_INLINABLE_FN, not knowing what I was getting myself into. With JIT disabled at compile time, the compiler made me implement it down to the MIR level, but now I had to implement it in LIR.

Once that was done, I ran the interpreter and got an assertion failure: "Hit MOZ_CRASH(Record and Tuple are not supported by jit) at gecko-fork/js/src/jit/BaselineCodeGen.cpp:2589". This was excellent, since it told me exactly where to start. When I looked at BaselineCodeGen.cpp, I saw that the seven opcodes for records and tuples were all defined with the UNSUPPORTED_OPCODE macro, so I planned to proceed by removing each of the UNSUPPORTED_OPCODE calls one-by-one and seeing what that forced me to implement.

I started with the InitRecord opcode, which as you might guess, creates a new record with a specified number of fields. As a strategy, I followed the pattern for the existing NewArray and NewObject opcodes, since creating new arrays and objects is similar to creating new records.

By following the error messages, I found the files that I needed to change; I’m putting this list in logical order rather than in the order that the compile errors came up, which was quite different.

  • VMFunctionList-inl.h — added the RecordType::createUninitialized C++ function to the list of functions that can be called from the JIT
  • VMFunctions.h — added a TypeToDataType case for the RecordType C++ type
  • BaselineCodeGen.cpp, where I added an emit method for InitRecord
  • BaselineIC.cpp, and CacheIR.cpp, where I added code to support inline caching (explained here) for InitRecord.
  • MIROps.yaml, the file that defines all MIR opcodes; a lot of other code is automatically generated from this file. I had to add a new InitRecord opcode.
  • MIR.cpp — MInitRecord methods
  • MIR.h, where I had to define a new MInitRecord class, and MIR.cpp, where I had to implement the class.
  • Lowering.cpp, where I added code to translate the MIR representation for an InitRecord call to LIR.
  • LIROps.yaml, similarly to MIROps.yaml.
  • CodeGenerator.cpp, where I added the visitInitRecord method that translates the LIR code to assembly.
  • Recover.cpp — while I don’t understand this code very well, I think it’s what implements the “bailout” mechanism described in the docs. Similarly to the other modules, I had to add methods for InitRecord and a new class to the accompanying header file.

I love compiler errors! Without static typechecking, I wouldn’t have any information about what parts of the code I needed to change to add a new feature. As a functional programmer, I normally don’t give C++ a lot of credit for static typechecking, but whether it’s about modern language features or the coding style used in SpiderMonkey (or both), I actually find that I get a lot of helpful information from type error messages when working on SpiderMonkey. Without static type errors, I would have had to understand the JIT from the top down to know what parts I needed to change, maybe by reading through the code (slow and tedious) or maybe by reading through documentation (likely to be out of date). Types are documentation that can’t fall out of date, since the compiler won’t generate code for you if you give it something that doesn’t typecheck.

Once everything compiled and I started the interpreter again, I got a different assertion failure:

"Assertion failure: BytecodeOpHasIC(op), at /home/tjc/gecko-fork/js/src/jit/BaselineCodeGen.cpp:649"

This pointed to the final change, in BytecodeLocation.h. I had added the code for inline caching, but hadn’t updated the opcode table defined in this file to indicate that the InitRecord opcode had an inline cache. Since the relationship between this table and the code itself exists only in the programmers’ heads, there’s no way for the compiler to check this for us.

Once I fixed this and started the interpreter again, I got a new error:

Hit MOZ_CRASH(Record and Tuple are not supported by jit) at /home/tjc/gecko-fork/js/src/jit/BaselineCodeGen.cpp:2604
Thread 1 "js" received signal SIGSEGV, Segmentation fault. 0x000055555809ce62 in js::jit::BaselineCodeGen::emit_AddRecordProperty (this=0x7fffffffd080) at /home/tjc/gecko-fork/js/src/jit/BaselineCodeGen.cpp:2604 2604 UNSUPPORTED_OPCODE(AddRecordProperty)

This is just saying that AddRecordProperty is an unsupported opcode, which is what I would expect since I only implemented one of the record/tuple opcodes. So that means that after my changes, SpiderMonkey was able to generate code for the InitRecord opcode. (The reason why these errors showed up as soon as I launched the interpreter, without having to execute any code, is that at startup time with JIT enabled, the interpreter compiles all the self-hosted libraries, which are implemented in JavaScript. Since on my working branch, there is library code that uses the Record and Tuple types, that means that the code path leading to those UNSUPPORTED_OPCODES was guaranteed to be reached.)

So what do I know now? The JIT seems to be able to generate code for the InitRecord opcode, at least for the first occurrence of it in the self-hosted libraries. Whether that code works (that is, implements the semantics in the spec) is a separate question. To know the answer, I would have to look at the generated code — I won’t be able to actually test any code in the interpreter until I implement all the opcodes, since each one will subsequently fail with the same error message as above. But that’s for another day.

Hello, world!

November 8th, 2021
It’s been a long time since I’ve blogged regularly, especially about software. When I worked on the Rust team, I wrote an update post at the end of every single day about what I’d worked on that day, every day I possibly could. I’m going to try to do that again I joined the Compilers team at Igalia this past September and am currently working on implementing new JavaScript features in the Spidermonkey JavaScript engine; at the moment, the Records and Tuples proposal, which would add immutable data types to JavaScript. As much as possible, I’m going to document how I spend each work day and what problems arise. This is mostly for me (so that I don’t look back and wonder what I did all month), but if anyone else happens to find it interesting, that’s an added bonus.