on impostor syndrome, or: worry dies last

June 24th, 2022

According to Sedgwick, it was just this kind of interchange that fueled her emotional re-education. She came to see that the quickness of her mind was actually holding back her progress, because she expected emotional change to be as easy to master as a new theory: “It’s hard to recognize that your whole being, your soul doesn’t move at the speed of your cognition,” she told me. “That it could take you a year to really know something that you intellectually believe in a second.” She learned “how not to feel ashamed of the amount of time things take, or the recalcitrance of emotional or personal change.”

Maria Russo, “The reeducation of a queer theorist”, 1999

My colleague Ioanna Dimitriou told me “worry dies last”, and it made me remember this passage from an interview with Eve Kosofsky Sedgwick.

It’s especially common in fields where people’s work is constantly under review by talented peers, such as academia or Open Source Software, or taking on a new job.

Geek Feminism Wiki, “Impostor Syndrome”

At the end of 2012/beginning of 2013 I wrote a four-part blog post about my experiences with impostor syndrome. That led to me getting invited to speak on episode 113 of the “Ruby Rogues” podcast, which was dedicated to impostor syndrome. (Unfortunately, from what I can tell, their web site is gone.)

Since then, my thinking about impostor syndrome has changed.

“Impostor syndrome” is an entirely rational behavior for folks who do get called impostors (ie. many underrepresented people). It’s part coping mechanism, part just listening to the feedback you’re getting….

We call it “impostor syndrome”, but we’re not sick. The real sickness is an industry that calls itself a meritocracy but over and over and over fails to actually reward merit.

This is fixable. It will take doing the work of rooting out bias in all its forms, at all levels – and critically, in who gets chosen to level up. So let’s get to work.

Leigh Honeywell, “Impostor Syndrome”, 2016

I agree with everything Leigh wrote here. Impostor syndrome, like any response to past or ongoing trauma, is not a pathology. It’s a reasonable adaptation to an environment that places stresses on your mind and body that exhaust your resources for coping with those demands. I wrote a broader post about this point in 2016, called “Stop Moralizing About Personality Traits”.

Acceptance is the first step towards change. By now, I’ve spent over a decade consciously reckoning with the repercussions of growing up and into young adulthood without emotional support, both on the micro-level (family and intimate relationships) and the macro-level (being a perennial outsider with no home on either side of a variety of social borders: for example, those of gender, sexuality, disability, culture, and nationality). When i started my current job last year, I wasn’t over it. That made it unnecessarily hard to get started and put up a wall between me and any number of people who might have offered help if they’d only known what I was going through. I’m still not over it.

To recognize, and name as a problem, the extent to which my personality has been shaped by unfair social circumstances: that was one step. Contrary to my acculturation as an engineer, the next step is not “fix the problem”. In fact, there is no patch you can simply apply to your own inner operating system, because all of your conscious thoughts run in user space. Maybe you can attach a debugger to your own kernel, but some changes can’t be made to a running program without a cold reboot. I don’t recommend trying that at home.

Learning to identify impostor syndrome (or, as you might call it, “dysfunctional environment syndrome”; or, generalizing, “complex trauma” or “structural violence”) is one step, but a bug report isn’t the same thing as a passing regression test. As with free software, improvement has to come a little bit at a time, from many different contributors; there are few successful projects with a single maintainer.

I am ashamed of the amount of time things take, of looking like a senior professional on the outside as long as my peers don’t know (or aren’t thinking about) how I’ve never had a single job in tech for more than two years, about what it was like for me to move from job to job never picking enough momentum to accomplish anything that felt real to me. I wonder whether they think I’ve got it all figured out, which I don’t, but it often feels easier to just let people think that and suffer in silence. Learning to live with trauma requires trusting relationships; you can’t do it on your own. But the trauma itself makes it difficult to impossible to trust and to enter into genuine relationships.

I am not exaggerating when I say that my career has been traumatic for me; it has both echoed much older traumas and created entirely new ones. That’s a big part of why I had to share how I felt about finally meeting more of my co-workers in person. I’m 41 years old and I feel like I should be better at this by now. I’m not. But I’ll keep trying, if it takes a lifetime.

we belong

June 23rd, 2022

I am about 70% robot and 30% extremely sentimental and emotional person, generally in series rather than in parallel. But last week’s Igalia summit was a tidal wave of feelings, unexpected but completely welcome. Some of those feelings are ones I’ve already shared with the exact people who need to know, but there are some that I need to share with the Internet. I am hoping I’m not the only one who feels this way, though I don’t think I am.

A lot of us are new and this was our first summit. Meeting 60 or 70 people who were previously 2D faces on a screen for half an hour a week, at best, was intense. I was told by others that reuniting with long-time friends/colleagues/comrades/whatever words you want to use (and it’s hard to find the exact right one for a workplace like this) who they hadn’t seen since pre-pandemic was intense as well.

For me, there was more to it. I doubt I’m alone in this either, but it might explain why I’m feeling so strongly.

I tried to quit tech in 2015. I couldn’t afford to in the end, and I went to Google. They fired me for (allegedly) discriminating against white men, in late 2017. I decided it was time to quit again. I became an EMT and then a patient care coordinator, and applied to nursing schools. I got rejected. I decided I didn’t want to try again because I had learned that unless I became a physician, working in health care would never give me the respect I need. Unfortunately, I have an ego. I like to think that I balance it out with empathy more than some people in tech do, but it’s still there.

I got a DM in 2018 from some guy I knew from Twitter asking if I wanted to apply to Igalia, and I waited three years to take him up on it. Now I’m here.

Getting started wasn’t easy. The two weeks working from the office before the summit wasn’t easy either. But it all fell away sometime between Wednesday and Friday of last week, and quite unexpectedly, I decided I’m moving to Europe as soon as I can, probably to A Coruña (where Igalia’s headquarters is) at first but who knows where life will take me next? Listing all the reasons would take too long. But: I found a safe space, one where I felt welcome, accepted, like I belonged. It’s not a perfect one; I stood up during one of the meetings and expressed my pain at the dissonance between the comfort I feel here and the knowledge that most of the faces in the room were white and most belonged to men. I want to say we’re working on it, but our responsibility is to finish the work, not to feel good that we’ve started it. That’s true for writing code to deliver to a customer, and it’s true for achieving fairness.

I am old enough now to accommodate multiple conflicting truths. My desire to improve the unfairness, and to get other people to open their hearts enough to risk all-consuming rage at just how unfair things can be, those things coexist with my joy about finding a group of such consistently caring, thoughtful, and justice-minded people — ones who don’t seem to mind having me around.

I’m normally severely allergic to words like “love” and “family” in a corporate context. As an early childhood trauma survivor, these words are fraught, and I would rather keep things at work a bit more chill. At the same time, when I heard Igalians use these words during the summit to talk about our collective, it didn’t feel as menacing as it usually does. Maybe the right word to use here — the thing that we really mean when we generalize the words “love” and “family” because we’ve been taught (incorrectly) that it can only come from our lovers or parents — is “safety”. Safety is one of the most underrated concepts there is. Feeling safe means believing that you can rely on the people around you, that they know where you’re coming from or else if they don’t, that they’re willing to try to find out, that they’re willing to be changed by what happens if they do find out. I came in apprehensive. But in little ways and in big ways, I found safe people, not just one or two but a lot.

I could say more, but if I did, I might never stop. To channel the teenaged energy that I’m feeling right now (partly due to reconnecting with that version of myself who loved computers and longed to find other people who did too), I’ll include some songs that convey how I feel about this week. I don’t know if this will ring true for anyone else, but I have to try.

Allette Brooks, “Silicon Valley Rebel”

We lean her bike along the office floor
They never know what to expect shaved into the back of her head when she walks in the door
And she says ‘I don’t believe in working like that for a company
It’s not like they care about you
It’s not like they care about me’

Please don’t leave us here alone in this silicon hell, oh
Life would be so unbearable without your rebel yell...

Vienna Teng, “Level Up”

Call it any name you need
Call it your 2.0, your rebirth, whatever –
So long as you can feel it all
So long as all your doors are flung wide
Call it your day number 1 in the rest of forever

If you are afraid, give more
If you are alive, give more now
Everybody here has seams and scars

Namoli Brennet “We Belong”

Here's to all the tough girls
And here's to all the sensitive boys
We Belong
Here's to all the rejects
And here's to all the misfits
We Belong

Here's to all the brains and the geeks
And here's to all the made up freaks, yeah
We Belong

And when the same old voices say
That we'd be better off running away
We belong, We belong anyway

The Nields, “Easy People”

You let me be who I want to be

Bob Franke, “Thanksgiving Eve”

What can you do with your days
But work and hope
Let your dreams bind your work to your play

And most of all, the Mountain Goats, “Color in Your Cheeks”

They came in by the dozens, walking or crawling
Some were bright-eyed, some were dead on their feet
And they came from Zimbabwe, or from Soviet Georgia
East Saint Louis, or from Paris, or they lived across the street
But they came, and when they finally made it here
It was the least that we could do to make our welcome clear

Come on in
We haven't slept for weeks
Drink some of this
This'll put color in your cheeks

This is a different kind of post from the ones I was originally planning to do on this blog. And I never thought I’d be talking about my job this way. Life comes at you fast. To return to the Allette Brooks lyric: it’s because at a co-op, there’s no “they” that’s separate from “you” and “me”. It’s just “you” and “me”, and we care about each other. It turns out that safe spaces and cooperative structure aren’t just political ideas that happen to correlate — in a company, neither can exist without the other. It’s not a safe space if you can get fired over one person’s petty grievance, like being reminded that white men don’t understand everything. Inversely, cooperative structure can’t work without deep trust. Trust is hard to scale, and as Igalia grows I worry about what will happen (doubtless, the people who were here when it was 1/10ths of the size have a different view.) There is no guarantee of success, but I want to be one of the ones to try.

And we’re hiring. I know how hard it is to try again when you’ve been humiliated, betrayed, and disappointed at work before, something that’s more common than not in tech when you don’t look like, sound like, or feel like everybody else. I’m here because somebody believed in me. I’m glad both that they did and that I was able to return that leap of faith and believe that I truly was believed in. And I would like to pass along the favor to you. Of course, to do that, I have to get to know you a little bit first. As long as I continue to have some time, I want to talk to people in groups that are systematically underrepresented in tech who may be intrigued by what I wrote here but aren’t sure if they’re good enough. My email is tjc at (obvious domain name) and the jobs page is at https://www.igalia.com/jobs. Even if you don’t see a technical role there that exactly fits you, please don’t let that stop you from reaching out; what matters most is willingness to learn and to tolerate the sometimes-painful, always-rewarding process of creating something together with mutual consent rather than coercion.

Records and objects: both the same and different

January 29th, 2022

I wrote in my most recent “daily” post about the choices involved in representing records in MIR and LIR, Warp’s intermediate languages. Since then, more questions occurred to me.

MIR has a simple type system and some internal typechecking, which is helpful. LIR erases most type details but does have an even simpler type system that distinguishes JavaScript Values from raw integers/pointers/etc. I already added a Record type to MIR; an alternative would have been to give the NewRecord operation the result type Object, reflecting that records are (currently) represented as objects, but I thought that more internal type checks are always a good thing. Without a separate Record type, it’s theoretically possible that compiler bugs could result in record fields being mutable (of course, there are still ways such bugs could arise even with MIR typechecking, but it’s one more thing that makes it less likely).

I realized there’s a problem with this approach: if I want to take option 2 in the previous post — translating FinishRecord into MIR by breaking it down into smaller operations like Elements, LoadFixedSlot, etc. — it’s not possible to generate type-correct MIR. These operations expect an Object as an operand, but I want them to work on Records.

The point of representing records as objects was (at least in part) to reuse existing code for objects, but that goal exists in tension with distinguishing records from objects and keeping the MIR type system simple. Of course, we could add some form of subtyping to MIR, but that’s probably a bad idea.

So orthogonally to the three options I mentioned in my previous post, there’s another set of three options:

  1. Add Record as an MIR type, like I started out doing; don’t re-use MIR instructions like Elements for records, instead making those operations explicit in CodeGenerator, the final translation from LIR to assembly.
  2. Don’t add any new MIR types; give NewRecord the result type Object and FinishRecord the argument and result types Object
  3. Compile away records when translating JS to MIR, adding whatever new MIR instructions are necessary to make this possible; in this approach, records would desugar to Objects and any operations needed to enforce record invariants would be made explicit,

Option 1 rules out some code re-use and loses some internal checking, since more of the heavy lifting happens in the untyped world. Option 2 allows more code re-use, but also loses some internal checking, since it would be unsound to allow every Object operation to work on records. Option 3 has the advantage that only the JS-to-MIR translation (WarpBuilder / WarpCacheIRTranspiler) would need to be modified, but seems riskier than the first two options with respect to type-correctness bugs — it seems like debugging could potentially be hard if records were collapsed into objects, too.

A colleague pointed out that MIR already has some types that are similar to other types, but where certain additional invariants need to be upheld, such as RefOrNull (used in implementing WebAssembly); this type is effectively equivalent to “pointer”, but keeping it a separate type allows for the representation to change later if the WebAssembly implementors want to. Record seems very similar; we want to avoid committing to the Object representation more than necessary.

My colleague also suggested adding an explicit RecordToObject conversion, which would allow all the Object operations to be re-used for records without breaking MIR typechecking. As long as there’s no need to add an ObjectToRecord conversion (which I don’t think there is), this seems like the most appealing choice, combined with option 1 above. There’s still potential for bugs, e.g. if a compiler bug generates code that applies RecordToObject to a record and then does arbitrary object operations on it, such as mutating fields. Ideally, RecordToObject would produce something that can only be consumed by reads, not writes, but this isn’t practical to do within MIR’s existing type system.

(All of the same considerations come up for tuples as well; I just started with records, for no particular reason.)

I don’t think there’s any way to know which approach is best without having a working prototype, so I’m just going to try implementing this and seeing what happens.

JavaScript is a very interesting language

January 28th, 2022

The other day I learned about indexed setters in JavaScript. It hadn’t occurred to me that you can do the following:

var z = 5;
Object.defineProperty(Array.prototype, 0, { set: function(y) { z = 42; }});

and the interpreter will happily accept it. This code mutates the global object called Array.prototype, overriding the (implicit) setter method for the property '0'. If I’m to compare this with a more conventional programming language (using C-like syntax for illustrative purposes), this would be like if you could write:

int n = /* ... */;
int[n] a;
a[0] = 5;

but the semantics of the last assignment statement could involve arbitrary side effects, and not necessarily mutating the 0th element of a, depending on the value of some global variable.

Most languages don’t allow you to customize this behavior. In JavaScript, it wasn’t an intentional design decision (as far as I know), but a consequence of a few design choices interacting:

  • Everything is an object, which is to say a bag of key-value pairs (properties)
  • What’s more, properties aren’t required to be simple lookups, but can be defined by arbitrary accessor and mutator functions (setters and getters) that can interact with global mutable state
  • Numbers and strings can be cast to each other implicitly
  • Inheritance is prototype-based, prototypes are mutable, and built-in types aren’t special (their operations are defined by prototypes that user code can mutate)

It’s not immediately obvious that together, these decisions imply user-defined semantics for array writes. But if an array is just a bag of key-value pairs where the keys happen to be integers (not a contiguous aligned block of memory), and if strings are implicitly coerced to integers (“0” can be used in place of 0 and vice versa), and if specifications for built-in types can be dynamically modified by anybody, then there’s nothing to stop you from writing custom setters for array indices.

To someone used to modern statically typed languages, this is remarkable. Programming is communication between humans before it’s communication with machines, and communication is hard when you say “rat” when you really mean “bear” but you won’t tell your conversation partners that that’s what you mean and instead expect them to look it up in a little notebook that you carry around in your pocket (and regularly erase and rewrite). If that’s what you’re going to do, you had better be prepared for your hiking partners to react with less alarm than is warranted when you say “hey, there’s a rat over there.” If your goal is to be understood, it pays to think about how other people will interpret what you say, even if that conflicts with being able to say whatever you want.

And if you’re trying to understand someone else’s program (including if that someone else is you, a month ago), local reasoning is really useful; mental static analysis is hard if random behavior gets overridden in the shadows. More concretely, the kind of dynamic binding that JavaScript allows makes it hard to implement libraries in user space. For example, if you want to build a linked list abstraction and use arrays as the underlying representation, the semantics of your library is completely dependent on whatever modifications anybody in scope has made to the meaning of the array operations. So that’s not an abstraction at all, since it forces people who just want to use your data structure to think about its representation.

This came up for me in my work in progress implementing the tuple prototype methods — the most straightforward ways to re-use existing array library code to build an immutable tuple abstraction don’t work, because of the ability users have to override the meaning of any property of an array, including length and indexed elements. The workarounds I used depend on the code being in a self-hosted library, meaning that it has to live inside the compiler. It makes sense for the record and tuple implementations to live inside the compiler for other reasons (it would be hard for a userland library to guarantee immutability), but you can easily imagine data structures that don’t need to be built into the language, but nevertheless can’t take full advantage of code re-use if they’re implemented as libraries.

Now, plenty of people who use and implement JavaScript agree with me; perhaps even most. The problem is that existing code may depend on this behavior, so it’s very difficult to justify changing the language to reject programs that were once accepted. And that’s all right; it’s a full employment guarantee for compiler writers.

Even so, I can’t help but wonder what code that depends on this functionality — and it must be out there — is actually doing; the example I showed above is contrived, but surely there’s code out there that would break if array setters couldn’t be overridden, and I’d like to understand why somebody chose to write their code that way. I’m inclined to think that this kind of code springs from dependence on prototype inheritance as the only mechanism for code reuse (how much JS imposes that and how much has to do with individual imaginations, I can’t say), but maybe there are other reasons to do this that I’m not thinking of.

I also wonder if anyone would stand up for this design choice if there was an opportunity to magically redesign JS from scratch, with no worries about backwards compatibility; I’m inclined to think there must be someone who would, since dynamic languages have their advocates. I have yet to find an argument for dynamic checking that wasn’t really just an argument for better type systems; if you’re writing a program, there’s a constraint simpler than the program itself that describes the data the program is supposed to operate on. The “static vs. dynamic” debate isn’t about whether those constraints exist or not — it’s about whether they exist only in the mind of the person who wrote the code, or if they’re documented for others to understand and potentially modify. I want to encourage people to write code that invites contributors rather than using tacit knowledge to concentrate power within a clique, and that’s what motivates me to value statically typed languages with strong guarantees. I’ve never seen a substantive argument against them.

But what’s easier than persuading people is to wear my implementor hat and use types as much as possible underneath an abstraction barrier, to improve both performance and reliability, while making it look like you’re getting what you want.

“The sad truth is, there’s very little that’s creative in creativity. The vast majority is submission–submission to the laws of grammar, to the possibilities of rhetoric, to the grammar of narrative, to narrative’s various and possible structurings. In a society that privileges individuality, self-reliance, and mastery, submission is a frightening thing.”
— Samuel R. Delany, “Some Notes for the Intermediate and Advanced Creative Writing Student”, in About Writing

Even when languages offer many degrees of freedom, most people find it more manageable to pick a set of constraints and live within them. Programming languages research offers many ways to offer people who want to declare their constraints ways to do so within the framework of an existing, less constrainted language, as well as inferring those constraints when they are left implicit. When it comes to putting those ideas into practice, there’s more than enough work to keep me employed in this industry for as long as I want to be.

Implementing records in Warp

January 20th, 2022

Toward the goal of implementing records and tuples in Warp, I’m starting with code generation for empty records. The two modules I’ve been staring at the most are WarpBuilder.cpp and WarpCacheIRTranspiler.cpp.

While the baseline compiler translates bytecode and CacheIR directly into assembly code (using the MacroAssembler), the optimizing compiler (Warp) uses two intermediate languages: it translates bytecode and CacheIR to MIR; MIR to LIR; and then LIR to assembly code.

As explained in more detail here, WarpBuilder takes a snapshot (generated by another module, WarpOracle.cpp) of running code, and for each bytecode, it generates MIR instructions, either from CacheIR (for bytecode ops that can have inline caches), or directly. For ops that can be cached, WarpBuilder calls its own buildIC(), method, which in turn calls the TranspileCacheIRToMIR() method in WarpCacheIRTranspiler.

A comment in WarpBuilderShared.h says “Because this code is used by WarpCacheIRTranspiler we should generally assume that we only have access to the current basic block.” From that, I’m inferring that WarpCacheIRTranspiler maps each CacheIR op onto exactly one basic block. In addition, the addEffectful() method in WarpCacheIRTranspiler enforces that each basic block contains at most one effectful instruction.

In the baseline JIT implementation that I already finished, the InitRecord and FinishRecord bytecodes each have their own corresponding CacheIR ops; I made this choice by looking at how existing ops like NewArray were implemented, though in all of these cases, I’m still not sure I fully understand what the benefit of caching is (rather than just generating code) — my understanding of inline caching is that it’s an optimization to avoid method lookups when polymorphic code is instantiated repeatedly at the same type, and in all of these cases, there’s no type-based polymorphism.

I could go ahead and add InitRecord and FinishRecord into MIR and LIR as well; this would be similar to my existing code where the BaselineCacheIRCompiler compiles these operations to assembly. To implement these operations in Warp, I would add similar code to CodeGenerator.cpp (the module that compiles LIR to assembly) as what is currently in the BaselineCacheIRCompiler.

But, MIR includes some lower-level operations that aren’t present in CacheIR — most relevantly to me, operations for manipulating ObjectElements fields: Elements, SetInitializedLength, and so on. Using these operations (and adding a few more similar ones), I could translate FinishRecord to a series of simpler MIR operations, rather than adding it to MIR. To be more concrete, it would look something like:

(CacheIR)

FinishRecord r

== WarpCacheIRTranspiler ==>

(MIR)

e = Elements r
Freeze e
sortedKeys = LoadFixedSlot r SORTED_KEYS_SLOT
sortedKeysElements = Elements sortedKeys
CallShrinkCapacityToInitializedLength sortedKeys
SetNonWritableArrayLength sortedKeysElements
recordInitializedLength = InitializedLength r
SetArrayLength sortedKeysElements recordInitializedLength
CallSort sortedKeys

(I’m making up a concrete syntax for MIR.)

This would encapsulate the operations involved in finishing a record, primarily sorting the keys array and setting flags to ensure that the record and its sorted keys array are read-only. Several of these are already present in MIR, and the others would be easy to add, following existing operations as a template.

The problem with this approach is that FinishRecord in CacheIR would map onto multiple effectful MIR instructions, so I can’t just add a case for it in WarpCacheIRTranspiler.

I could also push the lower-level operations up into CacheIR, but I don’t know if that’s a good idea, since presumably there’s a reason why it hasn’t been done already.

To summarize, the options I’m considering are:

  1. Pass down InitRecord and FinishRecord through the pipeline by adding them to MIR and LIR
  2. Open up FinishRecord (InitRecord isn’t as complicated) in the translation to MIR, which might involve making FinishRecord non-cacheable altogether
  3. Open up FinishRecord in the translation to CacheIR, by adding more lower-level operations into CacheIR

I’ll have to do more research and check my assumptions before making a decision. A bigger question I’m wondering about is how to determine if it’s worth it to implement a particular operation in CacheIR at all; maybe I’m going about things the wrong way by adding the record/tuple opcodes into CacheIR right away, and instead I should just be implementing code generation and defer anything else until benchmarks exist?

Tuple prototype methods ☑️

January 18th, 2022

It’s been a little while. But since my last post, Nicolò’s patches implementing records and tuples in SpiderMonkey landed, which means that I was able to submit my patches implementing the prototype methods for the Tuple type. That code is awaiting review.

Some initial review comments that I received focused on concerns about changing existing code that implements Arrays. Immutable and mutable sequences support a lot of the same operations, so it made sense to re-use existing Array code in SpiderMonkey rather than reinventing the wheel. Section 8.2.3 of the Records nad Tuples proposal lists the tuple prototype methods and, for many of them, specifies that the behavior should be the same as what’s already specified for the equivalent Array method.

In some cases, code reuse required changing existing code. Built-in JavaScript types can have their methods implemented in SpiderMonkey in two different ways: self-hosted (implemented in JavaScript as a library) or native (implemented inside the compiler, in C++). I implemented most of the tuple methods as self-hosted methods, which can easily call existing self-hosted methods for arrays. In some cases, I thought a native implementation would be more efficient: for example, for the toReversed() method, which returns a reversed copy of a given tuple. (Since tuples are immutable, it can’t reverse in-place.) There’s already an efficient C++ implementation of reverse for arrays, and because of how tuples are currently implemented in SpiderMonkey (using the NativeObject C++ type, which is the same underlying type that represents arrays and most other built-in types), making it work for tuples as well just required changing the method’s precondition, not the implementation. However, the maintainers were reluctant to allow any changes to the built-in array methods. I replaced that with a straightforward self-hosted implementation of toReversed(); it seems a shame to not take advantage of existing optimized code, but perhaps I was falling prey to premature optimization. We don’t have any performance benchmarks for code that uses tuples yet, and without profiling, it’s impossible to know what will actually be performance-critical.

While I await further review, I’ll be learning about WarpMonkey, the next tier of optimizing JIT in SpiderMonkey. The work I described in my previous posts was all about implementing records and tuples in the baseline compiler, and that work is finished (I’ll hold off on submitting it for review until after the prototype methods code has been accepted). I expect this to be where the real fun is!

Fun with pointer arithmetic

December 9th, 2021

a diagram showing the layout of the NativeObject and RecordType typesA picture is worth a thousand words, they say.

Where I left off yesterday, I was trying to figure out why my generated AddRecordProperty code was crashing. I was still using the simplest possible test case, a record with one field:

function f() { x = #{"a": 1}; }

Fixed slots

My code was writing a literal zero into the record’s “initialized length” slot:

store32(Imm32(0), Address(result, NativeObject::getFixedSlotOffset(RecordType::INITIALIZED_LENGTH_SLOT)));

But this should have been:

  storeValue(Int32Value(0), Address(result, NativeObject::getFixedSlotOffset(RecordType::INITIALIZED_LENGTH_SLOT)));

In the drawing, I indicated that offset 24 of a RecordType is a Value (JS::Value) denoting the number of elements that have been initialized so far. While it’s an invariant that this will actually be an integer value, as far as the compiler is concerned, the representation is of a Value, which has a different bit pattern from the integer 0.

Some existing code in RecordType::createUninitializedRecord() (this is code that isn’t upstream yet) should have been a clue:

  uint32_t length = getFixedSlot(INITIALIZED_LENGTH_SLOT).toPrivateUint32();

To get an unsigned int32, we call the Value method toPrivateUint32(), which returns an integer when called on an integer value.

Moreover, the getFixedSlot() method of NativeObject also returns a Value, which should have been a pretty good hint to me that fixed slots are Values:

const Value& getFixedSlot(uint32_t slot) const;

(NativeObject.h)

Observing the length field

Supposing that the register %rcx points to a record, I would like to be able to execute:

call js::DumpValue(((RecordType*) $rcx)->getFixedSlot(INITIALIZED_LENGTH_SLOT))

in gdb. (Where INITIALIZED_LENGTH_SLOT is defined as 0, since it happens to be the first fixed slot in this object.) Casting the value in %rcx to RecordType is necessary to tell gdb where the struct fields begin and end, but from there, I would have thought there would be enough debug information for it to know that RecordType inherits from NativeObject, which has a getFixedSlot() method.

Since I can’t do that, the next best thing is:

(gdb) call js::DumpValue( (JS::Value::fromRawBits (*($rcx + 24)) ))
0

And that works — it prints 0, which is what I would expect for a record with no initialized fields. Effectively, I inlined getFixedSlot(), which accesses offset 24 from the object. Then, JS::Value::fromRawBits decodes the tagged pointer that represents a Value, and DumpValue() pretty-prints it.

Observing the sortedKeys field

Looking at the picture again, records have a second fixed slot that’s a Value that is guaranteed (assuming the compiler works) to correspond to an ArrayObject, which just contains the record keys, in sorted order. I knew that my code was temporarily storing the value of this slot in register %rbx (as before, I figured this out by putting breakpoints in the code generation methods and looking at the values of various variables), so if I do:

call js::DumpValue ((JS::Value::fromRawBits($rbx)))

in gdb, I get output that’s something like <Array object at 232d468007f8>

But for more detail, I can do:

(gdb) call js::DumpObject (& ((JS::Value::fromRawBits($rbx)).toObject()))
object 29574ae007f0
  global 21bd66c40030 [global]
  class 5555590a8770 Array
  shape 21bd66c66320
  flags:
  proto 
  properties:
    [Latin 1]"length" (map 21bd66c62670/0 writable )

which interprets the Value as an Object and uses DumpObject to print out more details about its representation.

Observing the record itself

Having seen that the individual fixed slots of the record seemed to be correct, I wanted to debug my generated code for creating uninitialized records to see what the entire record object looked like. Knowing that the record was stored in %rcx, I figured out that I could do:

(gdb) call js::DumpObject (& ((JS::Value::fromRawBits($rcx)).toExtendedPrimitive()))
object 29574ae007b0
  global 21bd66c40030 [global]
  class 5555590b7d10 record
  shape 21bd66c663c0
  flags:
  proto null
  reserved slots:
      0 : 0
      1 : 
      2 : false
  properties:
  elements:
      0: false
      1: Assertion failure: (asBits_ & js::gc::CellAlignMask) == 0 (GC pointer is not aligned. Is this memory corruption?), at /home/tjc/gecko-fork/obj-x64-debug/dist/include/js/Value.h:622

“Extended primitive” is a provisional name for records and tuples, which are not objects, but are (in our prototype implementation) represented internally in the compiler as objects; that’s why I’m able to use DumpObject to print out the fields. Under “reserved slots”, it’s showing me the values of the three reserved slots shown at offsets, 24, 32, and 40 in the picture above.

Obviously, it’s a warning sign that trying to print out the elements array causes an assertion failure. I would like to be able to print it out using:

call js::DumpValue(((RecordType*) $rcx)->getElements()[0])

since getElements() is a NativeObject method that represents an array of Values. But this doesn’t work in gdb, so knowing that the offset of the elements_ field is 16, I did the following:

(gdb) p *(ObjectElements::fromElements((HeapSlot*) ($rcx + 16)))
$10 = { flags = 3952501696,  initializedLength = 13525, capacity = 1437286072, length = 21845}
    (gdb) 

I deleted some of the output so as to show only the dynamic fields. The picture above makes this much clearer, but once we access the HeapSlot array stored in the elements_ field, we can call the ObjectElements::fromElements method on it to access the elements header, which is stored physically before the array itself. This header consists of four int32 fields: flags, initializedLength, capacity, and length. This output makes it seem like garbage got written into the object, since the initialized length shouldn’t be 13525.

Looking at the slots_ array of the record (which is at offset 8), I observed:

p ((HeapSlot*) (*($rcx + 8)))
$22 = (js::HeapSlot *) 0x55ab3eb8
(gdb) 

This coresponds to this MacroAssembler call:

  storePtr(ImmPtr(emptyObjectSlots), Address(result, NativeObject::offsetOfSlots()));

result here represents the register that contains the record, and offsetOfSlots() is 8. I had copied/pasted this code from existing code that initializes arrays, without looking at it too carefully, but when I read it again, I noticed that emptyObjectSlots is a special value. A comment in NativeObject.h mentions that special singleton values are used when the elements_ and slots_ arrays are empty.

Initializing the elements

And that’s how I realized that some other code that I had copied and pasted out of the existing array initialization code was misplaced:

  // Initialize elements pointer for fixed (inline) elements.
  computeEffectiveAddress(
      Address(result, NativeObject::offsetOfFixedElements()), temp);
  storePtr(temp, Address(result, NativeObject::offsetOfElements()));

This works fine for what it does, but I needed to do the equivalent of the NativeObject::setEmptyElements() method:

Should do exactly what NativeObject::setEmptyElements() does:

  void setEmptyElements() { elements_ = emptyObjectElements; }

(code)

So what I really wanted was:

  // Initialize elements pointer
  storePtr(ImmPtr(emptyObjectElements),
           Address(result, NativeObject::offsetOfElements()));
                     

And after making that change, the elements header looked a lot better:

(gdb) p (*((ObjectElements*) ($rcx)))
$8 = {flags = 0, initializedLength = 0, capacity = 1, length = 1, static VALUES_PER_HEADER = 2}
(gdb)  call ((JSObject*) $rcx)->dump()
object 17d67b3007b0
  global 2689ff340030 [global]
  class 5555590b7b80 record
  shape 2689ff3663c0
  flags:
  proto null
  reserved slots:
      0 : 0
      1 : 
      2 : false
  properties:
    (gdb) 

There’s more to recount, but I don’t have any more time today.

Observability

December 8th, 2021

I’m realizing that this blog is mostly about debugging, since figuring out how to debug SpiderMonkey is taking up most of my time. Right now, the actual compilation algorithms I’m implementing are straightforward; if I was working on performance tuning an existing implementation or something, I’m sure it would be different. What’s actually hard is finding ways to make generated code more observable. I’m sure that more experienced SpiderMonkey hackers have their own techniques, but since there isn’t a lot of documentation to go on, I’m making it up mostly from scratch.

Having gotten empty records to work, I started working on code generation for the AddRecordProperty opcode, so we can have non-empty records. When I initially tested my code with input x = #{"a": 1}, I got an assertion failure:

Assertion failure: !this->has(reg), at /home/tjc/gecko-fork/js/src/jit/RegisterSets.h:680

Thread 1 "js" received signal SIGSEGV, Segmentation fault.
0x00005555580486c9 in js::jit::SpecializedRegSet<js::jit::LiveSetAccessors<js::jit::TypedRegisterSet >, js::jit::TypedRegisterSet >::add (
    this=0x7fffffffc768, reg=...) at /home/tjc/gecko-fork/js/src/jit/RegisterSets.h:680
680     MOZ_ASSERT(!this->has(reg));

AddRecordProperty takes three operands (the record, the field name, and the value for the field). When I dug into it, it turned out that two of the three operands had been allocated to the same register, which wasn’t right. The CacheIRCompiler module handles register allocation, spilling registers to the stack when necessary, so you can implement new operations without having to worry about those details. But in this case, I had to worry about that detail. Internally, the register allocator keeps a list, operandLocations_ that maps integers (operands are numbered from 0 onward) to physical registers. It turned out that operands 0 and 2 were both allocated to %rax, which I figured out in gdb (irrelevant details snipped):

(gdb) p operandLocations_[0]
$13 = (js::jit::OperandLocation &) @0x7fffffffc680: {
  kind_ = js::jit::OperandLocation::PayloadReg, data_ = {payloadReg = {reg = {
        reg_ = js::jit::X86Encoding::rax, 
                ...
(gdb) p operandLocations_[1]
$14 = (js::jit::OperandLocation &) @0x7fffffffc690: {
  kind_ = js::jit::OperandLocation::ValueReg, data_ = {payloadReg = {reg = {
        reg_ = js::jit::X86Encoding::rbx, 
                ...
(gdb) p operandLocations_[2]
$15 = (js::jit::OperandLocation &) @0x7fffffffc6a0: {
  kind_ = js::jit::OperandLocation::ValueReg, data_ = {payloadReg = {reg = {
        reg_ = js::jit::X86Encoding::rax,  

The BaselineCacheIRCompiler::init() method (code) is called each time a new opcode is compiled, and it assigns a virtual register for each operand. I compared the code I’d added for AddRecordProperty with the code for the other opcodes that also have 3 operands (such as SetElem), I noticed that all of them passed the third operand on the stack. When I changed AddRecordProperty to work the same way, the problem was fixed. I don’t know how I would have figured this out except for a lucky guess! I’m guessing it’s this way because of the small number of registers on x86, but the internal error-checking here seems lacking; it doesn’t seem good to silently alias distinct operands to each othre.

Once I changed the init() method and modified my emitAddRecordPropertyResult() method to expect the third operand on the stack, I found that for the compiled code to be run and not just emitted, I had to embed the code in a function once again:
function f() { x = #{"a":1}; }

(I still don’t understand the details as to when the JIT decides to generate native code.)

The result was a segfault, which was okay for the time being; it means that my code was at least able to generate code for setting a record property, without internal errors.

I realized that I wasn’t using the protocol for calling from the JIT into C++ functions (the callVM() method in the baseline compiler) correctly, but took a break before fixing it.

After taking a break, I fixed that problem only to run into another one: callVM() doesn’t automatically save registers. The macro assembler provides another method, callWithABI() (code), that makes it easier to save and restore registers. Evidently, ABI calls are less efficient than VM calls, but I’m not worried about performance at the moment. I switched it around to use callWithABI() and had to do some debugging by setting breakpoints both in the C++ code for the callee function and in the generated assembly code to make sure all the data types matched up. That worked reasonably well, but alas, I had to switch back to callVM() because I couldn’t figure out how to pass a HandleObject (see the comment at the beginning of RootingAPI.h for a fairly good explanation) into an ABI function. I added code to manually save and restore registers.

That worked okay enough, and my generated code got almost all the way through without a segfault. At the very end, my code calls the existing emitStoreDenseElement method. Internally (in the current implementation; this is all subject to change), records contain a sorted array of keys. Separately from storing the actual value at the correct offset in the record, the key has to be added to the array. (Since records are immutable, the actual sorting only has to happen once, when the FinishRecord opcode executes.)

I was getting a segfault because there was something wrong about the array that I was passing into the generated code for storing dense elements. I haven’t found the problem yet, but (going back to the beginning of this post), I’m trying to find ways to make the generated code more transparent. In this case, I knew that the emitStoreDenseElement code was getting its array input in the %rsi register, so I put a breakpoint in the right place and looked at it in gdb:


Thread 1 "js" received signal SIGTRAP, Trace/breakpoint trap.
0x00001dec1e58ab70 in ?? ()

(gdb) p *((JSObject*) $rsi)
$3 = {<js::gc::CellWithTenuredGCPointer> = { = {
      header_ = {<mozilla::detail::AtomicBaseIncDec> = {<mozilla::detail::AtomicBase> = {
            mValue = {<std::__atomic_base> = {static _S_alignment = 8, _M_i = 6046507808}, 
              static is_always_lock_free = true}}, }, }, static RESERVED_MASK = 7, static FORWARD_BIT = 1}, }, 
  static TraceKind = JS::TraceKind::Object, static MAX_BYTE_SIZE = 152}
(gdb) p *((JSObject*) $rsi).getHeader()
Attempt to take address of value not located in memory.
(gdb) p ((JSObject*) $rsi)->header_
$4 = {<mozilla::detail::AtomicBaseIncDec> = {<mozilla::detail::AtomicBase> = {
      mValue = {<std::__atomic_base> = {static _S_alignment = 8, _M_i = 6046507808}, 
        static is_always_lock_free = true}}, }, }
(gdb) p ((JSObject*) $rsi)->slots_
There is no member or method named slots_.
(gdb) p ((js::HeapSlot*) ($rsi + 8))
$5 = (js::HeapSlot *) 0x3a6c5de007f8
(gdb) p ((js::HeapSlot*) ($rsi + 8))[0]
$6 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 93824997702264}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 8))[1]
$7 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 64237105842200}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 8))[2]
$8 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 0}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 16))[0]
$9 = {<js::WriteBarriered> = {<js::BarrieredBase> = {value = {
        asBits_ = 64237105842200}}, <js::WrappedPtrOperations<JS::Value, js::WriteBarriered, void>> = {}, }, }
(gdb) p ((js::HeapSlot*) ($rsi + 16))[0].value
$10 = {asBits_ = 64237105842200}
Slot*) ($rsi + 16))))
No symbol "ObjectElements" in current context.
(gdb) p (js::ObjectElements::fromElements (((js::HeapSlot*) ($rsi + 16))))
$11 = (js::ObjectElements *) 0x3a6c5de007f0
(gdb) p (js::ObjectElements::fromElements (((js::HeapSlot*) ($rsi + 16))))->initializedLength
$12 = 1
(gdb) 

I’m posting this debugging transcript to show some of the things I’ve discovered so far. In gdb, you can cast a pointer to an arbitrary type and use that to print out the contents according to the physical layout of that type; hence, printing ((JSObject*) $rsi. Weirdly, the NativeObject class (code) seems to be inaccessible in gdb; I’m not sure if this is because NativeObject inherits from JSObject (in this case, I know that the object pointed to by %rsi is a NativeObject, or because of something different about that class. (None of NativeObject, js::NativeObject, or JS::NativeObject seem to be names that gdb recognizes.) That makes it harder to print some of the other fields, but the debugger does know about the header_ field, since it’s defined in the Cell class (code) that JSObject inherits from. The getHeader() method is just supposed to return the header_ field, so I’m not sure why gdb reported the “Attempt to take address of value not located in memory” error.

I knew that NativeObjects have a slots_ field and an elements_ field, physically laid out right after the header_ field, but due to the problem I mentioned with the NativeObject class, gdb won’t recognize these field names. I knew that slots_ would be at offset 8 and elements would be at offset 16, so using some pointer math and casting the pointers to the type that both slots_ and elements_ have (js::HeapSlot*; the HeapSlot class is defined here), I was able to make gdb treat these addresses as HeapSlot arrays. HeapSlot is basically an alias for the Value type, which represents all runtime values in SpiderMonkey (see Value.h).

The slots_ array contains “fixed slots” (attributes, basically); this object’s slots array has two elements that are initialized (the one at index 2 is printed as if its asBits_ field is 0, meaning it’s just a 0 and is probably uninitialized). For array-like objects (which this one is — if you’ve lost track, it’s the array of sorted record keys that’s part of a record), the elements_ array is an array of, well, elements. NativeObject has a getElements method, which casts the HeapSlot array into an ObjectElements object (which has some additional methods on it), but I can’t call any NativeObject methods from gdb, so instead I called the ObjectElements::fromElements method directly to cast it. Since the initializedLength() method on the resulting object returns 1, which is correct (the previous generated code sets the initialized length to 1, since we’re adding a property to a previously empty record), it seems like the layout is vaguely correct.

That’s as far as I got. Debugging tools for generated code that don’t require all these manual casts and have more access to type information that would make pretty-printing easier would be great! Maybe that exists and I haven’t found it, or maybe it’s just that compiler implementors are a small enough group that making tools more usable isn’t too compelling for anyone. In any case, the hard part about all of this is finding good ways to observe data at runtime and relate it to types that exist in the compiler.

Further adventures in gdb

December 7th, 2021

So, that updating-every-day plan hasn’t exactly worked out. Working on the code is so much fun that it’s hard to stop, and then I have to choose between going to bed and writing a blog post.

I’m currently working on generating code for the three opcodes related to records: AddRecord, FinishRecord, and AddRecordProperty. The work I wrote about in previous posts wasn’t exactly complete, because it allowed code generation in the presence of records, but I just plugged in no-op implementations for the tryAttachStub() methods in CacheIR.cpp that actually do the work. I still had to change some code generation methods so that existing code could work with records and tuples, but the really fun stuff is what I’m doing now!

Rather than try to cover everything, I’ll summarize my notes from the end of last week. I had added two methods to BaselineCacheIRCompiler.cpp, which is the part of the baseline compiler that generates baseline code from CacheIR code. As an initial test case, I was using a function that just returns an empty record literal:

function f() { x = #{}; }

since I hadn’t yet implemented code generation for adding record properties. The JavaScript bytecode for this function looks like:

loc     op
-----   --
main:
00000:  BindGName "x"                   # GLOBAL
00005:  InitRecord 0                    # GLOBAL 
00010:  FinishRecord                    # GLOBAL 
00011:  SetGName "x"                    # 
00016:  Pop                             # 
00017:  RetRval                         # 

The InitRecord 0 instruction creates an uninitialized record of length 0, and the FinishRecord instruction sets internal flags to make the record read-only. In a non-empty record, one or more AddRecordProperty instructions would appear in between; it would be a dynamic error to invoke another AddRecordProperty instruction again after the FinishRecord.

The bytecode above is from executing dis(f) in the interpreter; I also discovered that you can do disnative(f) after CacheIR and then native code are generated for f. This prints out the generated assembly code; it’s actually much easier to read if you execute print(disnative(f)). Invoking the interpreter with the environment variable IONFLAGS=codegen also prints out the code as it’s generated, which is sometimes more useful since it can be interleaved with other debug messages that show which methods are generating which bits of code.

I was getting a segfault, meaning that the code being generated by my emitNewRecordResult() method didn’t match the expectations that the code generated by my emitFinishRecordResult() method had about the data. So I fired up gdb, as I had done before. Trying to zero in on the problem, I was looking at some generated code that took its input in the %rax register (gdb uses the $rax syntax instead.) With careful use of casts and calling accessor methods, it’s possible to look at the C++ representations of the objects that generated code creates. I found it helpful to look at the value of this expression:

(gdb) p (*(*((JSObject*) $rax)).shape()).getObjectClass()
$9 = (const JSClass *) 0x5555590a6130 

From context, I was able to figure out that 0x5555590a6130 was the address of the JSClass structure for the ArrayObject type. This wasn’t what I expected, since the input for this code was supposed to be a record (with type RecordType in C++).

Since this was a few days ago, I’ve lost track of the logical steps in between, but eventually, I put a breakpoint in the createUninitializedRecord() method that I’d added to MacroAssembler.cpp; shape and arrayShape are variable names referring to two registers I allocated within that method.

Thread 1 "js" hit Breakpoint 1, js::jit::MacroAssembler::createUninitializedRecord (
    this=0x7fffffffae00, result=..., shape=..., arrayShape=..., temp=..., temp2=..., 
    recordLength=0, allocKind=js::gc::AllocKind::OBJECT4_BACKGROUND, 
    initialHeap=js::gc::DefaultHeap, fail=0x7fffffffaa38, allocSite=...)
    at /home/tjc/gecko-fork/js/src/jit/MacroAssembler.cpp:493
warning: Source file is more recent than executable.
493   allocateObject(result, temp, allocKind, 0, initialHeap, fail, allocSite);
(gdb) p shape
$10 = {reg_ = js::jit::X86Encoding::rcx, static DefaultType = js::jit::RegTypeName::GPR}
(gdb) p arrayShape
$11 = {reg_ = js::jit::X86Encoding::rcx, static DefaultType = js::jit::RegTypeName::GPR}
(gdb) quit

This debug output shows that the register allocator assigned both shape and arrayShape to the same register, %rcx. As a result, the code I generated to initialize the shapes was using the same shape for both the record object itself, and the internal sortedKeys array pointed to by the record’s sortedKeys field — causing createUninitializedRecord() to return something that looked like an array rather than a record.

The reason this was happening was the following code that I wrote:

 
AutoScratchRegisterMaybeOutput shape(allocator, masm, output);
AutoScratchRegisterMaybeOutput arrayShape(allocator, masm, output);

The AutoScratchRegisterMaybeOutput class (defined in CacheIRCompiler.h) re-uses the output register if possible, so calling it twice allocates both symbolic names to the same register. That wasn’t what I wanted.

The solution was to use the AutoScratchRegister class instead, telling the register allocator that shape and arrayShape should live in distinct registers. After making that change, repeating the original experiment gives:

(gdb) p (*(*((JSObject*) $rcx)).shape()).getObjectClass()
$5 = (const JSClass *) 0x5555590b5690 
(gdb) c
Continuing....
(gdb) p (*(*((JSObject*) $rbx)).shape()).getObjectClass()
$8 = (const JSClass *) 0x5555590a60f0 
(gdb) 

(To be sure that $rcx and $rbx were the registers allocated to shape and arrayShape, I had to set a breakpoint in my createUninitializedRecord() method again and look at the values of these variables.)

I hope this might be marginally useful to somebody trying to debug SpiderMonkey baseline code generation, or if nothing else, useful to me if I ever have to set aside this work for a few months (let’s be honest, weeks) and come back to it!

The emotional roller coaster that is programming

November 19th, 2021

I skipped a few days’ worth of updates; turns out it’s a bit difficult to fit in time to write an entire post when your work schedule isn’t very consistent.

In the meantime, I finished implementing all the record and tuple opcodes in the JIT. Having done some manual testing, it was time to start running the existing test suite with the compiler enabled. Fortunately, I figured out the flag to pass in so that I wouldn’t have to add it to each test file by hand (which would have been bad practice anyway):

mach jstests Record --args=--baseline-eager

This runs all the tests with Record in the name and adds the --baseline-eager flag in to the JavaScript shell.

At this stage, failures are good — it means there’s still something interesting left to work on. Yay, a failure!

Hit MOZ_CRASH(Unexpected type) at /home/tjc/gecko-fork/js/src/jit/CacheIR.cpp:7745
REGRESSION - non262/Record/equality.js
[7|1|0|0] 100% ======================================================>|   1.1s
REGRESSIONS
    non262/Record/equality.js
FAIL
 

Narrowing down the code that caused the failure, I got:

js> Object.is(#{x: +0}, #{x: -0})
Object.is(withPosZ, withNegZ)
Hit MOZ_CRASH(Unexpected type) at /home/tjc/gecko-fork/js/src/jit/CacheIR.cpp:7745

Thread 1 "js" received signal SIGSEGV, Segmentation fault.
0x00005555583763f8 in js::jit::CallIRGenerator::tryAttachObjectIs (this=0x7fffffffcd10, callee=...)
    at /home/tjc/gecko-fork/js/src/jit/CacheIR.cpp:7745
7745            MOZ_CRASH("Unexpected type");
(gdb) 

So this told me that I hadn’t yet implemented the cases for comparing records/tuples/boxes to each other in Object.is() in the JIT.

Fixing the problem seemed straightforward. I found the CallIRGenerator::tryAttachObjectIs() method in CacheIR.cpp. The CallIRGenerator stub takes care of generating code for built-in methods as they’re called; each time a known method is called on a particular combination of operand types that’s implemented in the baseline compiler, code gets generated that will be called next time instead of either interpreting the code or re-generating it from scratch.

For example, this code snippet from tryAttachObjectIs() shows that the first time Object.is() is called with two int32 operands, the compiler will generate a version of Object.is() that’s specialized to this case and saves the need to call a more generic method and do more type checks. Of course, the generated code has to include a check that the operand types actually are int32, and either call a different generated method or generate a new stub (specialized version of the method) if not.

    MOZ_ASSERT(lhs.type() == rhs.type());
    MOZ_ASSERT(lhs.type() != JS::ValueType::Double);

    switch (lhs.type()) {
      case JS::ValueType::Int32: {
        Int32OperandId lhsIntId = writer.guardToInt32(lhsId);
        Int32OperandId rhsIntId = writer.guardToInt32(rhsId);
        writer.compareInt32Result(JSOp::StrictEq, lhsIntId, rhsIntId);
        break;
      }

The existing code handles cases where both arguments have type Int32, String, Symbol, Object, et al. So it was easy to follow that structure and add a case where both operands have a box, record, or tuple type. After a fun adventure through the MacroAssembler, I had all the pieces implemented and the test passed; I was able to apply Object.is() to records (etc.) with the baseline compiler enabled.

After that, all the tests for records passed, which isn’t too surprising since there aren’t many methods for records. Next, I tried running the tests for what’s currently called Box in the Records and Tuples proposal (subject to change), and got more failures; still a good thing.

mach-with record-tuple-with-jit jstests Box --args=--baseline-eager
[1|0|0|0]  20% ==========>                                            |   1.2s

Hit MOZ_CRASH(unexpected type) at /home/tjc/gecko-fork/js/src/jit/CacheIRCompiler.cpp:1930
REGRESSION - non262/Box/unbox.js
[1|1|0|0]  40% =====================>                                 |   1.2s

Hit MOZ_CRASH(unexpected type) at /home/tjc/gecko-fork/js/src/jit/CacheIRCompiler.cpp:1930
REGRESSION - non262/Box/json.js
[1|2|0|0]  60% ================================>                      |   1.3s

Hit MOZ_CRASH(unexpected type) at /home/tjc/gecko-fork/js/src/jit/CacheIRCompiler.cpp:1930
REGRESSION - non262/Box/constructor.js
[2|3|0|0] 100% ======================================================>|   1.3s
REGRESSIONS
    non262/Box/unbox.js
    non262/Box/json.js
    non262/Box/constructor.js
FAIL

The common cause: generating code for any method calls on Boxes invokes GetPropIRGenerator::tryAttachPrimitive() (also in CacheIR.cpp as above), which didn’t have a case for records/tuples/boxes. (In JavaScript, a method is just another property on an object; so the GetProp bytecode operation extracts the property, and calling it is a separate instruction.) Similarly to the above, I added a case, and the code worked; I was able to successfully call (Box({}).unbox()) with the compiler enabled.

The next test failure, in json.js, was harder. I minimized the test case to one line, but wasn’t able to get it any simpler than this:

JSON.stringify(Box({}), (key, value) => (typeof value === "box" ? {x: value.unbox() } : value))

This code calls the JSON.stringify() standard library method on the value Box({}) (a box wrapped around an empty object); the second argument is a function that’s applied to the value of each property in the structure before converting it to a string. The fix I made that fixed unbox.js got rid of the MOZ_CRASH(unexpected type) failure, but replaced it with a segfault.

It took me too many hours to figure out that I had made the mistake of copying/pasting code without fully understanding it. The cached method stubs rely on “guards”, which is to say, runtime type checks, to ensure that we only call a previously-generated method in the future if the types of the operands match the ones from the past (when we generated the code for this particular specialization of the method). When making the change for Object.is(), I had looked at CacheIRCompiler.cpp and noticed that the CacheIRCompiler::emitGuardToObject() method generates code that tests whether an operand is an object or not:

bool CacheIRCompiler::emitGuardToObject(ValOperandId inputId) {
  JitSpew(JitSpew_Codegen, "%s", __FUNCTION__);
  if (allocator.knownType(inputId) == JSVAL_TYPE_OBJECT) {
    return true;
  }

  ValueOperand input = allocator.useValueRegister(masm, inputId);
  FailurePath* failure;
  if (!addFailurePath(&failure)) {
    return false;
  }
  masm.branchTestObject(Assembler::NotEqual, input, failure->label());
  return true;
}

The generated code contains a “failure” label that this code branches to when the operand inputId is not an object. (It’s up to the caller to put appropriate code under the “failure” label so that this result will be handled however the caller wants.) I copied and pasted this code to create an emitGuardToExtendedPrimitive() method (“extended primitives” are what we’re calling records/tuples/boxes for now), and changed JSVAL_TYPE_OBJECT to JSVAL_TYPE_EXTENDED_PRIMITIVE so that the code would check for the “extended primitive” runtime type tag instead of the “object” type tag. The problem is that I also needed to use something else instead of branchTestObject. As it was, whenever a stub that expects a record/tuple/box as an argument was generated, it would be re-used for operands that are objects. This is obviously unsound and, looking at the failing test case again, we can see why this code exposed the bug:

JSON.stringify(Box({}), (key, value) => (typeof value === "box" ? {x: value.unbox() } : value))

The first time the (key, value) anonymous function is called, the name value is bound to Box({}). So a stub gets generated that’s a version of the typeof operation, specialized to Box things (actually anything that’s a record, tuple, or box, for implementation-specific reasons). The stub checks that the operand is a record/tuple/box, and if so, returns the appropriate type tag string (such as “box”). Except because of the bug that I introduced, this stub got re-used for any object operands. The way that the JSON stringify code works (JSON.cpp), it calls the “replacer” (i.e. the anonymous (key, value) function) on the value of each property — but then, it calls the replacer again on the replaced value. So my generated stub that worked perfectly well for Box({}) was subsequently called on {x: {}}, which has an entirely different type; hence the segfault.

Finding this bug took a long time (partly because I couldn’t figure out how to enable the “CacheIR spew” code that prints out generated CacheIR code, so I was trying to debug generated code without being able to read it…), but I experimented with commenting out various bits of code and eventually deduced that typeof was probably the problem; once I read over the code related to typeof, I spotted that my emitGuardToExtendedPrimitive() method was calling branchTestObject(). Adding a branchTestExtendedPrimitive() method to the macro assembler was easy, but tedious, since the code is architecture-specific. It would be nice if dynamic type-testing code was automatically generated, since the code that tests for tags denoting different runtime types is all basically the same. But rather than trying to automate that, I decided it was better to bite the bullet, since I already had enough cognitive load with trying to understand the compiler as it is.

It turned out that the json.js test case, despite being designed to test something else, was perfect for catching this bug, since it involved applying the same method first to a Box and then to an object. Once I’d fixed the problem with guards, this test passed. The constructor.js test still fails, but that just means I’ll have something interesting to work on tomorrow.

Perhaps the swings from despair to elation are why programming can be so habit-forming. While trying to track down the bug, I felt like I was the dullest person in the world and had hit my limit of understanding, and would never make any further progress. When I found the bug, for a moment I felt like I was on top of the world. That’s what keeps us doing it, right? (Besides the money, anyway.)

By the way, I still don’t fully understand inline caching in SpiderMonkey, so other resources, such as “An Inline Cache Isn’t Just A Cache” by Matthew Gaudet, are better sources than my posts. I mean for this blog to be more of a journal of experimentation than a definitive source of facts about anything.