Implementing records in Warp

Toward the goal of implementing records and tuples in Warp, I’m starting with code generation for empty records. The two modules I’ve been staring at the most are WarpBuilder.cpp and WarpCacheIRTranspiler.cpp.

While the baseline compiler translates bytecode and CacheIR directly into assembly code (using the MacroAssembler), the optimizing compiler (Warp) uses two intermediate languages: it translates bytecode and CacheIR to MIR; MIR to LIR; and then LIR to assembly code.

As explained in more detail here, WarpBuilder takes a snapshot (generated by another module, WarpOracle.cpp) of running code, and for each bytecode, it generates MIR instructions, either from CacheIR (for bytecode ops that can have inline caches), or directly. For ops that can be cached, WarpBuilder calls its own buildIC(), method, which in turn calls the TranspileCacheIRToMIR() method in WarpCacheIRTranspiler.

A comment in WarpBuilderShared.h says “Because this code is used by WarpCacheIRTranspiler we should generally assume that we only have access to the current basic block.” From that, I’m inferring that WarpCacheIRTranspiler maps each CacheIR op onto exactly one basic block. In addition, the addEffectful() method in WarpCacheIRTranspiler enforces that each basic block contains at most one effectful instruction.

In the baseline JIT implementation that I already finished, the InitRecord and FinishRecord bytecodes each have their own corresponding CacheIR ops; I made this choice by looking at how existing ops like NewArray were implemented, though in all of these cases, I’m still not sure I fully understand what the benefit of caching is (rather than just generating code) — my understanding of inline caching is that it’s an optimization to avoid method lookups when polymorphic code is instantiated repeatedly at the same type, and in all of these cases, there’s no type-based polymorphism.

I could go ahead and add InitRecord and FinishRecord into MIR and LIR as well; this would be similar to my existing code where the BaselineCacheIRCompiler compiles these operations to assembly. To implement these operations in Warp, I would add similar code to CodeGenerator.cpp (the module that compiles LIR to assembly) as what is currently in the BaselineCacheIRCompiler.

But, MIR includes some lower-level operations that aren’t present in CacheIR — most relevantly to me, operations for manipulating ObjectElements fields: Elements, SetInitializedLength, and so on. Using these operations (and adding a few more similar ones), I could translate FinishRecord to a series of simpler MIR operations, rather than adding it to MIR. To be more concrete, it would look something like:

(CacheIR)

FinishRecord r

== WarpCacheIRTranspiler ==>

(MIR)

e = Elements r
Freeze e
sortedKeys = LoadFixedSlot r SORTED_KEYS_SLOT
sortedKeysElements = Elements sortedKeys
CallShrinkCapacityToInitializedLength sortedKeys
SetNonWritableArrayLength sortedKeysElements
recordInitializedLength = InitializedLength r
SetArrayLength sortedKeysElements recordInitializedLength
CallSort sortedKeys

(I’m making up a concrete syntax for MIR.)

This would encapsulate the operations involved in finishing a record, primarily sorting the keys array and setting flags to ensure that the record and its sorted keys array are read-only. Several of these are already present in MIR, and the others would be easy to add, following existing operations as a template.

The problem with this approach is that FinishRecord in CacheIR would map onto multiple effectful MIR instructions, so I can’t just add a case for it in WarpCacheIRTranspiler.

I could also push the lower-level operations up into CacheIR, but I don’t know if that’s a good idea, since presumably there’s a reason why it hasn’t been done already.

To summarize, the options I’m considering are:

  1. Pass down InitRecord and FinishRecord through the pipeline by adding them to MIR and LIR
  2. Open up FinishRecord (InitRecord isn’t as complicated) in the translation to MIR, which might involve making FinishRecord non-cacheable altogether
  3. Open up FinishRecord in the translation to CacheIR, by adding more lower-level operations into CacheIR

I’ll have to do more research and check my assumptions before making a decision. A bigger question I’m wondering about is how to determine if it’s worth it to implement a particular operation in CacheIR at all; maybe I’m going about things the wrong way by adding the record/tuple opcodes into CacheIR right away, and instead I should just be implementing code generation and defer anything else until benchmarks exist?

Tags: ,

Leave a Reply