add-ir-instruction
Adding a New IR Instruction to Hermes
When adding a new IR instruction, you must touch a specific set of files. This skill describes each file, the pattern to follow, and important conventions.
Checklist of Files to Modify
doc/IR.md— Documentation (the only place for doc-comments)include/hermes/IR/Instrs.def— Instruction registrationinclude/hermes/IR/Instrs.h— Class definition (NO doc-comments here)include/hermes/IR/IRBuilder.h— Builder declarationlib/IR/IRBuilder.cpp— Builder implementationlib/IR/IRVerifier.cpp— Verification logiclib/Optimizer/Scalar/TypeInference.cpp— Type inference stublib/BCGen/HBC/ISel.cpp— HBC instruction selection (stub or implementation)lib/BCGen/SH/SH.cpp— Static Hermes codegen (stub or implementation)lib/BCGen/facebook/Mins/Mins.cpp— Mins codegen (stub or implementation)- Tests — At minimum, update or add tests in
test/
If the instruction needs lowering (i.e., it does not map directly to a bytecode opcode), you also need:
include/hermes/BCGen/Lowering.h— Lowering pass declarationlib/BCGen/Lowering.cpp— Lowering pass implementationlib/BCGen/HBC/LoweringPipelines.cpp— Register pass in HBC pipelinelib/BCGen/SH/SH.cpp(inlowerModuleIR) — Register pass in SH pipeline
Step-by-Step Guide
1. Document in doc/IR.md
This is the ONLY place to put documentation for the instruction. Do NOT add
doc-comments to Instrs.h.
Add a markdown table entry in the appropriate section:
### MyNewInst
MyNewInst | _
--- | --- |
Description | Brief description of what the instruction does.
Example | `MyNewInst %arg1, %arg2 : type`
Arguments | *%arg1* is ... *%arg2* is ...
Semantics | Describe the semantics, referencing the spec where appropriate.
Effects | Describe side effects (e.g., "May read and write memory.", "May read memory and throw.", "Does not read or write memory.").
2. Register in Instrs.def
Add a DEF_VALUE entry. Place it near related instructions:
DEF_VALUE(MyNewInst, Instruction)
If it's a subclass of another instruction, use the parent as the second argument.
If it's a terminator, use TERMINATOR instead of DEF_VALUE.
3. Define the class in Instrs.h
Do NOT add doc-comments to this file. Documentation belongs in doc/IR.md.
Follow this exact pattern:
class MyNewInst : public Instruction {
MyNewInst(const MyNewInst &) = delete;
void operator=(const MyNewInst &) = delete;
public:
enum { Arg1Idx, Arg2Idx };
explicit MyNewInst(Value *arg1, Value *arg2)
: Instruction(ValueKind::MyNewInstKind) {
// Optional assertions on operand types:
// assert(arg2->getType().isSomeType() && "message");
// Set the result type:
setType(Type::createNoType()); // for instructions with no output
// or: setType(Type::createFoo()); for typed instructions
pushOperand(arg1);
pushOperand(arg2);
}
explicit MyNewInst(
const MyNewInst *src,
llvh::ArrayRef<Value *> operands)
: Instruction(src, operands) {}
Value *getArg1() const {
return getOperand(Arg1Idx);
}
Value *getArg2() const {
return getOperand(Arg2Idx);
}
static bool hasOutput() {
return false; // true if the instruction produces a value
}
static bool isTyped() {
return false; // true if the output type is meaningful
}
SideEffect getSideEffectImpl() const {
// Compose side effects. Common patterns:
// return {}; // pure
// return SideEffect{}.setReadHeap(); // reads memory
// return SideEffect{}.setReadHeap().setWriteHeap(); // reads+writes
// return SideEffect{}.setThrow().setReadHeap(); // may throw + read
return SideEffect{}.setThrow().setReadHeap();
}
static bool classof(const Value *V) {
ValueKind kind = V->getKind();
return kind == ValueKind::MyNewInstKind;
}
};
4. Add IRBuilder declaration in IRBuilder.h
MyNewInst *createMyNewInst(Value *arg1, Value *arg2);
5. Add IRBuilder implementation in IRBuilder.cpp
MyNewInst *IRBuilder::createMyNewInst(Value *arg1, Value *arg2) {
auto *inst = new MyNewInst(arg1, arg2);
insert(inst);
return inst;
}
6. Add verification in IRVerifier.cpp
Add a visit method that checks invariants:
bool Verifier::visitMyNewInst(const MyNewInst &Inst) {
AssertIWithMsg(
Inst,
Inst.getArg2()->getType().isSomeType(),
"MyNewInst::Arg2 must be of SomeType");
return true;
}
7. Add type inference in TypeInference.cpp
Add an infer method. For instructions without output, return createNoType():
Type inferMyNewInst(MyNewInst *inst) {
return Type::createNoType();
}
8. Add code generation stubs
If the instruction is lowered before codegen, add fatal stubs. Otherwise, implement the actual code generation.
HBC ISel (lib/BCGen/HBC/ISel.cpp):
void HBCISel::generateMyNewInst(MyNewInst *Inst, BasicBlock *next) {
hermes_fatal("MyNewInst should have been lowered.");
}
SH (lib/BCGen/SH/SH.cpp):
void generateMyNewInst(MyNewInst &inst) {
hermes_fatal("MyNewInst should have been lowered");
}
Mins (lib/BCGen/facebook/Mins/Mins.cpp):
Unless asked to, do not implement Mins codegen for new instructions.
Leave it as a stub.
void generateMyNewInst(MyNewInst &inst) {
unimplemented(inst);
}
9. (If needed) Add a lowering pass
Declare in Lowering.h:
/// Brief description of what the lowering does.
Pass *createLowerMyNewInst();
Implement in Lowering.cpp:
Pass *hermes::createLowerMyNewInst() {
class ThisPass : public FunctionPass {
public:
explicit ThisPass() : FunctionPass("LowerMyNewInst") {}
bool runOnFunction(Function *F) override {
IRBuilder builder{F};
bool changed = false;
// Collect instructions first to avoid iterator invalidation.
llvh::SmallVector<MyNewInst *, 4> insts;
for (auto &BB : *F) {
for (auto &I : BB) {
if (auto *MNI = llvh::dyn_cast<MyNewInst>(&I))
insts.push_back(MNI);
}
}
for (auto *MNI : insts) {
// Replace MNI with lowered IR...
MNI->eraseFromParent();
changed = true;
}
return changed;
}
};
return new ThisPass();
}
Register in pipelines:
In lib/BCGen/HBC/LoweringPipelines.cpp and in
lib/BCGen/SH/SH.cpp (lowerModuleIR), add PM.addPass(createLowerMyNewInst());
at the appropriate point (before any pass that would need to process instructions
introduced by the lowering).
10. Add or update tests
Add lit tests in the appropriate test/ subdirectory. Use %FileCheck or
%FileCheckOrRegen to verify the IR output. If existing tests cover the
feature, update their expected output to reflect the new instruction.
Which subdirectory to use depends on how the instruction interacts with the compiler pipeline — not every instruction needs tests in every directory:
test/IRGen/— When the instruction is generated directly from JavaScript source. Tests here verify that IRGen produces the expected IR.test/Optimizer/— When the instruction affects or interacts with optimization passes.test/BCGen/— Not used as often, but if this IR instruction is used in combination with new bytecode instructions, then place bytecode gen tests here.
Key Conventions
- No doc-comments in
Instrs.h. All documentation goes indoc/IR.md. The class inInstrs.hshould have no///or/** */comments describing what the instruction does. Brief inline comments explaining non-obvious implementation details (like side effects) are fine. - Placement matters. Place the new instruction near related instructions in every file (e.g., private field instructions are grouped together).
- Consistent naming. The instruction name (e.g.,
FooBarInst) must be consistent across all files:Instrs.def,Instrs.h,IRBuilder.h/cpp,IRVerifier.cpp,TypeInference.cpp, and all codegen files. - The
ValueKindis derived automatically. When you addDEF_VALUE(MyNewInst, Instruction)toInstrs.def, the enum valueValueKind::MyNewInstKindis generated automatically.