Changes in CIRCT

Changes in CIRCT: consuming the tywaves annotations in the debug dialect

Once the debug type information is propagated to firrtl through the Tywaves Annotation, it can be used by firtool to create a debug file linking chisel, firrtl and verilog representations together. The CIRCT project already supports the emission of some debug information through the debug dialect and HGLDD. The debug dialect tracks the correlation between values, types and hierarchy in the IR. This correspondence is finally encoded into a file in a format called HGLDD (Hardware Generator Language Debug Database) from which external tools can read and reconstruct the original source view.

CIRCT completes Chisel compilation by transforming FIRRTL to Verilog. As a result, at the current status, the debug information is related to FIRRTL and not Chisel. Empowering FIRRTL with the Tywaves annotations would allow CIRCT to attach extra source language information to the debug dialect and HGLDD. Although firrtl keeps the same hierarchies and variable names of chisel, it does not express any scala meta-programming information like the scala types. Therefore, without that information, any viewer would be merely a FIRRTL waveform viewer and not a true Chisel viewer.

This section explain how I updated CIRCT to consume the tywaves annotations from firrtl, materialize them to debug dialect, with updated operations to store the new information in the compiler, and add new fields to HGLDD without breaking the existing functionality and compatibility with other tools using that. The last section will discuss the opportunities for an extension to other languages of tywaves thanks to the CIRCT project.

Consuming the Tywaves annotations

As mentioned here, all annotations present in a firrtl file must be handled by a fir compiler. This is done in CIRCT during the LowerAnnotations pass. Annotations with targets are associated to the respective MLIR operations (FIRRTL dialect). In the case of tywaves, the annotation is applied to its target with the addAnnotation function.

After, that the type information is associated to the firrtl operations as a metadata like shown here:

firrtl.circuit "Foo" {
firrtl.module @Foo(
  in %inA: !firrtl.uint<42> [{class = "chisel3.tywaves.TywavesAnnotation", target = "~Foo|Foo>inA", typeName = "IO[UInt<42>]"}],
  in %inB: !firrtl.bundle<a: sint<19>, b: clock> [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "IO[MyBundle]"},  // Target is not required anymore in this pass
                                                    {circt.fieldID = 1 : i32, class = "chisel3.tywaves.TywavesAnnotation", typeName = "IO[SInt<19>]"},
                                                    {circt.fieldID = 2 : i32, class = "chisel3.tywaves.TywavesAnnotation", typeName = "IO[SomeClockType]"}],
  out %outC: !firrtl.vector<asyncreset, 2> [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "IO[Vec<AsyncReset>]"},
                                            {circt.fieldID = 1 : i32, class = "chisel3.tywaves.TywavesAnnotation", typeName = "IO[AsyncReset]"}],
) {
  %c0_ui17 = firrtl.constant 0 : !firrtl.uint<17>
  %c0_clock = firrtl.specialconstant 0 : !firrtl.clock

  %someWire = firrtl.wire {annotations = [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "Wire[SInt<17>]", params = [{name="size", typeName="int", value="17"}]}]} : !firrtl.uint<17>
  %someNode = firrtl.node %c0_ui17 {annotations = [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "UInt<17>"}]} : !firrtl.uint<17>
  %someReg = firrtl.reg %c0_clock {annotations = [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "Reg[SInt<17>]"}]} : !firrtl.clock, !firrtl.uint<17>
  }
}

After that, the MaterializeDebugInfo pass creates the debug dialect operations. This pass looks at the FIRRTL ports and types, and creates corresponding tracking operations, such that the FIRRTL perspective is preserved throughout the pipeline. In the updated version, this pass also looks at the annotations and creates updated dbg operations next to the already existing dbg operations.

New debug dialect operations

In MLIR, operations enable to manage different levels of abstractions and computations. Each operation has a set of attributes, arguments (zero or more) and results (zero or more) which can be used to express some information in the IR.

Specifically the debug dialect contains 4 operations:

dbg.variable: represent a named variable declared in the source code (it does not represent fields of aggregates).
dbg.struct and dbg.array: preserve the hierarchical representation of struct-like and array-like aggregates.
dbg.scope: degines a scope in the source code.

The dbg.variable operation can be combined with the dbg.struct and dbg.array operations to represent and fully reconstruct the original aggregates.

To include the tywaves annotations in the debug dialect, the debug dialect has been updated as follows:

dbg.variable is added with two new attributes representing the type name and type parameters.
Two new operations has been created: dbg.subfield and dbg.moduleinfo.

All the operation and attributes representing a materialization of the tywaves annotations are generated only if the corresponding tywaves annotation is present in the firrtl file. This way, if that extra debug level is not wanted or needed, the debug dialect keeps the same behavior as before.

Updated `dbg.variable` operation

The dbg.variable operation has been updated to include two new attributes: typeName and params.

def VariableOp : DebugOp<"variable"> {
    let arguments = (ins
        StrAttr:$name,
        AnyType:$value,
+       OptionalAttr<StrAttr>:$typeName,
+       OptionalAttr<ArrayAttr>:$params,
        Optional<ScopeType>:$scope
    );
    // No result
}

`dbg.subfield` operation

The addition of support for source language type and constructor parameters for top variables and subfields (also nested) required to build this additional operation since the dbg.variable operation would not have been appropriate for elements in aggregates. This last, explicitly represents the "top" variables in a module, for instance if a variable aggregate a is declared in a module with fields x and y, it would represent only the aggregate a and information of x and y would be the one contained in firrtl operations.

The dbg.subfield enables to track debug information for subfields of aggregates separately from the parent variable. It has the same attributes of the updated dbg.variable to store a value, source language name, type name, and type parameters but, unlike dbg.variable, it also returns a value SubFieldType that another operation can contain. This allows to insert it in dbg.struct and dbg.array operations to represent the type information of subfields.

The dbg.subfield doesn't have a scope operand, because it is always a descendant of a dbg.variable. Below the attributes:

def SubFieldOp : DebugOp<"subfield"> {
    let arguments = (ins
        StrAttr:$name,                  // The name of the subfield
        AnyType:$value,                 // The value of the subfield
        OptionalAttr<StrAttr>:$typeName,// The type name of the subfield
        OptionalAttr<ArrayAttr>:$params // The type parameters of the subfield
    );

    // The result of the operation to be used in other operations (dbg.struct, dbg.array)
    let results = (outs SubFieldType:$result);
}

`dbg.moduleinfo` operation

Tywaves supports type names and parameters also for modules and instances. The dbg.moduleinfo operation is used to store this information for a module. It is materialized once per module and it is a only a descriptor operation, storing debug information.

def ModuleInfoOp : DebugOp<"moduleinfo"> {
    // Arguments storing the tywaves information
    let arguments = (ins
        StrAttr:$typeName,
        OptionalAttr<ArrayAttr>:$params
    );
}

Example without the updated operations

This is the output of MaterializeDebugInfo pass without the updated operations. The same output is got also if tywaves annotations are not present in the firrtl file. While if the -g option is not passed to the compiler, the tywaves annotations are ignored.

// ...
dbg.variable "inA", %inA : !firrtl.uint<42>

%0 = firrtl.subfield %inB[a] : !firrtl.bundle<a: sint<19>, b: clock>
%1 = firrtl.subfield %inB[b] : !firrtl.bundle<a: sint<19>, b: clock>
%2 = dbg.struct {"a": %0, "b": %1} : !firrtl.sint<19>, !firrtl.clock
dbg.variable "inB", %2 : !dbg.struct

%3 = firrtl.subindex %outC[0] : !firrtl.vector<asyncreset, 2>
%4 = firrtl.subindex %outC[1] : !firrtl.vector<asyncreset, 2>
%5 = dbg.array [%3, %4] : !firrtl.asyncreset
dbg.variable "outC", %5 : !dbg.array

// ...
%c0_ui17 = firrtl.constant 0 : !firrtl.uint<17>
%c0_clock = firrtl.specialconstant 0 : !firrtl.clock

%someWire = firrtl.wire : !firrtl.uint<17>
dbg.variable "someWire", %someWire : !firrtl.uint<17>

%someNode = firrtl.node %c0_ui17 : !firrtl.uint<17>
dbg.variable "someNode", %someNode : !firrtl.uint<17>

%someReg = firrtl.reg %c0_clock : !firrtl.clock, !firrtl.uint<17>
dbg.variable "someReg", %someReg : !firrtl.uint<17>

Example with the updated operations

Here, the output of MaterializeDebugInfo pass with the updated operations.

// ...
- dbg.variable "inA", %inA : !firrtl.uint<42>
+ dbg.variable "inA", %inA {typeName = "IO[UInt<42>]"} : !firrtl.uint<42>


- %0 = firrtl.subfield %inB[a] : !firrtl.bundle<a: sint<19>, b: clock>
+ %0 = firrtl.subfield %inB[a] : !firrtl.bundle<a: sint<19>, b: clock>
+ %1 = dbg.subfield "inB.a", %0 {typeName = "IO[SInt<19>]"} : !firrtl.sint<19>
- %1 = firrtl.subfield %inB[b] : !firrtl.bundle<a: sint<19>, b: clock>
+ %2 = firrtl.subfield %inB[b] : !firrtl.bundle<a: sint<19>, b: clock>
+ %3 = dbg.subfield "inB.b", %2 {typeName = "IO[SomeClockType]"} : !firrtl.clock
- %2 = dbg.struct {"a": %0, "b": %1} : !firrtl.sint<19>, !firrtl.clock
+ %4 = dbg.struct {"a": %1, "b": %3} : !dbg.subfield, !dbg.subfield
- dbg.variable "inB", %2 : !dbg.struct
+ dbg.variable "inB", %4 {typeName = "IO[MyBundle]"} : !dbg.struct

- %3 = firrtl.subindex %outC[0] : !firrtl.vector<asyncreset, 2>
+ %5 = firrtl.subindex %outC[0] : !firrtl.vector<asyncreset, 2>
+ %6 = dbg.subfield "outC[0]", %5 {typeName = "IO[AsyncReset]"} : !firrtl.asyncreset
- %4 = firrtl.subindex %outC[1] : !firrtl.vector<asyncreset, 2>
+ %7 = firrtl.subindex %outC[1] : !firrtl.vector<asyncreset, 2>
+ %8 = dbg.subfield "outC[1]", %7 {typeName = "IO[AsyncReset]"} : !firrtl.asyncreset
- %5 = dbg.array [%3, %4] : !firrtl.asyncreset
+ %9 = dbg.array [%6, %8] : !dbg.subfield
- dbg.variable "outC", %5 : !dbg.array
+ dbg.variable "outC", %9 {typeName = "IO[Vec<AsyncReset>]"} : !dbg.array

// ...
%c0_ui17 = firrtl.constant 0 : !firrtl.uint<17>
%c0_clock = firrtl.specialconstant 0 : !firrtl.clock

- %someWire = firrtl.wire : !firrtl.uint<17>
- dbg.variable "someWire", %someWire : !firrtl.uint<17>
+ %someWire = firrtl.wire {annotations = [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "Wire[SInt<17>]", params = [{name="size", typeName="int", value="17"}]}]} : !firrtl.uint<17>
+ dbg.variable "someWire", %someWire {typeName = "Wire[SInt<17>]", params = [{name="size", typeName="int", value="17"}]} : !firrtl.uint<17>

- %someNode = firrtl.node %c0_ui17 : !firrtl.uint<17>
- dbg.variable "someNode", %someNode : !firrtl.uint<17>
+ %someNode = firrtl.node %c0_ui17 {annotations = [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "UInt<17>"}]} : !firrtl.uint<17>
+ dbg.variable "someNode", %someNode {typeName = "UInt<17>"} : !firrtl.uint<17>

- %someReg = firrtl.reg %c0_clock : !firrtl.clock, !firrtl.uint<17>
- dbg.variable "someReg", %someReg : !firrtl.uint<17>
+ %someReg = firrtl.reg %c0_clock {annotations = [{class = "chisel3.tywaves.TywavesAnnotation", typeName = "Reg[SInt<17>]"}]} : !firrtl.clock, !firrtl.uint<17>
+ dbg.variable "someReg", %someReg {typeName = "Reg[SInt<17>]"} : !firrtl.uint<17>

New HGLDD fields

The MLIR dialects, including the debug, are used internally in the compiler and their serialization is not meant to be parsed externally. Therefore, it cannot be used directly by an external waveform viewer. HGLDD aims to be a more standardized format for this purpose. It is based on JSON, and adding fields would not break the compatibility with existing parsers, since unhandled fields could be simply ignored.

The HGLDD format is updated with a new field source_lang_type_info. This field is emitted for modules, variables and fields of aggregates.

"source_lang_type_info": {
    "type_name": "<The type of the variable or module>"
},

or

"source_lang_type_info": {
    "type_name": "<The type of the variable or module>",
    "params": [
        {
            "name": "<The name of the parameter>",
            "typeName": "<he source language type of the parameter>",
            "value": "<The actually used>"
        }
    ]
},

HGLDD is not dumped directly from MLIR operations. The operations first are converted into a data structure DI which is then used to emit the file. Thus, also this data structure necessitated an update to include the new fields in DIModule and DIVariable.

+ struct DISourceLang {
+   /// The name of the type.
+   StringAttr typeName;
+   /// The constructor parameters of the type.
+   ArrayAttr params;
+ };

struct DIModule {
  /// The operation that generated this level of hierarchy.
  Operation *op = nullptr;
  /// The name of this level of hierarchy.
  StringAttr name;
  /// Levels of hierarchy nested under this module.
  SmallVector<DIInstance *, 0> instances;
  /// Variables declared within this module.
  SmallVector<DIVariable *, 0> variables;
  /// If this is an extern declaration.
  bool isExtern = false;
  /// If this is an inline scope created by a `dbg.scope` operation.
  bool isInline = false;

+ /// The source language type of this module.
+ DISourceLang sourceLangType;
};

struct DIVariable {
  /// The name of this variable.
  StringAttr name;
  /// The location of the variable's declaration.
  LocationAttr loc;
  /// The SSA value representing the value of this variable.
  Value value = nullptr;

+ /// The source language type of this module.
+ DISourceLang sourceLangType;
};

Opportunities for future work

The CIRCT project does not target only Chisel, but also other HDLs. All the languages in CIRCT share the same internal compiler. Therefore, the debug dialect and HGLDD are not strict to Chisel and FIRRTL. If other languages would be able to create the same annotations for FIRRTL, the compiler could emit the same exact information and all viewers reading te same hgldd format could make use of it.

In this sense the tywaves project is not only a project for Chisel, but a project with a look for all HDLs.

However, any language has its own construct and characteristics. This means that tywaves would not solve the problem of updating viewers when supporting a new language is needed. But their updates would be easier thanks to HGLDD and the shared infrastructure allows to include the type information for all those languages.

CIRCT