package api

Type Members

  1. abstract class ChecksumLayer extends ChecksumLayerBase

    A checksum layer computes a numeric value from a region of the data stream.

    A checksum layer computes a numeric value from a region of the data stream.

    The term checksum is used generically here to subsume all sorts of CRCs, check digits, data hash, and digest calculations.

    This abstract base is suitable only for checksums computed over small sections of data. It is not for large data streams or whole large files. The entire region of data the checksum is being computed over will be pulled into a byte buffer in memory.

    The resulting checksum is the return value of the #compute method.

    This result is delivered into a DFDL variable for use by the DFDL schema. This DFDL variable can have any name such as 'crc', 'digest', or 'dataHash'.

    The derived implementation class must also define a getter method based on the name of the DFDL variable which will be assigned with the checksum value. For example if the checksum is actually a specific digest/hash calculation and the DFDL variable is named digest, then this getter must be defined:

        int getLayerVariableResult_digest() {
          return this.digest; // usually returns a data member
    This will be called automatically to retrieve the integer value that was returned from the compute method, and the DFDL variable named digest will be assigned that value.

    The derived class implementing a checksum layer must call

        setLength(len); // sets the length in bytes
    to specify the length of the data region in bytes. Normally this would be called from the layer's implementation of the setLayerVariableParameters method:
        void setLayerVariableParameters(...) {
            setLength(len); // len is a constant,
                            // or is computed from a parameter variable
    See the documentation of the Layer class for a description of how DFDL variables are passed to the arguments of the setLayerVariableParameters method.

    See Layer for more details about layers generally as most of its documentation is relevant to this derived abstract base class as well.

  2. abstract class Layer extends AnyRef

    This is the primary API class for writing layers.

    This is the primary API class for writing layers.

    All layers are derived from this class.

    This class is used directly as a base class to define transforming layers. To simplify the definition of checksum layers, a specialized sub-class org.apache.daffodil.runtime1.layers.api.ChecksumLayer is also available. Many of the requirements for layers, such as the naming conventions for layer variables, are described here, but they apply equally to checksum layers.

    Derived classes will be dynamically loaded by Java's Service Provider Interface (SPI) system. The names of concrete classes derived from Layer are listed in a metadata resource file named for this class (that is, the file name is the fully-qualified class name of this class: resources/META-INF/services/org.apache.daffodil.runtime1.layers.api.Layer). This file contains lines where each line contains one fully qualified class name of a derived layer class. More than one line in the file denotes that the Jar file contains the definitions of more than one derived layer class. This file is incorporated in the compiled jar file for the derived Layer class so that the class path can be searched and Jars containing layer classes can be dynamically loaded.

    The SPI creates an instance the class by calling a default (no-arg) constructor, which should be the only constructor.

    Instances of derived layer classes can be stateful. They are private to threads, and each time a layer is encountered during parse/unparse, an instance is created for that situation.

    Layer instances should not share mutable state (such as via singleton objects).

    About Layer Variables

    Layer logic may read and write DFDL variables. These variables are associated with the layer implementation class by using Java/Scala reflection to find matches (case-sensitive) between the names of DFDL variables and method names and method arguments of the layer's Java/Scala code. Hence, the layer's DFDL variables must have names suitable for use as Java/Scala identifiers.

    The layer namespace is used only for its layer. All DFDL variables defined in that namespace are either used to pass parameters to the layer code, or receive results (such as a checksum) back from the layer code. This is enforced. If a layer namespace contains a DFDL variable and there is no corresponding usage of that variable name in the layer code (following the conventions below), then it is a Schema Definition Error when the layer code is loaded.

    A layer that does not define any DFDL variables does not have to define a DFDL schema that defines the layer's target namespace, but any layer that uses DFDL variables *must* define a schema with the layer's namespace as its target namespace, and with the variables declared in it (using dfdl:defineVariable).

    Every DFDL Variable in the layer's target namespace is used either at the start of the layer algorithm as a parameter to the layer or at the end of the layer algorithm where it is assigned a return value (such as a checksum or flag) from the layer.

    Variables being read must have values before being read by the layer, and this is true for both parsing and unparsing. This happens when the Daffodil processor begins parsing/unparsing the layered sequence. When unparsing, variables being read cannot be forward-referencing to parts of the DFDL infoset that have not yet been unparsed.

    A layer that wants to read parameters declares a special setter named setLayerVariableParameters which has arguments where each has a name and type that match a corresponding dfdl:defineVariable in the layer's target namespace.

    For example, if the layer logic has a DFDL variable for a parameter named direction of type xs:string and another DFDL variable for a parameter named wordCount of type xs:unsignedShort then the derived Layer class must have a setLayerVariableParameters with arguments corresponding in name and type to these two variables. This setter will be called passing the value of the variables immediately after the layer instance is constructed. The arguments to setLayerVariableParameters can be in any order:

        void setLayerVariableParameters(String direction, int wordCount) {
            // usually this setter will assign the values to data members
            this.direction = direction;
            this.wordCount = wordCount;
    Beside initializing local members, this setter is also an initializer for the layer class instance. Any exception thrown becomes a Schema Definition Error.

    If there are no parameter variables, then this setter, with no arguments, can be used purely for initialization.

    A DFDL variable used to return a result from a layer must be undefined, since variables in DFDL are single-assignment. Usually this means the use of the layer must be surrounded by a dfdl:newVariableInstance annotation which creates a new instance of the layer result variable, over a limited scope of use. The variable is assigned by the layer, and it is then available for reading by the DFDL schema until the end of the dfdl:newVariableInstance scope.

    To return a value into a DFDL variable, the layer implementation defines a special recognizable getter method. The name of the getter is formed from prefixing the DFDL variable name with the string "getLayerVariableResult_". The return type of the getter must correspond to the type of the variable.

    For example, a result value getter for a DFDL variable named total of type xs:unsignedShort would be:

         int getLayerVariableResult_total() {
             // returns the value created by the layer's algorithm.
             // commonly this returns the value of a data member.
    Layers could have multiple result variables, but a single variable is most common, generally for returning checksums. See the ChecksumLayer class, which is designed to facilitate creation of checksum/CRC/hash/digest layers.

    The Java types to use for the setter arguments and getter return types correspond to the DFDL variable types according to this table:

    DFDL Schema Type Java Type
    xs:byte byte
    xs:short short
    xs:int int
    xs:long long
    xs:integer java.math.BigInteger
    xs:decimal java.math.BigDecimal
    xs:unsignedInt long
    xs:unsignedByte short
    xs:unsignedShort int
    xs:unsignedLong java.math.BigInteger
    xs:nonNegativeInteger java.math.BigInteger
    xs:double double
    xs:float float
    xs:hexBinary byte[]
    xs:boolean boolean

    Layer Algorithm

    The rest of the Layer class implements the layer decode/encode logic.

    The actual algorithm of the layer is not implemented in methods of the derived layer class, but rather is implemented in the layer's input decoder and output encoder. These extend the and base classes to actually handle the data.

    Every layer must implement the #wrapLayerInput and #wrapLayerOutput methods, which provide the input decoder and output encoder instances to the Daffodil layer framework. When parsing/unparsing, the derived Layer class itself is concerned with setup and tear-down of the layer's input decoder and output encoder, with providing access to/from DFDL variables, and with reporting errors effectively.

    Layer Exception Handling

    The method setProcessingErrorException allows the layer to specify that if the layer throws specific exceptions or runtime exceptions that they are converted into processing errors. This eliminates most need for layer code to contain try-catches. For example:

    informs the DFDL processor that an IOException thrown from the layer is to be treated as a processing error.

    Unhandled exceptions thrown by the layer code are treated as fatal errors.