package api
Type Members
-
abstract
class
ChecksumLayer extends ChecksumLayerBase
A checksum layer computes a numeric value from a region of the data stream.
A checksum layer computes a numeric value from a region of the data stream.
The term checksum is used generically here to subsume all sorts of CRCs, check digits, data hash, and digest calculations.
This abstract base is suitable only for checksums computed over small sections of data. It is not for large data streams or whole large files. The entire region of data the checksum is being computed over will be pulled into a byte buffer in memory.
The resulting checksum is the return value of the
#compute
method.This result is delivered into a DFDL variable for use by the DFDL schema. This DFDL variable can have any name such as 'crc', 'digest', or 'dataHash'.
The derived implementation class must also define a getter method based on the name of the DFDL variable which will be assigned with the checksum value. For example if the checksum is actually a specific digest/hash calculation and the DFDL variable is named
digest
, then this getter must be defined:int getLayerVariableResult_digest() { return this.digest; // usually returns a data member
}compute
method, and the DFDL variable nameddigest
will be assigned that value.The derived class implementing a checksum layer must call
setLength(len); // sets the length in bytes
setLayerVariableParameters
method:void setLayerVariableParameters(...) { ... setLength(len); // len is a constant, // or is computed from a parameter variable ...
}Layer
class for a description of how DFDL variables are passed to the arguments of thesetLayerVariableParameters
method.See
Layer
for more details about layers generally as most of its documentation is relevant to this derived abstract base class as well. -
abstract
class
Layer extends AnyRef
This is the primary API class for writing layers.
This is the primary API class for writing layers.
All layers are derived from this class.
This class is used directly as a base class to define transforming layers. To simplify the definition of checksum layers, a specialized sub-class
org.apache.daffodil.runtime1.layers.api.ChecksumLayer
is also available. Many of the requirements for layers, such as the naming conventions for layer variables, are described here, but they apply equally to checksum layers.Derived classes will be dynamically loaded by Java's Service Provider Interface (SPI) system. The names of concrete classes derived from Layer are listed in a metadata resource file named for this class (that is, the file name is the fully-qualified class name of this class:
resources/META-INF/services/org.apache.daffodil.runtime1.layers.api.Layer
). This file contains lines where each line contains one fully qualified class name of a derived layer class. More than one line in the file denotes that the Jar file contains the definitions of more than one derived layer class. This file is incorporated in the compiled jar file for the derivedLayer
class so that the class path can be searched and Jars containing layer classes can be dynamically loaded.The SPI creates an instance the class by calling a default (no-arg) constructor, which should be the only constructor.
Instances of derived layer classes can be stateful. They are private to threads, and each time a layer is encountered during parse/unparse, an instance is created for that situation.
Layer instances should not share mutable state (such as via singleton objects).
About Layer Variables
Layer logic may read and write DFDL variables. These variables are associated with the layer implementation class by using Java/Scala reflection to find matches (case-sensitive) between the names of DFDL variables and method names and method arguments of the layer's Java/Scala code. Hence, the layer's DFDL variables must have names suitable for use as Java/Scala identifiers.
The layer namespace is used only for its layer. All DFDL variables defined in that namespace are either used to pass parameters to the layer code, or receive results (such as a checksum) back from the layer code. This is enforced. If a layer namespace contains a DFDL variable and there is no corresponding usage of that variable name in the layer code (following the conventions below), then it is a Schema Definition Error when the layer code is loaded.
A layer that does not define any DFDL variables does not have to define a DFDL schema that defines the layer's target namespace, but any layer that uses DFDL variables *must* define a schema with the layer's namespace as its target namespace, and with the variables declared in it (using
dfdl:defineVariable
).Every DFDL Variable in the layer's target namespace is used either at the start of the layer algorithm as a parameter to the layer or at the end of the layer algorithm where it is assigned a return value (such as a checksum or flag) from the layer.
Variables being read must have values before being read by the layer, and this is true for both parsing and unparsing. This happens when the Daffodil processor begins parsing/unparsing the layered sequence. When unparsing, variables being read cannot be forward-referencing to parts of the DFDL infoset that have not yet been unparsed.
A layer that wants to read parameters declares a special setter named
setLayerVariableParameters
which has arguments where each has a name and type that match a correspondingdfdl:defineVariable
in the layer's target namespace.For example, if the layer logic has a DFDL variable for a parameter named
direction
of typexs:string
and another DFDL variable for a parameter namedwordCount
of typexs:unsignedShort
then the derived Layer class must have asetLayerVariableParameters
with arguments corresponding in name and type to these two variables. This setter will be called passing the value of the variables immediately after the layer instance is constructed. The arguments tosetLayerVariableParameters
can be in any order:void setLayerVariableParameters(String direction, int wordCount) { // usually this setter will assign the values to data members this.direction = direction; this.wordCount = wordCount; }
Beside initializing local members, this setter is also an initializer for the layer class instance. Any exception thrown becomes a Schema Definition Error.If there are no parameter variables, then this setter, with no arguments, can be used purely for initialization.
A DFDL variable used to return a result from a layer must be undefined, since variables in DFDL are single-assignment. Usually this means the use of the layer must be surrounded by a
dfdl:newVariableInstance
annotation which creates a new instance of the layer result variable, over a limited scope of use. The variable is assigned by the layer, and it is then available for reading by the DFDL schema until the end of thedfdl:newVariableInstance
scope.To return a value into a DFDL variable, the layer implementation defines a special recognizable getter method. The name of the getter is formed from prefixing the DFDL variable name with the string "
getLayerVariableResult_
". The return type of the getter must correspond to the type of the variable.For example, a result value getter for a DFDL variable named
total
of typexs:unsignedShort
would be:int getLayerVariableResult_total() { // returns the value created by the layer's algorithm. // commonly this returns the value of a data member. return this.total;
}ChecksumLayer
class, which is designed to facilitate creation of checksum/CRC/hash/digest layers.The Java types to use for the setter arguments and getter return types correspond to the DFDL variable types according to this table:
DFDL Schema Type Java Type xs:byte byte xs:short short xs:int int xs:long long xs:integer java.math.BigInteger xs:decimal java.math.BigDecimal xs:unsignedInt long xs:unsignedByte short xs:unsignedShort int xs:unsignedLong java.math.BigInteger xs:nonNegativeInteger java.math.BigInteger xs:double double xs:float float xs:hexBinary byte[] xs:anyURI java.net.URI xs:boolean boolean xs:dateTime com.ibm.icu.util.ICUCalendar xs:date com.ibm.icu.util.ICUCalendar xs:time com.ibm.icu.util.ICUCalendar Layer Algorithm
The rest of the Layer class implements the layer decode/encode logic.
The actual algorithm of the layer is not implemented in methods of the derived layer class, but rather is implemented in the layer's input decoder and output encoder. These extend the
java.io.InputStream
andjava.io.OutputStream
base classes to actually handle the data.Every layer must implement the
#wrapLayerInput
and#wrapLayerOutput
methods, which provide the input decoder and output encoder instances to the Daffodil layer framework. When parsing/unparsing, the derived Layer class itself is concerned with setup and tear-down of the layer's input decoder and output encoder, with providing access to/from DFDL variables, and with reporting errors effectively.Layer Exception Handling
The method
setProcessingErrorException
allows the layer to specify that if the layer throws specific exceptions or runtime exceptions that they are converted into processing errors. This eliminates most need for layer code to contain try-catches. For example:setProcessingErrorException(IOException.class);
Unhandled exceptions thrown by the layer code are treated as fatal errors.
This is the documentation for the Apache Daffodil Scala API.
Package structure
org.apache.daffodil.sapi - Provides the classes necessary to compile DFDL schemas, parse and unparse files using the compiled objects, and retrieve results and parsing diagnostics
org.apache.daffodil.udf - Provides the classes necessary to create User Defined Functions to extend the DFDL expression language
org.apache.daffodil.runtime1.layers.api - Provides the classes necessary to create custom Layer extensions to DFDL.