nrrd (Nearly Raw Raster Data) format SpecificationCompiled by Paul Bourke
Nrrd is a library and file format designed to support scientific visualization and image processing involving N-dimensional raster data. Nrrd stands for "nearly raw raster data". Besides dimensional generality, nrrd is flexible with respect to type (8 integral types, 2 floating point types), encoding of written files (raw, ascii, hex, or gzip or bzip2 compression), and endianness (the byte order of data is explicitly recorded when the type or encoding expose it). Besides the NRRD format, the library can read and write PNG, PPM, and PGM images, as well as some VTK "STRUCTURED_POINTS" datasets. About two dozen operations on raster data are implemented, including simple things like quantizing, slicing, and cropping, as well as fancier things like projection, histogram equalization, and filtered resampling (up and down) with arbitrary seperable kernels. This document defines the NRRD file format. It has been updated to reflect the latest version of the format, supported by Teem version 1.9 and later. Since this document aims to be a self-contained reference for the NRRD file format, some of the material here repeats ideas found elsewhere in the nrrd documentation. Besides defining the format, this document also seeks to supply some rationale. This page has been written with a view towards keeping it useful upon printing. When saved, the filenames for NRRD files should end in ".nrrd". Detached headers, discussed in Section 3, should end in ".nhdr". Suggested suffixes for data files associated with detached headers (".raw", ".txt", etc) are listed in the encoding part of Section 5. The nrrdRead()/nrrdLoad() and nrrdWrite()/nrrdSave() functions of the nrrd library are intended to completely support the format as described here, but as yet there is no test suite. 1. General Information1.1 Format versions and magicsNRRD files come in two forms: with an attached header (in which the header and the data are in the same file), and with a detached header (header and data are in two separate files). The attached-header form is described first, and the minor variation that enables detached headers is described in Section 3.The general format of a NRRD file (with attached header) is: NRRD000X <field>: <desc> <field>: <desc> # <comment> ... <field>: <desc> <key>:=<value> <key>:=<value> <key>:=<value> # <comment> <data><data><data><data><data><data>... The very first line contains nothing but the NRRD "magic". The magic is what identifies the file as a NRRD file, for the benefit of readers that have to handle multiple file formats. For NRRD files, the first four characters are always "NRRD", and the remaining characters give information about the file format version. The "X" in "NRRD000X" identifies which version of the file format is being used. As new features have been added to the format, the X as been incremented to ensure that readers in old binaries don't try to parse newer headers. Briefly, the different magics are:
1.2 Basic header structureBecause the NRRD format uses a multi-line text header, some mention should be made of what exactly a line is. When Windows (and sometimes Cygwin) creates a text file, each line is terminated by a pair of characters, "\r\n". When everyone else creates a text file, each line is terminated by just "\n". NRRD readers must be able to handle both types of line terminations; NRRD writers can use one or the other or both. The line termination is ignored once the line has been read from file, so when the description herein says "the line contains ...", it is referring to everything prior to the character(s) comprising the line termination. In the current definition of the format, there is no limit to how long a line can be. The content, labels, key/value pairs, and comment fields (described below) allow for arbitrarily long strings. ASCII encoding is assumed for the NRRD header.Each of the "<field>: <desc>" lines specifies information about one of the fields in the nrrd. Each of these lines is called a "field specification", or more loosely, a "field". Each field specification is contained in exactly one line. Each field specification may appear no more than once in the nrrd header. All the field specifications have the same structure: a string "<field>" identifying the field (called the field identifier), then a colon followed by a single space ": ", and then the information describing the field "<desc>" (called the field descriptor). All field identifiers are case insensitive. Many of the field descriptors are also case insensitive, the exceptions are the field descriptors containing strings: the content, labels, units, space units, and sample units fields. Whitespace (that is not part of the previous line's termination) is not allowed before a field identifier. Extra whitespace after the field descriptor and before the line termination is ignored. Each of the "<key>:=<value>" lines specifies a key/value pair in the nrrd. These can appear in NRRD0002 (and higher version) files, but not NRRD0001 files. The key and value strings are delimited by the first ":=" to appear on the line: any spaces before or after ":=" are assumed part of the key or value, respectively. The length of the key string must be at least one character, the value string can be zero-length. Because both the key and value must appear on exactly one line, a minimal escaping scheme is required, which readers must interpret and writers must generate:
Comment lines start with a pound, "#", with no proceeding whitespace. The comment string itself starts with the first character which is a not a pound or a space (" "). Comment lines with a zero-length comment string should be ignored. Comment lines may be interspersed with field specifications in any order. This allows field specifications to be commented out and commented upon easily. Gracious NRRD readers should store all the comments seen in the header, but this is not a requirement. Comments are case sensitive. The magic, field specifications, key/value pairs, and comments comprise the NRRD header. After the header, there is a single blank line containing zero characters. This separates the header from the data, which follows. The header, the blank line, and the data comprise the NRRD file. A single NRRD file can store the information and data for a single array. There is currently no facility for storing multiple arrays in a single NRRD file. The NRRD format is complicated by the fact that some fields are always necessary, while some fields are always optional, and some fields are necessary only some of the time. Whether or not a field is necessary depends on previous information in the header. This is why there is no standard template for all NRRD headers, or a context-free grammar for NRRD headers. 1.3 "Per-axis" versus "basic" fields, and field specification orderingSome field specifications give a piece of information about each and every axis in a nrrd; these are called "per-axis field specifications". An example is the sizes field that identifies how many samples there are along each dimension of the array. Per-axis specifications must have as many components as there are dimensions in the array, so that they can identify information for every axis. There is no partial or implied per-axis information. Other field specifications are general with respect to array dimension, such as the type of the array; these are called "basic field specifications".An important and necessary basic field specification is the one giving the dimension of the nrrd. The format is: <int> can be any integer greater than 0. NRRD readers may not have the ability to represent absolutely any dimension, but they must be able to handle nrrds with dimension 16 or less, which is what the current nrrd implementation can do.dimension: <int> The number of samples along each axis is the only necessary per-axis specification. The format is: <size[i]> is the number of samples along axis i, with axis ordering going from fastest to slowest. As with all the other per-axis field identifiers, sizes ends with "s" to emphasize the plurality of the field it specifies. The field identifiers do not change, however, for one-dimensional nrrds.sizes: <size[0]> <size[1]> ... <size[dim-1]> Basic specifications can appear both before and after the dimension field. Per-axis specifications can only appear after the dimension field specification. This simplifies the task of parsing per-axis specifications, since we know how many pieces of information need to be parsed (the same as the dimension), as well as avoiding attempts at cleverness in the form of guessing the dimension from the first per-axis specification. There is a similar field specification ordering constraint associated with the orientation information described in Section 4: the space in which the array conceptually "lives" has to be identified (or at least its dimension given) prior to parsing all the other orientation-related fields which have one piece of information per space coordinate. A final constraint on field specification ordering is that the "data file: LIST" form of the data file field. must be the last field specification in the header. Within these constraints, the field specification may appear in any order. 1.4 Axis ordering, and a minimalist headerThe issue of axis ordering is fundamental. In memory and on disk, there is a strict linear order of all the values in an array, so that each value has a single integer address. Conceptually, however, each value has one or more integer coordinates which identify its position within the array (as many coordinates as there are dimensions in the array). The "fastest" axis is the one associated with the coordinate which increments fastest as the samples are traversed in linear order. For example, the typical raster ordering of interleaved RGB 2-dimensional image data is actually a three-dimensional array. The fastest axis is the color axis (only three samples long), followed by the horizontal axis, with the vertical axis being the slowest. All the per-axis field specifications identify information for each axis, and the axis ordering is always (reading left to right) fastest to slowest. NRRD does not assume any names for the axes, such as "X", "Y", "Z" or "I", "J", "K": they are identified solely by their location in the ordering from fastest to slowest. Besides dimension, there are two other always-necessary basic specifications: the type and the encoding specifications. Their format is: The possible values for type include the C identifiers you would probably use to identify a type: "int" means a 32-bit signed integer, "float" means 32-bit floating point, and so on. Useful variants like "uchar" (same as "unsigned char") are allowed. There is also the block type, which is used to represent some chunk of opaque memory, of user-specified size; see the type specification in Section 5 for all the details.type: <type> encoding: <encoding> The encoding tells how the data (following the blank line after the header) is written out; "ascii" and "raw" are common values, but "hex" allows extraction of images from some PostScript files, and compression is also supported; see the encoding specification (Section 5). NRRD readers must be able to support raw and ASCII encoding, everything else is optional. The field specifications described so far provide the means of writing a minimal NRRD header: This is identical in meaning to the PPM header:NRRD0001 # my first nrrd type: uchar dimension: 3 sizes: 3 640 480 encoding: raw P6 # my first nrrd 640 480 255 The field specifiers described so far, and illustrated above, are the only ones which are always necessary. However, other field specifications become necessary as a function of other fields: if the type was "float", and the encoding is something other than "ascii", then the endian of the data would have to be recorded. The details of which fields require which other fields are spelled out in Section 5 and Section 6. 1.5 Optional information and its representationOne of the things that makes reading a NRRD header complicated is that the absence of optional information is just as important to record as the information itself when it is present. That is, whatever data structure is created or updated as the result of reading a NRRD header must enable the writer to write the exact same NRRD header, up to field specification ordering. If values for an optional field were never specified by an input NRRD header, then specific values for this field must not be invented in an output NRRD header, and the field should not be saved at all. Keeping headers as concise as possible makes them easier to understand when read by humans, and eliminates the risk that misleading information is invented solely for the sake of conforming to a file format.In order to implement this in NRRD, each of the optional fields must have a way of representing the idea of "don't know"-- a state distinct from knowing a specific default value, or knowing the value specified in the header. All of optional fields can be initialized to "don't know", and only after "known" values are specified in an input header does the field become worthy of being saved in an output NRRD header. For optional fields with string values (content, labels, and units), the empty string ("") is the obvious choice for "don't know". For centers and kinds, the strings "???" and "none" (and the value used to represent them) means "don't know". In contrast, the optional fields with integer values (line skip and byte skip) actually have sensible a sensible "known" default value, namely zero. But how does one represent "don't know" with optional floating point data? NRRD uses NaN, or Not-a-Number. NaN is a value that can be represented in the ubiquitous IEEE 754 floating point standard, as the result of doing undefined arithmetic operations, such as zero divided by zero. While it may seem overly cute or clever to use NaN as a flag for "don't know", this is in fact exactly in keeping with the purpose of NaN as described in the original 754 standard. Furthermore, as described in the documentation for the air library, it is possible to generate a NaN at compile time (so that it doesn't have to be produced as a result of doing an undefined arithmetic operation), and it is possible to quickly test if a given number is NaN. Even if operations involving NaN are not implemented in the floating point hardware, but in software emulation supplied by the operating system, they will never be the bottle neck in reading and writing a NRRD file. Because of NaN's important role as signifier of "don't know", the NRRD reader must be able to interpret the case-insensitive string "nan" as a NaN, even if this is not already the behavior of sscanf() on a given platform (it probably isn't). Writing a small wrapper function around sscanf() is a very small price to pay for the representational convenience of NaN. Section 2 gives the details for how NRRD readers and writers should handle ASCII encoding of the the IEEE 754 special values. 1.6 Miscellaneous infoIn Section 5 and Section 6, when a field specification is described as harmless, it means that the field specification probably shouldn't be in the header, because its information is either irrelevant or meaningless in that context. However, for the sake of implementation simplicity, its presence shouldn't count as an error. The information in harmless field specifications must be ignored by NRRD readers, but it is okay to complain if the field specification is malformed and unparsable.In Section 5 and Section 6, numeric field specification descriptions include a "Type", which identifies the minimum precision with which the information must be represented by the NRRD reader. In this context, "int" means a 32-bit signed integer, and "double" means a 64-bit floating point number. Field specifications with alternate equivalent forms are listed together (for example, "block size" is the same as blocksize"). Equivalent field descriptors are listed together in the table enumerating the meanings of the various descriptors (for example, "uchar is the same as "unsigned char"). Note that quotes are used to delimit the field descriptors in the explanation of their meaning; quotes are not part of the descriptor itself (except for the labels and various units specifications, in which the descriptors (strings) are delimited by quotes). 2. ASCII Encoding of Floating Point ValuesFloat point field descriptors in the NRRD header, and floating point data written with ASCII encoding, have some special rules regarding their interpretation. Unfortunately, these are not always consistent with how the sscanf() C library call works on all platforms. The rules described here are a minimalist way of making sure that the basic IEEE 754 special values can be reliably read and written in NRRD files, across a variety of platforms. The rules described here apply to both 32-bit and 64-bit floating point values, or rather, the strings which are used to represent these values:
The difference between a quiet and signaling NaN is a detail of IEEE 754 which was left implementation-specific, so different platforms have different ways of distinguishing between quiet and signaling NaN, and some don't distinguishing between them at all. The intent was that quiet NaNs represent an indeterminate value, as in 0/0, or inf/inf, meaning simply that arithmetic doesn't define a single value for the result. On the other hand, signaling NaNs represent an invalid value, to signal that a non-existent or uninitialized floating point value was accessed, or that the input parameters to a function were so botched that no valid output can be generated; the signaling NaN is supposed to signal "someone goofed". Based on the fact that different portions of 754 can be implemented in software, or hardware, or a combination of the two, there may be performance considerations between the two kinds of NaNs. But in any case, its basically all moot, since unfortunately, there is no cross-platform standard API for the floating point exception handlers which can interact with signaling NaNs. Given this, in the NRRD file format (and in the nrrd library), a NaN is a NaN is a NaN, with no difference between signaling and quiet, and no recognition of the integer value in the mantissa field of the NaN. If the signaling/quiet distinction mattered, then when writing raw floating point data, not only would endianness have to be recorded, but also the convention for representing quiet NaN, and if the data came from a platform that knows the difference between the two NaNs. Readers would have to possibly traverse the whole array after input to detect and switch NaN representations. Doing this checking is not practical or efficient, and the consequences of not doing it are either moot or non-existent. Thankfully, there are unique and fully specified bit patterns for positive and negative infinity. NRRD writers should verify that their printf() function behaves in accordance with these rules. 3. Detached Headers with "data file:"Detached headers allow data stored in one or more separate files, with or without headers of another format, to be accessed by a NRRD header, leaving the original file(s) intact. The line skip and byte skip fields are especially useful for these cases. Detached headers are also very useful in situations where very large amounts of data are to be read or written with direct IO, a very fast method of IO in which the device driver transfers data directly between blocks on disk and user-space memory (nrrd currently supports direct IO on SGIs, via the air library). Direct IO requires special alignment between the data segment beginning and the block boundaries on disk, which makes using attached headers rather inconvenient. Detached headers are also the simplest way to deal with NRRD readers which do not support the optional compressed encodings, since a stand-alone program (such as "gzip/gunzip" or "bzip2/bunzip2") will be able to process the separate data file.There is one new field specification which is required in detached headers, using the "data file" identifier. This field specification can take one of three possible forms (the second and third were copied from the MetaImage format). "datafile:" is also valid as the field identifier. The addition of this field is the only difference between attached headers and detached headers. The magic at the beginning of the header is the same, so there is currently no way to immediately detect if the header being parsed is attached or detached. Detached headers may end with the the last field specification, or with a single blank line following the last field specification (in which case anything following the blank line is ignored). The meaning of the three forms of the field descriptor are:
When there are multiple separate datafiles (second and third forms above), the amount of data in each file has to be determined somehow. By default, each file is assumed to contain one slice along the slowest axis of the nrrd. That is, the dimension of data in each file is D-1, where D is the dimension of the full array (given by the dimension: field). A different datafile dimension (besides D-1) can be communicated with the optional <subdim> value. This value can be between 1 and D. When <subdim> is less than D-1 (for example, giving a 4-D volume one 2-D slice at a time), the number of data files can be determined by the product of one or more of the slowest axes. When <subdim> is equal to D, the data is assumed to be a set of equal-sized "slabs", based on cuts along the slowest axis. The number of "slabs" must divide into the number of samples along the slowest axis. Breaking the dataset into a header and one or more data files raises a new concerns, namely that the header file can't know if the data file has been erased, renamed, or moved. NRRD provides no means to overcome these problems once they've been created. On the other hand, moving the header and data files together to a new place is a common operation, and is supported by the special semantics associated with the data filename:
When using multiple data files, the line skip and byte skip fields describe how to data is to be accessed in all the files (separately): within every file, line skipping and byte skipping is used to get at the data. The encoding and endian fields must hold for all data files. 4. Space and Orientation InformationStaring with NRRD0004 files, NRRD headers can describe the orientation of the raster grid relative to some surrounding "space". This allows, for example, a NRRD file to specify how an MRI-scanned volume is oriented relative to a patient-based coordinate system such as right-anterior-superior. The orientation information in a NRRD header defines a relationship between the individual axes of the array, and the basis vectors of the (single) space in which the array conceptually lives. Because orientation information is a self-contained topic, all the related field specifications are described here, instead of in Section 5 and Section 6. For the purposes of this section, this surrounding space in which the image grid is logically oriented will be called "world space" (nonwithstanding the fact that application-specific uses of NRRDs may use "world space" to refer to yet another reference frame).Orientation information is defined by a combination of basic and per-axis field specifications, which accomplish four things:
When orientation information is defined by the fields below, some of the other per-axis fields may not be used. Specifically, on a per-axis basis, there is mutual exclusion between setting a space direction and using a non-NaN value for "mins", "maxs", and "spacings", or a non-empty string in the "units" field. However, there is no such exclusion with the "thicknesses", "centers", "kinds", and "labels" fields. space: <space>
It is important to recognize that the identification of the space (or
rather its basis vectors) is not the same as identifying which
axes of the array are aligned with which space basis vectors. That
is, in some other formats, "RAS" implies something about axis
ordering: that the coordinates along the fastest axis increase in the
left to right direction, and that the second and third array
coordinates increase along the anterior and superior directions. The
NRRD format, in contrast, is careful to separate the issue of axis
ordering from the task of identifying the space in which the array is
oriented. It is possible to reorder the axes, and/or the samples
along an axis, while keeping the spatial locations of the samples
unchanged. None of this changes the identification (such as
right-anterior-superior) of the world space.
space dimension: <int>
space units: "<unit[0]>" "<unit[1]>" ... "<unit[dim-1]>"
space units: "mm" "mm" "mm" space origin: <vector>
The format of the <vector> is as follows. The vector is delimited by "(" and ")", and the individual components are comma-separated. This is an example of a three-dimensional origin specification: space origin: (0.0,1.0,0.3) space directions: <vector[0]> <vector[1]> ... <vector[dim-1]>
space directions: none (1,0,0) (0,1,0) (0,0,3) measurement frame: <vector[0]> <vector[1]> ... <vector[spaceDim-1]>
The measurement frame is a basic (per-array) field specification (not per-axis), which identifies a spaceDim-by-spaceDim matrix, where spaceDim is the dimension of world space (implied by space or given by space dimension). The matrix transforms (a column vector of) coordinates in the measurement frame to coordinates in world space. vector[i] gives column i of the measurement frame matrix. Just as the space directions field gives, one column at a time, the mapping from image space to world space coordinates, the measurement frame gives the mapping measurement frame to world space coordinates, also one column at a time. Somewhat confusingly and unfortunately, there are currently no semantics defined which relate the measurement frame matrix to any kind of any axis (such as "3-vector"), even though it might seem natural. That is, the matrix defined by the "measurement frame" field is always a square matrix whose size is entirely determined by the dimension of world space, even if an axis identifies its kind as something which would seem to call for a different number of measurement frame dimensions. This is unfortunately not a clear-cut issue. Basically, it is not the job of the "measurement frame" field to reconcile an illogical combination of world space dimension and per-axis kind, which can arise with or without a measurement frame being defined. Even when there is no inconsistency, there is no graceful way of identifying which axes' kinds have coordinates which may be mapped to world space, since this is logically per-axis information, but having a per-axis measurement frame is certainly overkill. There is also the possibility that a measurement frame should be recorded for an image even though it is storing only scalar values (e.g., a sequence of diffusion-weighted MR images has a measurement frame for the coefficients of the diffusion-sensitizing gradient directions, and the measurement frame field is the logical store this information). Experience and time may clarify this situation. 5. Basic Field Specificationsdimension: <int>
type: <type>
The type descriptors used are valid type declarations in C, C99, Matlab, Microsoft-land, or some other program. Notice that "char" is not a NRRD type descriptor, to avoid potential confusion associated with the inherent signed/unsigned ambiguity of the "char" C type. If the platform has different C type names for the types described, there will have to be a disconnect between the type implied by the type descriptor, and the actual types used. In other words, the NRRD format requires a binding between the first two columns in the chart above. The third column is just what the current nrrd implementation uses on most supported platforms; this has proven surprisingly portable. In Windows, however, there is no "long long", so "__int64" is used instead. As currently defined, NRRD is simply not portable to platforms on which all the types described above (second column) are not available via some C type declaration or another. We will eventually have many computers in which the minimum addressable unit is larger than 8 bits, in which case NRRD will either have to be expanded to allow types with unaddressable values (in which case a bit type might as well be added), or, some rules will have to be defined for converting a smaller type into an addressable type during data read. Having individually addressable data samples vastly simplifies the task of implementing array operations. The block type is unlike the others. It is included for completeness in representation of the types available in the nrrd library, which uses this type to represent C structs or C++ objects: opaque chunks of memory that can be copied and permuted, but not interpreted as (or generated from) scalar values. The size of that chunk is given in the block size field specification. But block is not safe as a cross-platform general purpose type. Here are the special considerations:
block size: <int> blocksize: <int>
encoding: <encoding>
The formatting for hex is mostly the same as the ASCIIHexDecode and the ASCIIHexEncode filters of PostScript, but they are not identical: PostScript allows multiple filters (data can be run-length encoded as well as hex-encoded), null ('\0') characters count as whitespace, and the end of the data is explicitly indicated by a ">". However, in combination with the line skip specifier, it is usually possible to extract 8-bit image data from PostScript files, assuming you understand enough PostScript to determine the image dimensions. The "standard detached suffix" is the filename suffix that should be used by NRRD writers producing a separate data file in conjunction with a detached header. This is most important for the compression encodings, since the stand-alone programs expect certain suffixes when decompressing (".gz" and ".bz2" for gzip and bzip2, respectively), and these suffixes are stripped after decompression. Because the result of decompressing compressed data from a NRRD is always raw (as opposed to compressed ascii text), the suffix for the detached file includes ".raw" as well. NRRD readers, however, should not care about the filename suffix of a detached data file. Data file contents remaining after all data has been read should be ignored. This sanctions the strategy of using a detached nrrd header to refer to some smaller chunk of data in a separate larger data file. Data before the region of interest can be passed over with line skip and/or byte skip. See the byte skip specification for information about how compression encoding changes its meaning.
There is complete orthogonality between the encoding of the data, and
whether the header is attached or detached. The header is never
compressed- it is necessarily straight ASCII text.
endian: <endian>
The convention with NRRD files is that non-ascii data
should reflect the byte ordering of the current platform.
There is no preference for one endian or the other in NRRD files, and
NRRD writers should never have to worry about fixing endianness, only
recording it when necessary. Fixing endianness is the responsibility
of the NRRD reader. This way, NRRD readers and writers used within
one platform never pay the overhead of fixing endianness. That
overhead should only be incurred when going between platforms with
different endiannesses.
content: <string>
This field is intended as the place to store a very concise textual description of the information in the array, similar to the second line (the "title") of a VTK file format header. The nrrd library, for instance, uses content to store a textual representation of a summary of the operations applied to a nrrd. If nrrdSlice() slices a nrrd with content "engine" along axis 0 at position 50, then the content of the result will be "slice(engine,0,50)". min: <min>
max: <max>
old min: <min> oldmin: <min>
For example, if a floating point nrrd with values ranging from 0.0 to 1.0 is quantized to 8 bits, old min will be 0.0. This is not the middle of the range of values that were all mapped to the lowest output integer, but the lowest of those values.
Infinite values are not valid, "nan" means "don't know".
old max: <max> oldmax: <max>
data file: <filename> datafile: <filename> data file: <format> <min> <max> <step> [<subdim>] datafile: <format> <min> <max> <step> [<subdim>] data file: LIST [<subdim>] datafile: LIST [<subdim>]
This is always optional, but it is the only means of distinguishing
from an attached or a detached NRRD header. When it is present, it is
interpreted according to Section 3, and the
header is considered finished at the EOF, or at the blank line
following the last field, whichever comes first. If the header ends
with a blank line, any data after the blank line is ignored. If
this field is not present, the data is assumed to be in the same file
as the header, following the blank line marking the end of the header.
line skip: <skip> lineskip: <skip>
When used in combination with byte skip, the line skipping is done before the byte skipping. The meaning of line skip is not affected by the encoding field. byte skip: <skip> byteskip: <skip>
If skip is greater than or equal to zero, it tells how many bytes to skip in a data file in order to get to the beginning of the data. By definition, the bytes are skipped according to the action of fgetc(). When used in combination with line skip, the byte skipping is done after the line skipping. When this field does not appear, skip is taken to be zero. As an idiom copied from the MetaImage file format, the value of skip can be -1. This is valid only with raw encoding. The action of this byte skip is to fseek() backwards from the end of the data file, to the beginning of the data. The distance to seek is calculated from the nrrd type and axis sizes. This is a useful trick for getting at binary data in other formats with unknown (or variable) length binary headers, such as DICOM, TIFF, and BMP, but only if the data is uncompressed, and only if the end of the data is contiguous with the end of the file (which can fail to be the case in DICOM and TIFF). If skip is -1, the action of lineskip is entirely moot. The interpretation of byte skip changes according to whether or not the encoding used is a form of compression or not. The only compressions currently supported gzip and bzip2. In uncompressed encodings, the byte skipping is done just like the line skipping: within the data file, so as to locate the beginning of the data, and prior to the decoding of any data. In compressed encodings however, the line skipping is done first, and then the decompression begins. The byte skipping is done within the stream of decompressed data.
The reason for skipping bytes but not lines in the decompressed stream
is basically motivated by the conceptual difference between ASCII and
binary headers. One reason to write headers in ASCII is to make them
human readable, so they probably shouldn't be compressed to begin
with. Also, ASCII headers (such as in PNM images) often allow
multiple lines of optional comments, so the number of lines to skip
has to be determined on a per-file basis by looking at the
(uncompressed) file, at which point the data might as well be written
out as a NRRD file. In contrast, binary headers are very often fixed
length, and not human readable, which means that when the header and
data are compressed together, the beginning of the data can be easily
found via a byte skip offset. This also applies to large
datasets written by FORTRAN programs, for which even "raw" data can be
proceeded by a four-byte representation of the data length.
number: <string>
sample units: <string> sampleunits: <string>
6. Per-axis Field SpecificationsThe individual field descriptors in these specifications are delimited by one or more spaces (" ") or tabs ("\t"), or some combination of the two, but no other kinds of white space delimiters are valid.sizes: <size[0]> <size[1]> ... <size[dim-1]>
spacings: <space[0]> <space[1]> ... <space[dim-1]>
Because there must be one spacing for each axis, spacings must be given for axes which don't logically have a spatial component, such as the RGB axis of color image data, which is usually axis 0. Rather than invent a value (such as 1.0) for sample spacing where no value is sensible, a spacing value of "nan" should be used instead. In addition, "nan" can represent the fact that spacing information would be sensible here, but simply isn't known. Of course, if spacings are NaN for every axis, the field probably shouldn't be in the header.
The meaning and interpretation of the spacings field is
independent of the centers, axis min and axis
maxs fields, even though mutually incompatible settings are
possible.
thicknesses: <thickness[0]> <thickness[1]> ... <thickness[dim-1]>
It is likely the case that only one axis has this information
associated with it, so "nan" should be used for the other
axes. The compelling reason to make thickness a per-axis field is
that if it were a basic field, one would need a separate basic field
to identify which axis is the slice axis, and this information would
become invalid if the array axes were permuted.
axis mins: <min[0]> <min[1]> ... <min[dim-1]> axismins: <min[0]> <min[1]> ... <min[dim-1]>
Infinite values are not valid as axis mins. Any non-infinite values, including zero, are valid. As with spacings information, the use of "nan" as an axis min value is probably preferable to inventing one where no value is meaningful or known.
Presence of the axis mins field does not require presence of
the axis maxs field, although it is often useful for these to
appear together. However, using the axis mins field alone
can emulate the ORIGIN field of the VTK file format header.
axis maxs: <max[0]> <max[1]> ... <max[dim-1]> axismaxs: <max[0]> <max[1]> ... <max[dim-1]>
The settings of axis mins and axis maxs would seem
to imply a value for spacings, but this also depends on the
values of centers. Mutually incompatible settings of
these fields are possible to save in a NRRD header, but is not the job
of the NRRD reader to ensure their consistency, only to check that the
individual values in isolation are sensible (for instance, an axis max
can't be infinite).
centers: <center[0]> <center[1]> ... <center[dim-1]> centerings: <center[0]> <center[1]> ... <center[dim-1]>
labels: "<label[0]>" "<label[1]>" ... "<label[dim-1]>"
As shown above, each label is delimited by double quotes. Within each label, double quotes may be included by escaping them (\"), but no other form of escaping is supported. For axes with no labels, use a quoted empty string ("").
There is no fixed limit on how long the line containing the
labels field can be.
units: "<unit[0]>" "<unit[1]>" ... "<unit[dim-1]>"
kinds: <kind[0]> <kind[1]> ... <kind[dim-1]>
The possible values for <kind[i]> are as follows. Prior to the release of Teem 1.9, numerous kinds were added; future kinds will be added with more skepticism and caution. As NRRD files can represent sampled functions, images, fields, or maps of various types, one has to keep in mind the difference between axes which represent the domain of the function (or, an independent variable), versus the range of the function (or, a dependent variable). The only kinds below which represent a domain are:
One issue that arises with storing vectors and matrices in NRRDs is identifying the coordinate frame in which their coefficients are measured. The way to do this in NRRD is to use the measurement frame field in combination with the space directions field: which relates the measurement frame to the orientation of image as a whole. Without the measurement frame, field there are no semantics for what the measurement frame is (it isn't safe to assume that its the same as the image orientation). |