Format of Gwyddion Files

A Gwyddion native data file (GWY) consists of a tree-like structure of serialized objects. Generally, these objects can be of various kind and contain other embedded objects (hence the tree-like structure). It can be instructive to play with gwydump, a simple file structure visualizer available in on the project's web, for a while and examine the contents of various files. If you plan to read and/or write GWY files in independent software, have also a look at libgwyfile, a small standalone embeddable library for GWY file handling.

The Two Layers

This file format description specifies in fact two different things that are sometimes useful to distinguish:

  • The physical structuring of the files: byte order, data kinds and their representation, how the sizes of various bits of data are determined, how data objects are nested to form the tree, etc. It is called the generic GWY file format in libgwyfile. It can be in principle used for something completely different than SPM data by some completely unrelated software – and would be still called the GWY file format.
  • The representation of particular SPM data by Gwyddion, specifying further conventions on top of the physical structure and interpretation. For instance we describe that images are stored as objects called GwyDataField that have resolutions stored this way, image data stored that way, etc. Libgwyfile calls it the Gwyddion GWY file format.

In should be usually quite clear where we talk about the generic format and where about specific Gwyddion data objects (occassionally it will be explicitly noted). We will start with the physical file structure.

Byte Order

All data are stored in little-endian (also known as LSB or Intel) byte order.

File Header

The file header consists of four bytes (magic number) with the values of ASCII characters GWYP.

This is the new file format, an older version of file format with magic header GWYO also exists. It will not be discussed here as it is mostly extinct nowadays.

File Data

The rest of a Gwyddion file consists of a serialized GwyContainer object that contains all the data. This top-level object is stored exactly the same way as any other object, that is as described in the next section.

A generic GWY file can have some other object as the top-level object.

Object Layout

An object consists of three parts (in the following order):

  1. Type name, stored as a NUL-terminated string of ASCII characters. In a Gwyddion file this is the type name in GObject type system. Generally the name should be a valid C identifier.
  2. Serialized data size, stored as an unsigned 32bit integer. It does not include the size of the type name and the size of self.
  3. Component list. Components are named parts of object data, each of particular data type: an atomic type, an array of atomic types, or again an object. They are stored in no particular order.

Components

Each component consists of three parts (in the following order):

  1. Name, stored as a NUL-terminated UTF-8 encoded string.
  2. Type, stored as a single unsigned byte (character). The table of possible component types is presented below.
  3. Data, stored as whatever is appropriate for a particular type.

Components are often called items in libgwyfile and Gwyddion libraries.

Skipping an unknown (or uninteresting) component while reading a GWY file can be done by first reading the component name and type. Then there are several possibilities:

  • If the type is atomic and fixed-size the number of bytes to skip is given by the type table.
  • For strings, you have to skip after the terminating NUL character.
  • If the type is an object you need to read its name and size, and the size then tells you exactly how many bytes to skip further.
  • If the type is an array of simple atomic types then read the array length and then multiply the atomic type size with the length. This gives the number of bytes to skip.
  • For an array of strings, read the array length and repeat string skipping (move past NUL) according to the array length.
  • Finally, the most complex scenario is an array of objects when you need to read the array length and then repeat the object skipping the specified number of times.

Data Types

Available atomic data types are listed in following table:

TypeCharacterSizeNote
booleanb1 byte Zero is false, nonzero (normally 1) is true.
characterc1 byte 
32bit integeri4 bytes 
64bit integerq8 bytes 
doubled8 bytes Finite IEEE 754 double precision floating point number, i.e. files must not contain infinities and not-a-numbers.
stringsvariable NUL-terminated and UTF-8 encoded.
objectovariable Nested serialized object as described above.

Each atomic type except boolean has its array counterpart. The type character of array types is the same as of the corresponding atomic type, except it is uppercase. An array is stored as unsigned 32bit integer representing the number of items, followed by the item values. The number of items must be positive; empty arrays are not stored. Array data types are listed in following table:

TypeCharacterNote
array of charactersC In general neither NUL-terminated nor UTF-8 encoded (s is used for text strings).
array of 32bit integersI 
array of 64bit integersQ 
array of doublesD 
array of stringsS 
array of objectsOUppercase Oh, not zero.

Top-Level GwyContainer

GwyContainer is a general dictionary-like data object that can hold arbitrary components of arbitrary types. This permits incorporating unforeseen data structures into GWY files in a relatively sane manner.

The names (keys) of data objects in a GwyContainer representing a Gwyddion GWY file strongly resemble UNIX file names, i.e. they have the form of /-separated paths and form a sort of tree-like structure. For instance the title of the first image, numbered 0, is stored under the key /0/data/title. Note some data or information is found under keys that may not seem logical; the reason is usually historical.

The following sections describe the organisation of interesting data and information in the top-level GwyContainer. The list is not necessarily complete. However, since all data items in the file specify consistently their name, type and size in bytes it is always possible to skip unknown data types or data you are not interested in and extract only the desired data items.

Images

The following table summarises the common keys of image-related data in the top-level container for image number 0. For other images, the number 0 has to be replaced with the corresponding image number. Note that images are often numbered sequentially, starting from 0, however, they can have any numbers and the set of images numbers does not have to be contiguous.

KeyTypeMeaning
/0/dataGwyDataField Channel data.
/0/data/titlestring Channel title, as shown in the data browser.
/0/data/visibleboolean Whether the image should be displayed in a window when the file is loaded.
/0/data/realsquareboolean Whether the image should be displayed as Physically square (as opposed to Pixelwise square).
/0/base/palettestring Name of the false color gradient used to display the image.
/0/base/range-type32bit integer False color mapping type (as set by the Color range tool), the value is from GwyLayerBasicRangeType enum.
/0/base/mindouble Minimum value for user-set display range.
/0/base/maxdouble Maximum value for user-set display range.
/0/maskGwyDataField Mask data. The pixel dimensions of this data field must match those of the image data.
/0/mask/reddouble Red component of the mask color.
/0/mask/greendouble Green component of the mask color.
/0/mask/bluedouble Blue component of the mask color.
/0/mask/alphadouble Alpha (opacity) component of the mask color.
/0/showGwyDataField Presentation data. The pixel dimensions of this data field must match those of the image data.
/0/metaGwyContainer Channel metadata. The keys are directly the names as displayed in the metadata browser and the string values are the values.
/0/data/logGwyStringList Channel log as a list of string log entries. They have the format type::function(param=value, …)@time.
/0/select/fooa GwySelection subclass Selection data. Each kind of selection has (usually) a different object type and is stored under a different name; the specific name foo is the same as shown in the selection manager.

Channels are represented as GwyDataField objects. The components of a GwyDataField are summarised in the following table:

ComponentTypeMeaning
xres32bit integer Horizontal size in pixels.
yres32bit integer Vertical size in pixels.
xrealdouble Horizontal dimension in physical units.
yrealdouble Vertical dimension in physical units.
xoffdouble Horizontal offset of the top-left corner in physical units. It usually occurs only if non-zero.
yoffdouble Vertical offset of the top-left corner in physical units. It usually occurs only if non-zero.
si_unit_xyGwySIUnit Unit of lateral dimensions.
si_unit_zGwySIUnit Unit of data values.
dataarray of doubles Field data, stored as a flat array of size xres×yres, from top to bottom and from left to right.

Graphs

The following table summarises the common keys of graph-related data in the top-level container for graph number 1. For other graphs, the number 1 has to be replaced with the corresponding graph number. Note that graphs are often numbered sequentially, starting from 1, however, they can have any numbers positive and the set of graph numbers does not have to be contiguous. The number 0 in the prefix of graph keys is a historical relic that does not mean anything and it is always 0.

KeyTypeMeaning
/0/graph/graph/1GwyGraphModel Graph model object data.
/0/graph/graph/1/visibleboolean Whether the graph should be displayed in a window when the file is loaded.

Graphs are represented as GwyGraphModel objects. The components of a GwyGraphModel are summarised in the following table:

ComponentTypeMeaning
curvesarray of GwyGraphCurveModels Individual graph curves.
titlestring Graph title as displayed in the data browser.
x_unitGwySIUnit Unit of the abscissa.
y_unitGwySIUnit Unit of the ordinate.
top_labelstring Label on the top axis.
bottom_labelstring Label on the bottom axis.
left_labelstring Label on the left axis.
right_labelstring Label on the right axis.
x_is_logarithmicboolean Whether the abscissa has a logarithmic scale.
y_is_logarithmicboolean Whether the ordinate has a logarithmic scale.
x_mindouble User-set minimum value of the abscissa.
x_min_setboolean Whether user-set minimum value of the abscissa should be used (otherwise the range is determined automatically).
x_maxdouble User-set maximum value of the abscissa.
x_max_setboolean Whether user-set maximum value of the abscissa should be used (otherwise the range is determined automatically).
y_mindouble User-set minimum value of the ordinate.
y_min_setboolean Whether user-set minimum value of the ordinate should be used (otherwise the range is determined automatically).
y_maxdouble User-set maximum value of the ordinate.
y_max_setboolean Whether user-set maximum value of the ordinate should be used (otherwise the range is determined automatically).
grid-type32bit integer Type of grid shown. The value is from GwyGraphGridType enum.
label.has_frameboolean Whether the graph key has a frame.
label.frame_thickness32bit integer Width of graph key frame.
label.reverseboolean Whether to reverse the graph key.
label.visibleboolean Whether the graph key is visible.
label.position32bit integer The position (corner) where the graph key is places. The value is from GwyGraphLabelPosition enum.

Graph curves are represented as GwyGraphCurveModel objects. The components of a GwyGraphCurveModel are summarised in the following table:

ComponentTypeMeaning
xdataarray of doubles Abscissa points. The number of points must match ydata.
ydataarray of doubles Ordinate points. The number of points must match xdata.
descriptionstring Curve description (name).
type32bit integer Curve mode (points, lines, etc.) The value is from GwyGraphCurveType enum.
color.reddouble Red component of the curve color.
color.greendouble Green component of the curve color.
color.bluedouble Blue component of the curve color.
point_type32bit integer Type of symbols representing data points. The value is from GwyGraphPointType enum.
point_size32bit integer Size of symbols representing data points.
line_type32bit integer Type of lines connecting data points. The value is from GwyGraphLineType enum.
line_size32bit integer Width of lines connecting data points.

Spectra

The following table summarises the common keys of spectra-related data in the top-level container for spectra set number 0. For other spectra, the number 0 has to be replaced with the corresponding spectra set number. Note that spectra sets are often numbered sequentially, starting from 0, however, they can have any numbers and the set of spectra set numbers does not have to be contiguous.

KeyTypeMeaning
/sps/0GwySpectra Spectra data.

Sets of spectra of one kind are represented as GwySpectra objects. The components of a GwySpectra are summarised in the following table:

ComponentTypeMeaning
titlestring Spectra title as displayed in the data browser.
si_unit_xyGwySIUnit Unit of spectrum position coordinates.
coordsarray of doubles Coordinates of points where the spectra were taken, in physical units. Each spectrum takes two items: for the horizontal and vertical coordinate. The number of coordinates must match the number of curves in data.
dataarray of GwyDataLines Individual spectra curves.
selectedarray of 32bit integers Indices of selected spectra curves.

Individual curves in spectra are represented as GwyDataLine objects. The GwyDataLine object is a one-dimensional counterpart of GwyDataField and is used also for other regular one-dimensional data. The components of a GwyDataLine are summarised in the following table:

ComponentTypeMeaning
res32bit integer Number of data points.
realdouble Length in physical units.
offdouble Offset of the begining in physical units. It usually occurs only if non-zero.
si_unit_xGwySIUnit Unit of abscissa.
si_unit_yGwySIUnit Unit of data values.
dataarray of doubles Line data, stored as an array of res, from left to right.

Volume data

The following table summarises the common keys of volume-related data in the top-level container for volume data number 0. For other volume data, the number 0 has to be replaced with the corresponding volume data number. Note that volume data are often numbered sequentially, starting from 0, however, they can have any numbers and the set of volume data numbers does not have to be contiguous.

KeyTypeMeaning
/brick/0GwyBrick Volume data.
/brick/0/previewGwyDataField Two-dimensional data shown when the volume data are displayed in a window.
/brick/0/titlestring Volume data title, as shown in the data browser.
/brick/0/visibleboolean Whether the volume data should be displayed in a window when the file is loaded.
/brick/0/preview/palettestring Name of the false color gradient used to display the preview data.
/brick/0/metaGwyContainer Volume data metadata. The keys are directly the names as displayed in the metadata browser and the string values are the values.
/brick/0/logGwyStringList Volume data log as a list of string log entries. They have the format type::function(param=value, …)@time.

Volume data are represented as GwyBrick objects. The components of a GwyBrick are summarised in the following table:

ComponentTypeMeaning
xres32bit integer Horizontal size in pixels.
yres32bit integer Vertical size in pixels.
zres32bit integer Depth (number of levels) in pixels.
xrealdouble Horizontal dimension in physical units.
yrealdouble Vertical dimension in physical units.
zrealdouble Depthwise dimension in physical units.
xoffdouble Horizontal offset of the top-left corner in physical units. It usually occurs only if non-zero.
yoffdouble Vertical offset of the top-left corner in physical units. It usually occurs only if non-zero.
zoffdouble Depthwise offset of the top-left corner in physical units. It usually occurs only if non-zero.
si_unit_xGwySIUnit Unit of horizontal lateral dimensions.
si_unit_yGwySIUnit Unit of vertical lateral dimensions.
si_unit_zGwySIUnit Unit of depthwise dimensions.
si_unit_wGwySIUnit Unit of data values.
dataarray of doubles Field data, stored as a flat array of size xres×yres×zres, from the zeroth to the last plane, top to bottom and from left to right.
calibrationGwyDataLine Calibration of the z axis to represent non-linear sampling in this dimension. The number of points must be equal to zres. This component is present only if non-linear sampling is used.

XYZ data

The following table summarises the common keys of XYZ-related data in the top-level container for XYZ data number 0. For other XYZ data, the number 0 has to be replaced with the corresponding XYZ data number. Note that XYZ data are often numbered sequentially, starting from 0, however, they can have any numbers and the set of XYZ data numbers does not have to be contiguous.

KeyTypeMeaning
/xyz/0GwySurface XYZ data.
/xyz/0/previewGwyDataField Regularised preview shown when the XYZ data are displayed in a window. Note that although XYZ previews are stored in GWY files, they are commonly re-created and updated when the data are displayed so it is rarely useful to add them when you are writing a GWY file.
/xyz/0/titlestring XYZ data title, as shown in the data browser.
/xyz/0/visibleboolean Whether the XYZ data should be displayed in a window when the file is loaded.
/xyz/0/preview/palettestring Name of the false color gradient used to display the preview data.
/xyz/0/metaGwyContainer XYZ data metadata. The keys are directly the names as displayed in the metadata browser and the string values are the values.
/xyz/0/logGwyStringList Volume data log as a list of string log entries. They have the format type::function(param=value, …)@time.

XYZ data are represented as GwySurface objects. The components of a GwySurface are summarised in the following table:

ComponentTypeMeaning
si_unit_xyGwySIUnit Unit of horizontal lateral dimensions.
si_unit_zGwySIUnit Unit of data values.
dataarray of doubles XYZ data, stored as a flat array of whose size is a multiple of 3. Each XYZ triplet is stored together, leading to the following data order: x₀, y₀, z₀, x₁, y₁, z₁, x₂, y₂, z₂, etc.

Other Items

The main GwyContainer contains also items that do not pertain to any specific channel, graph or other data type.

KeyTypeMeaning
/filenamestring The name of file the GwyContainer is currently associated with. If it was saved to a file then the item contains the corresponding file name. Otherwise it contains the name of file from which the data were loaded or imported. It may also be unset, for instance for newly created synthetic data.

Auxiliary Objects

The components of a GwySIUnit are summarised in the following table:

ComponentTypeMeaning
unitstrstring Textual representation of the unit, e.g. "A" or "m^-1" (as base SI unit, prefixes are ignored).

The components of a GwySelection are summarised in the following table. Some selection types can have other data members; refer to the documentation of specific selection classes for how to interpret the data.

ComponentTypeMeaning
max32bit integer Maximum number of objects the selection can hold (this is the number set by gwy_selection_set_max_objects()).
dataarray of doubles Selection data. The number of items that form one selection object is determined by the selection type.

The components of a GwyStringList are summarised in the following table. Note if GwyStringLists are used to represent logs, the strings have a specific structure described above.

ComponentTypeMeaning
stringsarray of strings List of string items.