gwydump
gwydump
displays the structure and possibly contents of
.gwy
files in a textual form and it can also extract raw data
components from them. It is useful for getting acquainted with the file
structure, obtaining overview of file contents from command line, debugging
problems or writing a program to read .gwy
files.
gwydump
can handle both Gwyddion 1 and 2 files and tentatively
also Gwyddion 3 serialisation format.
gwydump
depends only on GLib and it does not use any
Gwyddion library. It is also quite simple, currently it's under 700 physical
lines of code.
Note that now a library for reading and writing Gwyddion GWY files, called libgwyfile, also exists. It is written in pure C and has a simple permissive license, which makes it suitable for embedding in your programs.
Source code tarball (xz): | gwydump-2.1.tar.xz | 2011-02-04 | 12.4 kB |
Win64 executable: | gwydump-win64.exe | ||
Win32 executable: | gwydump-win32.exe |
The Win32/Win64 executable is not an installer, it is just the program. It
requires GLib so it is best to put it into the same directory where the
corresponding gwyddion.exe
(i.e. 32bit or 64bit) is installed (or
to set an App Path key in the MS Windows registry).
If run without any options, gwydump
prints an overview of all
objects and their components in the file. They are indented to represent their
nesting:
$ gwydump btz82-3.gwy Header GWYP "" "/filename" "/0/data/visible" "/0/data" "xres" "yres" "xreal" "yreal" "si_unit_xy" "unitstr" "si_unit_z" "unitstr" "data" "cache_bits" "cache_data" "/0/data/title" "/0/select/pointer" "max" "/0/graph/lastid"
The first line shows the file magic header – it is
GWYO
for version 1 files, GWYP
for
version 2 files. The second line is the unnamed root object, the rest are
its components and their components, etc.
To get an overview of just top-level items, we can limit the maximum depth
to print to 1 with -d
:
$ gwydump -d 1 -i 0 btz82-3.gwy Header GWYP "" "/filename" "/0/data/visible" "/0/data" "/0/data/title" "/0/select/pointer" "/0/graph/lastid"
The foregoing example also demonstrates the effect of option
-i
which controls the amount of indentation per nesting
level.
But we want more, component names are rarely enough. Component types can
be printed with -t
, their values with -v
and sizes
of objects (components of atomic types have known fixed sizes) with
-s
. For instance the first example with types and values would
look:
$ gwydump -tv btz82-3.gwy Header GWYP "" object=GwyContainer "/filename" string="/home/yeti/opt/gwyddion/bin/btz82-2.gwy" "/0/data/visible" boolean=TRUE "/0/data" object=GwyDataField "xres" int32=512 "yres" int32=512 "xreal" double=4e-06 "yreal" double=4e-06 "si_unit_xy" object=GwySIUnit "unitstr" string="m" "si_unit_z" object=GwySIUnit "unitstr" string="m" "data" double array of length 262144 "cache_bits" int32=3 "cache_data" double array of length 30 "/0/data/title" string="Topography" "/0/select/pointer" object=GwySelectionPoint "max" int32=1 "/0/graph/lastid" int32=0
This gave us a fairly good image of what is in the file. The type of the root object is GwyContainer as we could expect.
Fortunately, option -v
did not cause a dump of the 262144
values of the "data"
field of "/0/data"
. It
enables printing of atomic type values only. The content of arrays can be
printed too, see the next section.
If we want to know not only what is inside but also where it is, we use
option -o
to print offsets in the file (they are printed in
hexadecimal). Or option -a
which combines all the additional
information mentioned up to now:
$ gwydump -a btz82-3.gwy 00000000: Header GWYP 00000004: "" object=GwyContainer size=2097749 00000015: "/filename" string="/home/yeti/opt/gwyddion/bin/btz82-2.gwy" 00000048: "/0/data/visible" boolean=TRUE 0000005a: "/0/data" object=GwyDataField size=2097557 00000074: "xres" int32=512 0000007e: "yres" int32=512 00000088: "xreal" double=4e-06 00000097: "yreal" double=4e-06 000000a6: "si_unit_xy" object=GwySIUnit size=11 000000c0: "unitstr" string="m" 000000cb: "si_unit_z" object=GwySIUnit size=11 000000e4: "unitstr" string="m" 000000ef: "data" double array of length 262144 002000f9: "cache_bits" int32=3 00200109: "cache_data" double array of length 30 00200209: "/0/data/title" string="Topography" 00200223: "/0/select/pointer" object=GwySelectionPoint size=9 0020024c: "max" int32=1 00200255: "/0/graph/lastid" int32=0
Arrays, or their leading elements can be printed with -l
.
Continuing the example from the previous section:
$ gwydump -vl 3 btz82-3.gwy Header GWYP "" GwyContainer "/filename" "/home/yeti/opt/gwyddion/bin/btz82-2.gwy" "/0/data/visible" TRUE "/0/data" GwyDataField "xres" 512 "yres" 512 "xreal" 4e-06 "yreal" 4e-06 "si_unit_xy" GwySIUnit "unitstr" "m" "si_unit_z" GwySIUnit "unitstr" "m" "data" array of length 262144 [0] 5.10979e-06 [1] 5.11051e-06 [2] 5.11122e-06 [3..262143] ... "cache_bits" 3 "cache_data" array of length 30 [0] 5.08298e-06 [1] 5.12205e-06 [2] 0 [3..29] ... "/0/data/title" "Topography" "/0/select/pointer" GwySelectionPoint "max" 1 "/0/graph/lastid" 0
We see [number]
is used in place of component name
for array items, the remaining items are replaced with ellipsis.
Arrays of objects (as opposed to atomic types) are not controlled with
-l
. They are printed always as the output always includes all
named components (if not limited by depth). And objects have named
components regardless whether they are named components themselves or not.
A more complex example with a graph demonstrates it:
$ gwydump -tv btz82-4.gwy Header GWYP "" object=GwyContainer "/0/graph/lastid" int32=1 "/filename" string="/home/yeti/opt/gwyddion/bin/btz82-3.gwy" "/0/data/visible" boolean=TRUE "/0/data" object=GwyDataField "xres" int32=512 "yres" int32=512 "xreal" double=4e-06 "yreal" double=4e-06 "si_unit_xy" object=GwySIUnit "unitstr" string="m" "si_unit_z" object=GwySIUnit "unitstr" string="m" "data" double array of length 262144 "cache_bits" int32=3 "cache_data" double array of length 30 "/0/data/title" string="Topography" "/0/select/line" object=GwySelectionLine "max" int32=12 "data" double array of length 8 "/0/select/pointer" object=GwySelectionPoint "max" int32=1 "/0/graph/graph/1" object=GwyGraphModel "has_x_unit" boolean=FALSE "has_y_unit" boolean=FALSE "x_is_logarithmic" boolean=FALSE "y_is_logarithmic" boolean=FALSE "x_unit" object=GwySIUnit "unitstr" string="m" "y_unit" object=GwySIUnit "unitstr" string="m" "title" string="Profiles" "top_label" string="" "bottom_label" string="x" "left_label" string="y" "right_label" string="" "x_reqmin" double=0 "y_reqmin" double=5.08e-06 "x_reqmax" double=2e-06 "y_reqmax" double=5.12e-06 "label.position" int32=0 "label.has_frame" boolean=TRUE "label.frame_thickness" int32=1 "curves" object array of length 2 [0] object=GwyGraphCurveModel "xdata" double array of length 135 "ydata" double array of length 135 "description" string="Profile 1" "color.red" double=0 "color.green" double=0 "color.blue" double=0 "type" int32=2 "point_type" int32=0 "point_size" int32=8 "line_style" int32=0 "line_size" int32=1 [1] object=GwyGraphCurveModel "xdata" double array of length 197 "ydata" double array of length 197 "description" string="Profile 2" "color.red" double=0.812 "color.green" double=0 "color.blue" double=0 "type" int32=2 "point_type" int32=0 "point_size" int32=8 "line_style" int32=0 "line_size" int32=1
Beside human-readable dumps gwydump
can extract components in
raw binary form. Extraction is selected by -x
and it
disables all other output. Since the extracted data are dumped to the
standard output too, remember to redirect it:
$ gwydump -x '/"/0/data"/"yres"' btz82-3.gwy >out $ xxd out 0000000: 0002 0000 ....
xxd
is a very useful tool for looking at binary dumps,
distributed as a part of Vim. It showed us
the file consist of four bytes 0x00, 0x02, 0x00, 0x00.
So we managed to extract the y-resolution of the first data field, 512, to
a file as binary data. All data extracted with -x
is stored
exactly as in a .gwy
file: that is byte order is little endian,
booleans are stored as bytes, strings are NUL-terminated, etc. The output is
essentially just cut out of the input file.
How we came at the ugly expression '/"/0/data"/"yres"'
? The
paths in principle look like Unix file paths, starting with /
and
using it as component separator too. Component names are quoted strings:
"/0/data"
and "yres"
(special characters are
represented with the common escape sequences \n
, \t
,
\\
…). The fact that the names can
be /
-separated paths themselves (/0/data
) makes the
notation a bit harder to read, but there's no mystery in it.
Array indices form path components of the form [42]
(no
quotes).
And finally, as the path tends to contain lots of shell metacharacters we
put the entire path to single quotes: 'path'
.
If this all sounds too confusing, we can simply use -p
to
print the full paths in place of components names and choose the one we
wish:
$ gwydump -p -i 0 btz82-3.gwy Header GWYP "" /"/filename" /"/0/data/visible" /"/0/data" /"/0/data"/"xres" /"/0/data"/"yres" /"/0/data"/"xreal" /"/0/data"/"yreal" /"/0/data"/"si_unit_xy" /"/0/data"/"si_unit_xy"/"unitstr" /"/0/data"/"si_unit_z" /"/0/data"/"si_unit_z"/"unitstr" /"/0/data"/"data" /"/0/data"/"cache_bits" /"/0/data"/"cache_data" /"/0/data/title" /"/0/select/pointer" /"/0/select/pointer"/"max" /"/0/graph/lastid"
Can we do something more impressive than the extraction of y-resolution? Sure. As anything that has a path can be extracted we can extract for example the second graph curve:
$ gwydump -x '/"/0/graph/graph/1"/"curves"/[1]' >out
What is exactly in out
now? It is a serialised
GwyGraphCurveModel object. But wait, .gwy
files are little more
than serialised Gwyddion objects (particularly GwyContainers). Shouldn't
gwydump
be able to tell us something about the contents of
out
then? Indeed, it can. We only have to use option
-r
to indicate a raw serialised object without any header:
$ gwydump -ra out 00000000: "" object=GwyGraphCurveModel size=3330 00000017: "xdata" double array of length 197 0000064a: "ydata" double array of length 197 00000c7d: "description" string="Profile 2" 00000c94: "color.red" double=0.812 00000ca7: "color.green" double=0 00000cbc: "color.blue" double=0 00000cd0: "type" int32=2 00000cda: "point_type" int32=0 00000cea: "point_size" int32=8 00000cfa: "line_style" int32=0 00000d0a: "line_size" int32=1
We see the object starts immediately at offset 00000000
, but
otherwise nothing has changed.
Arrays can be extracted as well: take x-resolution and y-resolution, add some data array
$ gwydump -x '/"/0/data"/"xres"' btz82-3.gwy >out $ gwydump -x '/"/0/data"/"yres"' btz82-3.gwy >>out $ gwydump -x '/"/0/data"/"data"' btz82-3.gwy >>out
and voilà, a new simple data file was created!