Saving and Loading Data
PATKIT distinguishes between importing and loading, as well as exporting and saving. Importing and exporting are used (as consistently as possible) of the operations for data coming in from sources other than PATKIT and going out for use by other systems. Saving and loading on the hand refer to data produced by PATKIT for its own use.
Basic design principles for saving and loading
- Keep files human-readable when possible. This does not necessarily mean
human-editable, but comes close to it.
- High degree of modularity. We want each bit of metadata stored at the correct
level. If the metadata refers to a Modality, it should be in that Modality’s
meta file; if it refers to the Recording, in the Recordings meta file; if to
the whole Trial, in the Trials meta file; etc.
- Keep redundancy as low as possible. For example, this means that frame rate
of an ultrasound video is only stored in the (externally generated)
ultrasound parameter file and not as part of PATKIT generated metadata.
- Break the rules when they make life difficult. For example, in future the
sampling frequency of a wav (and possibly its duration) might be stored in
the Recording’s metafile to make it unnecessary to open the
.wav file to
just get this sort of implicit metadata. A valid reason for doing something
like this is, if the overhead from doing multiple disk reads starts to
matter. This is very unlikely though with modern drives.
- Backwards compatibility. This will be mainly taken care of by keeping
importers for old versions of the files as part of PATKIT when changes are
made to how the save formats work. For this purpose the metafiles will always
contain a PATKIT file version number – separate from PATKIT version number
– which will tell PATKIT which importer to use and also tell an old version
of PATKIT that it is outdated and unable to open a given save if that is the
case.
Versioning
Starting from version 1.0 PATKIT file formats will be versioned separately from
versions of PATKIT itself. Until then, no attempts are made for backwards
compatibility.
File names
The file names are made up of two or three parts separated by dots:
- Basename - this is the name of the corresponding
.wav file.
- Modality name - the Modality’s name attribute with whitespace converted to underscores if this file is specific to a Modality.
- Suffix - either
.npz or .patkit_meta. Former is for data stored in a numpy zip format and latter is for metadata stored as NestedText.
Example:
File005.PD_l2_on_RawUltrasound.patkit. This is the data for PD calculated with the l2 metric on raw ultrasound data for Recording File005. The metadata for the same Modality will be File005.PD_l2_on_RawUltrasound.patkit_meta, while the enclosing Recording’s metadata will be stored in File005.patkit_meta.
The filenames themselves are not used by PATKIT currently to figure out what a
given file contains. Rather that bit of information is just for human readability.
File name suffixes in code
PATKIT defines an Enum for valid suffixes in patkit.constants.