-
Notifications
You must be signed in to change notification settings - Fork 8
Note file formats
MPKEdit can save and load individual Note files (saves) which use the .note file extension.
In general, there are two types of .note files:
-
Basic Note File:
NoteEntry
data, followed by actual save data -
Extended Note File:
MPKNote
header, followed by Basic Note File
This page describes the data structure of these files.
The MPKNote
header was made to store text comments, as well as an internal timestamp. It is used by default.
Current format version is 0x01
.
Offset | Size | Description |
---|---|---|
0x00 | 1 | Format version number |
0x01 | 7 | Magic string "MPKNote ", ASCII-encoded |
0x08 | 2 | Reserved for future use |
0x0A | 5 | Timestamp (big endian 40-bit unixtime ) |
0x0F | 1 | n, Number of 16-byte blocks allocated for comment data |
0x10 | n × 16 | Comment data (UTF-8 string, NUL-terminated) |
– | – | Basic Note file data |
If your application does not support these extra features, simply skip to the Basic note file offset.
The Comment data can have a length up to 4080 bytes, defined by n
at offset 0x0F
, using the formula n * 16
where n
can be 0-255. The comment data must be terminated by a NULL
character.
The timestamp is a truncated 64-bit Unix timestamp, essentially an unsigned 40-bit integer in big-endian byte order. It's maximum value represents a date of Feb 19 36812 at 7:36:15 PM, effectively valid for 34.5 thousand years. In case it's not obvious, big endian means bytes are written "in reverse", starting at offset 0x0E
, with 0x0A
being the last. The two high bytes remaining are reserved for potential future use, but currently undefined.
In general, the timestamp represents the earliest possible time of saving. MPKEdit attempts to choose the earliest timestamp available when saving a note file. If a previously loaded .MPK has a date-modified
attribute, it will use this date.
This is the actual timestamp priority:
- The stored timestamps in the
MPKMeta
extension data itself (yet to be implemented). - The stored internal timestamp found in imported .note file.
- The
date-modified
of an imported .note file without any internal timestamp. - The overall
date-modified
of an MPK file containing notes. - At time-of-saving note file (technically impossible to trigger).
Note: The initial Format version number of 0x00
was used for a short period during initial implementation. It is obsolete, but MPKEdit supports it just in case (Import only). Supporting it is completely optional (see code if desired), because likely no one even used it, and those who did can simply use MPKEdit to re-save. I only mention it here for completeness. It's essentially the same as version 0x01
, using the same magic string, but the comment data has a fixed length of 256 bytes, and stores an 8-bit NUL-terminated string using the iso-8859-1
encoding (basically just extended ASCII). This was chosen to match the way DexDrive stored comments.
This is a simple binary format for representing notes as files.
Offset | Size | Name |
---|---|---|
0x00 | 32 | Note Entry |
0x20 | n × 256 | Raw save data |
The Note Entry is a raw data structure taken directly from MPK data. However, the two-byte IndexTable entry point located at offset 0x07
is not relevant in a note export, and has been specially repurposed as a file identifier, using the magic number 0xCAFE
. Immediately after this is the actual save data itself, of which the length will always be a multiple of 256. When importing this data, 0xCAFE
must be replaced with the correct entry point.
To detect Basic Note files, it is only required to check for the existence of 0xCAFE
at offset 0x07
.
Optionally, you can check if filesize minus 32 is a multiple of 256. Going even further, check that bytes 0x0A
and 0x0B
both equal zero which can be used to strengthen the file magic.
- Checksum likely to be added in the unused 16-bit space. Also likely to use existing hash algorithm MurmurHash128. Rationale for inclusion: It is an early warning system for potential data corruption. It will not fix problems, and problems are likely to be rare in most contexts, but can alert the user that something may have changed, either minor or major, that could compromise the file. It's a safety fallback, likely to not be needed, but if its ever needed, it could be a godsend. Reasons for accepting 16-bit instead of 32-bit: Technically the checksum is optional and only a nicety, we also don't rely on it for heavy parsing/validation, therefore, as a pure checksum, 16-bits really are sufficient, even for save files which may take up as much as 123 pages.
- Considerations: While MurmurHash128 is technically documented elsewhere, I fixed a few issues that make it incompatible with the original function. Implementers may see this as 'too much work' to deal with, and might prefer something 'more standard', or 'more simple'. CRC-16 comes with its own issues of quality and implementation ambiguity. And a simple checksum, even with certain basic improvements, still might be subject to additional implementation debugging time. The best course of action is to provide test strings with their expected values, and perhaps provide an example port in plain C as a guide (though JS is very C-like anyway, some syntax is less clear).
- There is also the option to forgo a checksum entirely. A 16-bit checksum, while sufficient, is certainly somewhat weak. And the remaining 16-bits could come in handy for other uses, such as bit flags that may end up important in the future. Also, it may be more practical to simply archive/compress save files and rely on the CRC verification of that. It also feels a bit wrong to use a 16-bit truncation of a 128-bit hash? I think I will hold off.