Skip to content
bryc edited this page Sep 29, 2023 · 62 revisions

MPKEdit can save and load individual Note files (saves) which use the .note file extension.

In general, there are two types of .note files:

  1. Basic Note File: NoteEntry data, followed by actual save data
  2. Extended Note File: MPKNote header, followed by Basic Note File

This page describes the data structure of these files.

MPKNote header

The MPKNote header was made to store text comments, as well as an internal timestamp. It is used by default.

Current format version is 0x01.

Offset Size Description
0x00 1 Format version number
0x01 7 Magic string "MPKNote", ASCII-encoded
0x08 2 Reserved for future use
0x0A 5 Timestamp (big endian 40-bit unixtime)
0x0F 1 n, Number of 16-byte blocks allocated for comment data
0x10 n × 16 Comment data (UTF-8 string, NUL-terminated)
Basic Note file data

If your application does not support these extra features, simply skip to the Basic note file offset.

The Comment data can have a length up to 4080 bytes, defined by n at offset 0x0F, using the formula n * 16 where n can be 0-255. The comment data must be terminated by a NULL character.

The timestamp is a truncated 64-bit Unix timestamp, essentially an unsigned 40-bit integer in big-endian byte order. It's maximum value represents a date of Feb 19 36812 at 7:36:15 PM, effectively valid for 34.5 thousand years. In case it's not obvious, big endian means bytes are written "in reverse", starting at offset 0x0E, with 0x0A being the last. The two high bytes remaining are reserved for potential future use, but currently undefined.

In general, the timestamp represents the earliest possible time of saving. MPKEdit attempts to choose the earliest timestamp available when saving a note file. If a previously loaded .MPK has a date-modified attribute, it will use this date.

This is the actual timestamp priority:

  1. The stored timestamps in the MPKMeta extension data itself (yet to be implemented).
  2. The stored internal timestamp found in imported .note file.
  3. The date-modified of an imported .note file without any internal timestamp.
  4. The overall date-modified of an MPK file containing notes.
  5. At time-of-saving note file (technically impossible to trigger).

Note: The initial Format version number of 0x00 was used for a short period during initial implementation. It is obsolete, but MPKEdit supports it just in case (Import only). Supporting it is completely optional (see code if desired), because likely no one even used it, and those who did can simply use MPKEdit to re-save. I only mention it here for completeness. It's essentially the same as version 0x01, using the same magic string, but the comment data has a fixed length of 256 bytes, and stores an 8-bit NUL-terminated string using the iso-8859-1 encoding (basically just extended ASCII). This was chosen to match the way DexDrive stored comments.

Basic Note file

This is a simple binary format for representing notes as files.

Offset Size Name
0x00 32 Note Entry
0x20 n × 256 Raw save data

The Note Entry is a raw data structure taken directly from MPK data. However, the two-byte IndexTable entry point located at offset 0x07 is not relevant in a note export, and has been specially repurposed as a file identifier, using the magic number 0xCAFE. Immediately after this is the actual save data itself, of which the length will always be a multiple of 256. When importing this data, 0xCAFE must be replaced with the correct entry point.

To detect Basic Note files, it is only required to check for the existence of 0xCAFE at offset 0x07. Optionally, you can check if filesize minus 32 is a multiple of 256. Going even further, check that bytes 0x0A and 0x0B both equal zero which can be used to strengthen the file magic.


Misc TODO (ignore)

  • Checksum likely to be added in the unused 16-bit space. Also likely to use existing hash algorithm MurmurHash128. Rationale for inclusion: It is an early warning system for potential data corruption. It will not fix problems, and problems are likely to be rare in most contexts, but can alert the user that something may have changed, either minor or major, that could compromise the file. It's a safety fallback, likely to not be needed, but if its ever needed, it could be a godsend. Reasons for accepting 16-bit instead of 32-bit: Technically the checksum is optional and only a nicety, we also don't rely on it for heavy parsing/validation, therefore, as a pure checksum, 16-bits really are sufficient, even for save files which may take up as much as 123 pages.
    • Considerations: While MurmurHash128 is technically documented elsewhere, I fixed a few issues that make it incompatible with the original function. Implementers may see this as 'too much work' to deal with, and might prefer something 'more standard', or 'more simple'. CRC-16 comes with its own issues of quality and implementation ambiguity. And a simple checksum, even with certain basic improvements, still might be subject to additional implementation debugging time. The best course of action is to provide test strings with their expected values, and perhaps provide an example port in plain C as a guide (though JS is very C-like anyway, some syntax is less clear).
  • There is also the option to forgo a checksum entirely. A 16-bit checksum, while sufficient, is certainly somewhat weak. And the remaining 16-bits could come in handy for other uses, such as bit flags that may end up important in the future. Also, it may be more practical to simply archive/compress save files and rely on the CRC verification of that. It also feels a bit wrong to use a 16-bit truncation of a 128-bit hash? I think I will hold off.