Registering new file formats for the ContentsManager #1456

martinRenou · 2024-08-30T07:54:06Z

Problem

When building extensions like JupyterGIS or JupyterCAD, we soon feel the need for supporting advanced file formats like .qgz, .fcstd etc. Because those are binary formats, the ContentsManager returns the base64 encoded version of the file when requesting its content: As of today the default ContentsManager supports text files, binary files as base64 and Notebooks as JSON.

In the case of the FreeCAD's .fcstd format, We were able to fix that problem ourselves by handling the base64 source with the FreeCAD's Python library, but this means the file is read and write mutliple times. It also only works in the specific case of jupyter-collaboration where we can hook on the file loading logic to turn the original source into something we understand for the collaboration logic (a JSON representation of the content).

Proposed Solution

It would be great to handle those specific file formats ourselves directly at the ContentsManager level, preventing those issues mentioned above.

One solution could be to provide our own ContentsManager, but because we are building libraries we don't want to overwrite a potential custom ContentsManager the user would have already set. Also our libraries would collide.

I would like to suggest being able to configure (maybe through traitlets) the ContentsManager, providing custom "file adapters" that would handle the read/write for specific file extensions.

Additional context

Notes from the jupyter-server meeting August 29th 2024 where we discussed this:

Could the base ContentsManager register custom file types?

Today, we set the type of content using two keys in the contents REST API model, type and format.

Extending would mean adding new type/format combinations.

Problems we observed:

The contents handler in the Jupyter Server owns the logic to check these keys, not the manager.

There is discrepancy between the formats allowed/not allowed in the ContentsHandlers and the ContentsManager. For example, "json" is not a valid format in the handler, but it is valid in the contents manager.

there is a lot of logic in the handler that should be pushed down into the manager.

Contents manager throws HTTP Errors? that's strange (and probably wrong)

In general, it would be ideal to have a registration/plugin point to bring your own file types to Jupyter Server without requiring folks to bring a custom contents manager to support them. Instead, we want to use the default contents manager, but we can act on files that aren't notebooks or text or base64 strings.

Comments in the chat:

The limitation of the contents GET approach will always be that it does not support streaming the content. Unless we have any ideas

maybe the "adapter" could have a public serialize() method and in a future an async aserialize() if we ever support streaming?

Compressed notebooks is also an idea

The text was updated successfully, but these errors were encountered:

martinRenou added the enhancement label Aug 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Registering new file formats for the ContentsManager #1456

Registering new file formats for the ContentsManager #1456

martinRenou commented Aug 30, 2024 •

edited

Loading

Registering new file formats for the ContentsManager #1456

Registering new file formats for the ContentsManager #1456

Comments

martinRenou commented Aug 30, 2024 • edited Loading

Problem

Proposed Solution

Additional context

martinRenou commented Aug 30, 2024 •

edited

Loading