Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ld+yaml media-type #8

Closed
4 tasks done
ioggstream opened this issue Jan 18, 2022 · 16 comments
Closed
4 tasks done

ld+yaml media-type #8

ioggstream opened this issue Jan 18, 2022 · 16 comments
Labels

Comments

@ioggstream
Copy link
Collaborator

ioggstream commented Jan 18, 2022

I expect

To register the application/ld+yaml mediatype.
This includes:

  • providing considerations about yaml features that are not round-trip safe when serializing to json, including
    • Complex mapping keys, since keys must be str, int, float, bool or None, not tuple
    • NaN, Infty not supported
  • providing considerations about yaml features that cannot be used since there is no json counterpart (eg. explicit typing !!str)

@dlongley @msporny @gkellogg

ioggstream added a commit that referenced this issue Jan 18, 2022
@msporny
Copy link

msporny commented Jan 18, 2022

To register the application/ld+yaml mediatype

Great! Please make sure it's completely round-trippable between application/ld+json :) -- creating any deviation would be problematic. That is, as much as it might be really awesome to use some of YAML's features that are different from JSON... doing so, in a non-round-trippable way, would probably harm the initiative.

Clearly, we should support YAML features like comments... but can't expect them to round trip to JSON-LD and back. :)

If you get this sorted, we'd love to put together some examples for Verifiable Credentials, Decentralized Identifiers, and Data Integrity Proofs -- digitally signed YAML documents with very little effort. :)

@ioggstream
Copy link
Collaborator Author

@msporny @pchampin @gkellogg first and foremost, thanks for your replies! I follow up the discussion here so that comments won't get lost if json-ld/json-ld.org#9 will be superseeded by another PR.

Agree with @msporny : we need to identify the admissible yaml subset and to clarify that:

  • some features won't be round-trip-safe;
  • some features won't fit json anyway.

For example:

  • comments won't be preserved;
  • allowed datatypes: mappings, scalars, string, int, float, null, ...
  • only one document --- [...] ...
  • no explicit typing (eg. !!str ) but for simplicity, but we can discuss in admitting types
  • anchors and merge-keys allowed

I saw you have here a quite long list of examples...
do you already have a list of those features?

@gkellogg
Copy link

Those examples were created automatically by the script that validates examples in the spec, simply re-serializing them in YAML. No real work has gone in to specifically identify the YAML subset, but a YAML-LD spec should do so.

As JSON-LD specifically relies on an internal representation using Infra types, and we can expect other -LD variations (e.g, CBOR-LD) a future version of a core JsON-LD spec may provide some extension points for non-JSON data types, but that is entirely speculative.

@ioggstream
Copy link
Collaborator Author

ioggstream commented Jan 28, 2022

@gkellogg @msporny I stubbed some general considerations here https://github.com/ietf-wg-httpapi/mediatypes/pull/15/files
your feedback would be great!

internal representation using Infra types

how does this differ with RFC8259

extension points for non-JSON data types

shouldn't any alternative representation have a well-defined mapping to JSON?

@pchampin
Copy link

pchampin commented Feb 9, 2022

how does this differ with RFC8259

INFRA defines an abstract data model ; RFC8259 (JSON) defines an exchange format. Granted, the boundary a bit blurry: first, because JSON is based on JS, and therefore is often assumed to describe JS's data format. Second, because RFC8259 has a number of remarks about interoperability, which somehow constrain the underlying data formats used by JSON implementations.

shouldn't any alternative representation have a well-defined mapping to JSON?

That would make sense if JSON was a data model, which it is not (strictly speaking). INFRA, on the other hand is.

@ioggstream
Copy link
Collaborator Author

shouldn't any alternative representation have a well-defined mapping to JSON
That would make sense if JSON was a data model

I mean that the infra spec defines a way to convert an infra value to a json compatible javascript value for a string, boolean, number, null, list, or string-keyed map.

While extension points can be difficult to manage with JSON, yaml could define specific tags (eg. !!infra:mynewtype ) and the associated syntax for that.

As of now, I think it would be beneficial trying to describe the current json mapping and address extension points when they'll be defined.

Probably, instead of restricting how to use yaml for encoding json-ld objects, it is better to provide guidance in what will be interoperable or not with respect to JSON, and leave the choice to the user. cc: @msporny

@msporny
Copy link

msporny commented Feb 9, 2022

Probably, instead of restricting how to use yaml for encoding json-ld objects, it is better to provide guidance in what will be interoperable or not with respect to JSON, and leave the choice to the user. cc: @msporny

I'll merely note that the Decentralized Identifier WG ended up having to define the core data model in INFRA and then demonstrate mappings to JSON and JSON-LD. There was work to map the core data model to CBOR, but that work failed due to disinterest.

See this section on core data model that uses INFRA:

https://www.w3.org/TR/did-core/#data-model

... and how we mapped it to JSON:

https://www.w3.org/TR/did-core/#production

... which the JSON-LD representation then built on top of:

https://www.w3.org/TR/did-core/#production-0

The end result, IMHO, was an awful mess... we should have never done it. A few loud opinions in the WG wanted it, and it was clear that there were going to be formal objections if we didn't do it, so we tried the experiment and I did my best to make it make sense in the specification. I have always been of the opinion that we should've just used the JSON-LD data model and be done with it (it had direct mappings to JSON, CBOR, etc.)... so, just a word of warning for anyone that wants to try to get INFRA working w/ JSON-LD... it's possible, but the end result is hard for many to grasp and understand... though, it's not much better for RDF, is it? :)

@pchampin
Copy link

pchampin commented Feb 9, 2022

@msporny although a sympathize with your pain re. DID, I think the problem here is very different.

If YAML is to be used as a representation for a JSON-LD document, then one has to specify how to represent JSON-LD internal representation in YAML. And that internal representation is based on INFRA.

For the DID core model, I agree with you that it would have been better to define it in terms of the JSON-LD data model, which can then be serialized in JSON, and inherit form all other JSON-LD representations to come, including YAML...

@ioggstream
Copy link
Collaborator Author

@pchampin @msporny given the situation, I think in this media type registration we should aim at registering an usage that is already in the wild: people that wants to use and content-negotiate yaml serialization of json-ld document.

I am not excluding that in the future a thorough work based on the json-ld data model can be done. In fact, using explicit tagging with yaml it is already possible to declare a json-ld namespace and explicit each value type, eg.

%TAG !! tag:json-ld.org,2000:types/
--
!!map a: !!array [1, 2, 3]

To support json-ld usage in web APIs, I think this proposal should base on what's json-ld now, and not be hindered by what it could be one day.

@msporny
Copy link

msporny commented Feb 9, 2022

If YAML is to be used as a representation for a JSON-LD document, then one has to specify how to represent JSON-LD internal representation in YAML. And that internal representation is based on INFRA.

JSON-LD already has an abstract data model defined, why can't YAML-LD just use that: https://w3c.github.io/json-ld-syntax/#data-model

I will admit that if something changed drastically in JSON-LD 1.1 that makes this impossible, I'm unaware of it. :)

To support json-ld usage in web APIs, I think this proposal should base on what's json-ld now, and not be hindered by what it could be one day.

This is the approach we took with CBOR-LD -- just define how to map to/from JSON-LD and you're done -- (and I fully admit that some might find the approach controversial).

Introduction to the concepts in CBOR-LD:

https://docs.google.com/presentation/d/1ksh-gUdjJJwDpdleasvs9aRXEmeRvqhkVWqeitx5ZAE/edit

The CBOR-LD "specification" (if you can call a collection of ramblings that):

https://digitalbazaar.github.io/cbor-ld-spec/

If you define a clean mapping to/from JSON-LD, you can end up with a much smaller specification. Again, just throwing this out there having given YAML-LD some random thought throughout the last couple of years but then never acting on it. YMMV. :)

@pchampin
Copy link

pchampin commented Feb 9, 2022

@msporny I'm confused

JSON-LD already has an abstract data model defined, why can't YAML-LD just use that: https://w3c.github.io/json-ld-syntax/#data-model

(...)

This is the approach we took with CBOR-LD -- just define how to map to/from JSON-LD and you're done -- (and I fully admit that some might find the approach controversial).

As I read it, CBOR-LD defines how to map to/from JSON-LD's internal representation: it works at the level of maps, key-value pairs... not at the level of nodes, arcs and graphs, as described by the data model.

So I think we are in violent agreement here :)

I will admit that if something changed drastically in JSON-LD 1.1 that makes this impossible, I'm unaware of it. :)

I don't think that's the case :-)

Although the notion of "internal representation" was not explicitly defined in JSON-LD 1.0, it was implictly there, but referred simply to RFC4627. Moving from RFC4627's terminology to INFRA's terminology was deemed cleaner because, again, RFC4627 does not really define a data model (only a syntax), while INFRA does.

@gkellogg
Copy link

gkellogg commented Feb 9, 2022

If I were to revisit anything in the JSON-LD data model, it would be the interpretation of JSON numbers to allow for decimal values. As it is now, JSON numbers are either interpreted as integers (long) or doubles based on the range of the number. But, in JSON-LD 1.1, we use The JSON Canonicalization Scheme (RFC8785) as a way to represent numbers in the rdf:JSON datatype serialization, which allows for a serialization form of either integer, decimal, or double. This really only comes into play in JSON-LD when creating RDF literals from native JSON numbers (something which is generally a bad design point, but is there to allow a reasonable interpretation of native JSON forms), but could also come into play when representing those numbers in the data model, and thus in serializations to forms such as YAML.

@darrelmiller
Copy link
Contributor

In the OpenAPI specification, we limit the use of YAML to the "JSON Schema" https://yaml.org/spec/1.2.2/#json-schema (not to be confused with JSON Schema ). Specifying this constraint has proven sufficient for us to ensure tooling can roundtrip between YAML and JSON.

@darrelmiller darrelmiller linked a pull request Mar 19, 2022 that will close this issue
@ioggstream
Copy link
Collaborator Author

@darrelmiller we will address the issues with "yaml json schema" (see the interoperability considerations), so that might need tweaking the OAS spec too.

@ioggstream
Copy link
Collaborator Author

Moved to https://github.com/json-ld/yaml-ld/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Development

Successfully merging a pull request may close this issue.

5 participants