Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spec progress #78

Merged
merged 31 commits into from
Sep 13, 2022
Merged

Spec progress #78

merged 31 commits into from
Sep 13, 2022

Conversation

gkellogg
Copy link
Member

@gkellogg gkellogg commented Aug 21, 2022

  • Flesh out more conversion operations.
  • Describe conversion relative to the YAML Representation Graph.
  • Stub a definition of the YAML-LD JSON profile.
  • Describe the YAML-LD extended profile.
  • Stub updated descriptions to the JSOn-LD API

For #75.


Preview | Diff

* Flesh out more conversion operations.
* Describe conversion relative to the YAML Representation Graph.
* Stub a definition of the YAML-LD JSON profile.
* Describe the YAML-LD extended profile.
* Stub updated descriptions to the JSOn-LD API
@gkellogg gkellogg added the spec Issue on specification label Aug 21, 2022
@gkellogg
Copy link
Member Author

Note that some references depend on broken ReSpec xrefs as discussed in https://github.com/w3c/respec/issues/4243.

Apologies for a dump of changes, as I expect my time to become more limited once the RCH and RDF-star working groups get underway.

PR is in progress, but early reviews and change requests to this PR are welcome.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
Suggestions from @TallTed, who never tires in his errors to make technical specifications more readable.

Co-authored-by: Ted Thibodeau Jr <[email protected]>
@TallTed
Copy link
Contributor

TallTed commented Aug 22, 2022

I actually LOLed at this typo --

Suggestions from @TallTed, who never tires in his errors to make technical specifications more readable.

@gkellogg
Copy link
Member Author

I actually LOLed at this typo --

Suggestions from @TallTed, who never tires in his errors to make technical specifications more readable.

I blame aggressive auto-correct on my Mac. "endeavors" :)

… representation. Of course, this section requires further expansion.
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
@TallTed
Copy link
Contributor

TallTed commented Aug 25, 2022

It's difficult to track the changes in the last commit, |var| changes vs much more potentially significant textual changes. If there are further markup changes that impact the whole doc, I'd appreciate it if those could be made distinctly from textual revision.

@gkellogg
Copy link
Member Author

It's difficult to track the changes in the last commit, |var| changes vs much more potentially significant textual changes. If there are further markup changes that impact the whole doc, I'd appreciate it if those could be made distinctly from textual revision.

Well, this is a work in progress. Given the amount that's being touched, it's impractical to break it up into multiple PRs, which would all then need to be re-based against each other, and would significantly affect that amount of work that can be done.

Probably best to wait until the status changes from "work in progress". Still, it does touch a lot of the document, and I apologize for that. Changes for things like <var>x</var> to |x| impact the difficulty of making these changes, and most of the algorithmic text is introduced in the PR.

@TallTed
Copy link
Contributor

TallTed commented Aug 25, 2022

I'm afraid that waiting until this already giant PR (GitHub won't tell me how many lines are touched, though I can easily see that ~550 lines have been added) becomes even bigger means "review the entire document" rather than "review the changes", and that's going to be a big time sink when we get there.

That said, I'm (generally) OK with large PRs (though smaller remain preferable). In this case, doing the |var| changes in one commit, and the other edits from de22b88 in another commit, would have been sufficient for me (and others, I imagine) to more carefully review the textual changes, and less carefully review the |var| changes.

None of which is to say, stop what you're doing. More to ask that maybe this omnibus PR be merged, and another (smaller, hopefully) omnibus begun, sooner than later.

I do understand the pain of addressing near-inevitable conflicts and rebasing PRs that have impacted one another. But it's better than building these docs in a wiki, as we used to do...

@gkellogg gkellogg marked this pull request as ready for review August 25, 2022 22:28
@gkellogg
Copy link
Member Author

Again, apologies for touching so much of the document, but I think it is now useful for developers to start implementing something.

I've completed the work I intended for this PR.

Copy link
Contributor

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor stuff...

spec/index.html Outdated Show resolved Hide resolved
@@ -943,6 +943,15 @@ <h3>Converting a <a>YAML scalar</a></h3>
the conversion result is a <a>language-tagged string</a>
with value taken from |n|,
ane a <a>language tag</a> taken from the suffix of |t|.
<div class="note">
<a>Node tags</a> including an underscore (`"_"`),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<a>Node tags</a> including an underscore (`"_"`),
<a>Node tags</a> including an underscore ("`_`"),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went back and forth on this. The JSON-LD specs use "_" instead of "_" for equivalent quoted characters (e.g., in https://www.w3.org/TR/json-ld11/#the-i18n-namespace) That may make it more readable, if not be as accurate. If we do make the change, it needs to be done comprehensively, so I'll leave this out of this batch.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. There may be some W3-ish style guide or examples that provides guidance, or others in the WG with opinions. I can live with the "_" style if there's general agreement that it's better, though it's not to my taste.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There were about four points in the document that use this style for "_", "#", and "%23". To my mind it's easier to see these as it exists easily, given how the spec styles code blocks.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
Copy link
Contributor

@pchampin pchampin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I sympathize with @TallTed's argument that discussing on big PRs is not ideal, so I am in favour of merging it, and then discussing more specific points, using more targetted PRs.

In particular, I am still uncomfortable with extending the internal representation, but this can be discussed later, on the basis of the changes proposed here.

spec/index.html Outdated
the document SHOULD use the `%YAML` directive with
version set to at least `1.2`
and a `%TAG` directive appropriate for each
`RDF literal`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be "for each RDF literal", but "for each namespace of datatype(s) used in RDF literals of the input RDF graph", although this phrasing is not very clear either.

Also: how are implementation chose the name of such tags?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should not be "for each RDF literal", but "for each namespace of datatype(s) used in RDF literals of the input RDF graph", although this phrasing is not very clear either.

Well, in terms of iteration, it is for each "RDF literal" with the description of specifically looking at the "datatype IRIs" of those literals left unsaid.

Also: how are implementation chose the name of such tags?

As with most other RDF serialization formats, the choice of namespace prefix to use is up to a given implementation and not something that should be specified. It's perfectly legitimate to not use any %TAG prefixes and have the node tags be full IRIs. I don't think we can say more than this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in terms of iteration, it is for each "RDF literal"

Ok, now that you explain it, I understand what you meant. It is however very easy to misread the current phrasing as "there should be 1 TAG declaration per RDF literal". Furthermore...

It's perfectly legitimate to not use any %TAG prefixes and have the node tags be full IRIs.

agreed, and that's why a SHOULD seems to strong for me.

I don't think we can say more than this.

I think we should even say less 😉 i.e. and replace the SHOULD by a MAY. E.g. replacing the end of the sentence with "and MAY use %TAG directives to make the serialization of some RDF literals more concise."

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated language and added an example in 53837c8. See if that helps.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated
Comment on lines 1301 to 1308
<li>
All other literals a transformed to a native <a>string</a>.
</li>
</ul>

<p class="ednote">
An alternative would be to transform such literals to
JSON-LD <a>value objects</a>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reallyy don't see the benefit of losing information at this stage, while the (standard) internal representation is capable of representing RDF literals through value objects...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern was for round-tripping: how do you distinguish from something specified using a value object (or through context coercion) from something expressed natively? How would you know to express a value object using node tags? Easiest to not do that at all, and allow direct representation within the (expanded) internal representation. At least, this was the conclusion I came to when thinking about it over time.

Being able to represent more native values in the internal representation (also potentially IRIs and blank nodes) better supports the greater expressivity of alternative formats such as YAML and CBOR. And, it's largely transparent to the JSON-LD processing algorithms, other than the to/from RDF bits, so it seems a natural solution.

This was discussed in #17 (comment) on #17, and should be the topic of further discussion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problem is that the serialization phase comes after compaction and framing, so trying to interpret what nodes represent value objects and are eligible for being serialized using node tags could require recursively processing the internal definition of the graph as it is done in expansion in order to identify which keys represent @value, @type, and @language, so the process of finding those, and replacing them with a scalar with node tag becomes much more difficult.

Consider the following:

{
  "@context": {
    "@vocab": "http://example.org/",
    "prop": {
      "@context": {
        "xsd": "http://www.w3.org/2001/XMLSchema#",
        "val": "@value",
        "dt": "@type"
      }
    }
  },
  "prop": {
    "val": "2022-08-27",
    "dt": "xsd:date"
  }
}

The use of an expanded internal representation retains the nature of the literal values so that they are more easily resolved into scalars with node tags.

Of course, when serialized to JSON, or using the YAML-LD JSON schema, these get flattened. The text in this PR simply turns them into strings (or numbers of boolean), but we could decide to turn them into value objects as well, which is also a reasonable direction. At that point, re-compacting might be necessary to get an expression suiting the in-scope context.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
gkellogg and others added 11 commits August 28, 2022 15:46
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
Co-authored-by: Ted Thibodeau Jr <[email protected]>
spec/index.html Outdated Show resolved Hide resolved
Copy link
Contributor

@TallTed TallTed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of minor tweaks above...

gkellogg and others added 2 commits August 29, 2022 10:58
Co-authored-by: Ted Thibodeau Jr <[email protected]>
…ich apparently is the right way to modify a term defined in another spec.
@gkellogg
Copy link
Member Author

This was discussed on today's meeting. Not ready to merge yet.

@niklasl suggested an alternative to using RDF literals as part of an extended internal representation would be to simply use YAML node tags on scalars to encode value objects. This could only really work where value objects (containing @value and @type or @language, but not other tags). For example:

{
  "property": {
    "@value": "2022-08-31",
    "@type": "http://www.w3.org/2001/XMLSchema#date"
  }
}

could be serialized as:

---
property: !<http://www.w3.org/2001/XMLSchema%23date>  "2022-08-31"

If we allow some examination of the context, we could also probably handle:

{
  "@context": {
    "xsd": "http://www.w3.org/2001/XMLSchema#"
  },
  "property": {
    "@value": "2022-08-31",
    "@type": "xsd:date"
  }
}

That could be considered as an alternative to the suggested algorithms for Converting a YAML scalar (still requiring some extendedYAML flag) and Converting From the Internal Representation, which would use node-tags on scalars for value objects with the extendedYAML flag, and potentially look at the top-level context for terms used in @type IRIs.

spec/index.html Outdated
Comment on lines 1124 to 1127
Although allowed within the YAML Grammar, some current YAML parsers
do not allow the use of `"#"` within a tag URI &mdash; substituting
the `"%23"` escape is a workaround for this problem, that will
hopefully become unnecessary as implementations are updated.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gkellogg I suggest to collect all these notes under "interoperability consideration", so that implementers can find them all in the same place.

Comment on lines +761 to 764
<p class="issue" data-number="63">
The current text does not support this, and only supports
a single <a>YAML document</a>.
This is inconsistent with the processing description in <a href="#convert-stream" class="sectionRef"></a>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gkellogg I suggest to say that:

  1. ld+yaml can contain multiple documents even in the basic profile;
  2. they should be processed in isolation. For further details see https://github.com/ietf-wg-httpapi/mediatypes/blob/main/draft-ietf-httpapi-yaml-mediatypes.md#yaml-streams-int-yaml-streams

This is because aggregating multiple docs in a single stream is a common practice and a useful native YAML feature addressed by editors and libraries.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the implications for the API would be, as it doesn't really provide for separate documents.

I think this is something we come back to later.

Copy link
Contributor

@ioggstream ioggstream left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gkellogg positive feelings provided that the comments are addressed.

On YAML -> Internal representation processing I just trust you, but I think that we will have time for tuning after releasing of a draft version.

spec/index.html Outdated Show resolved Hide resolved
spec/index.html Outdated Show resolved Hide resolved
@gkellogg
Copy link
Member Author

gkellogg commented Sep 13, 2022

Discussed at today's Face to Face and resolved to merge.

Resolution: https://json-ld.org/minutes/2022-09-12/#resolution-1

@gkellogg gkellogg merged commit 312e028 into main Sep 13, 2022
@gkellogg gkellogg deleted the spec-progress branch September 13, 2022 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
spec Issue on specification
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants