Incomplete content_type subsection. #42

Bonnarel · 2020-04-20T16:57:53Z

Recent semantic discussion addressed the use case of adding the possibility to link sibling or alternate science datasets to the main item. Eventually the right place to specify the dataproduct_type of the datasets has been decided to be a standardized media type parameter in the content_type FIELD. this has to be explained in the section. See PR #43

pdowler · 2020-11-04T07:23:27Z

The PR discussion brought up the issue that mime type parameters really should be defined by the same authority that defines the mime type, so the idea of adding content={dataproduct_type} (or some other ivoa vocabulary value) to application/fits (eg) now seems unacceptable.

The next best alternative would be to add a new (optional in 1.1) field, say content_qualifier where one could use a vocabulary term to describe the logical content (as opposed to the format in content_type). Since the ObsCore standard looks to be moving toward dataproduct_type being a vocabulary (and we could treat the current list of words as such now) allowing vocabulary terms would would satisfy the current and future use cases.

Detail: do we define a default vocabulary and allow terms from that to be used "unqualfiied" (eg image or #image instead of http://ivoa.net/ObsCore/dataproduct_type#image <-- totally made up fully qualified vocabulary term) -- or do we always require fully qualified values? Having a default vocabulary kind of anchors is (again) to the idea that DataLink is for data and files and not more generically "links to resources"... something we've tried to generalise in the current revision.

So, do we allow bare vocabulary terms from any (IVOA) vocabulary -- image (or #image) and galaxy (or #galaxy) -- or fully qualified vocabulary terms (identifiers)?

pdowler · 2020-11-04T07:25:07Z

I can volunteer to write this and create a PR, but I'd like to wait for PR #50 because that introduces optional fields and this would definitely create a merge conflict if done in parallel.

msdemlei · 2020-11-11T08:22:12Z

The next best alternative would be to add a new (optional in 1.1)
field, say content_qualifier where one could use a vocabulary term to

I always cringe when "about the same thing" is done differently in two
different standards. So... what's the difference between this and
obscore dataproduct_type? Is the rationale for this difference really
so significant to justify inventing something new and forcing adopters
to learn yet another thing?

Detail: do we define a default vocabulary and allow terms from that to
be used "unqualfiied" (eg image or #image instead of
http://ivoa.net/ObsCore/dataproduct_type#image <-- totally made up

There's http://www.ivoa.net/rdf/product-type that ought to become
adopted with SimpleDALRegExt. And I'm pretty sure we should just use
that.

So, do we allow bare vocabulary terms from any (IVOA) vocabulary --
image (or #image) and galaxy (or #galaxy) -- or fully qualified
vocabulary terms (identifiers)?

This is a bit tricky -- for internal (datalink) consistency, I'd say we
should do it like with semantics: What's in there is a URI relative to
http://www.ivoa.net/rdf/product-type. This will make #image just work,
and if people really want, they can add fully qualified URIs.

Given that's what datalink does elsewhere, I'd say we can't really do it
differently here.

If we started from scratch, I'd not do it this way again and instead say
"it's terms from product-type, full stop, no fooling around with
concatenating URIs".

This is because I'm now convinced that hierarchy-aware matching
("anything that is image or narrower") is an important use case in this
kind of thing; and that, really, won't ever work when you allow terms
from everywhere. That's why I'm against repeating the # hack in, say,
obscore, or in SimpleDALRegExt. I might add some text explaining why
datalink differs in this respect from what's done elsewhere in the VO if
VocinVO2 becomes REC before Datalink 1.1, just so adopters don't curse
use to badly.

Bonnarel · 2020-11-18T18:52:19Z

Hi all,

The next best alternative would be to add a new (optional in 1.1)
field, say content_qualifier where one could use a vocabulary term to

I always cringe when "about the same thing" is done differently in two
different standards. So... what's the difference between this and
obscore dataproduct_type? Is the rationale for this difference really
so significant to justify inventing something new and forcing adopters
to learn yet another thing?

Well, I tend to agree with Pat there. I think we have to be cautious about adding new columns all the time in the future. So having two qualify the content of the link independantly from its relation to "#this" (content_type and content_qualifier) should be enough. We will still have plenty of use cases wher we will not use a dataproduct_type to qualify the target because it's simply inappropriate. But if the target is voevent the content can be a classification tag of that voevent or if the semantics is "metadata" the content_qualifier could tell us : "provenance" record, obscore record, ssa record, proprietary, etc...
This more ore less requires to integrate the vocabulary namespace in the value of this new content_qualifier field

Detail: do we define a default vocabulary and allow terms from that to
be used "unqualfiied" (eg image or #image instead of
http://ivoa.net/ObsCore/dataproduct_type#image <-- totally made up

There's http://www.ivoa.net/rdf/product-type that ought to become
adopted with SimpleDALRegExt. And I'm pretty sure we should just use
that.

So, do we allow bare vocabulary terms from any (IVOA) vocabulary --
image (or #image) and galaxy (or #galaxy) -- or fully qualified
vocabulary terms (identifiers)?

This is a bit tricky -- for internal (datalink) consistency, I'd say we
should do it like with semantics: What's in there is a URI relative to
http://www.ivoa.net/rdf/product-type. This will make #image just work,
and if people really want, they can add fully qualified URIs.

Given that's what datalink does elsewhere, I'd say we can't really do it
differently here.

so http://www.ivoa.net/rdf/product-type as to be the default namespace for this field (stated so in the spec or advertized "à la" xsd namespace at the beginning of the VOTable)

So anything which is not a dataproduct_type from the iVOA vocab has to contain an explicit namespace

If we started from scratch, I'd not do it this way again and instead say
"it's terms from product-type, full stop, no fooling around with
concatenating URIs".

This is because I'm now convinced that hierarchy-aware matching
("anything that is image or narrower") is an important use case in this
kind of thing; and that, really, won't ever work when you allow terms
from everywhere. That's why I'm against repeating the # hack in, say,
obscore, or in SimpleDALRegExt. I might add some text explaining why
datalink differs in this respect from what's done elsewhere in the VO if
VocinVO2 becomes REC before Datalink 1.1, just so adopters don't curse
use to badly.

Bonnarel · 2020-11-19T05:17:34Z

For a while I also volunteered to write this one
I didn't create a pull request because Pat writes there may be a conflict with PR #50
so here is the proposal
he value may be null (blank)
The value may be null (blank)
if unknown and will typically be null for links to services.
if unknown and will typically be null for links to services.

\subsubsection{content_qualifier}

The content_qualifier column is optional. If it is present, it tells the client the nature of the thing or service they will receive or access if they use the link, in other words the target. If the target is a dataproduct, the field SHOULD contain one of the terms defined in the IVOA dataproduct_type vocabulary, considered as the default vocabulary. For other natures of the target the field MAY contain a term defined in another IVOA or proprietary vocabulary refered by its URI.

\subsection{Successful Requests}
\subsection{Successful Requests}

Bonnarel · 2020-12-09T17:45:31Z

For a while I also volunteered to write this one
I didn't create a pull request because Pat writes there may be a conflict with PR #50
so here is the proposal
he value may be null (blank)
The value may be null (blank)
if unknown and will typically be null for links to services.
if unknown and will typically be null for links to services.

\subsubsection{content_qualifier}

The content_qualifier column is optional. If it is present, it tells the client the nature of the thing or service they will receive or access if they use the link, in other words the target. If the target is a dataproduct, the field SHOULD contain one of the terms defined in the IVOA dataproduct_type vocabulary, considered as the default vocabulary. For other natures of the target the field MAY contain a term defined in another IVOA or proprietary vocabulary refered by its URI.

\subsection{Successful Requests}
\subsection{Successful Requests}

I eventually created the PR for the small subsection because it is not in conflict with the table where optional FIELDS will be listed. See discussion on this PR#50

Bonnarel · 2020-12-09T17:51:05Z

Possible solution in PR #56 (DataLink-#51)

pdowler · 2021-05-13T01:14:01Z

Coming back to this now that the other PRs are merged. Before we discuss the name of the column in the links table, the more fundamental question is whether there is a single vocabulary which defines the values or are there several, some of which have not been created yet?

If we started from scratch, I'd not do it this way again and instead say
"it's terms from product-type, full stop, no fooling around with
concatenating URIs".

I'm not sure what you mean by "do it this way again". (I thought) I understand that there are two orthogonal vocabulary concepts:
0. you have a vocabulary with a set of words (hierarchical: wider and narrower)

you have multiple vocabularies with different sets of words (different namespace)
you can have a vocabulary that extends another (adds words), usually to add narrower terms (don't know if there is a way to add a new base term and for that to be any different from just a term in a different vocabulary)

Are you saying you don't like allowing terms from multiple vocabularies in this new "content_qualifier" column? (using #1) If so, the "fooling around" is caused by allowing unqualified bare terms from a default vocabulary.

Are you saying you don't like extensions of a single-mandated vocabulary? (using #2). If so, the "fooling around" (like in semantics column) is because we allowed the unqualified bare terms #this which seemed kind of cute at the time. Maybe we don't have a well specified way for people to declare and use extensions but that's really important so people can put prototype terms into use...

I just don't see how the product-type vocabulary can satisfy all the use cases and I don't see adding a column for each new vocabulary, so:

My position right now:

semantics continues to mandate the single vocabulary, therefore unqualified terms are allowed
content_qualifier (not in love with the name) allows fully qualified terms from any vocabulary; no default; I could possibly get behind restricting to "any ivoa vocabulary", depending on your position on extensions

We (CADC) use fully qualified terms in semantics that are not in the core vocab; I consider them prototype in nature and just haven't got around to the VEP stage.

msdemlei · 2021-05-14T09:15:29Z

On Wed, May 12, 2021 at 06:14:16PM -0700, Patrick Dowler wrote: Coming back to this now that the other PRs are merged. Before we discuss the name of the column in the links table, the more fundamental question is whether there is a single vocabulary which defines the values or are there several, some of which have not been created yet?

The most important thing is to get the use case clear. I'm assuming it's "route links through SAMP". If we're thinking of anything else, this would be the moment to say so (and properly define it: "A client wants to do X").

Are you saying you don't like allowing terms from multiple vocabularies in this new "content_qualifier" column? (using #1) If

I don't like pretending you can post your private vocabulary somewhere and terms in it will work as well as if they were defined in the IVOA vocabulary. While RDF would make that possible (because you can define relationships between vocabularies), all kinds of technicalities make that completely unrealistic in practice (I'll elaborate if you want). That is: We have the choice between allowing resources from all over the place ("full URIs", which then in effect are opaque strings) and concept trees with proper metadata (labels, description, preliminary/deprecated flags). In that choice, the trees and metadata to me are overwhelmingly more important in almost all relevant use cases for consensus vocabulaires. And hence I propose as a good default policy: When you have a field with values from a controlled vocabulary, say "terms are from http://www.ivoa.net/rdf/thisvoc". You can always say "prefix by an x- for something experimental that won't resolve" or so; in practice, people will just fall back to something ugly for unknown terms anyway, and you won't be in the hierarchy -- that's reasonably sensible behaviour.

Are you saying you don't like extensions of a single-mandated vocabulary? (using #2). If so, the "fooling around" (like in

They're built to be extended; VEPs are supposed to be cheap exactly to make people extend soon and extend early.

#this which seemed kind of cute at the time. Maybe we don't have a well specified way for people to declare and use extensions but that's really important so people can put prototype terms into use...

...and of course you can always just stick in terms illegaly for *really* early prototyping and bear with the consequences laid out above.

I just don't see how the product-type vocabulary can satisfy all the use cases and I don't see adding a column for each new

What additional use cases are these? Me, I think columns usually should be per-use case (when these use cases are sufficiently different, of course). Having columns be useful sometimes for A and sometimes for B in my experience ends up making them not useful for either A and B -- and that's independent of the question of vocabularies. Having said that, having some wild "twitter-like tags" obviously is a valid use case in microblogs. Perhaps they may work for datalinks as well (I'd need some serious convincing here, though). In that case, though, I think one would find that these things don't need any sort of vocabulary and work, twitter-like, by spontaneous agreement of certain subcommunities.

My position right now: * semantics continues to mandate the single vocabulary, therefore unqualified terms are allowed * content_qualifier (not in love with the name) allows fully qualified terms from any vocabulary; no default; I could possibly get behind restricting to "any ivoa vocabulary", depending on your position on extensions

But how would that solve the (IMHO valid) SAMP routing use case? Note that hierarchy plays a major role there, as a single client might handle an entire branch of product types.

pdowler · 2021-05-14T23:00:47Z

OK, I get the objection to the wild west of arbitrary full URLs to something on the internet; I don't think it would magically work either and they are just opaque identifiers to s/w (a human could go get the definition of a term).

I re-read what I think is the original post on this (issue #44) and in there I noted a couple of rather simple things that maybe are enough to get by for some time. First, there is a (proposed) "tabular" or "table" value in the product-type vocabulary; assuming such a VEP was accepted this would nominally be the way to link to "records" (query results).

If you saw a links response with:

id semantics product_type content_type ...
id1 #this #image application/fits ...
id1 #derivation #table application/fits ...

You could infer that the second link was to a fits file with a table in it, but does #derivation tell you what's in the table? what is a row in that table? is it clear that it is an extracted source? if not, how could we make that clear?

The answer could be a narrower term than #derivation that said something about what kind of derivation: same data but processed to be "better" vs information extracted vs astronomical sources extracted ...

So I guess if both datalink/core and product-type vocabularies grow sufficiently, aren't too rigid and don't become a huge mess then we'd be OK with a product_type column restricted to values from that vocabulary. The combinations from two vocabularies will make this quite flexible... I suspect 3 such things would be too much.

Francois - do you think this will work for the use cases from Ada and others you mentioned?

Aside: At CADC we have a handful of astronomers and data-scientists that use our services a lot; they are pseudo-representative of the community (pseudo because they know too much now). I am keenly aware of how much they hate it when things change and if you give them something simple they get used to you can never go back and generalize it in a way that makes it more complex. As a result, I am extremely leery of simple-looking things that look like short cuts unless I have sketched out the general solution and I know the shortcut is not going to bite me later. So like Markus, I don't think I grok the general problem here (lack of use cases) and that makes me a little worried that we'll regret something. OTOH, if we just think about it as "used to be able to say one thing about a link" and "now you can say two things about a link" then that helps.

Bonnarel · 2021-05-20T15:23:06Z

OK, I get the objection to the wild west of arbitrary full URLs to something on the internet; I don't think it would magically work either and they are just opaque identifiers to s/w (a human could go get the definition of a term).

I re-read what I think is the original post on this (issue #44) and in there I noted a couple of rather simple things that maybe are enough to get by for some time. First, there is a (proposed) "tabular" or "table" value in the product-type vocabulary; assuming such a VEP was accepted this would nominally be the way to link to "records" (query results).

If you saw a links response with:

id semantics product_type content_type ...
id1 #this #image application/fits ...
id1 #derivation #table application/fits ...

You could infer that the second link was to a fits file with a table in it, but does #derivation tell you what's in the table? what is a row in that table? is it clear that it is an extracted source? if not, how could we make that clear?

The answer could be a narrower term than #derivation that said something about what kind of derivation: same data but processed to be "better" vs information extracted vs astronomical sources extracted ...

So I guess if both datalink/core and product-type vocabularies grow sufficiently, aren't too rigid and don't become a huge mess then we'd be OK with a product_type column restricted to values from that vocabulary. The combinations from two vocabularies will make this quite flexible... I suspect 3 such things would be too much.

Francois - do you think this will work for the use cases from Ada and others you mentioned?

well there are two level of answers :
1 ) if we consider the original usecase where #this is a "source" or "detection" in a catalog and #link is a timeseries "of #this" for sure product_type combined with one of (coderived, derived, counterpart, progenitor) semantics term is enough. And content_type will tell us about the format.
But
2 ) - when #link is not a dataproduct product_type is useless. It is not a problem per se. we can leave it empty. But maybe we want to say more about what it is in that case. Imagine #link is "Documentation". Is that a tutorial ? a refered article ? a simple html page ? a github repository ? Where do we put this information if the new field is reserved for dataproduct_type vocabulary ?
- in Ada's proposal there were 4 levels :
Level 0 - Data-format (fits, VOTable, PDF, png, …)
Level 1 - Data-type (tabular, image, spectrum, cube, text, …)
Level 2 - Data-information (Documentation, Calibration, Log, Preview, …)
Level 3 - Data-relation (Derived from, Progenitor of, Sibling of, ...)
0 and 1 will be covered by content_type and product_type. 3 is obviously covered by semantics. My personal opinion is that the examples for level2 are also a kind of relationship between #this and the #link, so well covered by semantics. But it may happen that something in her level 2 could be covered by data-type (the very nature of documentation for example. An i think we will sooener or later need a new "metadata" semantics term the nature of which could be an "obscore record" or a "provenance record" or .....

---> could we find a more generic term than product_type for describing the nature of the #link. (I understand that content_qualifier is ruled out)
---> can we consider that the default vocabulary there is the dataproduct_type one and that we allow alternative complete uri ivoa terms if needed ?

Aside: At CADC we have a handful of astronomers and data-scientists that use our services a lot; they are pseudo-representative of the community (pseudo because they know too much now). I am keenly aware of how much they hate it when things change and if you give them something simple they get used to you can never go back and generalize it in a way that makes it more complex. As a result, I am extremely leery of simple-looking things that lock like short cuts unless I have sketched out the general solution and I know the shortcut is not going to bite me later. So like Markus, I don't think I grok the general problem here (lack of use cases) and that makes me a little worried that we'll regret something. OTOH, if we just think about it as "used to be able to say one thing about a link" and "now you can say two things about a link" then that helps.

Not sure I catch this "aside". What do you consider as a shortcut there ? use the same field for different vocabularies

pdowler · 2021-05-20T21:13:06Z

As long as the product-type vocabulary, which says "what something is" expands to include terms beyond what ObsCore uses (different kinds of science data) it could be a general purpose way to augment the content_type.

The level 3 and 4 examples above are both using terms from the datalink/core vocabulary; it could be that we have created some confusion with the content of that vocabulary... is there a use case where you would want to specify one of those level 3 and one of those level 4 terms? If so, is is feasible to split the datalink/core vocab into two actually distinct vocabularies (I'm skeptical)? what about simply allowing multiple terms to be used to describe a link that has a complicated multi-faceted relationship to #this? I do in fact have a use case that suggests this and I don't want to get that mixed up with use of product-type, but in general being able to put multiple terms might be an alternative.

On the aside: the "simple thing" I am potentially nervous about is being strict about product_type column being just for terms from the product-type vocab, and then future evolution of that vocab is also strict and not being able to use it for other use cases. The other obvious thing I could see doing is linking to an instance of a data model and for that I'd expect to say content_type=aplication/x-votable+xml product_type="instance(s) of ObsCore" or something like that. So do we eventually add a base term "model" and narrower terms like "ObsCore" and "Source" and "Cube" to the product-type vocabulary? We could go that way and I'd feel a lot better about adding a strict product_type column to links now if I heard "heh, that sounds cool - we could do some VEPs for that in the near future".

msdemlei · 2021-05-21T07:17:08Z

On Thu, May 20, 2021 at 08:23:28AM -0700, Bonnarel wrote: 2 ) - when #link is not a dataproduct product_type is useless. It is not a problem per se. we can leave it empty. But maybe we want to say more about what it is in that case. Imagine #link is "Documentation". Is that a tutorial ? a refered article ? a simple html page ? a github repository ? Where do we put this information if the new field is reserved for dataproduct_type vocabulary ?

I'd say the right way to go about answering this question is to figure out: What client is supposed to consume this information, and what is it going to do with it? Following established use in linguistics, I'd call this the pragmatics of the field (cf. https://en.wikipedia.org/wiki/Pragmatics). Once we've understood that, we'll have a much better chance of figuring out if (and how) a product_type column works for the use case or if this needs to be addressed in some other way, and what kind of semantics should be put in place. Incidentally, the "other way" could also include "allow terms from a second vocabulary". Since both of them would be controlled, we can guarantee that there are no collisions between the terms from the two vocabularies, and clients could, by inspecting what vocabulary a term comes from, even figure out if something is a product-type, a (say) documentation-type or just the odd out-of-vocabulary thing that you always have to reckon with. I'm not saying that's a good idea here -- as I said, we first have to figure out exactly what the pragmatics of whatever you're after here is. But it *might* be a good idea.

Bonnarel · 2021-05-23T20:41:25Z

Le 20/05/2021 à 23:13, Patrick Dowler a écrit :

As long as the product-type vocabulary, which says "what something is" expands to include terms beyond what ObsCore uses (different kinds of science data) it could be a general purpose way to augment the content_type. product-type vocabulary

Which means that ObsCore will only use a reduced part of the new dataproduct_type vocabulmary

The level 3 and 4 examples above are both using terms from the datalink/core vocabulary; it could be that we have created some confusion with the content of that vocabulary... is there a use case where you would want to specify one of those level 3 and one of those level 4 terms? If so, is is feasible to split the datalink/core vocab into two actually distinct vocabularies (I'm skeptical)? what about simply allowing multiple terms to be used to describe a link that has a complicated multi-faceted relationship to #this? I do in fact have a use case that suggests this and I don't want to get that mixed up with use of product-type, but in general being able to put multiple terms might be an alternative.

For level 3 and 4 I may differ from Ada, so we would have to poke her to know what she actually meant. To me both 3 and 4 were actually qualifying the relationship between #this and #link. It's actually splitting the actual semantics fields in two parts. And I was afraid that if we add a 4th field to tackle content_type, product_type, relationship and information we will end with most of them empty in many use cases. I imagined that we should wait for a recommended datamodel annotation mechanism to try to solve this by adding an annotation on top of the current table. the use case I see is the one discussed in VEP006. Imagine we have a new relationship term "ancestor" a term of wider extent than progenitor (see VEP006 discussion) able to encompass #dark, #flat fields, used in calibration process to obtain #this etc as well as #progenitors. Then we could have both ancestor and dark ? While calibration and dark could be for dark file which can be used for calibrating the current #this? But what are the other consequences of having two terms in the semantic field ?

On the aside: the "simple thing" I am potentially nervous about is being strict about product_type column being just for terms from the product-type vocab, and then future evolution of that vocab is also strict and not being able to use it for other use cases. The other obvious thing I could see doing is linking to an instance of a data model and for that I'd expect to say content_type=aplication/x-votable+xml product_type="instance(s) of ObsCore" or something like that. So do we eventually add a base term "model" and narrower terms like "ObsCore" and "Source" and "Cube" to the product-type vocabulary? We could go that way and I'd feel a lot better about adding a strict product_type column to links now if I heard "heh, that sounds cool - we could do some VEPs for that in the near future".

in principle I agree with your concern. But ObsCore and Source will not relate to the same semantics value. ObsCore should be metadata (as provenance), while Source could be "derived" or "target" if it existed.

…

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#42 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMP5LTGQKX5RCIBJK4EAM4DTOV3PLANCNFSM4MMTKBBA>.

Bonnarel · 2021-05-23T21:05:37Z

Le 21/05/2021 à 09:17, msdemlei a écrit :

On Thu, May 20, 2021 at 08:23:28AM -0700, Bonnarel wrote: > 2 ) - when #link is not a dataproduct product_type is useless. > It is not a problem per se. we can leave it empty. But maybe we > want to say more about what it is in that case. Imagine #link is > "Documentation". Is that a tutorial ? a refered article ? a > simple html page ? a github repository ? Where do we put this > information if the new field is reserved for dataproduct_type > vocabulary ? I'd say the right way to go about answering this question is to figure out: What client is supposed to consume this information, and what is it going to do with it? Following established use in linguistics, I'd call this the pragmatics of the field (cf. https://en.wikipedia.org/wiki/Pragmatics).

Something very basic : give a more accurate characterization of what kind of documentation is going to be retrieved. At the DataLink table display level this can be only for selection of lets' say "references" When you retrieve it could be also used to annouce the output nature on a retrieval page.

Once we've understood that, we'll have a much better chance of figuring out if (and how) a product_type column works for the use case or if this needs to be addressed in some other way, and what kind of semantics should be put in place. Incidentally, the "other way" could also include "allow terms from a second vocabulary". Since both of them would be controlled, we can guarantee that there are no collisions between the terms from the two vocabularies, and clients could, by inspecting what vocabulary a term comes from, even figure out if something is a product-type, a (say) documentation-type or just the odd out-of-vocabulary thing that you always have to reckon with. I'm not saying that's a good idea here -- as I said, we first have to figure out exactly what the pragmatics of whatever you're after here is. But it *might* be a good idea.

So, my proposal was : by default dataproduct_type, and if needed full vocabulary term with URI Pat's proposal is : extend the scope of the dataproduct_type vocabulary Your proposal = let the client recognize the terms coming from different IVOA vocabularies. Hopefully there are no possible confusion ?

…

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#42 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMP5LTCWLG562ZBKQHBQX73TOYCILANCNFSM4MMTKBBA>.

msdemlei · 2021-05-25T12:30:07Z

On Sun, May 23, 2021 at 02:05:50PM -0700, Bonnarel wrote: Le 21/05/2021 à 09:17, msdemlei a écrit : > I'd say the right way to go about answering this question is to > figure out: What client is supposed to consume this information, and > what is it going to do with it? Following established use in > linguistics, I'd call this the pragmatics of the field (cf. > https://en.wikipedia.org/wiki/Pragmatics). Something very basic : give a more accurate characterization of what kind of documentation is going to be retrieved.

Sure -- but wouldn't a client just send the URL to a web browser regardless of that more accurate characterisation? If so, I don't see a need for machine-readable information, and whatever is in description is enough (because the recipient is a human). If the intended pragmatics are different (i.e., it's not about "figure out where to send this link"), we may need something else, but as I said we'd first need to figure out that other pragmatics.

msdemlei · 2021-05-25T12:30:21Z

On Sun, May 23, 2021 at 01:41:37PM -0700, Bonnarel wrote: the use case I see is the one discussed in VEP006. Imagine we have a new relationship term "ancestor" a term of wider extent than progenitor (see VEP006 discussuion) able to encompass #dark, #flat fields, used in calibration process to obtain #this etc as well as #progenitors. Then we could have both ancestor and dark ? While calibration and dark could be for dark which can be used for calibration ?

Well, this problem immediately goes away when we correctly construct the vocabulary, which is what the discussion on VEP-006 is all about: When the vocabulary is a tree, either dark ⊂ ancestor, in which case there's not need to give it (all dark-s are ancestor-s, and the machine knows it), or dark ∩ ancestor = ∅, in which case it cannot be both. It *is* a bit of an effort to construct vocabularies that way (as evinced by the VEP-006 discussion), but the payoff is that machines can figure out these things, and that's a huge payoff given that proper annotation is difficult for humans.

Bonnarel · 2021-06-30T09:26:37Z

As long as the product-type vocabulary, which says "what something is" expands to include terms beyond what ObsCore uses (different kinds of science data) it could be a general purpose way to augment the content_type.

The level 3 and 4 examples above are both using terms from the datalink/core vocabulary; it could be that we have created some confusion with the content of that vocabulary... is there a use case where you would want to specify one of those level 3 and one of those level 4 terms? If so, is is feasible to split the datalink/core vocab into two actually distinct vocabularies (I'm skeptical)? what about simply allowing multiple terms to be used to describe a link that has a complicated multi-faceted relationship to #this? I do in fact have a use case that suggests this and I don't want to get that mixed up with use of product-type, but in general being able to put multiple terms might be an alternative.

Well , I think it could be interesting to allow some combination of terms in semantics. I seem some use cases as I explained on the semantics mailing list for the VEP006 discussion. But if it is to be useful for clients, should we not restrict the allowed combinations to some predefined list ?

On the aside: the "simple thing" I am potentially nervous about is being strict about product_type column being just for terms from the product-type vocab, and then future evolution of that vocab is also strict and not being able to use it for other use cases. The other obvious thing I could see doing is linking to an instance of a data model and for that I'd expect to say content_type=aplication/x-votable+xml product_type="instance(s) of ObsCore" or something like that. So do we eventually add a base term "model" and narrower terms like "ObsCore" and "Source" and "Cube" to the product-type vocabulary? We could go that way and I'd feel a lot better about adding a strict product_type column to links now if I heard "heh, that sounds cool - we could do some VEPs for that in the near future".

pdowler · 2021-10-14T16:13:19Z

optional content_qualifier field added in PR 57 to resolve this issue.

Bonnarel · 2021-10-14T16:14:11Z

great !!! Le 14/10/2021 à 18:13, Patrick Dowler a écrit :

…

optional content_qualifier field added in PR 57 to resolve this issue. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#42 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AMP5LTG6DWIWEV4ZWDDACX3UG36STANCNFSM4MMTKBBA>. Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.

pdowler added the TBD label May 4, 2020

Bonnarel mentioned this issue May 5, 2020

adding possibility of completing mime-type by dataproduct_type parameter #43

Closed

pdowler added the 1.1 label May 6, 2020

Bonnarel mentioned this issue Dec 9, 2020

Datalink-#51 #56

Closed

pdowler closed this as completed Oct 14, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Incomplete content_type subsection. #42

Incomplete content_type subsection. #42

Bonnarel commented Apr 20, 2020 •

edited

Loading

pdowler commented Nov 4, 2020

pdowler commented Nov 4, 2020

msdemlei commented Nov 11, 2020

Bonnarel commented Nov 18, 2020 •

edited

Loading

Bonnarel commented Nov 19, 2020

Bonnarel commented Dec 9, 2020

Bonnarel commented Dec 9, 2020 •

edited

Loading

pdowler commented May 13, 2021

msdemlei commented May 14, 2021 via email

pdowler commented May 14, 2021

Bonnarel commented May 20, 2021 •

edited

Loading

pdowler commented May 20, 2021

msdemlei commented May 21, 2021 via email

Bonnarel commented May 23, 2021 via email •

edited

Loading

Bonnarel commented May 23, 2021 via email •

edited

Loading

msdemlei commented May 25, 2021 via email

msdemlei commented May 25, 2021 via email

Bonnarel commented Jun 30, 2021

pdowler commented Oct 14, 2021

Bonnarel commented Oct 14, 2021 via email

Incomplete content_type subsection. #42

Incomplete content_type subsection. #42

Comments

Bonnarel commented Apr 20, 2020 • edited Loading

pdowler commented Nov 4, 2020

pdowler commented Nov 4, 2020

msdemlei commented Nov 11, 2020

Bonnarel commented Nov 18, 2020 • edited Loading

Bonnarel commented Nov 19, 2020

Bonnarel commented Dec 9, 2020

Bonnarel commented Dec 9, 2020 • edited Loading

pdowler commented May 13, 2021

msdemlei commented May 14, 2021 via email

pdowler commented May 14, 2021

Bonnarel commented May 20, 2021 • edited Loading

---> could we find a more generic term than product_type for describing the nature of the #link. (I understand that content_qualifier is ruled out) ---> can we consider that the default vocabulary there is the dataproduct_type one and that we allow alternative complete uri ivoa terms if needed ?

pdowler commented May 20, 2021

msdemlei commented May 21, 2021 via email

Bonnarel commented May 23, 2021 via email • edited Loading

Bonnarel commented May 23, 2021 via email • edited Loading

msdemlei commented May 25, 2021 via email

msdemlei commented May 25, 2021 via email

Bonnarel commented Jun 30, 2021

pdowler commented Oct 14, 2021

Bonnarel commented Oct 14, 2021 via email

Bonnarel commented Apr 20, 2020 •

edited

Loading

Bonnarel commented Nov 18, 2020 •

edited

Loading

Bonnarel commented Dec 9, 2020 •

edited

Loading

Bonnarel commented May 20, 2021 •

edited

Loading

---> could we find a more generic term than product_type for describing the nature of the #link. (I understand that content_qualifier is ruled out)
---> can we consider that the default vocabulary there is the dataproduct_type one and that we allow alternative complete uri ivoa terms if needed ?

Bonnarel commented May 23, 2021 via email •

edited

Loading

Bonnarel commented May 23, 2021 via email •

edited

Loading