-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
{links} response content-type header #91
Comments
Answer by Mark Taylor : Given that the proposed text changes are rather convoluted, Unless the incoming request included a RESPONSEFORMAT parameter So: use the datalink content-type unless you've got a good reason not to This would technically be a breaking change from DL 1.0 to 1.1 Mark
|
Hum, François
|
I'm inclined to agree with this change, although it's a shame to make
changes because browsers don't actually support the standards (VOTable
section 8 lists the RFCs that justify the application/x-votable+xml mime
type).
Given the content in the above section of VOTable, DataLink should say the
minimum necessary. Specifying that the output content-type header has to
match a specified RESPONSEFORMAT (or fail) should be sufficient and that
also means one can make a links URL intended to be opened in a browser...
the section on RESPONSEFORMAT in DALI already says that quite explicitly
and for this very reason:
"If a client requests a format by specifying the media type (as opposed to
one of the short forms), the response that delivers that content must set
that media type in the Content-Type header. This is only an issue when a
format has multiple acceptable media types (e.g., VOTable). This allows the
client to control the Content-Type so that it can reliably cause specific
applications to handle the response (e.g., a browser rendering a VOTable
generally requires the text/xml media type)."
The only "change" needed in DataLink is to relax and say that any valid
VOTable mimetype can be used.
For completeness: there is also
ivoa-std/VOTable#15 which is about the content
param itself.
So, generally +1 from me but it's mainly clarifying/relaxing the DataLink
spec because details are in DALI and VOTable.
--
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada
…On Mon, 9 Jan 2023 at 08:42, François Bonnarel ***@***.***> wrote:
From Markus Demleitner today 👍
there's one skeleton in the closet
that recently came up again, and perhaps we can still somehow bury
it before RFC.
The problem is the following text:
Unless the incoming request included a RESPONSEFORMAT parameter
requesting a different format, the content-type header of the
response MUST be application/x-votable+xml'' with the content''
parameter set to ``datalink'', with the canonical form given in
\ref{sec:mime} strongly recommended.
The purpose of this language is that clients can (relatively) easily
work out that they are dealing with a Datalink document regardless of
where they get it from (as long as it's http). I think that's a good
idea, although I'm not aware of a client that actually looks at
content-type when retrieving things that could be datalink documents.
But at the same time this is blocking an important use case:
Displaying datalink documents in the browser (Background:
http://mail.ivoa.net/pipermail/dal/2021-April/008426.html and
https://github.com/msdemlei/datalink-xslt). When I wrote the XSLT
for that in ~2016, I planned it as a temporary hack until there are
good datalink clients, but now I think letting people open datalinks
with the browser and getting something actually usable is a major use
case in itself.
The trouble with this: Web browsers will not apply the XSLT to
documents with a media type of
application/x-votable+xml;content=datalink. I have to give them
text/xml to start the whole magic.
I hence at the moment have the choice of violating the standard or
breaking a use case important to me. I weaseled around that first by
inspecting user agent strings and only returning text/xml if the user
agent looked as if I was dealing a web browser, praying nobody would
notice. But that broke rather quickly (I forget the details), and I
switched to inspecting the accept header. If I find a text/html in
there, I return text/xml (yeah, it's that twisted), otherwise I'm
compliant with the datalink spec.
But it's still a violation of the standard. I had hoped programmatic
use would not be impacted, but it turns out that, for instance, the
JVMs earlier than 11 actually indicate acceptance of text/html, too.
Sigh.
So... it's trouble, and I have not found any solution that doesn't
make me cringe. But I increasingly have the impression that ignoring
the problem will only make matters worse.
The least horrible proposal I have would be to replace the text
quoted above:
When a datalink service returns a datalink VOTable (i.e., absent a
RESPONSEFORMAT parameter requesting something else), it MUST
indicate that in the response's content-type header. When the
request's accept header includes application/x-votable+xml'', then it
MUST be application/x-votable+xml'' with the content'' parameter set to datalink'',
with the canonical form given in
\ref{sec:mime} strongly recommended. Otherwise, any legal VOTable
media type, including text/xml, is allowed.
That is: clients wishing to do dispatch based on the datalink media
type must indicate that they accept VOTable. It's a pretty safe bet
that major browsers won't do that (and potential future VO-enabled
browsers wouldn't need the XSLT, I'm sure). And although HTTP
content negotiation isn't as popular as it should be, I think it's
implementationally not very intrusive.
The only alternative I could come up with would be to codify what I'm
currently doing:
Unless the incoming request included a RESPONSEFORMAT parameter
requesting a different format, and unless the user agent indicates
it will accept text/html, the content-type header of the response
MUST be application/x-votable+xml'' with the content''
parameter set to ``datalink'', with the canonical form given in
\ref{sec:mime} strongly recommended.
We could then have a footnote explaining what the text/html exception
is supposed to do. The downside here is that it's really an ugly
hack to return text/xml when accept has text/html, and there's too
much library code that wantonly sticks text/html into accept behind
the programmers' backs.
I think given the media type hasn't seen too much use so far anyway
and when a client wants to use it, it would be new code anyway, I'd
go for option one.
But if anyone had a less painful idea, that'd be even better. Does
anyone?
—
Reply to this email directly, view it on GitHub
<#91>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADIEM7SEY33AU3APVFF7UMDWRQ5X5ANCNFSM6AAAAAATVUOHHQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
On Mon, Jan 09, 2023 at 12:46:50PM -0800, Patrick Dowler wrote:
The only "change" needed in DataLink is to relax and say that any valid
VOTable mimetype can be used.
Hm, no, I don't think I'd like that. You see, with the current
regulation a client can know that it gets a *datalink* document (as
opposed to an SDM-compliant spectrum, say, or a TAP result) as soon
as it parses the HTTP headers. Without it, it has a very hard time
figuring this out even when it has the full document; DALI circulates
the idea to have a standardID info for that purpose, but not even
DaCHS does this yet (it will in the future).
Side note: *If* we relax the media type thing in Datalink 1.1, we
really must make the standardID INFO mandatory at the same time.
So, I think we need to spell out conditions under which the client
can reliably expect to get the content=datalink when something is a
datalink document. Use case: Splat gets a link sent via SAMP and
retrieves it. This could be an SDM VOTable or a Datalink document.
How will it tell the two situations (requiring entirely different
handling) apart?
In the end, we somehow need to reconcile the requirements from the
non-aware browsers (which need text/xml) and the aware clients (which
need a/x-v;content=datalink), which means that the server needs some
halfway reliable way to tell the two apart. And that's how I came up
with my two proposals:
(1) We either say "detect unaware clients by looking for text/html in
accept",
(2) or we say "detect aware clients by looking for
a/x-v;content=datalink in accept".
(1) would be less intrusive, except that clients may send text/html
without even knowing given the sometimes slightly sorry state of
content negotiation even in widely-used libraries; (2) is another
thing to think about when retrieving VO resources, which isn't nice.
Of course, if someone has a good idea on how to reconcile the
competing requirements without (ab-) using the dreaded HTTP content
negotiation, I'd be most interested to learn about them.
|
In practice, as author of a client that does have to figure out when something is a links table and when it's a catalogue or whatever, I generally do that by duck typing - if it's a VOTable with most or all of the columns required by DataLink sec 3.2 then I can treat it as a links document. I'm likely to carry on doing that in preference to looking at the content-type for practical reasons - content-types are not always present and correct, they can be fiddly to obtain and parse, and you might be acquiring one of these tables in some way other than HTTP. When I would like some signal about whether something is or is not a links table is before I've acquired it, to give the user a hint about whether they will want to download it. In principle the HTTP content-type in conjunction with a HEAD request could help there, but really I want that information without needing any HTTP interaction. So from my point of view some relaxation of constraints on the HTTP content-type header (like Pat's: any valid content-type is OK, or mine: use SHOULD not MUST) is not likely to present practical problems. Other resource consumers may have different views of course. |
I could change my "any valid VOTable content-type" to "SHOULD use
application/x-votable+xml;content=datalink unless the client asks for
something else via the RESPONSEFORMAT param" so prefer the current approach
but allow client to control it and then the "should" allows services to
detect/help web browsers if they want to.
set hat=implementor;
If our site had a link to a a links doc that people could click on, I'd
tell the page author to use RESPONSEFORMAT and I'd resist writing code to
detect browsers. Yeah, that doesn't help some other web site that links to
a {links} result at CADC, but it would be really hard to convince me to
write and maintain that browser detection code.
--
Patrick Dowler
Canadian Astronomy Data Centre
Victoria, BC, Canada
…On Tue, 10 Jan 2023 at 03:37, Mark Taylor ***@***.***> wrote:
In practice, as author of a client that does have to figure out when
something is a links table and when it's a catalogue or whatever, I
generally do that by duck typing - if it's a VOTable with most or all of
the columns required by DataLink sec 3.2 then I can treat it as a links
document. I'm likely to carry on doing that in preference to looking at the
content-type for practical reasons - content-types are not always present
and correct, they can be fiddly to obtain and parse, and you might be
acquiring one of these tables in some way other than HTTP.
When I would like some signal about whether something is or is not a links
table is before I've acquired it, to give the user a hint about whether
they will want to download it. In principle the HTTP content-type in
conjunction with a HEAD request could help there, but really I want that
information without needing any HTTP interaction.
So from my point of view some relaxation of constraints on the HTTP
content-type header (like Pat's: any valid content-type is OK, or mine: use
SHOULD not MUST) is not likely to present practical problems. Other
resource consumers may have different views of course.
—
Reply to this email directly, view it on GitHub
<#91 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ADIEM7VR3PPE7OPBDBK2FQDWRVCX3ANCNFSM6AAAAAATVUOHHQ>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
On Tue, Jan 10, 2023 at 03:08:34PM -0800, Patrick Dowler wrote:
I could change my "any valid VOTable content-type" to "SHOULD use
application/x-votable+xml;content=datalink unless the client asks for
something else via the RESPONSEFORMAT param" so prefer the current approach
but allow client to control it and then the "should" allows services to
detect/help web browsers if they want to.
Hm... while the SHOULD of course fixes my immediate problem, it will
still keep SPLAT and friends from being able to rely on the
content=datalink (it's just a SHOULD, after all), and that makes the
whole media type rather pointless (in this situation; it's of course
still fine for the content_type column).
set hat=implementor;
If our site had a link to a a links doc that people could click on, I'd
tell the page author to use RESPONSEFORMAT and I'd resist writing code to
Ah, that's not my situation. My situation is that people get a
datalink URL *from somewhere* and paste it into the browser. And
blindly adding RESPONSEFORMAT=text/xml to all of my datalinks would
not only be ugly, it would again break the "make SPLAT recognise a
datalink up front" case.
detect browsers. Yeah, that doesn't help some other web site that links to
a {links} result at CADC, but it would be really hard to convince me to
write and maintain that browser detection code.
Well, you don't need to if you don't want to serve XSLT (though,
personally, I think that's a shame). An implementation stubbornly
returning x-v;content=datalink still is correct whatever we do, and
in particular we don't require any sort of support for HTTP content
negotiation.
So, it seems to me the tableau of possible solutions now looks like
this:
(1) Open up ("SHOULD") the content type and rely on duck typing for
datalink detection as per Mark's suggestion (my take: possible, but
if we go there, I'd say we need to find a spot to recommend that
somehwere in the document, as it feels somewhat unorthodox as the
recommended way to do this kind of thing).
(1a) Open up the content type and require a standardID info for
Datalink 1.1.
(2) Keep the MUST unless there's an accept of text/html
(3) Keep the MUST if there's an accept of x-v;content=datalink
Unless we find something sucking less than any of these, I suppose
it's time for a pain level poll
(https://blog.g-vo.org/building-consensus.html#scale).
Here's my list: (1) - 6; (1a) - 4; (2) - 4; (3) - 5
|
Hi there
Le 11/01/2023 à 09:45, msdemlei a écrit :
On Tue, Jan 10, 2023 at 03:08:34PM -0800, Patrick Dowler wrote:
> I could change my "any valid VOTable content-type" to "SHOULD use
> application/x-votable+xml;content=datalink unless the client asks for
> something else via the RESPONSEFORMAT param" so prefer the current
approach
> but allow client to control it and then the "should" allows services to
> detect/help web browsers if they want to.
Hm... while the SHOULD of course fixes my immediate problem, it will
still keep SPLAT and friends from being able to rely on the
content=datalink (it's just a SHOULD, after all), and that makes the
whole media type rather pointless (in this situation; it's of course
still fine for the content_type column).
> set hat=implementor;
> If our site had a link to a a links doc that people could click on, I'd
> tell the page author to use RESPONSEFORMAT and I'd resist writing
code to
Ah, that's not my situation. My situation is that people get a
datalink URL *from somewhere* and paste it into the browser. And
blindly adding RESPONSEFORMAT=text/xml to all of my datalinks would
not only be ugly, it would again break the "make SPLAT recognise a
datalink up front" case.
> detect browsers. Yeah, that doesn't help some other web site that
links to
> a {links} result at CADC, but it would be really hard to convince me to
> write and maintain that browser detection code.
Well, you don't need to if you don't want to serve XSLT (though,
personally, I think that's a shame). An implementation stubbornly
returning x-v;content=datalink still is correct whatever we do, and
in particular we don't require any sort of support for HTTP content
negotiation.
So, it seems to me the tableau of possible solutions now looks like
this:
(1) Open up ("SHOULD") the content type and rely on duck typing for
datalink detection as per Mark's suggestion (my take: possible, but
if we go there, I'd say we need to find a spot to recommend that
somehwere in the document, as it feels somewhat unorthodox as the
recommended way to do this kind of thing).
(1a) Open up the content type and require a standardID info for
Datalink 1.1.
We already made a path in this direction in the 1.1 text wich reads
The DALI specification states that a standardID INFO element with
name “standardID” and the actual standardID string as a value SHOULD
be provided. It is recommended to include such an element to help
users and
applications to identify VOTables as results of DataLink services this
way:
where DataLink 1.0 didn't say anything
We can replace It is recommended by "it is required" if this solve the
problem.
…
(2) Keep the MUST unless there's an accept of text/html
(3) Keep the MUST if there's an accept of x-v;content=datalink
Unless we find something sucking less than any of these, I suppose
it's time for a pain level poll
(https://blog.g-vo.org/building-consensus.html#scale).
Here's my list: (1) - 6; (1a) - 4; (2) - 4; (3) - 5
—
Reply to this email directly, view it on GitHub
<#91 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AMP5LTFHKU3U3GIUUHD2EN3WRZXJPANCNFSM6AAAAAATVUOHHQ>.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
The CASDA implementation is also non-compliant with Datalink 1.0 as it always has a content-type of However, §3.3 of Datalink 1.1 still has a MUST for the content-type header being |
I am preparing some editorial changes that include MUST -> SHOULD and allows any valid VOTable mime type. I am open to making the standardID INFO mandatory in 1.1 so that clients have a clear way to detect that a links table is in there... that's somewhat better then the http header anyway since it gets saved in the xml file for later use. I will make that change as well and see how it looks at review. |
From Markus Demleitner today 👍
there's one skeleton in the closet
that recently came up again, and perhaps we can still somehow bury
it before RFC.
The problem is the following text:
Unless the incoming request included a RESPONSEFORMAT parameter
requesting a different format, the content-type header of the
response MUST be
application/x-votable+xml'' with the
content''parameter set to ``datalink'', with the canonical form given in
\ref{sec:mime} strongly recommended.
The purpose of this language is that clients can (relatively) easily
work out that they are dealing with a Datalink document regardless of
where they get it from (as long as it's http). I think that's a good
idea, although I'm not aware of a client that actually looks at
content-type when retrieving things that could be datalink documents.
But at the same time this is blocking an important use case:
Displaying datalink documents in the browser (Background:
http://mail.ivoa.net/pipermail/dal/2021-April/008426.html and
https://github.com/msdemlei/datalink-xslt). When I wrote the XSLT
for that in ~2016, I planned it as a temporary hack until there are
good datalink clients, but now I think letting people open datalinks
with the browser and getting something actually usable is a major use
case in itself.
The trouble with this: Web browsers will not apply the XSLT to
documents with a media type of
application/x-votable+xml;content=datalink. I have to give them
text/xml to start the whole magic.
I hence at the moment have the choice of violating the standard or
breaking a use case important to me. I weaseled around that first by
inspecting user agent strings and only returning text/xml if the user
agent looked as if I was dealing a web browser, praying nobody would
notice. But that broke rather quickly (I forget the details), and I
switched to inspecting the accept header. If I find a text/html in
there, I return text/xml (yeah, it's that twisted), otherwise I'm
compliant with the datalink spec.
But it's still a violation of the standard. I had hoped programmatic
use would not be impacted, but it turns out that, for instance, the
JVMs earlier than 11 actually indicate acceptance of text/html, too.
Sigh.
So... it's trouble, and I have not found any solution that doesn't
make me cringe. But I increasingly have the impression that ignoring
the problem will only make matters worse.
The least horrible proposal I have would be to replace the text
quoted above:
When a datalink service returns a datalink VOTable (i.e., absent a
RESPONSEFORMAT parameter requesting something else), it MUST
indicate that in the response's content-type header. When the
request's accept header includes
application/x-votable+xml'', then it MUST be
application/x-votable+xml'' with thecontent'' parameter set to
datalink'', with the canonical form given in\ref{sec:mime} strongly recommended. Otherwise, any legal VOTable
media type, including text/xml, is allowed.
That is: clients wishing to do dispatch based on the datalink media
type must indicate that they accept VOTable. It's a pretty safe bet
that major browsers won't do that (and potential future VO-enabled
browsers wouldn't need the XSLT, I'm sure). And although HTTP
content negotiation isn't as popular as it should be, I think it's
implementationally not very intrusive.
The only alternative I could come up with would be to codify what I'm
currently doing:
Unless the incoming request included a RESPONSEFORMAT parameter
requesting a different format, and unless the user agent indicates
it will accept text/html, the content-type header of the response
MUST be
application/x-votable+xml'' with the
content''parameter set to ``datalink'', with the canonical form given in
\ref{sec:mime} strongly recommended.
We could then have a footnote explaining what the text/html exception
is supposed to do. The downside here is that it's really an ugly
hack to return text/xml when accept has text/html, and there's too
much library code that wantonly sticks text/html into accept behind
the programmers' backs.
I think given the media type hasn't seen too much use so far anyway
and when a client wants to use it, it would be new code anyway, I'd
go for option one.
But if anyone had a less painful idea, that'd be even better. Does
anyone?
The text was updated successfully, but these errors were encountered: