diff --git a/DataLink.tex b/DataLink.tex index 103aa81..e832363 100644 --- a/DataLink.tex +++ b/DataLink.tex @@ -25,6 +25,11 @@ \newcommand{\blinks}{\{links\}} \newcommand{\attval}[2]{#1={\allowbreak}{"}#2{"}} +\newcommand{\rfcmust}{\textbf{must}} +\newcommand{\rfcshould}{\textbf{should}} +\newcommand{\rfcmay}{\textbf{may}} +\newcommand{\rfcrecommended}{\textbf{recommended}} +\newcommand{\rfcoptional}{\textbf{optional}} \begin{document} @@ -52,8 +57,8 @@ \section*{Acknowledgments} \section*{Conformance-related definitions} -The words ``MUST'', ``SHALL'', ``SHOULD'', ``MAY'', ``RECOMMENDED'', and -``OPTIONAL'' (in upper or lower case) used in this document are to be +The words ``\rfcmust'', ``\rfcshould'', ``\rfcmay'', ``\rfcrecommended'', and +``\rfcoptional'' (in upper or lower case) used in this document are to be interpreted as described in IETF standard RFC2119 \citep{std:RFC2119}. The \emph{Virtual Observatory (VO)} is a @@ -268,7 +273,7 @@ \subsubsection{Datasets linked to an astronomical source} data include images from which sources have been extracted, or imaging the object in case of extended objects, as well as additional observations such as Spectra or Time Series of the source and even spectral cubes -and Time Series of images for extended or varying objects. The {\blinks} +and Time Series of images for extended or varying objects. The \blinks\ response obtained for the source id can allow to easily retrieve all these associated data in one shot. @@ -279,21 +284,21 @@ \subsubsection{Metadata and data related to provenance entities} ProvTAP \citep{iwd:provtap} or ProvSAP \citep{iwd:provsap} DAL services. The Entity instances represent the state of the data items between various steps of the data processing flow. ``Entities'' can be hooked -to the more complete data they represent using the {\blinks} endpoint. +to the more complete data they represent using the \blinks\ endpoint. -\section{The \blinks~endpoint} +\section{The \blinks\ endpoint} \label{sec:linksEndpoint} -Most commonly, DataLink link lists are retrieved from \blinks~endpoints. +Most commonly, DataLink link lists are retrieved from a \blinks\ endpoints. These are DALI-sync endpoints with implementor-defined names. -As specified by DALI-sync, the parameters for a request may be submitted +As specified by DALI-sync, the parameters for a request are submitted using an HTTP GET (query string) or POST action. Any service may offer zero or more datalink endpoints. -\subsection{Parameters on \blinks~endpoints} +\subsection{Parameters on \blinks\ endpoints} -On \blinks~endpoints, the ID and RESPONSEFORMAT parameters as defined +On \blinks\ endpoints, the ID and RESPONSEFORMAT parameters as defined below are mandatory. @@ -303,20 +308,20 @@ \subsubsection{ID} The ID parameter is used by the client to specify one or more identifiers. The service will return at least one link for each of the specified values. The ID values are found in data discovery services -and may be readable URIs or opaque strings. Submitting ID values in batches +and \rfcmay\ be readable URIs or opaque strings. Submitting ID values in batches may be more efficient if the client is planning to submit many such values; clients can control the size of the output by limiting the number of ID values they submit in each request. -Services may place a limit on the number of ID values they will process in one +Services \rfcmay\ place a limit on the number of ID values they will process in one request. If the client submits more ID values than a service is prepared to -process, the service should process ID values up to the limit and -{\bf must} include an overflow indicator in the output to denote that +process, the service \rfcshould\ process ID values up to the limit and +\rfcmust\ include an overflow indicator in the output to denote that the result is truncated as described in DALI. -The service {\bf must not} truncate the output within the set of rows +The service \rfcmust\ \textbf{not} truncate the output within the set of rows (links) for a single ID value. -If the client submits no ID values, the service must respond with a +If the client submits no ID values, the service \rfcmust\ respond with a normal response (e.g.\ an empty results table for VOTable output). The service may include service descriptors (see \ref{sec:serviceDescriptors}) @@ -331,7 +336,7 @@ \subsubsection{RESPONSEFORMAT} support for RESPONSEFORMAT is mandatory. The only output format required by this specification is VOTable with -TABLEDATA serialization; services must support this format. Clients +TABLEDATA serialization; services \rfcmust\ support this format. Clients that want to get the standard (VOTable) output format should simply ignore this parameter. @@ -344,19 +349,19 @@ \subsubsection{RESPONSEFORMAT} \end{itemize} All of these result in the standard output format. -Service implementers may support additional output formats but must follow +Service implementers \rfcmay\ support additional output formats but \rfcmust\ follow the DALI specification if they chose any formats described there. -\subsection{Registering \blinks~endpoints} +\subsection{Registering \blinks\ endpoints} Since normal datalink operations do not involve the Registry, this -specification poses no requirements to register \blinks~endpoints. +specification poses no requirements to register \blinks\ endpoints. Datalink clients also generally have no reason to inspect VOSI capabilities endpoints, and hence there are no requirements on -mentioning \blinks~endpoints in any VOSI capability documents. +mentioning \blinks\ endpoints in any VOSI capability documents. -Operators still wishing to declare \blinks~endpoints can do this by +Operators still wishing to declare \blinks\ endpoints can do this by giving a capability with a standardID of \begin{verbatim} ivo://ivoa.net/std/DataLink#links-1.0 @@ -365,7 +370,7 @@ \subsection{Registering \blinks~endpoints} of the DataLink standard, to avoid backward compatibility problems. This specification does not constrain the capability type used in such -declarations. The access URL of the \blinks~endpoint must be given in a +declarations. The access URL of the \blinks\ endpoint \rfcmust\ be given in a \xmlel{vs:ParamHTTP}-typed interface element. Hence, a single datalink capability could be declared as follows within @@ -401,7 +406,7 @@ \subsection{Registering \blinks~endpoints} \subsection{VOSI} Since DataLink services are not usually registered, the VOSI-capabilities endpoint -is not required; the VOSI-availability endpoint is optional. +is not required; the VOSI-availability endpoint is \rfcoptional. \section{\blinks\ Response} @@ -423,21 +428,20 @@ \subsection{DataLink MIME Type} application/x-votable+xml;content=datalink \end{verbatim} to denote that the response from that URL is a DataLink response. -This is also the MIME type for the \blinks\ response +This is also the preferred MIME type for the \blinks\ response (see \ref{sec:successfulRequests}) unless the caller has explicitly requested a specific value via the RESPONSEFORMAT parameter (see \ref{sec:responseformat}). -Services may include other MIME type parameters in the response. \subsubsection{DataLink recognition outside the context of DAL discovery services responses} When providing a column with URLs, for example outside DAL service responses or when service descriptors are not defined, if all the -URLs are to a DataLink {\blinks} endpoint, then the preferred approach +URLs are to a DataLink \blinks\ endpoint, then the preferred approach is to add a LINK element with the content type defined above. -If some values are to a {\blinks} endpoint and others to +If some values are to a \blinks\ endpoint and others to different content types (e.g.\ single file download), then the VOTable would need a second column to convey the content type. ObsCore utypes for access\_url and access\_format SHOULD @@ -494,9 +498,9 @@ \subsection{List of Links} \label{fig:linkFields} \end{table} -Fields must be present and values provided +Fields \rfcmust\ be present and values provided (or null) as described in Table \ref{fig:linkFields}. Each row in the table -represents one link and must have exactly one of: +represents one link and \rfcmust\ have exactly one of: \begin{itemize} \item an access\_url \item a service\_def @@ -504,9 +508,9 @@ \subsection{List of Links} \end{itemize} To facilitate consumption of large datalink results in streaming mode, all links -for a single ID value MUST be served in consecutive rows in the output. +for a single ID value \rfcmust\ be served in consecutive rows in the output. -If an error occurs while processing an ID value, there should be at least +If an error occurs while processing an ID value, there \rfcshould\ be at least one row for that ID value and an error\_message. For example, if an input ID value is not recognised or found, one row with an error\_message to that effect is sufficient. @@ -515,7 +519,7 @@ \subsection{List of Links} then one could have one or more rows with the same ID and different error\_message(s). -Services may include additional columns; this can be used to include +Services \rfcmay\ include additional columns; this can be used to include values that can be referenced from service descriptor input parameters (see \ref{sec:serviceResources}). @@ -529,7 +533,7 @@ \subsubsection{ID} \subsubsection{access\_url} -The access\_url column may contain a URL to download a single resource. +The access\_url column contains a URL to download a single resource. This URL can be a link to a dynamic resource (e.g.\ preview generation). Beside dereferencable URLs, it is allowed to use URI-fragments to link @@ -544,7 +548,7 @@ \subsubsection{service\_def} The service\_def column contains a reference from the result row to a separate resource. This resource describes a service as specified -in section 4. +in section \ref{sec:serviceResources}. For example, if the response document includes this resource to describe a service: \begin{verbatim} @@ -561,15 +565,15 @@ \subsubsection{service\_def} \subsubsection{error\_message} -The error\_message column is used when no accessURL can be generated for +The error\_message column is used when no access\_url or service\_def can be generated for an input identifier. If an error\_message is included in the output, the -only other columns with values should be the ID column and the semantics -column; all others should be null. +only other columns with values \rfcshould\ be the ID column and the semantics +column; all others \rfcmust\ be null. \subsubsection{description} -The description column SHOULD contain a human-readable description of +The description column \rfcshould\ contain a human-readable description of the link; it is intended for display by interactive applications and very important to help user distinguish links with same semantics (see below). @@ -587,7 +591,7 @@ \subsubsection{semantics} \citep{std:RFC3986} are completed using the base URI of the core DataLink vocabulary, \url{http://www.ivoa.net/rdf/datalink/core}. Terms from this -vocabulary must always be written as relative URIs. This means that for +vocabulary \rfcmust\ always be written as relative URIs. This means that for concepts from the core vocabulary, the value in the semantics column always starts with a hash. @@ -602,11 +606,11 @@ \subsubsection{semantics} file(s) making up what ID references. Since NULL values are not permitted in the semantics column, when only -an error\_message is supplied its value should be the most appropriate +an error\_message is supplied its value \rfcshould\ be the most appropriate for the link the service was trying to generate. For concepts outside the core DataLink vocabulary, the full concept URI -must be given. It should resolve to a human-readable document +\rfcmust\ be given. It \rfcshould\ resolve to a human-readable document describing what the concept means and what clients are expected to do with links annotated with it. @@ -631,26 +635,26 @@ \subsubsection{content\_type} The content\_type column tells the client the general file format (mime-type) they will receive if they use the link (access\_url or invoking a service). -For recursive DataLink links, the content\_type value should +For recursive DataLink links, the content\_type value \rfcshould\ be as specified in section \ref{sec:mime}. -This field may be null (blank) if the value is unknown. +This field \rfcmay\ be null (blank) if the value is unknown. \subsubsection{content\_length} The content\_length column tells the client the size of the download -if they use the link, in bytes. For VOTable, the FIELD must be +if they use the link, in bytes. For VOTable, the FIELD \rfcmust\ be \attval{datatype}{long} with \attval{unit}{byte}. -The value may be null (blank) +The value \rfcmay\ be null (blank) if unknown and will typically be null for links to services. \subsubsection{content\_qualifier} -The content\_qualifier column is optional. If it is present, it tells +The content\_qualifier column is \rfcoptional. If it is present, it tells the client the nature of the thing or service they will receive or access if they use the link. If the access\_url references a data product, the content\_qualifier -field should define its product type. In that case, the considerations +field \rfcshould\ define its product type. In that case, the considerations for the semantics column (Sect.~\ref{sect:semantics}) apply, except that the basic vocabulary is \url{http://www.ivoa.net/rdf/product-type}, and the interpretation as an RDF triple would be $$( @@ -674,12 +678,12 @@ \subsubsection{link\_auth} \verb|true| : authentication is required -This field may be null (blank) if the value is unknown. +This field \rfcmay\ be null (blank) if the value is unknown. \subsubsection{link\_authorized} The link\_authorized column tells the client whether the currently authenticated -identity is authorized to use the link. For VOTable, the FIELD must be +identity is authorized to use the link. For VOTable, the FIELD \rfcmust\ be \attval{datatype}{boolean}. This is generally a prediction to save clients from trying to use a link and getting a permission denied response. Valid values are: @@ -689,29 +693,30 @@ \subsubsection{link\_authorized} \verb|true| : current user is authorized If the value is \verb|false| and the caller tries to use the link anyway, it may be -challenged for credentials (e.g.\ HTTP 401 response with WWW-Authenticate header) or +challenged for credentials (e.g.\ HTTP 401 response with WWW-Authenticate headers) or denied (e.g.\ HTTP 403 ``permission denied''). If the value is \verb|true|, the caller should proceed with the same authentication and should expect to succeed. -This field may be null (blank) if the value is unknown. +This field \rfcmay\ be null (blank) if the value is unknown. \subsection{Successful Requests} \label{sec:successfulRequests} -Successfully executed requests should result in a response with HTTP +Successfully executed requests \rfcshould\ result in a response with HTTP status code 200 (OK) and a response in the format requested by the client or in the default format for the service. The content of the response (for tabular formats) is described above, with some additional details below. Unless the incoming request included a RESPONSEFORMAT parameter requesting -a different format, the content-type header of the response MUST be -``application/x-votable+xml'' with the -``content'' parameter set to ``datalink'', +a different format, the content-type header of the response \rfcmust\ be one of the +values allowed by the VOTable specification, which at the time of this writing includes +``application/x-votable+xml'' and ``text/xml''. The former value is preferred +and SHOULD be augmented with the ``content'' parameter set to ``datalink'', with the canonical form given in \ref{sec:mime} -strongly recommended. Contrary to +strongly \rfcrecommended. Contrary to all other uses of the string given in \ref{sec:mime}, clients wishing to evaluate the content type of the response must, however, perform a full parse @@ -727,27 +732,30 @@ \subsection{Successful Requests} \subsubsection{VOTable output} -The table of links {\bf must} be returned in a RESOURCE with -\attval{type}{results}. The table {\bf must} be in TABLEDATA serialization +The table of links \rfcmust\ be returned in a RESOURCE with +\attval{type}{results}. The table \rfcmust\ be in TABLEDATA serialization unless another serialization is specifically requested (see \ref{sec:responseformat}) and supported by the implementation. The name and UCD attributes for FIELD elements in the VOTable (and the units in one case) are specified above (see \ref{sec:listOfLinks}). -The DALI specification states that a standardID INFO element with -name ``standardID'' and the actual standardID string as a value SHOULD -be provided. It is recommended to include such an element to help users -and applications to identify VOTables as results of DataLink services -this way: +The DALI specification states that VOTable output should include an +INFO element with \attval{name}{standardID} and the standardID string as a value. \begin{verbatim} ... ... + + ... +
+ ...
\end{verbatim} - +From version 1.1 of this standard, the \blinks\ response \rfcmust\ include this +INFO element so that a table of links is easily identified by users and applications +when initially received from the service and if saved for later use. \subsubsection{Other Output Formats} @@ -764,13 +772,13 @@ \subsection{Errors} error where no ID parameter is specified. Services should return the document format requested by the client (see \ref{sec:responseformat}). For the standard -output format (VOTable) the error document {\bf must} also be VOTable. +output format (VOTable) the error document \rfcmust\ also be VOTable. For errors that occur while generating individual links, each identifier may result in a link with only an error\_message as described above. In either case (error document or per-link error\_message), -the error message must start with one of the strings in +the error message \rfcmust\ start with one of the strings in Table \ref{tab:errors}, in order of specificity. \begin{table}[ht] \begin{center} @@ -790,7 +798,7 @@ \subsection{Errors} \label{tab:errors} \end{table} -In all cases, the service may append additional useful information to the +In all cases, the service \rfcmay\ append additional useful information to the error strings above. If there is additional text, it must be separated from the error string with a colon (:) character, for example: @@ -878,7 +886,7 @@ \subsection{Descriptive PARAMs} free or custom services could be registered in an IVOA registry and thus have a resourceIdentifier to enable lookup of the record. -For standard services, the value of the accessURL PARAM must be the +For standard services, the value of the accessURL PARAM \rfcmust\ be the accessURL for the capability specified by the standardID. The accessURL is not generally usable as-is; the client must include extra parameters as described below. If a standardID indicates a capability that supports @@ -893,22 +901,22 @@ \subsection{Descriptive PARAMs} of the service query to a web browser. This is appropriate for both HTML documents and web interactive interfaces. -A service descriptor may contain multiple exampleURL PARAMs. +A service descriptor \rfcmay\ contain multiple exampleURL PARAMs. In exampleURL PARAMs, operators can give valid service calls as GET-able URLs in the PARAMs' value attribute. They are intended as an aid for debugging, in particular to aid users and developers in making sure a -service is still operating as expected. The PARAM's description should +service is still operating as expected. The PARAM's description \rfcshould\ give an indication of what the call will result in. End-user clients might indicate exampleURLs to the user after unexpected service failures. \subsection{Input PARAMs} -A service descriptor must contain a GROUP element with \attval{name}{inputParams} +A service descriptor \rfcmust\ contain a GROUP element with \attval{name}{inputParams} to describe user-specified input parameters of the service. There are three types of input params: params with a fixed value, params where the values come from the ``results'', and params where the value is variable and chosen/specified by the user. -For params with a fixed value (e.g. \attval{fly}{true}), the client {\bf must} +For params with a fixed value (e.g. \attval{fly}{true}), the client \rfcmust\ treat it as a required parameter and include it in the service invocation; this allows a service implementor to include constant params explicitly (and describe them via a DESCRIPTION element) rather than just include them in the ``accessURL'' without the @@ -918,9 +926,9 @@ \subsection{Input PARAMs} attribute is empty (\attval{value}{}) and the PARAM includes a ref attribute to indicate the FIELD (column) that contains the values. For example, a TAP query result may contain identifiers that can be used to invoke the {links} service; the FIELD with the identifiers -must have an XML ID attribute (e.g. \attval{ID}{abc}) and the input PARAM would include -the attribute \attval{ref}{abc}). When this mechanism is used, the client {\bf must} -treat it as a required parameter and the parameter and value {\bf must} be included in +\rfcmust\ have an XML ID attribute (e.g. \attval{ID}{abc}) and the input PARAM would include +the attribute \attval{ref}{abc}). When this mechanism is used, the client \rfcmust\ +treat it as a required parameter and the parameter and value \rfcmust\ be included in the service invocation. For user-specified input PARAMs the value attribute is empty (\attval{value}{}) @@ -930,10 +938,10 @@ \subsection{Input PARAMs} interfaces, operators SHOULD indicate useful ranges of parameters in MIN and MAX children or, for enumerated parameters, indicate the valid values in OPTIONS. In general, services may have parameters of this type that are optional or required and this distinction is -not currently described; services should use a child DESCRIPTION element to document any +not currently described; services \rfcshould\ use a child DESCRIPTION element to document any requirements. Clients should assume that these user-specified parameters are optional, but that specifying some of them may be necessary to have the service do something useful. -Services should respond with an informative error message if the input is not adequate to +Services \rfcshould\ respond with an informative error message if the input is not adequate to perform the operations(s). \subsection{Example: Service Descriptor for the \blinks\ Capability} @@ -943,14 +951,14 @@ \subsection{Example: Service Descriptor for the \blinks\ Capability} (see \ref{sec:resourceId}). In order for the service resource to refer to this FIELD, the FIELD element describing this column of the table -{\bf must} include an XML ID attribute +\rfcmust\ include an XML ID attribute that uniquely identifies the FIELD (column). -For example, a response following the ObsCore-1.0 data model +For example, a response following the ObsCore-1.1 data model would use the following: \begin{verbatim} \end{verbatim} where the ID value {\em primaryID\/} is arbitrary. @@ -996,7 +1004,7 @@ \subsection{Example: Service Descriptor for the \blinks\ Capability} The exampleURL value in the example above provides an example of a URL that use of this service descriptor could produce; -it should resolve to produce an actual result. +it \rfcshould\ resolve to produce an actual result. Although this version of DataLink only has one parameter (ID), using a GROUP and providing the service parameter name allows this recipe to be @@ -1180,7 +1188,7 @@ \subsection{Example: SODA Spectral Cutout with Custom Parameters} The BAND parameter allows the user to specify a spectral interval to extract from the spectrum and follows SODA's regulations. Its VALUES -child declares the range of wavelengths in the dataset; services should +child declares the range of wavelengths in the dataset; services \rfcshould\ always try to give information on the sensible ranges of input parameters, and clients should strive to make them easily accessible to users, if possible in the users' preferred units. Given that at this @@ -1276,6 +1284,8 @@ \section{Changes} \subsection{DataLink-1.1} \begin{itemize} +\item relax content-type usage to allow any valid VOTable MIME type +\item INFO element with standardID mandatory in \blinks\ response \item added optional content\_qualifier to describe link target content with terms from the product-type vocabulary \item added optional link\_auth and link\_authorized to signal whether authentication diff --git a/Makefile b/Makefile index e721219..b20c54a 100644 --- a/Makefile +++ b/Makefile @@ -7,7 +7,7 @@ DOCNAME = DataLink DOCVERSION = 1.1 # Publication date, ISO format; update manually for "releases" -DOCDATE = 2021-11-15 +DOCDATE = 2023-02-03 # What is it you're writing: NOTE, WD, PR, REC, PEN, or EN DOCTYPE = WD diff --git a/ivoatex b/ivoatex index 5be2ce2..2927181 160000 --- a/ivoatex +++ b/ivoatex @@ -1 +1 @@ -Subproject commit 5be2ce2fc7fee0225a70db38c72d6996326ae653 +Subproject commit 292718147b8126f21c5b734523c4c76766c8bafc