Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal OGC API - Processs - Part4: Job Management #437

Open
wants to merge 30 commits into
base: master
Choose a base branch
from

Conversation

gfenoy
Copy link
Contributor

@gfenoy gfenoy commented Sep 23, 2024

Following the discussions during today's GDC / OGC API—Processes SWG meeting, I created this PR.

It contains a proposal for an additional part to the OGC API - Processes family: "OGC API - Processes - Part 4: Job Management" extension.

This extension was initially discussed here:

The document identifier 24-051 was registered.

@gfenoy gfenoy added the Part 4 (Job Management) OGC API - Processes - Part 4: Job Management label Sep 23, 2024
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove and add .DS_Store to .gitignore to not worry about it anymore

Copy link
Contributor Author

@gfenoy gfenoy Sep 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in df465c5.

The repo is organized as follows:

* standard - the main standard document content
- organized in multiple sections and directories (openapi, requirements, etc.)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is outdated since they are now at the root openapi/schemas/processes-job-management

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed in df465c5

Comment on lines 12 to 16
1. Construct a path for each "rel=http://www.opengis.net/def/rel/ogc/1.0/job-list" link on the landing page as well as for the {root}/jobs path.
2. Issue an HTTP POST request for each path.
3. Validate that the response header does not contain `405 Method not allowed`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe an OAP Core could report job-list on its landing page while only supporting the "normal" /processes/{jobId}/execution. This AST should only validate that POST /jobs is supported, since the other endpoint does not "need" to support the additional capabilities of this extension, and POST /processes/{jobId}/execution should already be handled by the AST of core.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, so I applied the modification and removed the reference to the job-list.

I think that the /job url relation type is still job-list on the landing page. I thought the ATS would check for the job-list relation type to know where to send the POST request (and used the {root}/jobs following other tests from part 1 and part 2).

@@ -0,0 +1,2 @@
OGC API - Processes - Part 2: Transactions recommendations.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wrong title

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be fixed in df465c5

:received-date: yyyy-mm-dd
:issued-date: yyyy-mm-dd
:external-id: http://www.opengis.net/doc/IS/ogcapi-processes-4/1.0
:keywords: process, collection, instance, spatial, data, openapi, transactions, insert, update, delete, add, remove, deploy, undeploy, REST, PUT, POST, DELETE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"transactions" does not seem relevant. Add "jobs" instead? not sure about "collection" either.

Comment on lines 1 to 3
type: object
additionalProperties:
$ref: "input.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest using an alternate representation:

inputs:
  type: object
  additionalProperties:
    $ref: "input.yaml"
outputs:
  type: object
  additionalProperties:
    $ref: "output.yaml"

Reasons:

  1. Although "outputs" are shown, those represent the requested outputs (i.e.: transmission mode, format, or any other alternate content negotiation parameters) submitted during the job creation request, in order to eventually produce the desired results. Often, the requested outputs depend on whichever inputs were submitted. Therefore, viewing them separately on different endpoints is not really convenient or useful.

  2. The /jobs/{jobId}/outputs endpoint can easily be confused with /jobs/{jobId}/results. The "request outputs" in this case are "parameter descriptions of eventual outputs", which are provided upstream of the execution workflow. In a way, those are parametrization "inputs" of the processing pipeline.

  3. Because OGC API - Processes core defines specific response combinations and requirements for /jobs/{jobId}/results, the /jobs/{jobId}/outputs is a convenient and natural endpoint name that an API can use to provide alternate response handling and representations conflicting with OGC API - Processes definitions. CRIM's implementation does exactly that. I would appreciate keeping that option available.

  4. As a matter of fact, older OGC API - Processes implementations (start of ADES/EMS days) actually used /jobs/{jobId}/outputs instead of /jobs/{jobId}/results. Adding /jobs/{jobId}/outputs would break those implementations.

  5. Having inputs and outputs nested under those fields (rather than at the root) allows providing even further contents that could be relevant along the inputs/outputs. For example, additional links, metadata or definitions to describe those parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment about inputs

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we use a $ref to openEO's definition, to avoid maintaining duplicate definitions?

Comment on lines +3 to +5
enum:
- process
- openeo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be made a simple string with examples?
Do we want to create the same situation as statusCode needing an override because of the new created status?

If the long term use is to have job management available for an OGC API, maybe it would be better to define a Requirements class that say for openEO, type: openeo MUST be used, and process for OGC API - Processes. A "long" Coverage processing could then easily define their own requirement class with type: coverage, without causing invalid JSON schema definitions.

Comment on lines 7 to 10
id:
type: string
processID:
type: string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requirements need to be revised. They used the alternate name jobID when referring to the GET /jobs responses.

Similarly, process was mentioned during the meeting.
I'm not sure if processID remains relevant however, because process would be the URI, not just the processID from GET /processes/{processID}.

====
[%metadata]
label:: /req/job-management/update-response-locked
part:: If a job is locked, meaning that it is currently being processed (status set to `accepted` or `running`), the server SHALL respond with HTTP status code `423 Locked`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in openEO that's queued or running, we also include an error message in the JSON payload.

label:: /req/job-management/update-response-locked
part:: If a job is locked, meaning that it is currently being processed (status set to `accepted` or `running`), the server SHALL respond with HTTP status code `423 Locked`.
part:: The response body SHALL be based upon the OpenAPI 3.0 schema https://raw.githubusercontent.com/opengeospatial/ogcapi-processes/master/core/openapi/schemas/exception.yaml[exception.yaml].
part:: The `type` of the exception SHALL be “http://www.opengis.net/def/exceptions/ogcapi-processes-4/1.0/locked”.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about the question here?
The reported exception type (https://github.com/opengeospatial/ogcapi-processes/blob/master/openapi/schemas/common-core/exception.yaml#L7) must be the http://www.opengis.net/def/exceptions/ogcapi-processes-4/1.0/locked.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I guess we have different error handling to consider, too.

[%metadata]
label:: /req/ogcapi-processes/create-body
part:: The body of the POST request SHALL be based upon the OpenAPI 3.0 schema https://github.com/opengeospatial/ogcapi-processes/blob/master/openapi/schemas/processes-workflows/execute-workflows.yaml[execute-workflows.yaml]
part:: The media type `application/json` SHALL be used to indicate that request body contains a processes description encoded as an <<rc_ogcapi-processes,OGC API - Processes>>.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openEO also uses application/json... How do we expect to resolve such conflicts?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

====
[%metadata]
label:: /req/ogcapi-processes/update-body
part:: The media type `application/ogcapi-processes+json` SHALL be used to indicate that request body contains a job encoded as an <<rc_openeo,OpenEO>>.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that mixed up with extensions/job_management/standard/requirements/ogcapi-processes/create/REQ_body.adoc?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

====
[%metadata]
label:: /req/openeo/update-body
part:: The media type `application/json` SHALL be used to indicate that request body contains a job encoded as an <<rc_openeo,OpenEO Process Graph>>.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above. If this is for PATCH /jobs/:id, a user must submit a (partial) job definition.
If this is for something else, a user would probably submit a UDP, not a process graph (generally, process graph is a term usually not used in the openEO API spec as a sole entity, instead UDP is used, which has a different schema).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here (and for other job encodings for that matter), we might need to be even more loose on the requirement depending on the referenced schemas.

For example, a PATCH /jobs/{jobId} could update only part of the definition (let say, only modify the desired output format, without providing the rest of the contents again). If the referenced schema defines some properties as required, this could technically make a partial content omitting them invalid, while combining the submitted body (the modified output format) + what was already defined in the job might form a valid schema.


* [[[Common_Workflow_Language,1]]], Peter Amstutz, Michael R. Crusoe, Nebojša Tijanić (editors), Brad Chapman, John Chilton, Michael Heuer, Andrey Kartashov, Dan Leehr, Hervé Ménager, Maya Nedeljkovich, Matt Scales, Stian Soiland-Reyes, Luka Stojanovic (2020): Common Workflow Language, v1.2. Specification, Common Workflow Language working group. https://w3id.org/cwl/v1.2/

* [[[OpenEO_Process_Graphs,2]]], OpenEO: OpenEO Developers API Reference / Process Graphs. https://openeo.org/documentation/1.0/developers/api/reference.html#section/Processes/Process-Graphs
Copy link

@m-mohr m-mohr Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned before, I think this should link to either jobs or UDPs (or both).


. Be authenticated,
. Have "modification privileges" on the jobs offered through the API,
. Have access to one or more of the POST and/or PUT methods on the jobs /jobs/{jobId} endpoints.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PATCH instead of PUT?


The API definition, as defined in Clause 7.3 from <<OAProc-1>>, must reflect this in the resource paths and their available methods.

Examples in the Clauses specifying the requirements classes focus on the mechanics of the POST, PUT, and DELETE methods and exclude authentication. Since authentication will typically be required for all DRU requests, this section provides some examples/guidance:
Copy link

@m-mohr m-mohr Oct 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PATCH instead of PUT?

DRU only or should Creating (i.e. CRUD) also be restricted to authenticated users?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a copy-paste typo from Part 2: DRU. Indeed, should be for any CRUD. Whether restricted or not (partially or completely) is left up to the implementation.


Examples in the Clauses specifying the requirements classes focus on the mechanics of the POST, PUT, and DELETE methods and exclude authentication. Since authentication will typically be required for all DRU requests, this section provides some examples/guidance:

The OpenAPI definition exposed by the serve will declare the authentication schemes that an implementation of the Processes - Part 4 (JM) supports for each operation (or for all operations in the API implementation).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if it actually offers an OpenAPI definition ;-)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I propose we remove following examples entirely. It is not relevant to go into details about security representations here, since there are many implementations out there, and it creates false expectations when reading this document.

There are ongoing discussions in OGC API - Common/Security SWG about similar concerns, and that OpenAPI should not be considered the de-facto place to look for authentication schemes (it can provide them for convenience, but it is not the "reference").


The `OGC API - Processes - Workflow Execute Request` class defines that jobs can be created from an OGC API - Processes - Workflow Execute Request.

The `OpenEO Process Graph` class defines that jobs can be created from an OpenEO Process Graph.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think later we need to go through the document and replace all occurances of Process GRaph with UDP...

Finally, the HTTP POST method on the `/job/{jobId}/results` is used to start a job.

Creating or updating a job requires that a formal description of the new or
updated jobs be provided by the client. This Standar does not
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
updated jobs be provided by the client. This Standar does not
updated jobs be provided by the client. This Standard does not

|------------------------------------------------------------>|
| |
| HTTP/1.1 201 Created |
| Location: /jobs/{jobId} |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In openEO we also return a hreader that returns the OpenEO-Identifier header, which just includes the ID.
We can't reuse this header here due to the "OpenEO-" prefix, but maybe a generic X-OGC-Identifier or so would work? Is this a valuable information to have?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What could be considered is a combination of Content-Location and Content-ID.

I don't see an issue with adding those, but I don't think they would be required, since the information can be obtained from the response contents (it is assumed here that job status+id are returned, as per other comments).

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Content-Location is weird as there's already the standardized Location header.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Location is to indicate where to redirect to. Content-Location is to describe where the contents of the resource can be found. In this case, both URI are the same, but they mean different things.

Copy link

@m-mohr m-mohr Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I have never seen that before... openEO only uses Location right now. If Content-Location is optional, there is no issue though.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an official RFC for Content-ID via HTTP Headers? I found one for Content-Location, but failed to do so for Content-ID (just found one that was related to emails).

| POST /jobs/{jobId}/results HTTP/1.1 |
|------------------------------------------------------------>|
| |
| HTTP/1.1 200 OK |
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In openEO we return 202 due to the HTTP status code semantics.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, the alignment is with OGC API - Processes that defines sync execution by default, which returns the results contents inline, hence the 200. However, 202 could also be returned with a Job Status response if the job was previously configured to run asynchronously.

Things to consider for the resolution here:

  • job definition enforcing some execution strategy
  • process definition enforcing some execution strategy
  • Prefer header in POST /jobs/{jobId}/results to request an execution strategy

Another important item to consider here is that, POST /jobs/{jobId}/results only makes sense if the POST /jobs was also negotiated with a status: created. This is because POST /jobs can also return the 200/201 directly with the sync/async execution immediately triggered. In such case, POST /jobs/{jobId}/results should respond with HTTP 423 Locked since the job would already be accepted/running/completed.

It wouldn't hurt to have more examples here to represent each combination.

Copy link

@m-mohr m-mohr Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the backend decide what the default execution mode is? So there not really any negotiation to get "created" in openEO, which is because there might be operations that you may want to run before actually starting the computation (e.g. estimating the costs).

In openEO it's certainly async, and for OAP it seems sync (which is a bit counterintuitive for me, but that might be because we call jobs actually batch jobs in openEO and batch jobs in synchronous mode feel wrong).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, yes, the backend should be free to decide.

I agree with you about OAP being counter-intuitive of defaulting to sync. It is this way solely for helping implementation support sync-only, which does not require any "job" concept. However, given that Part 4 is all about "job management", I feel implementations should be free to pick their own default. Even in Core, implementation that do support "job" usually implement it because they need async by default...

There should probably be a way to advertise which is the default. In OAP v1.0, there was actually a poorly documented method, which was to order the jobControlOptions with the first value being the default execution mode. This allowed per-process defaults.

With Part 4, we must actually consider 3 execution modes when creating jobs:

  • sync (technically OAP's default)
    • starts execution right away
    • returns the results inline
  • async
    • queued immediately
    • starts whenever resources are available
    • returns a job status response
  • create
    • creates the job
    • not started until triggered* (POST /jobs/{jobId}/results)
    • returns a job status: created response with

(*) triggering can be done as sync or async using Prefer, as if submitted on POST /processes/{processId}/execution

Therefore, openEO's default is actually create followed by async "on trigger".
I would like to distinguish this follow-up async trigger from the "typical" async in OAP. Otherwise, the specification is confusing by reusing the same terms.

@@ -3,6 +3,6 @@ nullable: false
enum:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is in conflict with openEO. Didn't we agree to use the openEO codes?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it was decided that status codes were to stay the way they currently are to avoid breaking Core v2.0 that is about to be submitted for approval. Changing them would require another round of tests and validation, and delay yet again its release already overdue.

If I recall correctly, there was an idea that jobs submitted given a certain OAP/openEO/etc. pattern (eg: using Content-Schema or whatever) would be allowed to use their own set of status codes. Basically, the client interacting with the server could negotiate a certain interface. I think it makes sense if OAP-style inputs/outputs or openEO-style parameters/returns are employed when creating the job, the status codes should align with them. This might be the most portable approach, since even aligning status codes between OAP/openEO would end up breaking for any other interface that could be added later. For example, CWL have their own codes as well, and a client that knows CWL better might prefer its status code names.

Maybe a conformance class should be added, saying something along the lines of "if the job is created with openEO parameters/returns, the server MAY employ openEO-style status codes" + list the mapping shown in #420? Using /conformance listing that class, a client would be able to identify which status codes are to be expected.

Copy link

@m-mohr m-mohr Oct 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understood it completely different, but I couldn't attend the last meeting. I thought because you are going 2.0 anyway you'd switch. Hmm.. not sure this all is worth it we we don't try to resolve the conflicts or make it utterly difficult with clients that need to understand all those schemas...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest bringing it up in the next meeting to validate. This Part 4 document simply adds create. The rest are "whatever Core defines".

- process
jobID:
type: string
- wps
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since not enum but example, it doesn't really matter.
Some servers (CRIM's for example) do support WPS this way.

Comment on lines +1 to +21
type: object
required:
- inputs
- outputs
properties:
inputs:
type: object
additionalProperties:
$ref: "../processes-workflow/input-workflow.yaml"
outputs:
type: object
additionalProperties:
$ref: "../processes-workflow/output-workflow.yaml"
links:
type: array
items:
$ref: "../common-core/link.yaml"
metadata:
type: array
items:
$ref: "../processes-core/metadata.yaml"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this OAP specific?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly to provide additional provenance metadata.
Not OAP specific.

jobs:
type: array
items:
$ref: "statusInfo.yaml"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statusInfo.yaml sounds a bit misleading. Shouldn't this be called jobDetails.yaml or similar?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This uses the original OAP Core reference https://github.com/opengeospatial/ogcapi-processes/blob/master/openapi/schemas/processes-core/statusInfo.yaml
Same name is employed to make mapping more obvious.

The GET /jobs endpoint is allowed to return very minimalistic info (eg: only the job id, type and status is sufficient). A "Job Details" would make more sense for GET /jobs/{jobId} which should return much more information, but is also allowed with only the same minimal set of 3 properties for backward compatibility.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openEO requires created in addition, but otherwise it's the same (except for type for now).

properties:
id:
type: string
process:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

process in openEO is an object, so this conflicts.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a typo.
It should be processID, as per

Copy link

@m-mohr m-mohr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made my first review. Requested various changes and left some comments for discussion.

Thanks for compiling this document so far, must've been a big chunk of work!

Comment on lines +6 to +13
inputs:
type: object
additionalProperties:
$ref: "../processes-workflow/input-workflow.yaml"
outputs:
type: object
additionalProperties:
$ref: "../processes-workflow/output-workflow.yaml"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfenoy
While testing/implementing more of the OAP v1 vs v2 execution modes, I found out that the following should probably be defined as well:

  • V1:
    • response=raw|document
    • outputs -> must be extended to include transmissionMode=value|reference
    • headers
      • Accept
  • V2:
    • headers, notably for:
      • Prefer: return=minimal|representation
      • Prefer: respond-async|wait=x
      • Accept

Basically, the inputs and outputs themselves are not always enough to replicate a job execution because of the various execution parameters and content negotiation that can drastically affect how the results are returned (encodings, media-type, value/link, etc.).

@fmigneault
Copy link
Contributor

@gfenoy
I've been working on implementing job management, and while looking at the PR to apply a comment, I noticed there is no openapi/paths/pJobs file or similar defining the POST /jobs, or any of the other endpoints added under /jobs/{jobId}/....

Comment on lines +5 to +6
label:: /per/job-management/additional-status-codes
part:: Servers MAY support other HTTP protocol capabilities. Therefore, the server may return other status codes than those listed in <<status_codes>>.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@m-mohr FYI, about the mentioned status code flexibility.

====
[%metadata]
label:: /req/job-management/start-response
part:: A successful execution of the operation SHALL be reported as a response with a HTTP status code '200'.
Copy link
Contributor

@fmigneault fmigneault Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using POST /jobs/{jobId}/results, the server should probably still have the option to negotiate the execution mode. In other words, sync and 200 would be used by default, but a Prefer: respond-async would allow the server to trigger this job, although it would not respond with results right away. If async is selected this way, the response would instead be the usual Job Status with monitoring. Also, a 202 would have to be used, since no job is "created" in that case. Once completed, the results can be retrieved from GET /jobs/{jobId}/results.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you not expect the sync result already from POST /jobs?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could do either of the following:

  1. POST /jobs + Prefer: wait=X to respond results inline (no need for other request)

  2. POST /jobs + Prefer: respond-async responds with Job Status

    1. starts the job whenever resources are ready
    2. GET /jobs/{jobId}/results to retrieve results once status: succeeded
  3. POST /jobs + Prefer: wait=X + status: create

    1. places the job in created state, until triggered later on
    2. POST /jobs/{jobId}/results to trigger the job and return the results inline
  4. POST /jobs + Prefer: respond-async + status: create

    1. places the job in created state, until triggered later on
    2. POST /jobs/{jobId}/results to trigger the job and return with Job Status
    3. GET /jobs/{jobId}/results to retrieve results once status: succeeded

Here, the 202 would make sense for 4.ii, whereas 200 for 3.ii

Copy link

@m-mohr m-mohr Oct 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like you have too many options in OAP. Honestly, this feels overengineered. You are implementing things server side that usually a client would handle. Then the server (and client) implementations are much simpler. openEO API is much simpler, but can still do everything you do here it's their clients easily.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, we've had users expecting all these combinations so far, so we must support them to accommodate everyone. If Part 4 only allows working with openEO style jobs, it's not much of an extension for OAP.

Copy link

@m-mohr m-mohr Oct 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users often don't really know what they need and that there are alternatives... If I would add averything to openEO that users just ask for, it would be a mess. Often it is possible to show them alternatives that work equally good.

Anyway, having multiple possible ways of doing the same thing should IMHO be avoided and I still think a client can simplify API and server development.

The way I understood our discussions initially the OAP version of the openEO jobs is slightly different/extended, but this is a completely different spec with a small subset being openEO. If that's what you are looking for I give up. I don't think the way it is is any useful. Then better don't align at all, which makes it less confusing for users because things are clearly separated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this clause say that say that there is no encoding for a specific job definition. Good!

It then indicates that this Standard includes two conformance classes ... One for "OGC - API - Processes - Workflow Execute Request" and one for "OpenEO Process Graph/UDP". Also Good,

However, what about an execute request from Part 1 that executes a single process? Shouldn't I be able to post an execute request to create a job the executes a single deployed process? ... without all the workflow dressing?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gfenoy please add me as a submitter:
Panagiotis (Peter) A. Vretanos, CubeWerx Inc.
Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Part 4 (Job Management) OGC API - Processes - Part 4: Job Management
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants