From 5fe8ad608c0532945073074df7820ea85bc03b67 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 23 Nov 2021 15:33:34 +0100 Subject: [PATCH 001/117] Add issue templates --- ...e-or-clarification-for-existing-process.md | 19 ++++++ .../ISSUE_TEMPLATE/new-process-parameter.md | 25 +++++++ .../ISSUE_TEMPLATE/new-process-proposal.md | 65 +++++++++++++++++++ 3 files changed, 109 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md create mode 100644 .github/ISSUE_TEMPLATE/new-process-parameter.md create mode 100644 .github/ISSUE_TEMPLATE/new-process-proposal.md diff --git a/.github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md b/.github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md new file mode 100644 index 00000000..1598968d --- /dev/null +++ b/.github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md @@ -0,0 +1,19 @@ +--- +name: Issue or Clarification for existing process +about: Report an issue or clarification for an existing process. +title: "[process id]: [issue summary]" +labels: bug, patch +assignees: '' + +--- + +**Process ID:** ... + +**Describe the issue:** +A clear and concise description of what needs to be fixed or clarified. + +**Proposed solution:** +A potential solution you'd propose. + +**Additional context:** +Add any other context about the problem here. diff --git a/.github/ISSUE_TEMPLATE/new-process-parameter.md b/.github/ISSUE_TEMPLATE/new-process-parameter.md new file mode 100644 index 00000000..17f4ef5b --- /dev/null +++ b/.github/ISSUE_TEMPLATE/new-process-parameter.md @@ -0,0 +1,25 @@ +--- +name: New process parameter +about: Propose a new process parameter +title: "[process id]: [parameter summary]" +labels: enhancement, minor +assignees: '' + +--- + +**Proposed Process ID:** ... (alphanumeric + underscore only) +**Proposed Parameter Name:** ... (alphanumeric + underscore only) +**Optional:** yes, default: ... + +## Context +Give some context and an introduction of what this parameter is needed for, e.g. use cases. + +## Description +... (Markdown allowed) + +## Data Type +boolean/number/string/array/object/null +... (include additional constraints, e.g. min/max values for numbers or a list of allowed values for strings) + +## Additional changes +If required describe any additional changes that are required for the parameter, e.g. an additional data type for the return value, an additional example, an additional link to explain the parameter, etc. diff --git a/.github/ISSUE_TEMPLATE/new-process-proposal.md b/.github/ISSUE_TEMPLATE/new-process-proposal.md new file mode 100644 index 00000000..69c867f1 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/new-process-proposal.md @@ -0,0 +1,65 @@ +--- +name: New process proposal +about: Propose a new process with detailed specifics (name, description, parameters, + return value etc.). To just pitch an idea without a lot of details, please fill + a normal issue. +title: '' +labels: new process +assignees: '' + +--- + +**Proposed Process ID:** ... (alphanumeric + underscore only) + +## Context +Give some context and an introduction of what this process is needed for, e.g. use cases. + +## Summary +... (max. 60 chars) + +## Description +... (min. 55 chars, Markdown allowed) + +## Parameters + +### `param1` (name of first parameter, alphanumeric + underscore only) + +**Optional:** no/yes, default: ... (default value only required for "yes") + +#### Description +... (Markdown allowed) + +#### Data Type +boolean/number/string/array/object/null +... (include additional constraints, e.g. min/max values for numbers or a list of allowed values for strings) + +### `param2` (name of second parameter, copy this template for additional parameters) + +**Optional:** no/yes, default: ... (default value only required for "yes") + +#### Description +... + +#### Data Type +... + +## Return Value +### Description +... (Markdown allowed) + +### Data Type +boolean/number/string/array/object/null +... (include additional constraints, e.g. min/max values for numbers or a list of allowed values for strings) + +## Categories (optional) +* ... +* ... + +## Links to additional resources (optional) +* https://... + +## Examples (optional) +* ... + +## Details about exceptions mentioned in the description (optional) +* ... From 1f1e29ad725304847d20db198c346ea18de93f33 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 24 Nov 2021 14:18:08 +0100 Subject: [PATCH 002/117] Add info about CLI commands in tests folder --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index e819ce3c..28c6ef37 100644 --- a/README.md +++ b/README.md @@ -34,7 +34,10 @@ This repository contains a set of files formally describing the openEO Processes * [subtype-schemas.json](meta/subtype-schemas.json) in the `meta` folder defines common data types (`subtype`s) for JSON Schema used in openEO processes. * The [`examples`](examples/) folder contains some useful examples that the processes link to. All of these are non-binding additions. * The [`tests`](tests/) folder can be used to test the process specification for validity and consistent "style". It also allows rendering the processes in a web browser. - + + If you switch to the `tests` folder in CLI and after installing NodeJS and run `npm install`, you can run a couple of commands: + * `npm test`: Check the processes for validity and lint them. Processes need to pass tests to be added to this repository. + * `npm run render`: Opens a browser with all processes rendered through the docgen. ## Process From 5ee94cb0567cfb482e2a9bef0dfbec7d245c589d Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 24 Nov 2021 14:19:13 +0100 Subject: [PATCH 003/117] Add info about CLI commands in tests folder (cherry-picked from draft branch) --- README.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index e819ce3c..28c6ef37 100644 --- a/README.md +++ b/README.md @@ -34,7 +34,10 @@ This repository contains a set of files formally describing the openEO Processes * [subtype-schemas.json](meta/subtype-schemas.json) in the `meta` folder defines common data types (`subtype`s) for JSON Schema used in openEO processes. * The [`examples`](examples/) folder contains some useful examples that the processes link to. All of these are non-binding additions. * The [`tests`](tests/) folder can be used to test the process specification for validity and consistent "style". It also allows rendering the processes in a web browser. - + + If you switch to the `tests` folder in CLI and after installing NodeJS and run `npm install`, you can run a couple of commands: + * `npm test`: Check the processes for validity and lint them. Processes need to pass tests to be added to this repository. + * `npm run render`: Opens a browser with all processes rendered through the docgen. ## Process From b6b881a151a31818678f85469da0d4767d98a58d Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 24 Nov 2021 14:26:34 +0100 Subject: [PATCH 004/117] Fix/Upgrade CI --- .github/workflows/docs.yml | 5 ++--- .github/workflows/tests.yml | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index b0057d30..44bae476 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -9,13 +9,12 @@ on: jobs: deploy: runs-on: ubuntu-latest - strategy: - matrix: - node-version: [14.x] steps: - name: Inject env variables uses: rlespinasse/github-slug-action@v3.x - uses: actions/setup-node@v1 + with: + node-version: '16' - uses: actions/checkout@v2 - run: | npm install diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index a2efdf8c..dcb1bcbc 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -3,11 +3,10 @@ on: [push, pull_request] jobs: deploy: runs-on: ubuntu-latest - strategy: - matrix: - node-version: [14.x] steps: - uses: actions/setup-node@v1 + with: + node-version: '16' - uses: actions/checkout@v2 - name: Run tests run: | From da2d5218b050fa22b259fbee292de98e8f329a23 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 24 Nov 2021 14:26:34 +0100 Subject: [PATCH 005/117] Fix/Upgrade CI --- .github/workflows/docs.yml | 5 ++--- .github/workflows/tests.yml | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index b0057d30..44bae476 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -9,13 +9,12 @@ on: jobs: deploy: runs-on: ubuntu-latest - strategy: - matrix: - node-version: [14.x] steps: - name: Inject env variables uses: rlespinasse/github-slug-action@v3.x - uses: actions/setup-node@v1 + with: + node-version: '16' - uses: actions/checkout@v2 - run: | npm install diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index a2efdf8c..dcb1bcbc 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -3,11 +3,10 @@ on: [push, pull_request] jobs: deploy: runs-on: ubuntu-latest - strategy: - matrix: - node-version: [14.x] steps: - uses: actions/setup-node@v1 + with: + node-version: '16' - uses: actions/checkout@v2 - name: Run tests run: | From 6890e34c74e8a1da707ddb1f4431e3c53e49c0b1 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 24 Nov 2021 17:33:14 +0100 Subject: [PATCH 006/117] Fix CNAME handling --- .github/workflows/docs.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 44bae476..260743f4 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -43,6 +43,7 @@ jobs: publish_dir: gh-pages user_name: 'openEO CI' user_email: openeo.ci@uni-muenster.de + cname: processes.openeo.org - name: deploy to ${{ env.GITHUB_REF_SLUG }} uses: peaceiris/actions-gh-pages@v3 if: ${{ env.GITHUB_REF_SLUG != 'master' }} From 70eed30f9baf5468423fea47cf0013177a532a19 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 24 Nov 2021 18:05:24 +0100 Subject: [PATCH 007/117] Make "create blank issue" more prominent --- .github/ISSUE_TEMPLATE/config.yml | 1 + .github/ISSUE_TEMPLATE/other.md | 8 ++++++++ 2 files changed, 9 insertions(+) create mode 100644 .github/ISSUE_TEMPLATE/config.yml create mode 100644 .github/ISSUE_TEMPLATE/other.md diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 00000000..ec4bb386 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1 @@ +blank_issues_enabled: false \ No newline at end of file diff --git a/.github/ISSUE_TEMPLATE/other.md b/.github/ISSUE_TEMPLATE/other.md new file mode 100644 index 00000000..1b435bbe --- /dev/null +++ b/.github/ISSUE_TEMPLATE/other.md @@ -0,0 +1,8 @@ +--- +name: Other issues or proposals +about: Use this if no other options suits your needs. +title: '' +labels: '' +assignees: '' + +--- From a7d6441d9e471e1be2c1e6aafcfb56eccb14859b Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 24 Nov 2021 18:05:24 +0100 Subject: [PATCH 008/117] Make "create blank issue" more prominent --- .../issue-or-clarification-for-existing-process.md | 2 +- .github/ISSUE_TEMPLATE/new-process-proposal.md | 2 +- .github/ISSUE_TEMPLATE/other.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md b/.github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md index 1598968d..b80fc2a1 100644 --- a/.github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md +++ b/.github/ISSUE_TEMPLATE/issue-or-clarification-for-existing-process.md @@ -1,5 +1,5 @@ --- -name: Issue or Clarification for existing process +name: Issue or clarification for existing process about: Report an issue or clarification for an existing process. title: "[process id]: [issue summary]" labels: bug, patch diff --git a/.github/ISSUE_TEMPLATE/new-process-proposal.md b/.github/ISSUE_TEMPLATE/new-process-proposal.md index 69c867f1..276f253e 100644 --- a/.github/ISSUE_TEMPLATE/new-process-proposal.md +++ b/.github/ISSUE_TEMPLATE/new-process-proposal.md @@ -2,7 +2,7 @@ name: New process proposal about: Propose a new process with detailed specifics (name, description, parameters, return value etc.). To just pitch an idea without a lot of details, please fill - a normal issue. + a "normal" issue. title: '' labels: new process assignees: '' diff --git a/.github/ISSUE_TEMPLATE/other.md b/.github/ISSUE_TEMPLATE/other.md index 1b435bbe..1bce9684 100644 --- a/.github/ISSUE_TEMPLATE/other.md +++ b/.github/ISSUE_TEMPLATE/other.md @@ -1,6 +1,6 @@ --- name: Other issues or proposals -about: Use this if no other options suits your needs. +about: Use this if no other option suits your needs and you want to fill a "normal" issue. title: '' labels: '' assignees: '' From 3faf89dc8e6792642ee8859ddd078ce66ee64607 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 29 Nov 2021 15:34:48 +0100 Subject: [PATCH 009/117] Add more subtype schema tests, improve test README, minor improvements --- README.md | 6 +--- meta/subtype-schemas.json | 14 +++++---- tests/README.md | 31 +++++++++++++++++-- tests/processes.test.js | 17 +++++------ tests/subtype-schemas.test.js | 22 -------------- tests/subtypes-file.test.js | 29 ++++++++++++++++++ tests/subtypes-schemas.test.js | 54 ++++++++++++++++++++++++++++++++++ tests/testHelpers.js | 7 ++++- 8 files changed, 135 insertions(+), 45 deletions(-) delete mode 100644 tests/subtype-schemas.test.js create mode 100644 tests/subtypes-file.test.js create mode 100644 tests/subtypes-schemas.test.js diff --git a/README.md b/README.md index 28c6ef37..45283419 100644 --- a/README.md +++ b/README.md @@ -33,11 +33,7 @@ This repository contains a set of files formally describing the openEO Processes * [implementation.md](meta/implementation.md) in the `meta` folder provide some additional implementation details for back-ends. For back-end implementors, it's highly recommended to read them. * [subtype-schemas.json](meta/subtype-schemas.json) in the `meta` folder defines common data types (`subtype`s) for JSON Schema used in openEO processes. * The [`examples`](examples/) folder contains some useful examples that the processes link to. All of these are non-binding additions. -* The [`tests`](tests/) folder can be used to test the process specification for validity and consistent "style". It also allows rendering the processes in a web browser. - - If you switch to the `tests` folder in CLI and after installing NodeJS and run `npm install`, you can run a couple of commands: - * `npm test`: Check the processes for validity and lint them. Processes need to pass tests to be added to this repository. - * `npm run render`: Opens a browser with all processes rendered through the docgen. +* The [`tests`](tests/) folder can be used to test the process specification for validity and consistent "style". It also allows rendering the processes in a web browser. Check the [tests documentation](tests/README.md) for details. ## Process diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index b2a349bf..2d6d7ae8 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -75,6 +75,7 @@ "type": "object", "subtype": "chunk-size", "title": "Chunk Size", + "description": "The chunk size per dimension given. This object maps the dimension names given as key to chunks given as either a physical measure or pixels. If not given or `null`, no chunking is applied.", "required": [ "dimension", "value" @@ -108,7 +109,7 @@ "type": "string", "subtype": "collection-id", "title": "Collection ID", - "description": "A collection id from the list of supported collections.", + "description": "A collection identifier from the list of supported collections.", "pattern": "^[\\w\\-\\.~/]+$" }, "date": { @@ -128,6 +129,7 @@ "duration": { "type": "string", "subtype": "duration", + "title": "Duration", "description": "[ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations), e.g. `P1D` for one day.", "pattern": "^(-?)P(?=\\d|T\\d)(?:(\\d+)Y)?(?:(\\d+)M)?(?:(\\d+)([DW]))?(?:T(?:(\\d+)H)?(?:(\\d+)M)?(?:(\\d+(?:\\.\\d+)?)S)?)?$" }, @@ -172,7 +174,7 @@ "type": "string", "subtype": "input-format", "title": "Input File Format", - "description": "An input format supported by the back-end." + "description": "A file format that the back-end supports to import data from." }, "input-format-options": { "type": "object", @@ -191,7 +193,7 @@ "type": "array", "subtype": "kernel", "title": "Image Kernel", - "description": "Image kernel, a two-dimensional array of numbers.", + "description": "A two-dimensional array of numbers to be used as kernel for the image operation.", "items": { "type": "array", "items": { @@ -237,7 +239,7 @@ "type": "string", "subtype": "output-format", "title": "Output File Format", - "description": "An output format supported by the back-end." + "description": "A file format that the back-end supports to save and export data to." }, "output-format-options": { "type": "object", @@ -390,13 +392,13 @@ "type": "string", "subtype": "udf-runtime", "title": "UDF runtime", - "description": "The name of a UDF runtime." + "description": "The identifier of a UDF runtime you want to run the given UDF source code with." }, "udf-runtime-version": { "type": "string", "subtype": "udf-runtime-version", "title": "UDF Runtime version", - "description": "The version of a UDF runtime." + "description": "The version of the UDF runtime you want to run the given UDF source code with." }, "uri": { "type": "string", diff --git a/tests/README.md b/tests/README.md index e2868634..31c79785 100644 --- a/tests/README.md +++ b/tests/README.md @@ -4,5 +4,32 @@ To run the tests follow these steps: 1. Install [node and npm](https://nodejs.org) - should run with any recent version 2. Run `npm install` in this folder to install the dependencies -3. Run the tests with `npm test`. -4. To show the files nicely formatted in a web browser, run `npm run render`. It starts a server and opens the corresponding page in a web browser. \ No newline at end of file +3. Run the tests with `npm test`. This will also lint the files and verify it follows best practices. +4. To show the files nicely formatted in a web browser, run `npm run render`. It starts a server and opens the corresponding page in a web browser. + +## Development processes + +All new processes must be added to the `proposals` folder. Each process must be declared to be `experimental`. +Processes must comply to best practices, which ensure a certain degree of consistency. +`npm test` will validate and lint the processes and also ensure the best practices are applied. + +The linting checks that the files are named correctly, that the content is correctly formatted and indented (JSON and embedded CommonMark). +The best practices ensure that for examples the fields are not too short and also not too long for example. + +A spell check is also checking the texts. It may report names and rarely used technical words as errors. +If you are sure that these are correct, you can add them to the `.words` file to exclude the word from being reported as an error. +The file must contain one word per line. + +New processes should be added via GitHub Pull Requests. + +## Subtype schemas + +Sometimes it is useful to define a new "data type" on top of the JSON types (number, string, array, object, ...). +For example, a client could make a select box with all collections available by adding a subtype `collection-id` to the JSON type `string`. +If you think a new subype should be added, you need to add it to the `meta/subtype-schemas.json` file. +It must be a valid JSON Schema. The tests mentioned above will also verify to a certain degree that the subtypes are defined correctly. + +## Examples + +To get out of proposal state, at least two examples must be provided. +The examples are located in the `examples` folder and will also be validated to some extent in the tests. \ No newline at end of file diff --git a/tests/processes.test.js b/tests/processes.test.js index 089328f9..f7c73c0d 100644 --- a/tests/processes.test.js +++ b/tests/processes.test.js @@ -1,7 +1,7 @@ const glob = require('glob'); const fs = require('fs'); const path = require('path'); -const { normalizeString, checkDescription, checkSpelling, checkJsonSchema, getAjv, prepareSchema } = require('./testHelpers'); +const { normalizeString, checkDescription, checkSpelling, checkJsonSchema, getAjv, prepareSchema, isObject } = require('./testHelpers'); const anyOfRequired = [ "quantiles", @@ -66,7 +66,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { // description expect(typeof p.description).toBe('string'); // lint: Description should be longer than a summary - expect(p.description.length).toBeGreaterThan(55); + expect(p.description.length).toBeGreaterThan(60); checkDescription(p.description, p); }); @@ -98,7 +98,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { } test("Return Value", () => { - expect(typeof p.returns).toBe('object'); + expect(isObject(p.returns)).toBeTruthy(); expect(p.returns).not.toBeNull(); // return value description @@ -108,14 +108,14 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { checkDescription(p.returns.description, p); // return value schema - expect(typeof p.returns.schema).toBe('object'); expect(p.returns.schema).not.toBeNull(); + expect(typeof p.returns.schema).toBe('object'); // lint: Description should not be empty checkJsonSchema(jsv, p.returns.schema); }); test("Exceptions", () => { - expect(typeof p.exceptions === 'undefined' || (typeof p.exceptions === 'object' && p.exceptions !== 'null')).toBeTruthy(); + expect(typeof p.exceptions === 'undefined' || isObject(p.exceptions)).toBeTruthy(); }); var exceptions = o2a(p.exceptions); @@ -153,7 +153,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { } var paramKeys = Object.keys(parametersObj); - expect(typeof example).toBe('object'); + expect(isObject(example)).toBeTruthy(); expect(example).not.toBeNull(); // example title @@ -194,8 +194,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { if (Array.isArray(p.links)) { test.each(p.links)("Links > %#", (link) => { - expect(typeof link).toBe('object'); - expect(link).not.toBeNull(); + expect(isObject(link)).toBeTruthy(); // link href expect(typeof link.href).toBe('string'); @@ -250,8 +249,8 @@ function checkParam(param, p, checkCbParams = true) { checkFlags(param); // Parameter schema - expect(typeof param.schema).toBe('object'); expect(param.schema).not.toBeNull(); + expect(typeof param.schema).toBe('object'); checkJsonSchema(jsv, param.schema); if (!checkCbParams) { diff --git a/tests/subtype-schemas.test.js b/tests/subtype-schemas.test.js deleted file mode 100644 index 49633fda..00000000 --- a/tests/subtype-schemas.test.js +++ /dev/null @@ -1,22 +0,0 @@ -const fs = require('fs'); -const $RefParser = require("@apidevtools/json-schema-ref-parser"); -const { checkJsonSchema, normalizeString, getAjv } = require('./testHelpers'); - -test("subtype-schemas.json", async () => { - let fileContent = fs.readFileSync('../meta/subtype-schemas.json'); - - let schema = JSON.parse(fileContent); - expect(schema).not.toBe(null); - expect(typeof schema).toBe('object'); - - // lint: Check whether the file is correctly JSON formatted - expect(normalizeString(JSON.stringify(schema, null, 4))).toEqual(normalizeString(fileContent.toString())); - - // Is JSON Schema valid? - checkJsonSchema(await getAjv(), schema); - - // is everything dereferencable? - let subtypes = await $RefParser.dereference(schema, { dereference: { circular: "ignore" } }); - expect(subtypes).not.toBe(null); - expect(typeof subtypes).toBe('object'); -}); \ No newline at end of file diff --git a/tests/subtypes-file.test.js b/tests/subtypes-file.test.js new file mode 100644 index 00000000..e70f7e8f --- /dev/null +++ b/tests/subtypes-file.test.js @@ -0,0 +1,29 @@ +const fs = require('fs'); +const $RefParser = require("@apidevtools/json-schema-ref-parser"); +const { checkJsonSchema, getAjv, isObject, normalizeString } = require('./testHelpers'); + +test("File subtype-schemas.json", async () => { + let schema; + let fileContent; + try { + fileContent = fs.readFileSync('../meta/subtype-schemas.json'); + schema = JSON.parse(fileContent); + } catch(err) { + console.error("The file for subtypes is invalid and can't be read:"); + console.error(err); + expect(err).toBeUndefined(); + } + + expect(isObject(schema)).toBeTruthy(); + expect(isObject(schema.definitions)).toBeTruthy(); + + // lint: Check whether the file is correctly JSON formatted + expect(normalizeString(JSON.stringify(schema, null, 4))).toEqual(normalizeString(fileContent.toString())); + + // Is JSON Schema valid? + checkJsonSchema(await getAjv(), schema); + + // is everything dereferencable? + let subtypes = await $RefParser.dereference(schema, { dereference: { circular: "ignore" } }); + expect(isObject(subtypes)).toBeTruthy(); +}); \ No newline at end of file diff --git a/tests/subtypes-schemas.test.js b/tests/subtypes-schemas.test.js new file mode 100644 index 00000000..ff1b72bd --- /dev/null +++ b/tests/subtypes-schemas.test.js @@ -0,0 +1,54 @@ +const $RefParser = require("@apidevtools/json-schema-ref-parser"); +const { checkDescription, checkSpelling, isObject } = require('./testHelpers'); + +// I'd like to run the tests for each subtype individually instead of in a loop, +// but jest doesn't support that, so you need to figure out yourself what is broken. +// The console.log in afterAll ensures we have a hint of which process was checked last + +// Load and dereference schemas +let subtypes = {}; +let lastTest = null; +let testsCompleted = 0; +beforeAll(async () => { + subtypes = await $RefParser.dereference('../meta/subtype-schemas.json', { dereference: { circular: "ignore" } }); + return subtypes; +}); + +afterAll(async () => { + if (testsCompleted != Object.keys(subtypes.definitions).length) { + console.log('The schema the test has likely failed for: ' + lastTest); + } +}); + +test("Schemas in subtype-schemas.json", () => { + // Each schema must contain at least a type, subtype, title and description + for(let name in subtypes.definitions) { + let schema = subtypes.definitions[name]; + lastTest = name; + + // Schema is object + expect(isObject(schema)).toBeTruthy(); + + // Type is array with an element or a stirng + expect((Array.isArray(schema.type) && schema.type.length > 0) || typeof schema.type === 'string').toBeTruthy(); + + // Subtype is a string + expect(typeof schema.subtype === 'string').toBeTruthy(); + + // Check title + expect(typeof schema.title === 'string').toBeTruthy(); + // lint: Summary should be short + expect(schema.title.length).toBeLessThan(60); + // lint: Summary should not end with a dot + expect(/[^\.]$/.test(schema.title)).toBeTruthy(); + checkSpelling(schema.title, schema); + + // Check description + expect(typeof schema.description).toBe('string'); + // lint: Description should be longer than a summary + expect(schema.description.length).toBeGreaterThan(60); + checkDescription(schema.description, schema); + + testsCompleted++; + } +}); \ No newline at end of file diff --git a/tests/testHelpers.js b/tests/testHelpers.js index 385d0449..3f998088 100644 --- a/tests/testHelpers.js +++ b/tests/testHelpers.js @@ -116,6 +116,10 @@ async function getAjv() { return jsv; } +function isObject(obj) { + return (typeof obj === 'object' && obj === Object(obj) && !Array.isArray(obj)); +} + function normalizeString(str) { return str.replace(/\r\n|\r|\n/g, "\n").trim(); } @@ -214,5 +218,6 @@ module.exports = { checkSpelling, checkJsonSchema, checkSchemaRecursive, - prepareSchema + prepareSchema, + isObject }; \ No newline at end of file From 9ee49d54d7f79a0de48aa73659904cd6659f5f7b Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 1 Dec 2021 14:53:42 +0100 Subject: [PATCH 010/117] Quantiles: Clarify to use type 7 (#303) * Use type 7 for quantiles #296 --- CHANGELOG.md | 4 +++- meta/implementation.md | 14 ++++++++++++++ quantiles.json | 8 +++++++- 3 files changed, 24 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index aa9978c8..a0e2aa1c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -29,7 +29,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `aggregate_temporal_period`: Clarified which dimension labels are present in the returned data cube. [#274](https://github.com/Open-EO/openeo-processes/issues/274) - `ard_surface_reflectance`: The process has been categorized as "optical" instead of "sar". - `save_result`: Clarify how the process works in the different contexts it is used in (e.g. synchronous processing, secondary web service). [#288](https://github.com/Open-EO/openeo-processes/issues/288) -- `quantiles`: Clarified behavior. [#278](https://github.com/Open-EO/openeo-processes/issues/278) +- `quantiles`: + - The default algorithm for sample quantiles has been clarified (type 7). [#296](https://github.com/Open-EO/openeo-processes/issues/296) + - Improved documentation in general. [#278](https://github.com/Open-EO/openeo-processes/issues/278) ## [1.1.0] - 2021-06-29 diff --git a/meta/implementation.md b/meta/implementation.md index af8ac782..f69ec2be 100644 --- a/meta/implementation.md +++ b/meta/implementation.md @@ -141,3 +141,17 @@ To make `date_shift` easier to implement, we have found some libraries that foll - JavaScript: [Moment.js](https://momentjs.com/) - Python: [dateutil](https://dateutil.readthedocs.io/en/stable/index.html) - R: [lubridate](https://lubridate.tidyverse.org/) ([Cheatsheet](https://rawgit.com/rstudio/cheatsheets/master/lubridate.pdf)) + +## Quantile algorithms + +The `quantiles` could implement a number of different algorithms, literature usually distinguishes [9 types](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample). +Right now it's not possible to choose from them, but it may be added in the future. +To improve interoperability openEO processes, version 1.2.0 added details about the algorithm that must be implemented. +A survey has shown that most libraries implement type 7 and as such this was chosen to be the default. + +We have found some libraries that can be used for an implementation: +- Java: [Apache Commons Math Percentile](http://commons.apache.org/proper/commons-math/javadocs/api-3.6/org/apache/commons/math3/stat/descriptive/rank/Percentile.html), choose the [estimation type `R_7`](http://commons.apache.org/proper/commons-math/javadocs/api-3.6/org/apache/commons/math3/stat/descriptive/rank/Percentile.EstimationType.html#R_7) +- JavaScript: [d3](https://github.com/d3/d3-array/blob/v2.8.0/README.md#quantile), has only type 7 implemented. +- Julia: [Statistics.quantile](https://docs.julialang.org/en/v1/stdlib/Statistics/#Statistics.quantile!), type 7 is the default. +- Python: [numpy](https://numpy.org/doc/stable/reference/generated/numpy.quantile.html), [pandas](https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.quantile.html), [xarray](http://xarray.pydata.org/en/stable/generated/xarray.DataArray.quantile.html) - type 7 (called 'linear' for the interpolation parameter) is the default for all of them. +- R: [quantile](https://stat.ethz.ch/R-manual/R-patched/library/stats/html/quantile.html) - type 7 is the default. diff --git a/quantiles.json b/quantiles.json index 8df00cbd..81f60c2b 100644 --- a/quantiles.json +++ b/quantiles.json @@ -1,7 +1,7 @@ { "id": "quantiles", "summary": "Quantiles", - "description": "Calculates quantiles, which are cut points dividing the range of a sample distribution into either\n\n* intervals corresponding to the given `probabilities` or\n* equal-sized intervals (q-quantiles based on the parameter `q`).\n\nEither the parameter `probabilities` or `q` must be specified, otherwise the `QuantilesParameterMissing` exception is thrown. If both parameters are set the `QuantilesParameterConflict` exception is thrown.", + "description": "Calculates quantiles, which are cut points dividing the range of a sample distribution into either\n\n* intervals corresponding to the given `probabilities` or\n* equal-sized intervals (q-quantiles based on the parameter `q`).\n\nEither the parameter `probabilities` or `q` must be specified, otherwise the `QuantilesParameterMissing` exception is thrown. If both parameters are set the `QuantilesParameterConflict` exception is thrown.\n\nSample quantiles can be computed with several different algorithms. Hyndman and Fan (1996) have concluded on nine different types, which are commonly implemented in statistical software packages. This process is implementing type 7, which is implemented widely and often also the default type (e.g. in Excel, Julia, Python, R and S).", "categories": [ "math > statistics" ], @@ -173,6 +173,12 @@ "rel": "about", "href": "https://en.wikipedia.org/wiki/Quantile", "title": "Quantiles explained by Wikipedia" + }, + { + "rel": "about", + "href": "https://www.amherst.edu/media/view/129116/original/Sample+Quantiles.pdf", + "type": "application/pdf", + "title": "Hyndman and Fan (1996): Sample Quantiles in Statistical Packages" } ] } \ No newline at end of file From 5abad4b00c05ca703b0f12c7d6c0f54575329fb6 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 1 Dec 2021 15:18:59 +0100 Subject: [PATCH 011/117] load_result: Load by URL and filter by extents and bands (#292) * Improve load_result #220 and other minor alignments --- CHANGELOG.md | 3 + load_collection.json | 2 +- proposals/load_result.json | 185 +++++++++++++++++++++++++++++- proposals/run_udf_externally.json | 2 +- run_udf.json | 2 +- 5 files changed, 185 insertions(+), 9 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index a0e2aa1c..e0b56635 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `predict_curve` - `ard_normalized_radar_backscatter` and `sar_backscatter`: Added `options` parameter - `array_find`: Added parameter `reverse`. [#269](https://github.com/Open-EO/openeo-processes/issues/269) +- `load_result`: + - Added ability to load by (signed) URL (supported since openEO API v1.1.0). + - Added parameters `spatial_extent`, `temporal_extent` and `bands`. [#220](https://github.com/Open-EO/openeo-processes/issues/220) - `run_udf`: Exception `InvalidRuntime` added. [#273](https://github.com/Open-EO/openeo-processes/issues/273) - A new category "math > statistics" has been added [#277](https://github.com/Open-EO/openeo-processes/issues/277) diff --git a/load_collection.json b/load_collection.json index 83df8134..dfdb72ca 100644 --- a/load_collection.json +++ b/load_collection.json @@ -1,7 +1,7 @@ { "id": "load_collection", "summary": "Load a collection", - "description": "Loads a collection from the current back-end by its id and returns it as a processable data cube. The data that is added to the data cube can be restricted with the additional `spatial_extent`, `temporal_extent`, `bands` and `properties`.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", + "description": "Loads a collection from the current back-end by its id and returns it as a processable data cube. The data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent`, `bands` and `properties`.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", "categories": [ "cubes", "import" diff --git a/proposals/load_result.json b/proposals/load_result.json index ebc81718..d8b70c6a 100644 --- a/proposals/load_result.json +++ b/proposals/load_result.json @@ -1,7 +1,7 @@ { "id": "load_result", "summary": "Load batch job results", - "description": "Loads batch job results by job id from the server-side user workspace. The job must have been stored by the authenticated user on the back-end currently connected to.", + "description": "Loads batch job results and returns them as a processable data cube. A batch job result can be loaded by ID or URL:\n\n* **ID**: The identifier for a finished batch job. The job must have been submitted by the authenticated user on the back-end currently connected to.\n* **URL**: The URL to the STAC metadata for a batch job result. This is usually a signed URL that is provided by some back-ends since openEO API version 1.1.0 through the `canonical` link relation in the batch job result metadata.\n\nIf supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent` and `bands`.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", "categories": [ "cubes", "import" @@ -11,11 +11,184 @@ { "name": "id", "description": "The id of a batch job with results.", - "schema": { - "type": "string", - "subtype": "job-id", - "pattern": "^[\\w\\-\\.~]+$" - } + "schema": [ + { + "title": "ID", + "type": "string", + "subtype": "job-id", + "pattern": "^[\\w\\-\\.~]+$" + }, + { + "title": "URL", + "type": "string", + "format": "uri", + "subtype": "uri", + "pattern": "^https?://" + } + ] + }, + { + "name": "spatial_extent", + "description": "Limits the data to load from the batch job result to the specified bounding box or polygons.\n\nThe process puts a pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry,\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries, or\n* a `GeometryCollection` containing `Polygon` or `MultiPolygon` geometries. To maximize interoperability, `GeometryCollection` should be avoided in favour of one of the alternatives above.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", + "schema": [ + { + "title": "Bounding Box", + "type": "object", + "subtype": "bounding-box", + "required": [ + "west", + "south", + "east", + "north" + ], + "properties": { + "west": { + "description": "West (lower left corner, coordinate axis 1).", + "type": "number" + }, + "south": { + "description": "South (lower left corner, coordinate axis 2).", + "type": "number" + }, + "east": { + "description": "East (upper right corner, coordinate axis 1).", + "type": "number" + }, + "north": { + "description": "North (upper right corner, coordinate axis 2).", + "type": "number" + }, + "base": { + "description": "Base (optional, lower left corner, coordinate axis 3).", + "type": [ + "number", + "null" + ], + "default": null + }, + "height": { + "description": "Height (optional, upper right corner, coordinate axis 3).", + "type": [ + "number", + "null" + ], + "default": null + }, + "crs": { + "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/), [WKT2 (ISO 19162) string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html) or [PROJ definition (deprecated)](https://proj.org/usage/quickstart.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", + "anyOf": [ + { + "title": "EPSG Code", + "type": "integer", + "subtype": "epsg-code", + "minimum": 1000, + "examples": [ + 3857 + ] + }, + { + "title": "WKT2", + "type": "string", + "subtype": "wkt2-definition" + }, + { + "title": "PROJ definition", + "type": "string", + "subtype": "proj-definition", + "deprecated": true + } + ], + "default": 4326 + } + } + }, + { + "title": "GeoJSON", + "description": "Limits the data cube to the bounding box of the given geometry. All pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).", + "type": "object", + "subtype": "geojson" + }, + { + "title": "No filter", + "description": "Don't filter spatially. All data is included in the data cube.", + "type": "null" + } + ], + "default": null, + "optional": true + }, + { + "name": "temporal_extent", + "description": "Limits the data to load from the batch job result to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Also supports open intervals by setting one of the boundaries to `null`, but never both.\n\nSet this parameter to `null` to set no limit for the temporal extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_temporal()`` directly after loading unbounded data.", + "schema": [ + { + "type": "array", + "subtype": "temporal-interval", + "minItems": 2, + "maxItems": 2, + "items": { + "anyOf": [ + { + "type": "string", + "format": "date-time", + "subtype": "date-time" + }, + { + "type": "string", + "format": "date", + "subtype": "date" + }, + { + "type": "string", + "subtype": "year", + "minLength": 4, + "maxLength": 4, + "pattern": "^\\d{4}$" + }, + { + "type": "null" + } + ] + }, + "examples": [ + [ + "2015-01-01T00:00:00Z", + "2016-01-01T00:00:00Z" + ], + [ + "2015-01-01", + "2016-01-01" + ] + ] + }, + { + "title": "No filter", + "description": "Don't filter temporally. All data is included in the data cube.", + "type": "null" + } + ], + "default": null, + "optional": true + }, + { + "name": "bands", + "description": "Only adds the specified bands into the data cube so that bands that don't match the list of band names are not available. Applies to all dimensions of type `bands`.\n\nEither the unique band name (metadata field `name` in bands) or one of the common band names (metadata field `common_name` in bands) can be specified. If the unique band name and the common name conflict, the unique band name has a higher priority.\n\nThe order of the specified array defines the order of the bands in the data cube. If multiple bands match a common name, all matched bands are included in the original order.\n\nIt is recommended to use this parameter instead of using ``filter_bands()`` directly after loading unbounded data.", + "schema": [ + { + "type": "array", + "items": { + "type": "string", + "subtype": "band-name" + } + }, + { + "title": "No filter", + "description": "Don't filter bands. All bands are included in the data cube.", + "type": "null" + } + ], + "default": null, + "optional": true } ], "returns": { diff --git a/proposals/run_udf_externally.json b/proposals/run_udf_externally.json index 3396270b..9672eb71 100644 --- a/proposals/run_udf_externally.json +++ b/proposals/run_udf_externally.json @@ -34,7 +34,7 @@ "type": "string", "format": "uri", "subtype": "uri", - "pattern": "^(http|https)://" + "pattern": "^https?://" } }, { diff --git a/run_udf.json b/run_udf.json index 5ca0ec1f..f65f850c 100644 --- a/run_udf.json +++ b/run_udf.json @@ -35,7 +35,7 @@ "type": "string", "format": "uri", "subtype": "uri", - "pattern": "^(http|https)://" + "pattern": "^https?://" }, { "description": "Path to a UDF uploaded to the server.", From d0aba01665b54b184feb477252dc0431e097f7da Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 1 Dec 2021 17:06:56 +0100 Subject: [PATCH 012/117] Fix spell check warning --- tests/.words | 1 + 1 file changed, 1 insertion(+) diff --git a/tests/.words b/tests/.words index b9fe6130..95a83c72 100644 --- a/tests/.words +++ b/tests/.words @@ -37,3 +37,4 @@ gdalwarp Lanczos sinc interpolants +Hyndman \ No newline at end of file From 7168f3f8fbf20e52e8c3a75e9ce12ef61dfd3938 Mon Sep 17 00:00:00 2001 From: Stefaan Lippens Date: Wed, 1 Dec 2021 17:58:25 +0100 Subject: [PATCH 013/117] Array modify: doc finetuning (#309) Co-authored-by: Matthias Mohr --- CHANGELOG.md | 1 + proposals/array_modify.json | 50 +++++++++++++++++-------------------- 2 files changed, 24 insertions(+), 27 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e0b56635..750253e6 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -31,6 +31,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `aggregate_temporal_period`: Clarified which dimension labels are present in the returned data cube. [#274](https://github.com/Open-EO/openeo-processes/issues/274) - `ard_surface_reflectance`: The process has been categorized as "optical" instead of "sar". +- `array_modify`: Clarified behavior. - `save_result`: Clarify how the process works in the different contexts it is used in (e.g. synchronous processing, secondary web service). [#288](https://github.com/Open-EO/openeo-processes/issues/288) - `quantiles`: - The default algorithm for sample quantiles has been clarified (type 7). [#296](https://github.com/Open-EO/openeo-processes/issues/296) diff --git a/proposals/array_modify.json b/proposals/array_modify.json index e6813402..6eaa6852 100644 --- a/proposals/array_modify.json +++ b/proposals/array_modify.json @@ -1,7 +1,7 @@ { "id": "array_modify", - "summary": "Change the content of an array (insert, remove, update)", - "description": "Allows to insert into, remove from or update an array.\n\nAll labels get discarded and the array indices are always a sequence of numbers with the step size of 1 and starting at 0.", + "summary": "Change the content of an array (remove, insert, update)", + "description": "Modify an array by removing, inserting or updating elements. Updating can be seen as removing elements followed by inserting new elements (not necessarily the same number).\n\nAll labels get discarded and the array indices are always a sequence of numbers with the step size of 1 and starting at 0.", "categories": [ "arrays" ], @@ -9,7 +9,7 @@ "parameters": [ { "name": "data", - "description": "An array.", + "description": "The array to modify.", "schema": { "type": "array", "items": { @@ -19,7 +19,7 @@ }, { "name": "values", - "description": "The values to fill the array with.", + "description": "The values to insert into the `data` array.", "schema": { "type": "array", "items": { @@ -29,7 +29,7 @@ }, { "name": "index", - "description": "The index of the element to insert the value(s) before. If the index is greater than the number of elements, the process throws an `ArrayElementNotAvailable` exception.\n\nTo insert after the last element, there are two options:\n\n1. Use the simpler processes ``array_append()`` to append a single value or ``array_concat()`` to append multiple values.\n2. Specify the number of elements in the array. You can retrieve the number of elements with the process ``count()``, having the parameter `condition` set to `true`.", + "description": "The index in the `data` array of the element to insert the value(s) before. If the index is greater than the number of elements in the `data` array, the process throws an `ArrayElementNotAvailable` exception.\n\nTo insert after the last element, there are two options:\n\n1. Use the simpler processes ``array_append()`` to append a single value or ``array_concat()`` to append multiple values.\n2. Specify the number of elements in the array. You can retrieve the number of elements with the process ``count()``, having the parameter `condition` set to `true`.", "schema": { "type": "integer", "minimum": 0 @@ -37,7 +37,7 @@ }, { "name": "length", - "description": "The number of elements to replace. This parameter has no effect in case the given `index` does not exist in the array given.", + "description": "The number of elements in the `data` array to remove (or replace) starting from the given index. If the array contains fewer elements, the process simply removes all elements up to the end.", "optional": true, "default": 1, "schema": { @@ -57,7 +57,7 @@ }, "exceptions": { "ArrayElementNotAvailable": { - "message": "The array has no element with the specified index." + "message": "The array can't be modified as the given index is larger than the number of elements in the array." } }, "examples": [ @@ -124,26 +124,6 @@ "c" ] }, - { - "description": "Add a value at a specific non-existing position after the array, fill missing elements with `null`.", - "arguments": { - "data": [ - "a", - "b" - ], - "values": [ - "e" - ], - "index": 4 - }, - "returns": [ - "a", - "b", - null, - null, - "e" - ] - }, { "description": "Remove a single value from the array.", "arguments": { @@ -181,6 +161,22 @@ "b", "c" ] + }, + { + "description": "Remove multiple values from the end of the array and ignore that the given length is exceeding the size of the array.", + "arguments": { + "data": [ + "a", + "b", + "c" + ], + "values": [], + "index": 1, + "length": 10 + }, + "returns": [ + "a" + ] } ] } From 1a954f6791462be1f696191f7509ee0d9b6f5b62 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 1 Dec 2021 18:03:05 +0100 Subject: [PATCH 014/117] Rename debug to inspect, add implementation guide, other improvements (#310) --- CHANGELOG.md | 4 ++ meta/implementation.md | 71 ++++++++++++++++++++++++++ proposals/{debug.json => inspect.json} | 14 ++--- tests/processes.test.js | 7 +++ 4 files changed, 89 insertions(+), 7 deletions(-) rename proposals/{debug.json => inspect.json} (57%) diff --git a/CHANGELOG.md b/CHANGELOG.md index 750253e6..27d7992d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,6 +22,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed - `array_labels`: Allow normal arrays to be passed for which the process returns the indices. [#243](https://github.com/Open-EO/openeo-processes/issues/243) +- `debug`: + - Renamed to `inspect`. + - The log level `error` does not need to stop execution. + - Added proposals for logging several data types to the implementation guide. ### Removed diff --git a/meta/implementation.md b/meta/implementation.md index f69ec2be..864e02d1 100644 --- a/meta/implementation.md +++ b/meta/implementation.md @@ -142,6 +142,77 @@ To make `date_shift` easier to implement, we have found some libraries that foll - Python: [dateutil](https://dateutil.readthedocs.io/en/stable/index.html) - R: [lubridate](https://lubridate.tidyverse.org/) ([Cheatsheet](https://rawgit.com/rstudio/cheatsheets/master/lubridate.pdf)) +## `inspect` process + +The `inspect` process (previously known as `debug`) is a process to allow users to debug their workflows. +Back-ends should not execute the processes for log levels that are not matching the mininum log level that can be specified through the API (>= v1.2.0) for each data processing request. + +### Data Types + +The process is only useful for users if a common behavior for data types passed into the `data` parameter has been agreed on across implementations. + +The following chapters include some proposals for common data (sub) types, but it is incomplete and will be extended in the future. +Also, for some data types a JSON encoding is missing, we'll add more details once agreed upon: + + +#### Scalars +For the data types boolean, numbers, strings and null it is recommended to log them as given. + +#### Arrays + +It is recommended to summarize arrays with as follows: +```js +{ + "data": [3,1,6,4,8], // Return a reasonable excerpt of the data, e.g. the first 5 or 10 elements + "length": 10, // Return the length of the array, this is important to determine whether the data above is complete or an excerpt + "min": 0, // optional: Return additional statstics if possible, ideally use the corresponsing openEO process names as keys + "max": 10 +} +``` + +#### Data Cubes + +It is recommended to return them summarized in a structure compliant to the [STAC data cube extension](https://github.com/stac-extensions/datacube). +If reasonsable, it gives a valuable benefit for users to provide all dimension labels (e.g. individual timestamps for the temporal dimension) instead of values ranges. +The top-level object and/or each dimension can be enhanced with additional statstics if possible, ideally use the corresponsing openEO process names as keys. + +```js +{ + "cube:dimensions": { + "x": { + "type": "spatial", + "axis": "x", + "extent": [8.253, 12.975], + "reference_system": 4326 + }, + "y": { + "type": "spatial", + "axis": "y", + "extent": [51.877,55.988], + "reference_system": 4326 + }, + "t": { + "type": "temporal", + "values": [ + "2015-06-21T12:56:55Z", + "2015-06-23T09:12:14Z", + "2015-06-25T23:44:44Z", + "2015-06-27T21:11:34Z", + "2015-06-30T17:33:12Z" + ], + "step": null + }, + "bands": { + "type": "bands", + "values": ["NDVI"] + } + }, + // optional: Return additional statstics for the data cube if possible, ideally use the corresponsing openEO process names as keys + "min": -1, + "max": 1 +} +``` + ## Quantile algorithms The `quantiles` could implement a number of different algorithms, literature usually distinguishes [9 types](https://en.wikipedia.org/wiki/Quantile#Estimating_quantiles_from_a_sample). diff --git a/proposals/debug.json b/proposals/inspect.json similarity index 57% rename from proposals/debug.json rename to proposals/inspect.json index cf902efd..91d6deb6 100644 --- a/proposals/debug.json +++ b/proposals/inspect.json @@ -1,7 +1,7 @@ { - "id": "debug", - "summary": "Publish debugging information", - "description": "Sends debugging information about the data to the log output. Passes the data through.", + "id": "inspect", + "summary": "Add information to the logs", + "description": "This process can be used to add runtime information to the logs, e.g. for debugging purposes. This process should be used with caution and it is recommended to remove the process in production workflows. For example, logging each pixel or array individually in a process such as ``apply()`` or ``reduce_dimension()`` could lead to a (too) large number of log entries. Several data structures (e.g. data cubes) are too large to log and will only return summaries of their contents.\n\nThe data provided in the parameter `data` is returned without changes.", "categories": [ "development" ], @@ -9,23 +9,23 @@ "parameters": [ { "name": "data", - "description": "Data to publish.", + "description": "Data to log.", "schema": { "description": "Any data type is allowed." } }, { "name": "code", - "description": "An identifier to help identify the log entry in a bunch of other log entries.", + "description": "A label to help identify one or more log entries originating from this process in the list of all log entries. It can help to group or filter log entries and is usually not unique.", "schema": { "type": "string" }, - "default": "", + "default": "User", "optional": true }, { "name": "level", - "description": "The severity level of this message, defaults to `info`. Note that the level `error` forces the computation to be stopped!", + "description": "The severity level of this message, defaults to `info`.", "schema": { "type": "string", "enum": [ diff --git a/tests/processes.test.js b/tests/processes.test.js index f7c73c0d..1d0d004c 100644 --- a/tests/processes.test.js +++ b/tests/processes.test.js @@ -21,6 +21,7 @@ var loader = (file, proposal = false) => { // Prepare for tests processes.push([file, p, fileContent.toString(), proposal]); + processIds.push(p.id); } catch(err) { processes.push([file, {}, "", proposal]); console.error(err); @@ -29,6 +30,7 @@ var loader = (file, proposal = false) => { }; var processes = []; +var processIds = []; const files = glob.sync("../*.json", {realpath: true}); files.forEach(file => loader(file)); @@ -36,6 +38,11 @@ files.forEach(file => loader(file)); const proposals = glob.sync("../proposals/*.json", {realpath: true}); proposals.forEach(file => loader(file, true)); +test("Check for duplicate process ids", () => { + const duplicates = processIds.filter((id, index) => processIds.indexOf(id) !== index); + expect(duplicates).toEqual([]); +}); + describe.each(processes)("%s", (file, p, fileContent, proposal) => { test("File / JSON", () => { From 3220d7a28e6d0836cce7a2152a45ff7950b230e4 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 1 Dec 2021 18:07:53 +0100 Subject: [PATCH 015/117] Update version number to 1.2.0 --- CHANGELOG.md | 7 +++++-- README.md | 5 +++-- array_apply.json | 4 ++-- array_contains.json | 2 +- array_find.json | 2 +- meta/subtype-schemas.json | 2 +- rename_labels.json | 2 +- tests/docs.html | 2 +- 8 files changed, 15 insertions(+), 11 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 27d7992d..cd7d6cf4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft +## [1.2.0] - 2021-12-15 + ### Added - New processes in proposal state @@ -29,7 +31,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Removed -- Removed the explict schema for `raster-cube` in the `data` parameters and return values of `run_udf` and `run_udf_externally`. It's still possible to pass raster-cubes via the "any" data type, but it's discouraged due to scalability issues. [#285](https://github.com/Open-EO/openeo-processes/issues/285) +- Removed the explicit schema for `raster-cube` in the `data` parameters and return values of `run_udf` and `run_udf_externally`. It's still possible to pass raster-cubes via the "any" data type, but it's discouraged due to scalability issues. [#285](https://github.com/Open-EO/openeo-processes/issues/285) ### Fixed @@ -261,7 +263,8 @@ First version which is separated from the openEO API. Complete rework of all pro Older versions of the processes were released as part of the openEO API, see the corresponding changelog for more information. -[Unreleased]: +[Unreleased]: +[1.2.0]: [1.1.0]: [1.0.0]: [1.0.0-rc.1]: diff --git a/README.md b/README.md index 45283419..99ea0a02 100644 --- a/README.md +++ b/README.md @@ -8,12 +8,13 @@ openEO develops interoperable processes for big Earth observation cloud processi The [master branch](https://github.com/Open-EO/openeo-processes/tree/master) is the 'stable' version of the openEO processes specification. An exception is the [`proposals`](proposals/) folder, which provides experimental new processes currently under discussion. They may still change, but everyone is encouraged to implement them and give feedback. -The latest release is version **1.1.0**. The [draft branch](https://github.com/Open-EO/openeo-processes/tree/draft) is where active development takes place. PRs should be made against the draft branch. +The latest release is version **1.2.0**. The [draft branch](https://github.com/Open-EO/openeo-processes/tree/draft) is where active development takes place. PRs should be made against the draft branch. | Version / Branch | Status | openEO API versions | | ------------------------------------------------------------ | ------------------------- | ------------------- | | [unreleased / draft](https://processes.openeo.org/draft) | in development | 1.x.x | -| [**1.1.0** / master](https://processes.openeo.org/1.1.0/) | **latest stable version** | 1.x.x | +| [**1.2.0** / master](https://processes.openeo.org/1.2.0/) | **latest stable version** | 1.x.x | +| [1.1.0](https://processes.openeo.org/1.1.0/) | legacy version | 1.x.x | | [1.0.0](https://processes.openeo.org/1.0.0/) | legacy version | 1.x.x | | [1.0.0 RC1](https://processes.openeo.org/1.0.0-rc.1/) | legacy version | 1.x.x | | [0.4.2](https://processes.openeo.org/0.4.2/) | legacy version | 0.4.x | diff --git a/array_apply.json b/array_apply.json index bea8a744..15da28dc 100644 --- a/array_apply.json +++ b/array_apply.json @@ -96,13 +96,13 @@ { "rel": "example", "type": "application/json", - "href": "https://processes.openeo.org/1.1.0/examples/array_find_nodata.json", + "href": "https://processes.openeo.org/1.2.0/examples/array_find_nodata.json", "title": "Find no-data values in arrays" }, { "rel": "example", "type": "application/json", - "href": "https://processes.openeo.org/1.1.0/examples/array_contains_nodata.json", + "href": "https://processes.openeo.org/1.2.0/examples/array_contains_nodata.json", "title": "Check for no-data values in arrays" } ] diff --git a/array_contains.json b/array_contains.json index cabfcf23..745b62b3 100644 --- a/array_contains.json +++ b/array_contains.json @@ -133,7 +133,7 @@ { "rel": "example", "type": "application/json", - "href": "https://processes.openeo.org/1.1.0/examples/array_contains_nodata.json", + "href": "https://processes.openeo.org/1.2.0/examples/array_contains_nodata.json", "title": "Check for no-data values in arrays" } ], diff --git a/array_find.json b/array_find.json index 908b5a76..c95f2628 100644 --- a/array_find.json +++ b/array_find.json @@ -164,7 +164,7 @@ { "rel": "example", "type": "application/json", - "href": "https://processes.openeo.org/1.1.0/examples/array_find_nodata.json", + "href": "https://processes.openeo.org/1.2.0/examples/array_find_nodata.json", "title": "Find no-data values in arrays" } ] diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index 2d6d7ae8..429252ef 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "http://processes.openeo.org/1.1.0/meta/subtype-schemas.json", + "$id": "http://processes.openeo.org/1.2.0/meta/subtype-schemas.json", "title": "Subtype Schemas", "description": "This file defines the schemas for subtypes we define for openEO processes.", "definitions": { diff --git a/rename_labels.json b/rename_labels.json index 6ed32f6f..e1ec9138 100644 --- a/rename_labels.json +++ b/rename_labels.json @@ -97,7 +97,7 @@ { "rel": "example", "type": "application/json", - "href": "https://processes.openeo.org/1.1.0/examples/rename-enumerated-labels.json", + "href": "https://processes.openeo.org/1.2.0/examples/rename-enumerated-labels.json", "title": "Rename enumerated labels" } ] diff --git a/tests/docs.html b/tests/docs.html index 73e01312..782d115b 100644 --- a/tests/docs.html +++ b/tests/docs.html @@ -114,7 +114,7 @@ document: 'processes.json', categorize: true, apiVersion: '1.1.0', - title: 'openEO processes (Draft)', + title: 'openEO processes (1.2.0)', notice: '**Note:** This is the list of all processes specified by the openEO project. Back-ends implement a varying set of processes. Thus, the processes you can use at a specific back-end may derive from the specification, may include non-standardized processes and may not implement all processes listed here. Please check each back-end individually for the processes they support. The client libraries usually have a function called `listProcesses` or `list_processes` for that.' } }) From d0ce91fcd347360b907ea2d9589d7564a2c1e1e3 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 13 Dec 2021 11:24:44 +0100 Subject: [PATCH 016/117] v1.2.0 --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index cd7d6cf4..ef5fd7f3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,7 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft -## [1.2.0] - 2021-12-15 +## [1.2.0] - 2021-12-13 ### Added From fd2b906605e891c78c647611261129c67158e191 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 13 Dec 2021 11:28:05 +0100 Subject: [PATCH 017/117] Update version number in docgen to draft --- tests/docs.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tests/docs.html b/tests/docs.html index 782d115b..23ef98e4 100644 --- a/tests/docs.html +++ b/tests/docs.html @@ -114,7 +114,7 @@ document: 'processes.json', categorize: true, apiVersion: '1.1.0', - title: 'openEO processes (1.2.0)', + title: 'openEO processes (draft)', notice: '**Note:** This is the list of all processes specified by the openEO project. Back-ends implement a varying set of processes. Thus, the processes you can use at a specific back-end may derive from the specification, may include non-standardized processes and may not implement all processes listed here. Please check each back-end individually for the processes they support. The client libraries usually have a function called `listProcesses` or `list_processes` for that.' } }) From ffbb927bafe748d0d8ce9ab69b0760487a997d56 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 20 Dec 2021 15:05:29 +0100 Subject: [PATCH 018/117] Add .editorconfig --- .editorconfig | 9 +++++++++ 1 file changed, 9 insertions(+) create mode 100644 .editorconfig diff --git a/.editorconfig b/.editorconfig new file mode 100644 index 00000000..1ef045c7 --- /dev/null +++ b/.editorconfig @@ -0,0 +1,9 @@ +# EditorConfig is awesome: https://EditorConfig.org + +[*.json] +charset = utf-8 +end_of_line = crlf +indent_style = spaces +indent_size = 4 +insert_final_newline = true +trim_trailing_whitespace = true From afc30732a59047b07b3c1a619c07613e6b368ee7 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 12 Jan 2022 11:13:42 +0100 Subject: [PATCH 019/117] Moved the `text_` processes to proposals as they are lacking implementations. Renamed `text_merge` to `text_concat` for better alignment with `array_concat`. --- CHANGELOG.md | 5 +++++ text_begins.json => proposals/text_begins.json | 3 ++- text_merge.json => proposals/text_concat.json | 5 +++-- text_contains.json => proposals/text_contains.json | 3 ++- text_ends.json => proposals/text_ends.json | 3 ++- 5 files changed, 14 insertions(+), 5 deletions(-) rename text_begins.json => proposals/text_begins.json (98%) rename text_merge.json => proposals/text_concat.json (98%) rename text_contains.json => proposals/text_contains.json (98%) rename text_ends.json => proposals/text_ends.json (98%) diff --git a/CHANGELOG.md b/CHANGELOG.md index ef5fd7f3..3120a329 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft +### Changed + +- Moved the `text_` processes to proposals as they are lacking implementations. +- Renamed `text_merge` to `text_concat` for better alignment with `array_concat`. + ## [1.2.0] - 2021-12-13 ### Added diff --git a/text_begins.json b/proposals/text_begins.json similarity index 98% rename from text_begins.json rename to proposals/text_begins.json index 36dc8c8b..08851617 100644 --- a/text_begins.json +++ b/proposals/text_begins.json @@ -6,6 +6,7 @@ "texts", "comparison" ], + "experimental": true, "parameters": [ { "name": "data", @@ -89,4 +90,4 @@ "returns": null } ] -} \ No newline at end of file +} diff --git a/text_merge.json b/proposals/text_concat.json similarity index 98% rename from text_merge.json rename to proposals/text_concat.json index ff6e07c2..83bb206c 100644 --- a/text_merge.json +++ b/proposals/text_concat.json @@ -1,10 +1,11 @@ { - "id": "text_merge", + "id": "text_concat", "summary": "Concatenate elements to a single text", "description": "Merges text representations (also known as *string*) of a set of elements to a single text, having the separator between each element.", "categories": [ "texts" ], + "experimental": true, "parameters": [ { "name": "data", @@ -101,4 +102,4 @@ "returns": "" } ] -} \ No newline at end of file +} diff --git a/text_contains.json b/proposals/text_contains.json similarity index 98% rename from text_contains.json rename to proposals/text_contains.json index af08b231..ce723d38 100644 --- a/text_contains.json +++ b/proposals/text_contains.json @@ -6,6 +6,7 @@ "texts", "comparison" ], + "experimental": true, "parameters": [ { "name": "data", @@ -89,4 +90,4 @@ "returns": null } ] -} \ No newline at end of file +} diff --git a/text_ends.json b/proposals/text_ends.json similarity index 98% rename from text_ends.json rename to proposals/text_ends.json index 03d32a17..7651ba57 100644 --- a/text_ends.json +++ b/proposals/text_ends.json @@ -6,6 +6,7 @@ "texts", "comparison" ], + "experimental": true, "parameters": [ { "name": "data", @@ -89,4 +90,4 @@ "returns": null } ] -} \ No newline at end of file +} From 5aa2da1de6757c90aa4d51ed252b21dbae286346 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 12 Jan 2022 18:11:35 +0100 Subject: [PATCH 020/117] Keep GeoJSON (vector) properties #270 --- CHANGELOG.md | 4 ++++ aggregate_spatial.json | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3120a329..813157b3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,6 +11,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Moved the `text_` processes to proposals as they are lacking implementations. - Renamed `text_merge` to `text_concat` for better alignment with `array_concat`. +### Fixed + +- `aggregate_spatial`: Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) + ## [1.2.0] - 2021-12-13 ### Added diff --git a/aggregate_spatial.json b/aggregate_spatial.json index 9b0c1bd8..9ba2ce9e 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -17,7 +17,7 @@ }, { "name": "geometries", - "description": "Geometries as GeoJSON on which the aggregation will be based.\n\nOne value will be computed per GeoJSON `Feature`, `Geometry` or `GeometryCollection`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", + "description": "Geometries as GeoJSON on which the aggregation will be based. Vector properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per GeoJSON `Feature`, `Geometry` or `GeometryCollection`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", "schema": { "type": "object", "subtype": "geojson" From d5be1a5b5d9fdd2532a05deff41c5bfa97039594 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 12 Jan 2022 18:42:37 +0100 Subject: [PATCH 021/117] Improve support for labeled arrays in experimental array processes (#317) * Better support for labeled arrays in array processes that previously discarded them. Should help with #233. * Update proposals/array_append.json * Throw an error when labels exist in both arrays. --- CHANGELOG.md | 4 ++++ proposals/array_append.json | 39 +++++++++++++++++++------------------ proposals/array_concat.json | 9 +++++++-- proposals/array_modify.json | 5 ++++- 4 files changed, 35 insertions(+), 22 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 813157b3..f3dd43d8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -8,6 +8,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed +- Added better support for labeled arrays. Labels are not discarded in all cases anymore. Affected processes: + - `array_append` + - `array_concat` + - `array_modify` - Moved the `text_` processes to proposals as they are lacking implementations. - Renamed `text_merge` to `text_concat` for better alignment with `array_concat`. diff --git a/proposals/array_append.json b/proposals/array_append.json index 59d8b178..fc5e6272 100644 --- a/proposals/array_append.json +++ b/proposals/array_append.json @@ -1,7 +1,7 @@ { "id": "array_append", "summary": "Append a value to an array", - "description": "Appends a value to the end of the array. Array labels get discarded from the array.", + "description": "Appends a new value to the end of the array, which may also include a new label for labeled arrays.", "categories": [ "arrays" ], @@ -23,6 +23,18 @@ "schema": { "description": "Any data type is allowed." } + }, + { + "name": "label", + "description": "If the given array is a labeled array, a new label for the new value should be given. If not given or `null`, the array index as string is used as the label. If in any case the label exists, a `LabelExists` exception is thrown.", + "optional": true, + "default": null, + "schema": { + "type": [ + "string", + "null" + ] + } } ], "returns": { @@ -34,6 +46,11 @@ } } }, + "exceptions": { + "LabelExists": { + "message": "An array element with the specified label already exists." + } + }, "examples": [ { "arguments": { @@ -49,21 +66,5 @@ 3 ] } - ], - "process_graph": { - "append": { - "process_id": "array_concat", - "arguments": { - "array1": { - "from_parameter": "data" - }, - "array2": [ - { - "from_parameter": "value" - } - ] - }, - "result": true - } - } -} \ No newline at end of file + ] +} diff --git a/proposals/array_concat.json b/proposals/array_concat.json index d095973a..2728ebfe 100644 --- a/proposals/array_concat.json +++ b/proposals/array_concat.json @@ -1,7 +1,7 @@ { "id": "array_concat", "summary": "Merge two arrays", - "description": "Concatenates two arrays into a single array by appending the second array to the first array. Array labels get discarded from both arrays before merging.", + "description": "Concatenates two arrays into a single array by appending the second array to the first array.\n\nArray labels are kept only if both given arrays are labeled. Otherwise, the labels get discarded from both arrays. The process fails with an `ArrayLabelConflict` exception if a label is present in both arrays. Conflicts must be resolved beforehand.", "categories": [ "arrays" ], @@ -37,6 +37,11 @@ } } }, + "exceptions": { + "ArrayLabelConflict": { + "message": "At least one label exists in both arrays and the conflict must be resolved before." + } + }, "examples": [ { "description": "Concatenates two arrays containing different data type.", @@ -58,4 +63,4 @@ ] } ] -} \ No newline at end of file +} diff --git a/proposals/array_modify.json b/proposals/array_modify.json index 6eaa6852..2ee02e16 100644 --- a/proposals/array_modify.json +++ b/proposals/array_modify.json @@ -1,7 +1,7 @@ { "id": "array_modify", "summary": "Change the content of an array (remove, insert, update)", - "description": "Modify an array by removing, inserting or updating elements. Updating can be seen as removing elements followed by inserting new elements (not necessarily the same number).\n\nAll labels get discarded and the array indices are always a sequence of numbers with the step size of 1 and starting at 0.", + "description": "Modify an array by removing, inserting or updating elements. Updating can be seen as removing elements followed by inserting new elements (not necessarily the same number).\n\nArray labels are kept only if both arrays given in `data` and `values` are labeled or one of them is empty. Otherwise, all labels get discarded and the array indices are a sequence of numbers with the step size of 1 and starting at 0. The process fails with an `ArrayLabelConflict` exception if a label is present in both arrays.", "categories": [ "arrays" ], @@ -58,6 +58,9 @@ "exceptions": { "ArrayElementNotAvailable": { "message": "The array can't be modified as the given index is larger than the number of elements in the array." + }, + "ArrayLabelConflict": { + "message": "At least one label exists in both arrays and the conflict must be resolved before." } }, "examples": [ From f12ffb2a0a14f1aa45f452d615af2aa39ce4eef2 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 14 Feb 2022 22:12:03 +0100 Subject: [PATCH 022/117] =?UTF-8?q?`rename=5Flabels`:=20Clarified=20that?= =?UTF-8?q?=20the=20`LabelsNotEnumerated`=C2=A0exception=20is=20thrown=20i?= =?UTF-8?q?f=20`source`=C2=A0is=20empty=20instead=20of=20if=20`target`?= =?UTF-8?q?=C2=A0is=20empty.=20#321?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- CHANGELOG.md | 3 ++- rename_labels.json | 8 ++++---- 2 files changed, 6 insertions(+), 5 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f3dd43d8..1e948031 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed - `aggregate_spatial`: Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) +- `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) ## [1.2.0] - 2021-12-13 @@ -85,7 +86,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Moved the experimental process `run_udf_externally` to the proposals. - Moved the rarely used and implemented processes `cummax`, `cummin`, `cumproduct`, `cumsum`, `debug`, `filter_labels`, `load_result`, `load_uploaded_files`, `resample_cube_temporal` to the proposals. - Exception messages have been aligned always use ` instead of '. Tooling could render it with CommonMark. -- `load_collection` and `mask_polygon`: Also support multi polygons instead of just polygons. [#237](https://github.com/Open-EO/openeo-processes/issues/237) +- `load_collection` and `mask_polygon`: Also support multi polygons instead of just polygons. [#237](https://github.com/Open-EO/openeo-processes/issues/237) - `run_udf` and `run_udf_externally`: Specify specific (extensible) protocols for UDF URIs. - `resample_cube_spatial` and `resample_spatial`: Aligned with GDAL and added `rms` and `sum` options to methods. Also added better descriptions. - `resample_cube_temporal`: Process has been simplified and only offers the nearest neighbor method now. The `process` parameter has been removed, the `dimension` parameter was made less restrictive, the parameter `valid_within` was added. [#194](https://github.com/Open-EO/openeo-processes/issues/194) diff --git a/rename_labels.json b/rename_labels.json index e1ec9138..1a018fe5 100644 --- a/rename_labels.json +++ b/rename_labels.json @@ -1,7 +1,7 @@ { "id": "rename_labels", "summary": "Rename dimension labels", - "description": "Renames the labels of the specified dimension in the data cube from `source` to `target`.\n\nIf the array for the source labels is empty (the default), the dimension labels are expected to be enumerated with zero-based numbering (0,1,2,3,...) so that the dimension labels directly map to the indices of the array specified for the parameter `target`. If the dimension labels are not enumerated and the `target` parameter is not specified, the `LabelsNotEnumerated` exception is thrown. The number of the source and target labels must be equal. Otherwise, the exception `LabelMismatch` is thrown.\n\nThis process doesn't change the order of the labels and their corresponding data.", + "description": "Renames the labels of the specified dimension in the data cube from `source` to `target`.\n\nIf the array for the source labels is empty (the default), the dimension labels are expected to be enumerated with zero-based numbering (0,1,2,3,...) so that the dimension labels directly map to the indices of the array specified for the parameter `target`. The number of the source and target labels must be equal. Otherwise, the exception `LabelMismatch` is thrown.\n\nThis process doesn't change the order of the labels and their corresponding data.", "categories": [ "cubes" ], @@ -23,7 +23,7 @@ }, { "name": "target", - "description": "The new names for the labels. The dimension labels in the data cube are expected to be enumerated if the parameter `target` is not specified. If a target dimension label already exists in the data cube, a `LabelExists` exception is thrown.", + "description": "The new names for the labels. If a target dimension label already exists in the data cube, a `LabelExists` exception is thrown.", "schema": { "type": "array", "items": { @@ -36,7 +36,7 @@ }, { "name": "source", - "description": "The names of the labels as they are currently in the data cube. The array defines an unsorted and potentially incomplete list of labels that should be renamed to the names available in the corresponding array elements in the parameter `target`. If one of the source dimension labels doesn't exist, the `LabelNotAvailable` exception is thrown. By default, the array is empty so that the dimension labels in the data cube are expected to be enumerated.", + "description": "The names of the labels as they are currently in the data cube. The array defines an unsorted and potentially incomplete list of labels that should be renamed to the names available in the corresponding array elements in the parameter `target`. By default, the array is empty so that the dimension labels in the data cube are expected to be enumerated.\n\nIf the dimension labels are not enumerated and the given array is empty, the `LabelsNotEnumerated` exception is thrown. If one of the source dimension labels doesn't exist, the `LabelNotAvailable` exception is thrown.", "schema": { "type": "array", "items": { @@ -101,4 +101,4 @@ "title": "Rename enumerated labels" } ] -} \ No newline at end of file +} From 4dbbb2957a5b7d9ed4534a19e2f1a805a3712fcb Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 15 Feb 2022 12:08:34 +0100 Subject: [PATCH 023/117] Fix typo --- meta/implementation.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/meta/implementation.md b/meta/implementation.md index 864e02d1..8774f9e4 100644 --- a/meta/implementation.md +++ b/meta/implementation.md @@ -160,7 +160,7 @@ For the data types boolean, numbers, strings and null it is recommended to log t #### Arrays -It is recommended to summarize arrays with as follows: +It is recommended to summarize arrays as follows: ```js { "data": [3,1,6,4,8], // Return a reasonable excerpt of the data, e.g. the first 5 or 10 elements From a4960c838b0d0c2cbfa985901761c3ece42b404e Mon Sep 17 00:00:00 2001 From: Stefaan Lippens Date: Fri, 18 Feb 2022 10:09:44 +0100 Subject: [PATCH 024/117] Update broken links to file format GDAL codes (#325) --- save_result.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/save_result.json b/save_result.json index 6f58db4a..905fe4b4 100644 --- a/save_result.json +++ b/save_result.json @@ -54,12 +54,12 @@ "links": [ { "rel": "about", - "href": "https://www.gdal.org/formats_list.html", + "href": "https://gdal.org/drivers/raster/index.html", "title": "GDAL Raster Formats" }, { "rel": "about", - "href": "https://www.gdal.org/ogr_formats.html", + "href": "https://gdal.org/drivers/vector/index.html", "title": "OGR Vector Formats" } ] From f2218d515cf44596f40c9853acec489dd74fa895 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 22 Feb 2022 13:32:11 +0100 Subject: [PATCH 025/117] `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. --- CHANGELOG.md | 1 + apply_neighborhood.json | 94 ++++++++++++++++++++++------------------- tests/processes.test.js | 14 +++--- 3 files changed, 58 insertions(+), 51 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 1e948031..e887b833 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -18,6 +18,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed - `aggregate_spatial`: Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) +- `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) ## [1.2.0] - 2021-12-13 diff --git a/apply_neighborhood.json b/apply_neighborhood.json index 05fb117a..2cb7d7b6 100644 --- a/apply_neighborhood.json +++ b/apply_neighborhood.json @@ -98,52 +98,60 @@ }, { "name": "overlap", - "description": "Overlap of neighborhoods along each dimension to avoid border effects.\n\nFor instance a temporal dimension can add 1 month before and after a neighborhood. In the spatial dimensions, this is often a number of pixels. The overlap specified is added before and after, so an overlap of 8 pixels will add 8 pixels on both sides of the window, so 16 in total.\n\nBe aware that large overlaps increase the need for computational resources and modifying overlapping data in subsequent operations have no effect.", + "description": "Overlap of neighborhoods along each dimension to avoid border effects. By default no overlap is provided.\n\nFor instance a temporal dimension can add 1 month before and after a neighborhood. In the spatial dimensions, this is often a number of pixels. The overlap specified is added before and after, so an overlap of 8 pixels will add 8 pixels on both sides of the window, so 16 in total.\n\nBe aware that large overlaps increase the need for computational resources and modifying overlapping data in subsequent operations have no effect.", "optional": true, - "schema": { - "type": "array", - "items": { - "type": "object", - "subtype": "chunk-size", - "required": [ - "dimension", - "value" - ], - "properties": { - "dimension": { - "type": "string" - }, - "value": { - "default": null, - "anyOf": [ - { - "type": "null", - "title": "No values" - }, - { - "type": "number", - "minimum": 0, - "description": "See the `unit` parameter for more information." - }, - { - "type": "string", - "subtype": "duration", - "description": "[ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations), e.g. `P1D` for one day.", - "pattern": "^(-?)P(?=\\d|T\\d)(?:(\\d+)Y)?(?:(\\d+)M)?(?:(\\d+)([DW]))?(?:T(?:(\\d+)H)?(?:(\\d+)M)?(?:(\\d+(?:\\.\\d+)?)S)?)?$" - } - ] - }, - "unit": { - "type": "string", - "description": "The unit the values are given in, either in meters (`m`) or pixels (`px`). If no unit is given, uses the unit specified for the dimension or otherwise the default unit of the reference system.", - "enum": [ - "px", - "m" - ] + "default": null, + "schema": [ + { + "title": "Without overlap", + "type": "null" + }, + { + "title": "With overlap", + "type": "array", + "items": { + "type": "object", + "subtype": "chunk-size", + "required": [ + "dimension", + "value" + ], + "properties": { + "dimension": { + "type": "string" + }, + "value": { + "default": null, + "anyOf": [ + { + "type": "null", + "title": "No values" + }, + { + "type": "number", + "minimum": 0, + "description": "See the `unit` parameter for more information." + }, + { + "type": "string", + "subtype": "duration", + "description": "[ISO 8601 duration](https://en.wikipedia.org/wiki/ISO_8601#Durations), e.g. `P1D` for one day.", + "pattern": "^(-?)P(?=\\d|T\\d)(?:(\\d+)Y)?(?:(\\d+)M)?(?:(\\d+)([DW]))?(?:T(?:(\\d+)H)?(?:(\\d+)M)?(?:(\\d+(?:\\.\\d+)?)S)?)?$" + } + ] + }, + "unit": { + "type": "string", + "description": "The unit the values are given in, either in meters (`m`) or pixels (`px`). If no unit is given, uses the unit specified for the dimension or otherwise the default unit of the reference system.", + "enum": [ + "px", + "m" + ] + } } } } - } + ] }, { "name": "context", @@ -233,4 +241,4 @@ "title": "Apply explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/tests/processes.test.js b/tests/processes.test.js index 1d0d004c..ce5292f5 100644 --- a/tests/processes.test.js +++ b/tests/processes.test.js @@ -248,10 +248,14 @@ function checkParam(param, p, checkCbParams = true) { // Parameter flags expect(typeof param.optional === 'undefined' || typeof param.optional === 'boolean').toBeTruthy(); - // lint: don't specify defaults + // lint: don't specify default value "false" for optional expect(typeof param.optional === 'undefined' || param.optional === true).toBeTruthy(); // lint: make sure there's no old required flag expect(typeof param.required === 'undefined').toBeTruthy(); + // lint: require a default value if the parameter is optional + if (param.optional === true && !anyOfRequired.includes(p.id)) { + expect(param.default).toBeDefined(); + } // Check flags (recommended / experimental) checkFlags(param); @@ -260,13 +264,7 @@ function checkParam(param, p, checkCbParams = true) { expect(typeof param.schema).toBe('object'); checkJsonSchema(jsv, param.schema); - if (!checkCbParams) { - // Parameters that are not required should define a default value - if(param.optional === true && !anyOfRequired.includes(p.id)) { - expect(param.default).toBeDefined(); - } - } - else { + if (checkCbParams) { // Checking that callbacks (process-graphs) define their parameters if (typeof param.schema === 'object' && param.schema.subtype === 'process-graph') { // lint: A callback without parameters is not very useful From d0ce54b174cc1585270086c86dac92f4a5023bc4 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 23 Feb 2022 11:36:42 +0100 Subject: [PATCH 026/117] `round`: Clarify that the rounding for ties applies not only for integers. #326 (#327) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Co-authored-by: Jan Jezeršek --- CHANGELOG.md | 1 + round.json | 18 ++++++++++++++++-- 2 files changed, 17 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e887b833..66b93dbc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -20,6 +20,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `aggregate_spatial`: Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) +- `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) ## [1.2.0] - 2021-12-13 diff --git a/round.json b/round.json index dbdd1323..cb1ff9e0 100644 --- a/round.json +++ b/round.json @@ -1,7 +1,7 @@ { "id": "round", "summary": "Round to a specified precision", - "description": "Rounds a real number `x` to specified precision `p`.\n\nIf the fractional part of `x` is halfway between two integers, one of which is even and the other odd, then the even number is returned.\nThis behavior follows [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229). This kind of rounding is also called \"round to nearest (even)\" or \"banker's rounding\". It minimizes rounding errors that result from consistently rounding a midpoint value in a single direction.\n\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Rounds a real number `x` to specified precision `p`.\n\nIf `x` is halfway between closest numbers of precision `p`, it is rounded to the closest even number of precision `p`.\nThis behavior follows [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) and is often called \"round to nearest (even)\" or \"banker's rounding\". It minimizes rounding errors that result from consistently rounding a midpoint value in a single direction.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math > rounding" ], @@ -68,6 +68,20 @@ }, "returns": -4 }, + { + "arguments": { + "x": 0.25, + "p": 1 + }, + "returns": 0.2 + }, + { + "arguments": { + "x": 0.35, + "p": 1 + }, + "returns": 0.4 + }, { "arguments": { "x": 1234.5, @@ -88,4 +102,4 @@ "title": "IEEE Standard 754-2019 for Floating-Point Arithmetic" } ] -} \ No newline at end of file +} From 56b98ae3f2fd68cff3815c3885af3361e7c187fb Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 25 Feb 2022 12:03:32 +0100 Subject: [PATCH 027/117] `apply_neighborhood`: Allow `null` as default value for units. (#328) --- CHANGELOG.md | 1 + apply_neighborhood.json | 40 ++++++++++++++++++++++++++++++---------- 2 files changed, 31 insertions(+), 10 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 66b93dbc..44096fe8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `array_modify` - Moved the `text_` processes to proposals as they are lacking implementations. - Renamed `text_merge` to `text_concat` for better alignment with `array_concat`. +- `apply_neighborhood`: Allow `null` as default value for units. ### Fixed diff --git a/apply_neighborhood.json b/apply_neighborhood.json index 2cb7d7b6..4966f28e 100644 --- a/apply_neighborhood.json +++ b/apply_neighborhood.json @@ -85,11 +85,21 @@ ] }, "unit": { - "type": "string", - "description": "The unit the values are given in, either in meters (`m`) or pixels (`px`). If no unit is given, uses the unit specified for the dimension or otherwise the default unit of the reference system.", - "enum": [ - "px", - "m" + "description": "The unit the values are given in, either in meters (`m`) or pixels (`px`). If no unit or `null` is given, uses the unit specified for the dimension or otherwise the default unit of the reference system.", + "default": null, + "anyOf": [ + { + "title": "Default unit", + "type": "null" + }, + { + "title": "Specific unit", + "type": "string", + "enum": [ + "px", + "m" + ] + } ] } } @@ -141,11 +151,21 @@ ] }, "unit": { - "type": "string", - "description": "The unit the values are given in, either in meters (`m`) or pixels (`px`). If no unit is given, uses the unit specified for the dimension or otherwise the default unit of the reference system.", - "enum": [ - "px", - "m" + "description": "The unit the values are given in, either in meters (`m`) or pixels (`px`). If no unit or `null` is given, uses the unit specified for the dimension or otherwise the default unit of the reference system.", + "default": null, + "anyOf": [ + { + "title": "Default unit", + "type": "null" + }, + { + "title": "Specific unit", + "type": "string", + "enum": [ + "px", + "m" + ] + } ] } } From 26ab5b9129e4be7213ec5b371b0bad51733d8f63 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 7 Mar 2022 16:58:47 +0100 Subject: [PATCH 028/117] Reverted: Moved the `text_` processes to proposals as they are lacking implementations. #329 --- CHANGELOG.md | 3 +-- proposals/text_begins.json => text_begins.json | 1 - proposals/text_concat.json => text_concat.json | 1 - proposals/text_contains.json => text_contains.json | 1 - proposals/text_ends.json => text_ends.json | 1 - 5 files changed, 1 insertion(+), 6 deletions(-) rename proposals/text_begins.json => text_begins.json (99%) rename proposals/text_concat.json => text_concat.json (99%) rename proposals/text_contains.json => text_contains.json (99%) rename proposals/text_ends.json => text_ends.json (99%) diff --git a/CHANGELOG.md b/CHANGELOG.md index 44096fe8..9a54e34b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,8 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `array_append` - `array_concat` - `array_modify` -- Moved the `text_` processes to proposals as they are lacking implementations. -- Renamed `text_merge` to `text_concat` for better alignment with `array_concat`. +- Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations. - `apply_neighborhood`: Allow `null` as default value for units. ### Fixed diff --git a/proposals/text_begins.json b/text_begins.json similarity index 99% rename from proposals/text_begins.json rename to text_begins.json index 08851617..766d5f0f 100644 --- a/proposals/text_begins.json +++ b/text_begins.json @@ -6,7 +6,6 @@ "texts", "comparison" ], - "experimental": true, "parameters": [ { "name": "data", diff --git a/proposals/text_concat.json b/text_concat.json similarity index 99% rename from proposals/text_concat.json rename to text_concat.json index 83bb206c..698c1cbc 100644 --- a/proposals/text_concat.json +++ b/text_concat.json @@ -5,7 +5,6 @@ "categories": [ "texts" ], - "experimental": true, "parameters": [ { "name": "data", diff --git a/proposals/text_contains.json b/text_contains.json similarity index 99% rename from proposals/text_contains.json rename to text_contains.json index ce723d38..9b78318f 100644 --- a/proposals/text_contains.json +++ b/text_contains.json @@ -6,7 +6,6 @@ "texts", "comparison" ], - "experimental": true, "parameters": [ { "name": "data", diff --git a/proposals/text_ends.json b/text_ends.json similarity index 99% rename from proposals/text_ends.json rename to text_ends.json index 7651ba57..f31e8e12 100644 --- a/proposals/text_ends.json +++ b/text_ends.json @@ -6,7 +6,6 @@ "texts", "comparison" ], - "experimental": true, "parameters": [ { "name": "data", From a4704b2fe35ad21225e359e1a1b84868c21f6442 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 8 Mar 2022 17:28:45 +0100 Subject: [PATCH 029/117] `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. #333 --- CHANGELOG.md | 1 + proposals/array_interpolate_linear.json | 11 +++++++---- 2 files changed, 8 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9a54e34b..6f30839e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -19,6 +19,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `aggregate_spatial`: Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. +- `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) - `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) diff --git a/proposals/array_interpolate_linear.json b/proposals/array_interpolate_linear.json index f5fe90ec..03021919 100644 --- a/proposals/array_interpolate_linear.json +++ b/proposals/array_interpolate_linear.json @@ -26,10 +26,13 @@ "returns": { "description": "An array with no-data values being replaced with interpolated values. If not at least 2 numerical values are available in the array, the array stays the same.", "schema": { - "type": [ - "number", - "null" - ] + "type": "array", + "items": { + "type": [ + "number", + "null" + ] + } } }, "examples": [ From e35081abbd444dd76f3137480b50d126056f593a Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 9 Mar 2022 16:11:04 +0100 Subject: [PATCH 030/117] Combine dimensions and vice-versa #308 (#316) --- CHANGELOG.md | 6 ++++ proposals/flatten_dimensions.json | 58 ++++++++++++++++++++++++++++++ proposals/unflatten_dimension.json | 58 ++++++++++++++++++++++++++++++ 3 files changed, 122 insertions(+) create mode 100644 proposals/flatten_dimensions.json create mode 100644 proposals/unflatten_dimension.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 6f30839e..6f573158 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft +### Added + +- New processes in proposal state: + - `flatten_dimensions` + - `unflatten_dimension` + ### Changed - Added better support for labeled arrays. Labels are not discarded in all cases anymore. Affected processes: diff --git a/proposals/flatten_dimensions.json b/proposals/flatten_dimensions.json new file mode 100644 index 00000000..3947e6a8 --- /dev/null +++ b/proposals/flatten_dimensions.json @@ -0,0 +1,58 @@ +{ + "id": "flatten_dimensions", + "summary": "Combine multiple dimensions into a single dimension", + "description": "Combines multiple given dimensions into a single dimension by flattening the values and merging the dimension labels with the given `label_separator`. Non-string dimension labels will be converted to strings. This process is the opposite of the process ``unflatten_dimensions()`` but executing both processes subsequently doesn't necessarily create a data cube that is equal to the original data cube.\n\nExample: Executing the process with a data cube with two dimensions `A` (labels: `2020` and `2021`) and `B` (labels: `B1` and `B2`) and the data `[[1,2],[3,4]]` and the parameters `dimensions` = `[A,B]` and `target_dimension` = `X` will result in a data cube with one dimension `X` (labels: `2020~B1`, `2020~B2`, `2021~B1` and `2021~B2`) and the data `[1,2,3,4]`.", + "categories": [ + "cubes" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "A data cube.", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + }, + { + "name": "dimensions", + "description": "The names of the dimension to combine. The order of the array defines the order in which the dimension labels and values are combined.\n\nFails with a `DimensionNotAvailable` exception if at least one of the specified dimensions does not exist.", + "schema": { + "type": "array", + "items": { + "type": "string" + } + } + }, + { + "name": "target_dimension", + "description": "The name of a target dimension with a single dimension label to replace. If a dimension with the given name doesn't exist yet, it is created with the specified name and the type `other` (see ``add_dimension()``).", + "schema": { + "type": "string" + } + }, + { + "name": "label_separator", + "description": "The string that will be used as a separator for the concatenated dimension labels.\n\nTo unambiguously revert the dimension labels with the process ``explode_dimensions()``, the given string must not be contained in any of the dimension labels.", + "optional": true, + "default": "~", + "schema": { + "type": "string", + "minLength": 1 + } + } + ], + "returns": { + "description": "A data cube with the new shape. The dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + }, + "exceptions": { + "DimensionNotAvailable": { + "message": "A dimension with the specified name does not exist." + } + } +} diff --git a/proposals/unflatten_dimension.json b/proposals/unflatten_dimension.json new file mode 100644 index 00000000..40479b44 --- /dev/null +++ b/proposals/unflatten_dimension.json @@ -0,0 +1,58 @@ +{ + "id": "unflatten_dimension", + "summary": "Split a single dimensions into multiple dimensions", + "description": "Splits a single dimension into multiple dimensions by systematically extracting values and splitting the dimension labels by the given `label_separator`. This process is the opposite of the process ``flatten_dimensions()`` but executing both processes subsequently doesn't necessarily create a data cube that is equal to the original data cube.\n\nExample: Executing the process with a data cube with one dimension `X` (labels: `2020~B1`, `2020~B2`, `2021~B1` and `2021~B2`) and the data `[1,2,3,4]` and the parameters `dimension` = `X` and `target_dimensions` = `[A,B]` will result in a data cube with two dimensions `A` (labels: `2020` and `2021`) and B (labels: `B1` and `B2`) and the data `[[1,2],[3,4]]`.", + "categories": [ + "cubes" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "A data cube that is consistently structured so that operation can execute flawlessly (e.g. the dimension labels need to contain the `label_separator` exactly 1 time for two target dimensions, 2 times for three target dimensions etc.).", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + }, + { + "name": "dimension", + "description": "The name of the dimension to split. The order of the array defines the order in which the dimension labels and values are split.\n\nFails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "schema": { + "type": "string" + } + }, + { + "name": "target_dimensions", + "description": "The names of the target dimensions, each with a single dimension label to replace. Non-existing dimensions will be created with the specified name and the type `other` (see ``add_dimension()``).", + "schema": { + "type": "array", + "items": { + "type": "string" + } + } + }, + { + "name": "label_separator", + "description": "The string that will be used as a separator to split the dimension labels. Each label will be split at the first occurrence of the given string only.", + "optional": true, + "default": "~", + "schema": { + "type": "string", + "minLength": 1 + } + } + ], + "returns": { + "description": "A data cube with the new shape. The dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + }, + "exceptions": { + "DimensionNotAvailable": { + "message": "A dimension with the specified name does not exist." + } + } +} From bdbd6f6a9aafb3a9c2cbdc3ac42ef67665a27171 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 9 Mar 2022 16:16:32 +0100 Subject: [PATCH 031/117] Load/Save ML model #300 (#304) --- CHANGELOG.md | 2 ++ meta/subtype-schemas.json | 8 +++++- proposals/load_ml_model.json | 53 ++++++++++++++++++++++++++++++++++++ proposals/save_ml_model.json | 44 ++++++++++++++++++++++++++++++ 4 files changed, 106 insertions(+), 1 deletion(-) create mode 100644 proposals/load_ml_model.json create mode 100644 proposals/save_ml_model.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 6f573158..9a0c13f1 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -10,7 +10,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - New processes in proposal state: - `flatten_dimensions` + - `load_ml_model` - `unflatten_dimension` + - `save_ml_model` ### Changed diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index 429252ef..3dc15b36 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -235,6 +235,12 @@ } } }, + "ml-model": { + "type": "object", + "subtype": "ml-model", + "title": "Machine Learning Model", + "description": "A machine learning model, accompanied with STAC metadata that implements the the STAC ml-model extension." + }, "output-format": { "type": "string", "subtype": "output-format", @@ -429,4 +435,4 @@ "description": "Year representation, as defined for `date-fullyear` by [RFC 3339 in section 5.6](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.6)." } } -} \ No newline at end of file +} diff --git a/proposals/load_ml_model.json b/proposals/load_ml_model.json new file mode 100644 index 00000000..16fd8412 --- /dev/null +++ b/proposals/load_ml_model.json @@ -0,0 +1,53 @@ +{ + "id": "load_ml_model", + "summary": "Load a ML model", + "description": "Loads a machine learning model from a STAC Item.\n\nSuch a model could be trained and saved as part of a previous batch job with processes such as ``fit_random_forest()`` and ``save_ml_model()``.", + "categories": [ + "machine learning", + "import" + ], + "experimental": true, + "parameters": [ + { + "name": "id", + "description": "The STAC Item to load the machine learning model from. The STAC Item must implement the `ml-model` extension.", + "schema": [ + { + "title": "URL", + "type": "string", + "format": "uri", + "subtype": "uri", + "pattern": "^https?://" + }, + { + "title": "Batch Job ID", + "description": "Loading a model by batch job ID is possible only if a single model has been saved by the job. Otherwise, you have to load a specific model from a batch job by URL.", + "type": "string", + "subtype": "job-id", + "pattern": "^[\\w\\-\\.~]+$" + }, + { + "title": "User-uploaded Files", + "type": "string", + "subtype": "file-path", + "pattern": "^[^\r\n\\:'\"]+$" + } + ] + } + ], + "returns": { + "description": "A machine learning model to be used with machine learning processes such as ``predict_random_forest()``.", + "schema": { + "type": "object", + "subtype": "ml-model" + } + }, + "links": [ + { + "href": "https://github.com/stac-extensions/ml-model", + "title": "STAC ml-model extension", + "type": "text/html", + "rel": "about" + } + ] +} \ No newline at end of file diff --git a/proposals/save_ml_model.json b/proposals/save_ml_model.json new file mode 100644 index 00000000..5e9ea8b0 --- /dev/null +++ b/proposals/save_ml_model.json @@ -0,0 +1,44 @@ +{ + "id": "save_ml_model", + "summary": "Save a ML model", + "description": "Saves a machine learning model as part of a batch job.\n\nThe model will be accompanied by a separate STAC Item that implements the [ml-model extension](https://github.com/stac-extensions/ml-model).", + "categories": [ + "machine learning", + "import" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "The data to store as a machine learning model.", + "schema": { + "type": "object", + "subtype": "ml-model" + } + }, + { + "name": "options", + "description": "Additional parameters to create the file(s).", + "schema": { + "type": "object", + "additionalParameters": false + }, + "default": {}, + "optional": true + } + ], + "returns": { + "description": "Returns `false` if the process failed to store the model, `true` otherwise.", + "schema": { + "type": "boolean" + } + }, + "links": [ + { + "href": "https://github.com/stac-extensions/ml-model", + "title": "STAC ml-model extension", + "type": "text/html", + "rel": "about" + } + ] +} \ No newline at end of file From 63e3e9df0608a9585ec0dfd8e276e1f343170fad Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 9 Mar 2022 16:41:51 +0100 Subject: [PATCH 032/117] Fix broken references to processes, also test for it --- CHANGELOG.md | 1 + apply.json | 4 ++-- array_apply.json | 4 ++-- proposals/flatten_dimensions.json | 4 ++-- proposals/load_ml_model.json | 6 +++--- tests/processes.test.js | 10 +++++----- tests/testHelpers.js | 10 +++++++++- 7 files changed, 24 insertions(+), 15 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9a0c13f1..5dc80abc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -26,6 +26,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed - `aggregate_spatial`: Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) +- `apply` and `array_apply`: Fixed broken references to the `absolute` process - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. - `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) diff --git a/apply.json b/apply.json index a39292e0..d0be1e1d 100644 --- a/apply.json +++ b/apply.json @@ -16,7 +16,7 @@ }, { "name": "process", - "description": "A process that accepts and returns a single value and is applied on each individual value in the data cube. The process may consist of multiple sub-processes and could, for example, consist of processes such as ``abs()`` or ``linear_scale_range()``.", + "description": "A process that accepts and returns a single value and is applied on each individual value in the data cube. The process may consist of multiple sub-processes and could, for example, consist of processes such as ``absolute()`` or ``linear_scale_range()``.", "schema": { "type": "object", "subtype": "process-graph", @@ -70,4 +70,4 @@ "title": "Apply explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/array_apply.json b/array_apply.json index 15da28dc..194d2f35 100644 --- a/array_apply.json +++ b/array_apply.json @@ -18,7 +18,7 @@ }, { "name": "process", - "description": "A process that accepts and returns a single value and is applied on each individual value in the array. The process may consist of multiple sub-processes and could, for example, consist of processes such as ``abs()`` or ``linear_scale_range()``.", + "description": "A process that accepts and returns a single value and is applied on each individual value in the array. The process may consist of multiple sub-processes and could, for example, consist of processes such as ``absolute()`` or ``linear_scale_range()``.", "schema": { "type": "object", "subtype": "process-graph", @@ -106,4 +106,4 @@ "title": "Check for no-data values in arrays" } ] -} \ No newline at end of file +} diff --git a/proposals/flatten_dimensions.json b/proposals/flatten_dimensions.json index 3947e6a8..e4b9ced6 100644 --- a/proposals/flatten_dimensions.json +++ b/proposals/flatten_dimensions.json @@ -1,7 +1,7 @@ { "id": "flatten_dimensions", "summary": "Combine multiple dimensions into a single dimension", - "description": "Combines multiple given dimensions into a single dimension by flattening the values and merging the dimension labels with the given `label_separator`. Non-string dimension labels will be converted to strings. This process is the opposite of the process ``unflatten_dimensions()`` but executing both processes subsequently doesn't necessarily create a data cube that is equal to the original data cube.\n\nExample: Executing the process with a data cube with two dimensions `A` (labels: `2020` and `2021`) and `B` (labels: `B1` and `B2`) and the data `[[1,2],[3,4]]` and the parameters `dimensions` = `[A,B]` and `target_dimension` = `X` will result in a data cube with one dimension `X` (labels: `2020~B1`, `2020~B2`, `2021~B1` and `2021~B2`) and the data `[1,2,3,4]`.", + "description": "Combines multiple given dimensions into a single dimension by flattening the values and merging the dimension labels with the given `label_separator`. Non-string dimension labels will be converted to strings. This process is the opposite of the process ``unflatten_dimension()`` but executing both processes subsequently doesn't necessarily create a data cube that is equal to the original data cube.\n\nExample: Executing the process with a data cube with two dimensions `A` (labels: `2020` and `2021`) and `B` (labels: `B1` and `B2`) and the data `[[1,2],[3,4]]` and the parameters `dimensions` = `[A,B]` and `target_dimension` = `X` will result in a data cube with one dimension `X` (labels: `2020~B1`, `2020~B2`, `2021~B1` and `2021~B2`) and the data `[1,2,3,4]`.", "categories": [ "cubes" ], @@ -34,7 +34,7 @@ }, { "name": "label_separator", - "description": "The string that will be used as a separator for the concatenated dimension labels.\n\nTo unambiguously revert the dimension labels with the process ``explode_dimensions()``, the given string must not be contained in any of the dimension labels.", + "description": "The string that will be used as a separator for the concatenated dimension labels.\n\nTo unambiguously revert the dimension labels with the process ``unflatten_dimension()``, the given string must not be contained in any of the dimension labels.", "optional": true, "default": "~", "schema": { diff --git a/proposals/load_ml_model.json b/proposals/load_ml_model.json index 16fd8412..364dc5ee 100644 --- a/proposals/load_ml_model.json +++ b/proposals/load_ml_model.json @@ -1,7 +1,7 @@ { "id": "load_ml_model", "summary": "Load a ML model", - "description": "Loads a machine learning model from a STAC Item.\n\nSuch a model could be trained and saved as part of a previous batch job with processes such as ``fit_random_forest()`` and ``save_ml_model()``.", + "description": "Loads a machine learning model from a STAC Item.\n\nSuch a model could be trained and saved as part of a previous batch job with processes such as ``save_ml_model()``.", "categories": [ "machine learning", "import" @@ -36,7 +36,7 @@ } ], "returns": { - "description": "A machine learning model to be used with machine learning processes such as ``predict_random_forest()``.", + "description": "A machine learning model to be used with machine learning processes.", "schema": { "type": "object", "subtype": "ml-model" @@ -50,4 +50,4 @@ "rel": "about" } ] -} \ No newline at end of file +} diff --git a/tests/processes.test.js b/tests/processes.test.js index ce5292f5..1fcf5f02 100644 --- a/tests/processes.test.js +++ b/tests/processes.test.js @@ -74,7 +74,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { expect(typeof p.description).toBe('string'); // lint: Description should be longer than a summary expect(p.description.length).toBeGreaterThan(60); - checkDescription(p.description, p); + checkDescription(p.description, p, processIds); }); test("Categories", () => { @@ -112,7 +112,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { expect(typeof p.returns.description).toBe('string'); // lint: Description should not be empty expect(p.returns.description.length).toBeGreaterThan(0); - checkDescription(p.returns.description, p); + checkDescription(p.returns.description, p, processIds); // return value schema expect(p.returns.schema).not.toBeNull(); @@ -136,7 +136,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { // exception description expect(typeof e.description === 'undefined' || typeof e.description === 'boolean').toBeTruthy(); - checkDescription(e.description, p); + checkDescription(e.description, p, processIds); // exception http code if (typeof e.http !== 'undefined') { @@ -169,7 +169,7 @@ describe.each(processes)("%s", (file, p, fileContent, proposal) => { // example description expect(typeof example.description === 'undefined' || typeof example.description === 'string').toBeTruthy(); - checkDescription(example.description, p); + checkDescription(example.description, p, processIds); // example process graph expect(example.process_graph).toBeUndefined(); @@ -244,7 +244,7 @@ function checkParam(param, p, checkCbParams = true) { expect(typeof param.description).toBe('string'); // lint: Description should not be empty expect(param.description.length).toBeGreaterThan(0); - checkDescription(param.description, p); + checkDescription(param.description, p, processIds); // Parameter flags expect(typeof param.optional === 'undefined' || typeof param.optional === 'boolean').toBeTruthy(); diff --git a/tests/testHelpers.js b/tests/testHelpers.js index 3f998088..418fd830 100644 --- a/tests/testHelpers.js +++ b/tests/testHelpers.js @@ -124,7 +124,7 @@ function normalizeString(str) { return str.replace(/\r\n|\r|\n/g, "\n").trim(); } -function checkDescription(text, p = null, commonmark = true) { +function checkDescription(text, p = null, processIds = [], commonmark = true) { if (!text) { return; } @@ -148,6 +148,14 @@ function checkDescription(text, p = null, commonmark = true) { // Check spelling checkSpelling(text, p); + + // Check whether process references are referencing valid processes + if (Array.isArray(processIds) && processIds.length > 0) { + let matches = text.matchAll(/(?:^|[^\w`])``(\w+)\(\)``(?![\w`])/g); + for(match of matches) { + expect(processIds).toContain(match[1]); + } + } } function checkSpelling(text, p = null) { From e9bbfa1dad0bfb676651b734e63ce7ff3b5dcfda Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 9 Mar 2022 18:35:01 +0100 Subject: [PATCH 033/117] Processes for Random Forest (#306) Co-authored-by: clausmichele --- proposals/fit_class_random_forest.json | 88 ++++++++++++++++++++++++++ proposals/fit_regr_random_forest.json | 88 ++++++++++++++++++++++++++ proposals/load_ml_model.json | 4 +- proposals/predict_random_forest.json | 42 ++++++++++++ tests/.words | 3 +- 5 files changed, 222 insertions(+), 3 deletions(-) create mode 100644 proposals/fit_class_random_forest.json create mode 100644 proposals/fit_regr_random_forest.json create mode 100644 proposals/predict_random_forest.json diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json new file mode 100644 index 00000000..4e3ce59d --- /dev/null +++ b/proposals/fit_class_random_forest.json @@ -0,0 +1,88 @@ +{ + "id": "fit_class_random_forest", + "summary": "Train a random forest classification model", + "description": "Executes the fit of a random forest classification based on the user input of target and predictors. The Random Forest classification model is based on the approach by Breiman (2001).", + "categories": [ + "machine learning" + ], + "experimental": true, + "parameters": [ + { + "name": "predictors", + "description": "The predictors for the classification model as a vector data cube. Aggregated to the features (vectors) of the target input variable.", + "schema": { + "type": "object", + "subtype": "vector-cube" + } + }, + { + "name": "target", + "description": "The training sites for the classification model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).", + "schema": { + "type": "object", + "subtype": "vector-cube" + } + }, + { + "name": "training", + "description": "The amount of training data to be used in the classification, given as a fraction. The sampling will be chosen randomly through the data object. The remaining data will be used as test data for the validation.", + "schema": { + "type": "number", + "exclusiveMinimum": 0, + "maximum": 1 + } + }, + { + "name": "num_trees", + "description": "The number of trees build within the Random Forest classification.", + "optional": true, + "default": 100, + "schema": { + "type": "integer", + "minimum": 1 + } + }, + { + "name": "mtry", + "description": "Specifies how many split variables will be used at a node. Default value is `null`, which corresponds to the number of predictors divided by 3.", + "optional": true, + "default": null, + "schema": [ + { + "type": "integer", + "minimum": 1 + }, + { + "type": "null" + } + ] + }, + { + "name": "seed", + "description": "A randomization seed to use for the random sampling in training. If not given or `null`, no seed is used and results may differ on subsequent use.", + "optional": true, + "default": null, + "schema": { + "type": [ + "integer", + "null" + ] + } + } + ], + "returns": { + "description": "A model object that can be saved with ``save_ml_model()`` and restored with ``load_ml_model()``.", + "schema": { + "type": "object", + "subtype": "ml-model" + } + }, + "links": [ + { + "href": "https://doi.org/10.1023/A:1010933404324", + "title": "Breiman (2001): Random Forests", + "type": "text/html", + "rel": "about" + } + ] +} \ No newline at end of file diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json new file mode 100644 index 00000000..9a71b5b4 --- /dev/null +++ b/proposals/fit_regr_random_forest.json @@ -0,0 +1,88 @@ +{ + "id": "fit_regr_random_forest", + "summary": "Train a random forest regression model", + "description": "Executes the fit of a random forest regression based on the user input of target and predictors. The Random Forest regression model is based on the approach by Breiman (2001).", + "categories": [ + "machine learning" + ], + "experimental": true, + "parameters": [ + { + "name": "predictors", + "description": "The predictors for the regression model as a vector data cube. Aggregated to the features (vectors) of the target input variable.", + "schema": { + "type": "object", + "subtype": "vector-cube" + } + }, + { + "name": "target", + "description": "The training sites for the regression model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).", + "schema": { + "type": "object", + "subtype": "vector-cube" + } + }, + { + "name": "training", + "description": "The amount of training data to be used in the regression, given as a fraction. The sampling will be randomly through the data object. The remaining data will be used as test data for the validation.", + "schema": { + "type": "number", + "exclusiveMinimum": 0, + "maximum": 1 + } + }, + { + "name": "num_trees", + "description": "The number of trees build within the Random Forest regression.", + "optional": true, + "default": 100, + "schema": { + "type": "integer", + "minimum": 1 + } + }, + { + "name": "mtry", + "description": "Specifies how many split variables will be used at a node. Default value is `null`, which corresponds to the number of predictors divided by 3.", + "optional": true, + "default": null, + "schema": [ + { + "type": "integer", + "minimum": 1 + }, + { + "type": "null" + } + ] + }, + { + "name": "seed", + "description": "A randomization seed to use for the random sampling in training. If not given or `null`, no seed is used and results may differ on subsequent use.", + "optional": true, + "default": null, + "schema": { + "type": [ + "integer", + "null" + ] + } + } + ], + "returns": { + "description": "A model object that can be saved with ``save_ml_model()`` and restored with ``load_ml_model()``.", + "schema": { + "type": "object", + "subtype": "ml-model" + } + }, + "links": [ + { + "href": "https://doi.org/10.1023/A:1010933404324", + "title": "Breiman (2001): Random Forests", + "type": "text/html", + "rel": "about" + } + ] +} \ No newline at end of file diff --git a/proposals/load_ml_model.json b/proposals/load_ml_model.json index 364dc5ee..445b5f2f 100644 --- a/proposals/load_ml_model.json +++ b/proposals/load_ml_model.json @@ -1,7 +1,7 @@ { "id": "load_ml_model", "summary": "Load a ML model", - "description": "Loads a machine learning model from a STAC Item.\n\nSuch a model could be trained and saved as part of a previous batch job with processes such as ``save_ml_model()``.", + "description": "Loads a machine learning model from a STAC Item.\n\nSuch a model could be trained and saved as part of a previous batch job with processes such as ``fit_regr_random_forest()`` and ``save_ml_model()``.", "categories": [ "machine learning", "import" @@ -36,7 +36,7 @@ } ], "returns": { - "description": "A machine learning model to be used with machine learning processes.", + "description": "A machine learning model to be used with machine learning processes such as ``predict_random_forest()``.", "schema": { "type": "object", "subtype": "ml-model" diff --git a/proposals/predict_random_forest.json b/proposals/predict_random_forest.json new file mode 100644 index 00000000..46103938 --- /dev/null +++ b/proposals/predict_random_forest.json @@ -0,0 +1,42 @@ +{ + "id": "predict_random_forest", + "summary": "Predict values from a Random Forest model", + "description": "Applies a Random Forest machine learning model to an array and predict a value for it.", + "categories": [ + "machine learning", + "reducer" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "An array of numbers.", + "schema": { + "type": "array", + "items": { + "type": [ + "number", + "null" + ] + } + } + }, + { + "name": "model", + "description": "A model object that can be trained with the processes ``fit_regr_random_forest()`` (regression) and ``fit_class_random_forest()`` (classification).", + "schema": { + "type": "object", + "subtype": "ml-model" + } + } + ], + "returns": { + "description": "The predicted value. Returns `null` if any of the given values in the array is a no-data value.", + "schema": { + "type": [ + "number", + "null" + ] + } + } +} \ No newline at end of file diff --git a/tests/.words b/tests/.words index 95a83c72..66152744 100644 --- a/tests/.words +++ b/tests/.words @@ -37,4 +37,5 @@ gdalwarp Lanczos sinc interpolants -Hyndman \ No newline at end of file +Breiman +Hyndman From 24f3a1a223c5df417ef708bced6de65e5f1f91fc Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 10 Mar 2022 10:41:49 +0100 Subject: [PATCH 034/117] Improvements for (un)flatten_dimension(s) and aggregate_spatial #336 (#337) --- CHANGELOG.md | 4 +++- aggregate_spatial.json | 5 ++++- proposals/flatten_dimensions.json | 7 +++++-- proposals/unflatten_dimension.json | 10 +++++++--- 4 files changed, 19 insertions(+), 7 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 5dc80abc..960c9fdd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -25,7 +25,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed -- `aggregate_spatial`: Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) +- `aggregate_spatial`: + - Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) + - Clarified that a `TargetDimensionExists` exception is thrown if the target dimension exists. - `apply` and `array_apply`: Fixed broken references to the `absolute` process - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. - `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) diff --git a/aggregate_spatial.json b/aggregate_spatial.json index 9ba2ce9e..4020610c 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -60,7 +60,7 @@ }, { "name": "target_dimension", - "description": "The new dimension name to be used for storing the results. Defaults to `result`.", + "description": "The name of a new dimensions that is used to store the results. A new dimension will be created with the given name and type `other` (see ``add_dimension()``). Defaults to the dimension name `result`. Fails with a `TargetDimensionExists` exception if a dimension with the specified name exists.", "schema": { "type": "string" }, @@ -87,6 +87,9 @@ "exceptions": { "TooManyDimensions": { "message": "The number of dimensions must be reduced to three for `aggregate_spatial`." + }, + "TargetDimensionExists": { + "message": "A dimension with the specified target dimension name already exists." } }, "links": [ diff --git a/proposals/flatten_dimensions.json b/proposals/flatten_dimensions.json index e4b9ced6..05e54212 100644 --- a/proposals/flatten_dimensions.json +++ b/proposals/flatten_dimensions.json @@ -17,7 +17,7 @@ }, { "name": "dimensions", - "description": "The names of the dimension to combine. The order of the array defines the order in which the dimension labels and values are combined.\n\nFails with a `DimensionNotAvailable` exception if at least one of the specified dimensions does not exist.", + "description": "The names of the dimension to combine. The order of the array defines the order in which the dimension labels and values are combined (see the example in the process description). Fails with a `DimensionNotAvailable` exception if at least one of the specified dimensions does not exist.", "schema": { "type": "array", "items": { @@ -27,7 +27,7 @@ }, { "name": "target_dimension", - "description": "The name of a target dimension with a single dimension label to replace. If a dimension with the given name doesn't exist yet, it is created with the specified name and the type `other` (see ``add_dimension()``).", + "description": "The name of the new target dimension. A new dimensions will be created with the given names and type `other` (see ``add_dimension()``). Fails with a `TargetDimensionExists` exception if a dimension with the specified name exists.", "schema": { "type": "string" } @@ -53,6 +53,9 @@ "exceptions": { "DimensionNotAvailable": { "message": "A dimension with the specified name does not exist." + }, + "TargetDimensionExists": { + "message": "A dimension with the specified target dimension name already exists." } } } diff --git a/proposals/unflatten_dimension.json b/proposals/unflatten_dimension.json index 40479b44..1cbf2d1d 100644 --- a/proposals/unflatten_dimension.json +++ b/proposals/unflatten_dimension.json @@ -17,16 +17,17 @@ }, { "name": "dimension", - "description": "The name of the dimension to split. The order of the array defines the order in which the dimension labels and values are split.\n\nFails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "description": "The name of the dimension to split.", "schema": { "type": "string" } }, { "name": "target_dimensions", - "description": "The names of the target dimensions, each with a single dimension label to replace. Non-existing dimensions will be created with the specified name and the type `other` (see ``add_dimension()``).", + "description": "The names of the new target dimensions. New dimensions will be created with the given names and type `other` (see ``add_dimension()``). Fails with a `TargetDimensionExists` exception if any of the dimensions exists.\n\nThe order of the array defines the order in which the dimensions and dimension labels are added to the data cube (see the example in the process description).", "schema": { "type": "array", + "minItems": 2, "items": { "type": "string" } @@ -34,7 +35,7 @@ }, { "name": "label_separator", - "description": "The string that will be used as a separator to split the dimension labels. Each label will be split at the first occurrence of the given string only.", + "description": "The string that will be used as a separator to split the dimension labels.", "optional": true, "default": "~", "schema": { @@ -53,6 +54,9 @@ "exceptions": { "DimensionNotAvailable": { "message": "A dimension with the specified name does not exist." + }, + "TargetDimensionExists": { + "message": "A dimension with the specified target dimension name already exists." } } } From 1bb153c5a8690628c6134d968865cd800139af71 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 15 Mar 2022 11:59:52 +0100 Subject: [PATCH 035/117] Fine-tune rename_labels #335 (#340) --- rename_labels.json | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/rename_labels.json b/rename_labels.json index 1a018fe5..1e3fda8d 100644 --- a/rename_labels.json +++ b/rename_labels.json @@ -1,7 +1,7 @@ { "id": "rename_labels", "summary": "Rename dimension labels", - "description": "Renames the labels of the specified dimension in the data cube from `source` to `target`.\n\nIf the array for the source labels is empty (the default), the dimension labels are expected to be enumerated with zero-based numbering (0,1,2,3,...) so that the dimension labels directly map to the indices of the array specified for the parameter `target`. The number of the source and target labels must be equal. Otherwise, the exception `LabelMismatch` is thrown.\n\nThis process doesn't change the order of the labels and their corresponding data.", + "description": "Renames the labels of the specified dimension in the data cube from `source` to `target`.\n\nIf the array for the source labels is empty (the default), the dimension labels are expected to be enumerated with zero-based numbering (0,1,2,3,...) so that the dimension labels directly map to the indices of the array specified for the parameter `target`. Otherwise, the number of the source and target labels must be equal. If none of these requirements is fulfilled, the `LabelMismatch` exception is thrown.\n\nThis process doesn't change the order of the labels and their corresponding data.", "categories": [ "cubes" ], @@ -23,7 +23,7 @@ }, { "name": "target", - "description": "The new names for the labels. If a target dimension label already exists in the data cube, a `LabelExists` exception is thrown.", + "description": "The new names for the labels.\n\nIf a target dimension label already exists in the data cube, a `LabelExists` exception is thrown.", "schema": { "type": "array", "items": { @@ -36,7 +36,7 @@ }, { "name": "source", - "description": "The names of the labels as they are currently in the data cube. The array defines an unsorted and potentially incomplete list of labels that should be renamed to the names available in the corresponding array elements in the parameter `target`. By default, the array is empty so that the dimension labels in the data cube are expected to be enumerated.\n\nIf the dimension labels are not enumerated and the given array is empty, the `LabelsNotEnumerated` exception is thrown. If one of the source dimension labels doesn't exist, the `LabelNotAvailable` exception is thrown.", + "description": "The original names of the labels to be renamed to corresponding array elements in the parameter `target`. It is allowed to only specify a subset of labels to rename, as long as the `target` and `source` parameter have the same length. The order of the labels doesn't need to match the order of the dimension labels in the data cube. By default, the array is empty so that the dimension labels in the data cube are expected to be enumerated.\n\nIf the dimension labels are not enumerated and the given array is empty, the `LabelsNotEnumerated` exception is thrown. If one of the source dimension labels doesn't exist, the `LabelNotAvailable` exception is thrown.", "schema": { "type": "array", "items": { From 536508c67ec090b0fe755cb59328402ca7733f4e Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 15 Mar 2022 15:04:03 +0100 Subject: [PATCH 036/117] Add vector_to_points #313 (#315) Co-authored-by: clausmichele --- CHANGELOG.md | 2 + proposals/vector_to_random_points.json | 94 +++++++++++++++++++++++++ proposals/vector_to_regular_points.json | 50 +++++++++++++ 3 files changed, 146 insertions(+) create mode 100644 proposals/vector_to_random_points.json create mode 100644 proposals/vector_to_regular_points.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 960c9fdd..86946f3f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `load_ml_model` - `unflatten_dimension` - `save_ml_model` + - `vector_to_random_points` + - `vector_to_regular_points` ### Changed diff --git a/proposals/vector_to_random_points.json b/proposals/vector_to_random_points.json new file mode 100644 index 00000000..e71c5c6e --- /dev/null +++ b/proposals/vector_to_random_points.json @@ -0,0 +1,94 @@ +{ + "id": "vector_to_random_points", + "summary": "Sample random points from geometries", + "description": "Generate a vector data cube of points by sampling random points from input geometries. At least one point is sampled per input geometry.\n\nThrows a `CountsMissing` exception if `geometry_count` and `total_count` are both unrestricted (i.e. set to `null`).", + "categories": [ + "cubes", + "vector" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "Input geometries for sample extraction.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", + "schema": [ + { + "type": "object", + "subtype": "geojson" + }, + { + "type": "object", + "subtype": "vector-cube" + } + ] + }, + { + "name": "geometry_count", + "description": "The maximum number of points to compute per geometry. Defaults to a maximum of one point per geometry.\n\nPoints in the input geometries can be selected only once by the sampling.", + "optional": true, + "default": 1, + "schema": [ + { + "type": "integer", + "minimum": 1 + }, + { + "title": "Unrestricted", + "type": "null" + } + ] + }, + { + "name": "total_count", + "description": "The maximum number of points to compute overall.\n\nThrows a `CountMismatch` exception if the specified value is less than the provided number of geometries.", + "optional": true, + "default": null, + "schema": [ + { + "type": "integer", + "minimum": 1 + }, + { + "title": "Unrestricted", + "type": "null" + } + ] + }, + { + "name": "group", + "description": "Specifies whether the sampled points should be grouped by input geometry (default) or be generated as independent points.\n\n* If the sampled points are grouped, the process generates a `MultiPoint` per geometry given. Vector properties are preserved.\n* Otherwise, each sampled point is generated as a distinct `Point` geometry. Vector properties are *not* preserved.", + "optional": true, + "default": true, + "schema": { + "type": "boolean" + } + }, + { + "name": "seed", + "description": "A randomization seed to use for random sampling. If not given or `null`, no seed is used and results may differ on subsequent use.", + "optional": true, + "default": null, + "schema": { + "type": [ + "integer", + "null" + ] + } + } + ], + "returns": { + "description": "Returns a vector data cube with the sampled points.", + "schema": { + "type": "object", + "subtype": "vector-cube" + } + }, + "exceptions": { + "CountsMissing": { + "message": "No maximum count is set per geometry and in total." + }, + "CountMismatch": { + "message": "The total number of points is lower than the number of geometries." + } + } +} diff --git a/proposals/vector_to_regular_points.json b/proposals/vector_to_regular_points.json new file mode 100644 index 00000000..e5000361 --- /dev/null +++ b/proposals/vector_to_regular_points.json @@ -0,0 +1,50 @@ +{ + "id": "vector_to_regular_points", + "summary": "Sample regular points from geometries", + "description": "Generate a vector data cube of points by sampling regularly-spaced points from input geometries.", + "categories": [ + "cubes", + "vector" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "Input geometries for sample extraction.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", + "schema": [ + { + "type": "object", + "subtype": "geojson" + }, + { + "type": "object", + "subtype": "vector-cube" + } + ] + }, + { + "name": "distance", + "description": "Defines the minimum distance in the unit of the reference system that is required between two samples generated *inside* a single geometry.\n\n- For **polygons**, the distance defines the cell sizes of a regular grid that starts at the upper-left bound of each polygon. The centroid of each cell is then a sample point. If the centroid is not enclosed in the polygon, no point is sampled. If no point can be sampled for the geometry at all, the first coordinate of the geometry is returned as point.\n- For **lines** (line strings), the sampling starts with a point at the first coordinate of the line and then walks along the line and samples a new point each time the distance to the previous point has been reached again.\n- For **points**, the point is returned as given.", + "schema": { + "type": "number", + "minimumExclusive": 0 + } + }, + { + "name": "group", + "description": "Specifies whether the sampled points should be grouped by input geometry (default) or be generated as independent points.\n\n* If the sampled points are grouped, the process generates a `MultiPoint` per geometry given. Vector properties are preserved.\n* Otherwise, each sampled point is generated as a distinct `Point` geometry. Vector properties are *not* preserved.", + "optional": true, + "default": true, + "schema": { + "type": "boolean" + } + } + ], + "returns": { + "description": "Returns a vector data cube with the sampled points.", + "schema": { + "type": "object", + "subtype": "vector-cube" + } + } +} From f4d5d03b04c62cf9fd0f8184916fee48d2a2657f Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 15 Mar 2022 15:09:02 +0100 Subject: [PATCH 037/117] Add vector_buffer (#314) --- CHANGELOG.md | 1 + proposals/vector_buffer.json | 42 ++++++++++++++++++++++++++++++++++++ 2 files changed, 43 insertions(+) create mode 100644 proposals/vector_buffer.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 86946f3f..77e80ca4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `load_ml_model` - `unflatten_dimension` - `save_ml_model` + - `vector_buffer` - `vector_to_random_points` - `vector_to_regular_points` diff --git a/proposals/vector_buffer.json b/proposals/vector_buffer.json new file mode 100644 index 00000000..204a54b7 --- /dev/null +++ b/proposals/vector_buffer.json @@ -0,0 +1,42 @@ +{ + "id": "vector_buffer", + "summary": "Buffer geometries by distance", + "description": "Buffers each input geometry by a given distance, which can either expand (dilate) or a shrink (erode) the geometry. Buffers can be applied to points, lines and polygons, but the results are always polygons. Multi-part types (e.g. `MultiPoint`) are also allowed.", + "categories": [ + "vector" + ], + "experimental": true, + "parameters": [ + { + "name": "geometries", + "description": "Geometries to apply the buffer on. Vector properties are preserved for vector data cubes and all GeoJSON Features.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", + "schema": [ + { + "type": "object", + "subtype": "geojson" + }, + { + "type": "object", + "subtype": "vector-cube" + } + ] + }, + { + "name": "distance", + "description": "The distance of the buffer in the unit of the spatial reference system. A positive distance expands the geometries and results in outward buffering (dilation) while a negative distance shrinks the geometries and results in inward buffering (erosion).", + "schema": { + "type": "number", + "not": { + "const": 0 + } + } + } + ], + "returns": { + "description": "Returns a vector data cube with the computed new geometries.", + "schema": { + "type": "object", + "subtype": "vector-cube" + } + } +} From e4df864846e833dda4669e190669a2db5b365c62 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 16 Mar 2022 13:12:50 +0100 Subject: [PATCH 038/117] `array_contains` and `array_find`: Clarify that giving `null` as `value` always returns `false` or `null` respectively, also fixed the incorrect examples. #348 --- CHANGELOG.md | 1 + array_contains.json | 6 +++--- array_find.json | 6 +++--- 3 files changed, 7 insertions(+), 6 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 77e80ca4..c3ed5504 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -33,6 +33,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Clarified that a `TargetDimensionExists` exception is thrown if the target dimension exists. - `apply` and `array_apply`: Fixed broken references to the `absolute` process - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. +- `array_contains` and `array_find`: Clarify that giving `null` as `value` always returns `false` or `null` respectively, also fixed the incorrect examples. [#348](https://github.com/Open-EO/openeo-processes/issues/348) - `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) - `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) diff --git a/array_contains.json b/array_contains.json index 745b62b3..a61622bc 100644 --- a/array_contains.json +++ b/array_contains.json @@ -20,7 +20,7 @@ }, { "name": "value", - "description": "Value to find in `data`.", + "description": "Value to find in `data`. If the value is `null`, this process returns always `false`.", "schema": { "description": "Any data type is allowed." } @@ -75,7 +75,7 @@ ], "value": null }, - "returns": true + "returns": false }, { "arguments": { @@ -167,4 +167,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/array_find.json b/array_find.json index c95f2628..a579a428 100644 --- a/array_find.json +++ b/array_find.json @@ -19,7 +19,7 @@ }, { "name": "value", - "description": "Value to find in `data`.", + "description": "Value to find in `data`. If the value is `null`, this process returns always `null`.", "schema": { "description": "Any data type is allowed." } @@ -106,7 +106,7 @@ ], "value": null }, - "returns": 1 + "returns": null }, { "arguments": { @@ -168,4 +168,4 @@ "title": "Find no-data values in arrays" } ] -} \ No newline at end of file +} From e231b39528784863c306be2caab89b257d5b7a8a Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 21 Mar 2022 10:54:28 +0100 Subject: [PATCH 039/117] default for train/validation split in fit_class_random_forest #350 --- proposals/fit_class_random_forest.json | 4 +++- proposals/fit_regr_random_forest.json | 4 +++- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json index 4e3ce59d..e038a57d 100644 --- a/proposals/fit_class_random_forest.json +++ b/proposals/fit_class_random_forest.json @@ -26,6 +26,8 @@ { "name": "training", "description": "The amount of training data to be used in the classification, given as a fraction. The sampling will be chosen randomly through the data object. The remaining data will be used as test data for the validation.", + "optional": true, + "default": 0.8, "schema": { "type": "number", "exclusiveMinimum": 0, @@ -85,4 +87,4 @@ "rel": "about" } ] -} \ No newline at end of file +} diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json index 9a71b5b4..7e2ad58e 100644 --- a/proposals/fit_regr_random_forest.json +++ b/proposals/fit_regr_random_forest.json @@ -26,6 +26,8 @@ { "name": "training", "description": "The amount of training data to be used in the regression, given as a fraction. The sampling will be randomly through the data object. The remaining data will be used as test data for the validation.", + "optional": true, + "default": 0.8, "schema": { "type": "number", "exclusiveMinimum": 0, @@ -85,4 +87,4 @@ "rel": "about" } ] -} \ No newline at end of file +} From 5eea6522105d739d5d0b178354b0d806e5a094e3 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 21 Mar 2022 11:10:52 +0100 Subject: [PATCH 040/117] fit_class_random_forest: better name for mtry parameter #339 --- proposals/fit_regr_random_forest.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json index 7e2ad58e..d2f6d7bc 100644 --- a/proposals/fit_regr_random_forest.json +++ b/proposals/fit_regr_random_forest.json @@ -45,7 +45,7 @@ } }, { - "name": "mtry", + "name": "max_variables", "description": "Specifies how many split variables will be used at a node. Default value is `null`, which corresponds to the number of predictors divided by 3.", "optional": true, "default": null, From b5c30e74621edacf468d374c68da10aa296e20fb Mon Sep 17 00:00:00 2001 From: Stefaan Lippens Date: Mon, 21 Mar 2022 15:17:55 +0100 Subject: [PATCH 041/117] minor typo in is_valid() (#353) --- is_valid.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/is_valid.json b/is_valid.json index 51924de4..f7d68b63 100644 --- a/is_valid.json +++ b/is_valid.json @@ -1,7 +1,7 @@ { "id": "is_valid", "summary": "Value is valid data", - "description": "Checks whether the specified value `x` is valid. The following values are considered valid:\n\n* Any finite numerical value (integers and floating-point numbers). The definition of finite numbers follows the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935) and excludes the special value `NaN` (not a number).\n* Any other value that is not a no-data value according to ``is_nodata()`. Thus all arrays, objects and strings are valid, regardless of their content.", + "description": "Checks whether the specified value `x` is valid. The following values are considered valid:\n\n* Any finite numerical value (integers and floating-point numbers). The definition of finite numbers follows the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935) and excludes the special value `NaN` (not a number).\n* Any other value that is not a no-data value according to ``is_nodata()``. Thus all arrays, objects and strings are valid, regardless of their content.", "categories": [ "comparison" ], @@ -56,4 +56,4 @@ "title": "IEEE Standard 754-2008 for Floating-Point Arithmetic" } ] -} \ No newline at end of file +} From b81e96b454b808e32aa94128ba71daf5b06daa19 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 23 Mar 2022 13:45:48 +0100 Subject: [PATCH 042/117] Clarifications for vector_to_*_points (#352) * Fixes #344 and #347 --- proposals/vector_to_random_points.json | 11 ++++------- proposals/vector_to_regular_points.json | 4 ++-- 2 files changed, 6 insertions(+), 9 deletions(-) diff --git a/proposals/vector_to_random_points.json b/proposals/vector_to_random_points.json index e71c5c6e..afe340ef 100644 --- a/proposals/vector_to_random_points.json +++ b/proposals/vector_to_random_points.json @@ -1,7 +1,7 @@ { "id": "vector_to_random_points", "summary": "Sample random points from geometries", - "description": "Generate a vector data cube of points by sampling random points from input geometries. At least one point is sampled per input geometry.\n\nThrows a `CountsMissing` exception if `geometry_count` and `total_count` are both unrestricted (i.e. set to `null`).", + "description": "Generate a vector data cube of points by sampling random points from input geometries. At least one point is sampled per input geometry. Vector properties are preserved.\n\nIf `geometry_count` and `total_count` are both unrestricted (i.e. set to `null`, which is the default), one sample per geometry is used.", "categories": [ "cubes", "vector" @@ -24,9 +24,9 @@ }, { "name": "geometry_count", - "description": "The maximum number of points to compute per geometry. Defaults to a maximum of one point per geometry.\n\nPoints in the input geometries can be selected only once by the sampling.", + "description": "The maximum number of points to compute per geometry.\n\nPoints in the input geometries can be selected only once by the sampling.", "optional": true, - "default": 1, + "default": null, "schema": [ { "type": "integer", @@ -56,7 +56,7 @@ }, { "name": "group", - "description": "Specifies whether the sampled points should be grouped by input geometry (default) or be generated as independent points.\n\n* If the sampled points are grouped, the process generates a `MultiPoint` per geometry given. Vector properties are preserved.\n* Otherwise, each sampled point is generated as a distinct `Point` geometry. Vector properties are *not* preserved.", + "description": "Specifies whether the sampled points should be grouped by input geometry (default) or be generated as independent points.\n\n* If the sampled points are grouped, the process generates a `MultiPoint` per geometry given which keeps the original identifier if present.\n* Otherwise, each sampled point is generated as a distinct `Point` geometry without identifier.", "optional": true, "default": true, "schema": { @@ -84,9 +84,6 @@ } }, "exceptions": { - "CountsMissing": { - "message": "No maximum count is set per geometry and in total." - }, "CountMismatch": { "message": "The total number of points is lower than the number of geometries." } diff --git a/proposals/vector_to_regular_points.json b/proposals/vector_to_regular_points.json index e5000361..3fd105f6 100644 --- a/proposals/vector_to_regular_points.json +++ b/proposals/vector_to_regular_points.json @@ -1,7 +1,7 @@ { "id": "vector_to_regular_points", "summary": "Sample regular points from geometries", - "description": "Generate a vector data cube of points by sampling regularly-spaced points from input geometries.", + "description": "Generate a vector data cube of points by sampling regularly-spaced points from input geometries. Vector properties are preserved.", "categories": [ "cubes", "vector" @@ -32,7 +32,7 @@ }, { "name": "group", - "description": "Specifies whether the sampled points should be grouped by input geometry (default) or be generated as independent points.\n\n* If the sampled points are grouped, the process generates a `MultiPoint` per geometry given. Vector properties are preserved.\n* Otherwise, each sampled point is generated as a distinct `Point` geometry. Vector properties are *not* preserved.", + "description": "Specifies whether the sampled points should be grouped by input geometry (default) or be generated as independent points.\n\n* If the sampled points are grouped, the process generates a `MultiPoint` per geometry given which keeps the original identifier if present.\n* Otherwise, each sampled point is generated as a distinct `Point` geometry without identifier.", "optional": true, "default": true, "schema": { From 7727689a17271c066f76f2aabebe89c0289e50b6 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 23 Mar 2022 15:40:36 +0100 Subject: [PATCH 043/117] drop training/validation split #354 --- proposals/fit_class_random_forest.json | 13 +------------ proposals/fit_regr_random_forest.json | 13 +------------ 2 files changed, 2 insertions(+), 24 deletions(-) diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json index e038a57d..20eadd6b 100644 --- a/proposals/fit_class_random_forest.json +++ b/proposals/fit_class_random_forest.json @@ -1,7 +1,7 @@ { "id": "fit_class_random_forest", "summary": "Train a random forest classification model", - "description": "Executes the fit of a random forest classification based on the user input of target and predictors. The Random Forest classification model is based on the approach by Breiman (2001).", + "description": "Executes the fit of a random forest classification based on the user input of target and predictors. The Random Forest classification model is based on the approach by Breiman (2001). Training/validation split must be done before.", "categories": [ "machine learning" ], @@ -23,17 +23,6 @@ "subtype": "vector-cube" } }, - { - "name": "training", - "description": "The amount of training data to be used in the classification, given as a fraction. The sampling will be chosen randomly through the data object. The remaining data will be used as test data for the validation.", - "optional": true, - "default": 0.8, - "schema": { - "type": "number", - "exclusiveMinimum": 0, - "maximum": 1 - } - }, { "name": "num_trees", "description": "The number of trees build within the Random Forest classification.", diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json index d2f6d7bc..8eed6dd0 100644 --- a/proposals/fit_regr_random_forest.json +++ b/proposals/fit_regr_random_forest.json @@ -1,7 +1,7 @@ { "id": "fit_regr_random_forest", "summary": "Train a random forest regression model", - "description": "Executes the fit of a random forest regression based on the user input of target and predictors. The Random Forest regression model is based on the approach by Breiman (2001).", + "description": "Executes the fit of a random forest regression based on the user input of target and predictors. The Random Forest regression model is based on the approach by Breiman (2001). Training/validation split must be done before.", "categories": [ "machine learning" ], @@ -23,17 +23,6 @@ "subtype": "vector-cube" } }, - { - "name": "training", - "description": "The amount of training data to be used in the regression, given as a fraction. The sampling will be randomly through the data object. The remaining data will be used as test data for the validation.", - "optional": true, - "default": 0.8, - "schema": { - "type": "number", - "exclusiveMinimum": 0, - "maximum": 1 - } - }, { "name": "num_trees", "description": "The number of trees build within the Random Forest regression.", From f440a345dca1b734d4825c41593dedda26ebb763 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 23 Mar 2022 15:51:53 +0100 Subject: [PATCH 044/117] max_variables parameter could additionally be of type string #355 --- proposals/fit_class_random_forest.json | 34 ++++++++++++++------------ proposals/fit_regr_random_forest.json | 32 +++++++++++++----------- 2 files changed, 37 insertions(+), 29 deletions(-) diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json index 20eadd6b..b070d33d 100644 --- a/proposals/fit_class_random_forest.json +++ b/proposals/fit_class_random_forest.json @@ -24,30 +24,34 @@ } }, { - "name": "num_trees", - "description": "The number of trees build within the Random Forest classification.", - "optional": true, - "default": 100, - "schema": { - "type": "integer", - "minimum": 1 - } - }, - { - "name": "mtry", - "description": "Specifies how many split variables will be used at a node. Default value is `null`, which corresponds to the number of predictors divided by 3.", - "optional": true, - "default": null, + "name": "max_variables", + "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split\n- `all`: All variables are considered for each split\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split\n- `sqrt`: The square root of the number of variables are considered for each split", "schema": [ { "type": "integer", "minimum": 1 }, { - "type": "null" + "type": "string", + "enum": [ + "all", + "log2", + "onethird", + "sqrt" + ] } ] }, + { + "name": "num_trees", + "description": "The number of trees build within the Random Forest classification.", + "optional": true, + "default": 100, + "schema": { + "type": "integer", + "minimum": 1 + } + }, { "name": "seed", "description": "A randomization seed to use for the random sampling in training. If not given or `null`, no seed is used and results may differ on subsequent use.", diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json index 8eed6dd0..55351481 100644 --- a/proposals/fit_regr_random_forest.json +++ b/proposals/fit_regr_random_forest.json @@ -23,31 +23,35 @@ "subtype": "vector-cube" } }, - { - "name": "num_trees", - "description": "The number of trees build within the Random Forest regression.", - "optional": true, - "default": 100, - "schema": { - "type": "integer", - "minimum": 1 - } - }, { "name": "max_variables", - "description": "Specifies how many split variables will be used at a node. Default value is `null`, which corresponds to the number of predictors divided by 3.", - "optional": true, - "default": null, + "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split\n- `all`: All variables are considered for each split\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split\n- `sqrt`: The square root of the number of variables are considered for each split", "schema": [ { "type": "integer", "minimum": 1 }, { - "type": "null" + "type": "string", + "enum": [ + "all", + "log2", + "onethird", + "sqrt" + ] } ] }, + { + "name": "num_trees", + "description": "The number of trees build within the Random Forest regression.", + "optional": true, + "default": 100, + "schema": { + "type": "integer", + "minimum": 1 + } + }, { "name": "seed", "description": "A randomization seed to use for the random sampling in training. If not given or `null`, no seed is used and results may differ on subsequent use.", From d91d07e8cb232040651602bb7cd72488d824f453 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 23 Mar 2022 17:31:38 +0100 Subject: [PATCH 045/117] Applied suggestions from code review Co-authored-by: Mattia Rossi --- proposals/fit_class_random_forest.json | 4 ++-- proposals/fit_regr_random_forest.json | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json index b070d33d..a9a549d9 100644 --- a/proposals/fit_class_random_forest.json +++ b/proposals/fit_class_random_forest.json @@ -1,7 +1,7 @@ { "id": "fit_class_random_forest", "summary": "Train a random forest classification model", - "description": "Executes the fit of a random forest classification based on the user input of target and predictors. The Random Forest classification model is based on the approach by Breiman (2001). Training/validation split must be done before.", + "description": "Executes the fit of a random forest classification based on training data. The process does not include a separate split of the data in test, validation and training data. The Random Forest classification model is based on the approach by Breiman (2001).", "categories": [ "machine learning" ], @@ -25,7 +25,7 @@ }, { "name": "max_variables", - "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split\n- `all`: All variables are considered for each split\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split\n- `sqrt`: The square root of the number of variables are considered for each split", + "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split.\n- `all`: All variables are considered for each split.\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split.\n- `onethird`: A third of the number of variables are considered for each split.\n- `sqrt`: The square root of the number of variables are considered for each split. This is often the default for classification.", "schema": [ { "type": "integer", diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json index 55351481..e75028b2 100644 --- a/proposals/fit_regr_random_forest.json +++ b/proposals/fit_regr_random_forest.json @@ -1,7 +1,7 @@ { "id": "fit_regr_random_forest", "summary": "Train a random forest regression model", - "description": "Executes the fit of a random forest regression based on the user input of target and predictors. The Random Forest regression model is based on the approach by Breiman (2001). Training/validation split must be done before.", + "description": "Executes the fit of a random forest regression based on training data. The process does not include a separate split of the data in test, validation and training data. The Random Forest regression model is based on the approach by Breiman (2001).", "categories": [ "machine learning" ], @@ -25,7 +25,7 @@ }, { "name": "max_variables", - "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split\n- `all`: All variables are considered for each split\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split\n- `sqrt`: The square root of the number of variables are considered for each split", + "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split.\n- `all`: All variables are considered for each split.\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split.\n- `onethird`: A third of the number of variables are considered for each split. This is often the default for regression.\n- `sqrt`: The square root of the number of variables are considered for each split.", "schema": [ { "type": "integer", From 2cfd022fdc4826af89166d88880a873cbea20e90 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 24 Mar 2022 13:27:27 +0100 Subject: [PATCH 046/117] Add missing ML functions to changelog --- CHANGELOG.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c3ed5504..27f984a8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,10 +9,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - New processes in proposal state: + - `fit_class_random_forest` + - `fit_regr_random_forest` - `flatten_dimensions` - `load_ml_model` - - `unflatten_dimension` + - `predict_random_forest` - `save_ml_model` + - `unflatten_dimension` - `vector_buffer` - `vector_to_random_points` - `vector_to_regular_points` From ca9e31094b863233d88459b6cf2a37416bc90d4e Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 4 Apr 2022 15:07:38 +0200 Subject: [PATCH 047/117] mask: minor clarification of the dimensions of data and mask arguments #359 --- mask.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mask.json b/mask.json index d7b591e1..515c81cb 100644 --- a/mask.json +++ b/mask.json @@ -1,7 +1,7 @@ { "id": "mask", "summary": "Apply a raster mask", - "description": "Applies a mask to a raster data cube. To apply a vector mask use ``mask_polygon()``.\n\nA mask is a raster data cube for which corresponding pixels among `data` and `mask` are compared and those pixels in `data` are replaced whose pixels in `mask` are non-zero (for numbers) or `true` (for boolean values). The pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data).\n\nThe data cubes have to be compatible so that each dimension in the mask must also be available in the raster data cube with the same name, type, reference system, resolution and labels. Dimensions can be missing in the mask with the result that the mask is applied for each label of the missing dimension in the data cube. The process fails if there's an incompatibility found between the raster data cube and the mask.", + "description": "Applies a mask to a raster data cube. To apply a vector mask use ``mask_polygon()``.\n\nA mask is a raster data cube for which corresponding pixels among `data` and `mask` are compared and those pixels in `data` are replaced whose pixels in `mask` are non-zero (for numbers) or `true` (for boolean values). The pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data).\n\nThe data cubes have to be compatible so that each dimension in the mask must also be available in the raster data cube with the same name, type, reference system, resolution and labels. Dimensions can be missing in the mask with the result that the mask is applied to each label of the dimension in `data` that is missing in the data cube of the mask. The process fails if there's an incompatibility found between the raster data cube and the mask.", "categories": [ "cubes", "masks" @@ -45,4 +45,4 @@ "subtype": "raster-cube" } } -} \ No newline at end of file +} From 17234ae4ff25e479c75021aa150bc4d2e58c01fd Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 19 Apr 2022 12:00:22 +0200 Subject: [PATCH 048/117] is_nodata: Clarify NaN #361 (#362) * `is_nodata`: Clarified that `NaN` can be considered as no-data value only if it is explicitly specified. #361 --- CHANGELOG.md | 1 + is_nodata.json | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 27f984a8..4f954949 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -38,6 +38,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. - `array_contains` and `array_find`: Clarify that giving `null` as `value` always returns `false` or `null` respectively, also fixed the incorrect examples. [#348](https://github.com/Open-EO/openeo-processes/issues/348) - `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) +- `is_nodata`: Clarified that `NaN` can be considered as a no-data value only if it is explicitly specified as no-data value. [#361](https://github.com/Open-EO/openeo-processes/issues/361) - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) - `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) diff --git a/is_nodata.json b/is_nodata.json index f975bb7d..108b8519 100644 --- a/is_nodata.json +++ b/is_nodata.json @@ -1,7 +1,7 @@ { "id": "is_nodata", "summary": "Value is a no-data value", - "description": "Checks whether the specified data is missing data, i.e. equals to `null` or any of the no-data values specified in the metadata. The special numerical value `NaN` (not a number) as defined by the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935) is not considered no-data and must return `false`.", + "description": "Checks whether the specified data is missing data, i.e. equals to `null` or any of the no-data values specified in the metadata.\n\nThe special numerical value `NaN` (not a number) as defined by the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935) is only considered as no-data value if specified as no-data value in the metdata.", "categories": [ "comparison" ], From ab51285e2bd37e3e882472d0f792b8da1faf19a6 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 21 Apr 2022 10:08:24 +0200 Subject: [PATCH 049/117] Migrate examples to openEO community examples repo (#342) --- .github/workflows/docs.yml | 3 +- CHANGELOG.md | 4 ++ README.md | 4 +- array_apply.json | 16 +----- array_contains.json | 2 +- array_find.json | 2 +- examples/array_contains_nodata.json | 59 ---------------------- examples/array_find_nodata.json | 65 ------------------------ examples/rename-enumerated-labels.json | 51 ------------------- rename_labels.json | 8 --- tests/examples.test.js | 69 -------------------------- 11 files changed, 10 insertions(+), 273 deletions(-) delete mode 100644 examples/array_contains_nodata.json delete mode 100644 examples/array_find_nodata.json delete mode 100644 examples/rename-enumerated-labels.json delete mode 100644 tests/examples.test.js diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 260743f4..2386d523 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -25,7 +25,6 @@ jobs: run: | git clone --branch gh-pages https://github.com/Open-EO/openeo-processes.git gh-pages find gh-pages -maxdepth 1 -type f -delete - rm -rf gh-pages/examples/ rm -rf gh-pages/meta/ rm -rf gh-pages/proposals/ - name: create empty gh-pages folder @@ -34,7 +33,7 @@ jobs: - run: | cp tests/docs.html index.html cp tests/processes.json processes.json - rsync -vrm --include='*.json' --include='*.html' --include='examples/***' --include='meta/***' --include='proposals/***' --exclude='*' . gh-pages + rsync -vrm --include='*.json' --include='*.html' --include='meta/***' --include='proposals/***' --exclude='*' . gh-pages - name: deploy to root (master) uses: peaceiris/actions-gh-pages@v3 if: ${{ env.GITHUB_REF_SLUG == 'master' }} diff --git a/CHANGELOG.md b/CHANGELOG.md index 4f954949..0092cb41 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -29,6 +29,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations. - `apply_neighborhood`: Allow `null` as default value for units. +### Removed + +- The `examples` folder has been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. + ### Fixed - `aggregate_spatial`: diff --git a/README.md b/README.md index 99ea0a02..91d2096d 100644 --- a/README.md +++ b/README.md @@ -29,11 +29,11 @@ See also the [changelog](CHANGELOG.md) for the changes between versions and the This repository contains a set of files formally describing the openEO Processes: -* The `*.json` files provide stable process specifications as defined by openEO. Stable processes need at least two implementations and a use-case example added to the [`examples`](examples/) folder *or* consensus from the openEO PSC. +* The `*.json` files provide stable process specifications as defined by openEO. Stable processes need at least two implementations and a use-case example added to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples) repository *or* consensus from the openEO PSC. * The `*.json` files in the [`proposals`](proposals/) folder provide proposed new process specifications that are still experimental and subject to change, including breaking changes. Everyone is encouraged to base their work on the proposals and give feedback so that eventually the processes evolve into stable process specifications. * [implementation.md](meta/implementation.md) in the `meta` folder provide some additional implementation details for back-ends. For back-end implementors, it's highly recommended to read them. * [subtype-schemas.json](meta/subtype-schemas.json) in the `meta` folder defines common data types (`subtype`s) for JSON Schema used in openEO processes. -* The [`examples`](examples/) folder contains some useful examples that the processes link to. All of these are non-binding additions. +* Previously, an `examples` folder contained examples of user-defined processes. These have been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. * The [`tests`](tests/) folder can be used to test the process specification for validity and consistent "style". It also allows rendering the processes in a web browser. Check the [tests documentation](tests/README.md) for details. ## Process diff --git a/array_apply.json b/array_apply.json index 194d2f35..80e53bef 100644 --- a/array_apply.json +++ b/array_apply.json @@ -91,19 +91,5 @@ "description": "Any data type is allowed." } } - }, - "links": [ - { - "rel": "example", - "type": "application/json", - "href": "https://processes.openeo.org/1.2.0/examples/array_find_nodata.json", - "title": "Find no-data values in arrays" - }, - { - "rel": "example", - "type": "application/json", - "href": "https://processes.openeo.org/1.2.0/examples/array_contains_nodata.json", - "title": "Check for no-data values in arrays" - } - ] + } } diff --git a/array_contains.json b/array_contains.json index a61622bc..803e14d0 100644 --- a/array_contains.json +++ b/array_contains.json @@ -133,7 +133,7 @@ { "rel": "example", "type": "application/json", - "href": "https://processes.openeo.org/1.2.0/examples/array_contains_nodata.json", + "href": "https://raw.githubusercontent.com/Open-EO/openeo-community-examples/main/processes/array_contains_nodata.json", "title": "Check for no-data values in arrays" } ], diff --git a/array_find.json b/array_find.json index a579a428..7b0e6317 100644 --- a/array_find.json +++ b/array_find.json @@ -164,7 +164,7 @@ { "rel": "example", "type": "application/json", - "href": "https://processes.openeo.org/1.2.0/examples/array_find_nodata.json", + "href": "https://raw.githubusercontent.com/Open-EO/openeo-community-examples/main/processes/array_find_nodata.json", "title": "Find no-data values in arrays" } ] diff --git a/examples/array_contains_nodata.json b/examples/array_contains_nodata.json deleted file mode 100644 index 3b1fe6f0..00000000 --- a/examples/array_contains_nodata.json +++ /dev/null @@ -1,59 +0,0 @@ -{ - "id": "array_contains_nodata", - "summary": "Check for no-data values", - "description": "Check whether the array contains a no-data (`null`) value.", - "categories": [ - "arrays" - ], - "parameters": [ - { - "name": "data", - "description": "List to find the value in.", - "schema": { - "type": "array", - "items": { - "description": "Any data type is allowed." - } - } - } - ], - "returns": { - "description": "`true` if the list contains a no-data value, false` otherwise.", - "schema": { - "type": "boolean" - } - }, - "process_graph": { - "apply": { - "process_id": "array_apply", - "arguments": { - "data": { - "from_parameter": "data" - }, - "process": { - "process_graph": { - "is_null": { - "process_id": "is_nodata", - "arguments": { - "x": { - "from_parameter": "x" - } - }, - "result": true - } - } - } - } - }, - "find": { - "process_id": "array_contains", - "arguments": { - "data": { - "from_node": "apply" - }, - "value": true - }, - "result": true - } - } -} \ No newline at end of file diff --git a/examples/array_find_nodata.json b/examples/array_find_nodata.json deleted file mode 100644 index b57fc35b..00000000 --- a/examples/array_find_nodata.json +++ /dev/null @@ -1,65 +0,0 @@ -{ - "id": "array_find_nodata", - "summary": "Find no-data values", - "description": "Get the index of the first no-data (`null`) value in an array.", - "categories": [ - "arrays" - ], - "parameters": [ - { - "name": "data", - "description": "List to find the value in.", - "schema": { - "type": "array", - "items": { - "description": "Any data type is allowed." - } - } - } - ], - "returns": { - "description": "The index of the first element with a no-data value. If only data values are available, `null` is returned.", - "schema": [ - { - "type": "null" - }, - { - "type": "integer", - "minimum": 0 - } - ] - }, - "process_graph": { - "apply": { - "process_id": "array_apply", - "arguments": { - "data": { - "from_parameter": "data" - }, - "process": { - "process_graph": { - "is_null": { - "process_id": "is_nodata", - "arguments": { - "x": { - "from_parameter": "x" - } - }, - "result": true - } - } - } - } - }, - "find": { - "process_id": "array_find", - "arguments": { - "data": { - "from_node": "apply" - }, - "value": true - }, - "result": true - } - } -} \ No newline at end of file diff --git a/examples/rename-enumerated-labels.json b/examples/rename-enumerated-labels.json deleted file mode 100644 index a8fc3a2c..00000000 --- a/examples/rename-enumerated-labels.json +++ /dev/null @@ -1,51 +0,0 @@ -{ - "summary": "Rename enumerated labels", - "description": "The process replaces the temporal dimension with a new dimension `min_max` with enumerated labels. The first label refers to the minimum values, the second label refers to the maximum values. Afterwards, the dimension labels are renamed to `min` and `max` respectively.", - "process_graph": { - "loadco1": { - "process_id": "load_collection", - "arguments": { - "id": "S2-RGB", - "spatial_extent": null, - "temporal_extent": null - } - }, - "apply1": { - "process_id": "apply_dimension", - "arguments": { - "data": { - "from_node": "loadco1" - }, - "process": { - "process_graph": { - "extrem1": { - "process_id": "extrema", - "arguments": { - "data": { - "from_parameter": "data" - } - }, - "result": true - } - } - }, - "dimension": "t", - "target_dimension": "min_max" - } - }, - "rename1": { - "process_id": "rename_labels", - "arguments": { - "data": { - "from_node": "apply1" - }, - "dimension": "bands", - "target": [ - "min", - "max" - ] - }, - "result": true - } - } -} \ No newline at end of file diff --git a/rename_labels.json b/rename_labels.json index 1e3fda8d..41fe7d7d 100644 --- a/rename_labels.json +++ b/rename_labels.json @@ -92,13 +92,5 @@ ] } } - ], - "links": [ - { - "rel": "example", - "type": "application/json", - "href": "https://processes.openeo.org/1.2.0/examples/rename-enumerated-labels.json", - "title": "Rename enumerated labels" - } ] } diff --git a/tests/examples.test.js b/tests/examples.test.js deleted file mode 100644 index 5ada1501..00000000 --- a/tests/examples.test.js +++ /dev/null @@ -1,69 +0,0 @@ -const glob = require('glob'); -const fs = require('fs'); -const path = require('path'); -const { normalizeString, checkDescription } = require('./testHelpers'); -const { ProcessRegistry, ProcessGraph } = require('@openeo/js-processgraphs'); - -const registry = new ProcessRegistry(); -const processes = glob.sync("../*.json", {realpath: true}); -processes.forEach(file => { - try { - var process = require(file); - registry.add(process); - } catch(err) { - console.error(err); - } -}); - -const files = glob.sync("../examples/*.json", {realpath: true}); -var examples = []; -files.forEach(file => { - try { - var fileContent = fs.readFileSync(file); - // Check JSON structure for faults - var example = JSON.parse(fileContent); - - // Prepare for tests - examples.push([file, example, fileContent.toString()]); - } catch(err) { - examples.push([file, {}, ""]); - console.error(err); - expect(err).toBeUndefined(); - } -}); - -describe.each(examples)("%s", (file, e, fileContent) => { - - test("File / JSON", () => { - const ext = path.extname(file); - // Check that the process file has a lower-case json extension - expect(ext).toEqual(".json"); - // If id is set: Check that the process name is also the file name - if (typeof e.id !== 'undefined') { - expect(path.basename(file, ext)).toEqual(e.id); - } - // lint: Check whether the file is correctly JSON formatted - expect(normalizeString(JSON.stringify(e, null, 4))).toEqual(normalizeString(fileContent)); - }); - - let pg = new ProcessGraph(e, registry); - - test("Require descriptions", () => { - // description - expect(typeof e.description).toBe('string'); - // lint: Description should be longer than a summary - expect(e.description.length).toBeGreaterThan(55); - checkDescription(e.description, e); - }); - - test("Parse", () => { - expect(() => pg.parse()).not.toThrow(); - expect(pg.parsed).toBe(true); - }); - - test("Validation", async () => { - await expect(pg.validate()).resolves.not.toThrow(); - expect(pg.isValid()).toBeTruthy(); - }); - -}); From 589bfd29667f8bfa4dd40992ca4f5edcb2d753d2 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 21 Apr 2022 10:22:27 +0200 Subject: [PATCH 050/117] Minor improvement to wording --- proposals/predict_random_forest.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/proposals/predict_random_forest.json b/proposals/predict_random_forest.json index 46103938..62c54e9f 100644 --- a/proposals/predict_random_forest.json +++ b/proposals/predict_random_forest.json @@ -1,6 +1,6 @@ { "id": "predict_random_forest", - "summary": "Predict values from a Random Forest model", + "summary": "Predict values based on a Random Forest model", "description": "Applies a Random Forest machine learning model to an array and predict a value for it.", "categories": [ "machine learning", @@ -39,4 +39,4 @@ ] } } -} \ No newline at end of file +} From faf47828cebf19f758bac5d7c513fb2e569ea4d5 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 2 May 2022 13:47:33 +0200 Subject: [PATCH 051/117] load_ml_model: user uploaded files #367 --- proposals/load_ml_model.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/load_ml_model.json b/proposals/load_ml_model.json index 445b5f2f..151513c8 100644 --- a/proposals/load_ml_model.json +++ b/proposals/load_ml_model.json @@ -27,7 +27,7 @@ "pattern": "^[\\w\\-\\.~]+$" }, { - "title": "User-uploaded Files", + "title": "User-uploaded File", "type": "string", "subtype": "file-path", "pattern": "^[^\r\n\\:'\"]+$" From 98a93afb2f62da79be61b6bb871aef7223d48969 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 18 May 2022 14:37:05 +0200 Subject: [PATCH 052/117] Fix typos --- CHANGELOG.md | 2 +- is_nodata.json | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0092cb41..e00468f4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -167,7 +167,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed - `any` and `all`: Renamed parameter `values` to `data`. [#147](https://github.com/Open-EO/openeo-processes/issues/147) -- `load_collection`: Parameter `properties` has subtype `metdata-filter`. +- `load_collection`: Parameter `properties` has subtype `metadata-filter`. - Examples adapted to latest API version for `aggregate_temporal`, `array_contains`, `array_find`, `filter_labels`, `load_collection` and `rename_labels`. [#136](https://github.com/Open-EO/openeo-processes/issues/136), [API#285](https://github.com/Open-EO/openeo-api/issues/285) - Some processes were assigned to different categories. diff --git a/is_nodata.json b/is_nodata.json index 108b8519..a1b7c08b 100644 --- a/is_nodata.json +++ b/is_nodata.json @@ -1,7 +1,7 @@ { "id": "is_nodata", "summary": "Value is a no-data value", - "description": "Checks whether the specified data is missing data, i.e. equals to `null` or any of the no-data values specified in the metadata.\n\nThe special numerical value `NaN` (not a number) as defined by the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935) is only considered as no-data value if specified as no-data value in the metdata.", + "description": "Checks whether the specified data is missing data, i.e. equals to `null` or any of the no-data values specified in the metadata.\n\nThe special numerical value `NaN` (not a number) as defined by the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935) is only considered as no-data value if specified as no-data value in the metadata.", "categories": [ "comparison" ], From 144414e8cf5e4e67d408d8a76daa4a5957a10108 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 19 Jul 2022 13:38:59 +0200 Subject: [PATCH 053/117] `run_udf`: Allow all data types instead of just objects in the `context` parameter. #376 --- CHANGELOG.md | 1 + run_udf.json | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e00468f4..2668bbc2 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -28,6 +28,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `array_modify` - Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations. - `apply_neighborhood`: Allow `null` as default value for units. +- `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376) ### Removed diff --git a/run_udf.json b/run_udf.json index f65f850c..1122f4e5 100644 --- a/run_udf.json +++ b/run_udf.json @@ -79,7 +79,7 @@ "name": "context", "description": "Additional data such as configuration options to be passed to the UDF.", "schema": { - "type": "object" + "description": "Any data type." }, "default": {}, "optional": true @@ -100,4 +100,4 @@ "description": "Any data type." } } -} \ No newline at end of file +} From 31ce48d60fbd5fc6c0b3cb5cf1fdcdaed1a25394 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 19 Jul 2022 13:51:16 +0200 Subject: [PATCH 054/117] Update test dependencies --- tests/package.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tests/package.json b/tests/package.json index c2137589..be51806f 100644 --- a/tests/package.json +++ b/tests/package.json @@ -23,10 +23,10 @@ "ajv": "^6.12.4", "concat-json-files": "^1.1.0", "glob": "^7.1.6", - "http-server": "^0.12.3", + "http-server": "^14.1.1", "jest": "^26.4.2", "markdown-spellcheck": "^1.3.1", - "markdownlint": "^0.18.0" + "markdownlint": "^0.26.0" }, "scripts": { "test": "jest", From 77df6a050584dce46a3eb8610a90604fe5472b36 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 19 Jul 2022 13:59:44 +0200 Subject: [PATCH 055/117] `load_collection` and `load_result`: Require at least one band if not set to `null`. #372 --- CHANGELOG.md | 1 + load_collection.json | 3 ++- proposals/load_result.json | 3 ++- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 2668bbc2..3af63aba 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -29,6 +29,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations. - `apply_neighborhood`: Allow `null` as default value for units. - `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376) +- `load_collection` and `load_result`: Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) ### Removed diff --git a/load_collection.json b/load_collection.json index dfdb72ca..d6615f21 100644 --- a/load_collection.json +++ b/load_collection.json @@ -161,6 +161,7 @@ "schema": [ { "type": "array", + "minItems": 1, "items": { "type": "string", "subtype": "band-name" @@ -303,4 +304,4 @@ "title": "List of common band names as specified by the STAC specification" } ] -} \ No newline at end of file +} diff --git a/proposals/load_result.json b/proposals/load_result.json index d8b70c6a..fa056b48 100644 --- a/proposals/load_result.json +++ b/proposals/load_result.json @@ -176,6 +176,7 @@ "schema": [ { "type": "array", + "minItems": 1, "items": { "type": "string", "subtype": "band-name" @@ -198,4 +199,4 @@ "subtype": "raster-cube" } } -} \ No newline at end of file +} From 21468020d1ecfbcda9cf5554b3b2bdd186553716 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 19 Jul 2022 14:18:41 +0200 Subject: [PATCH 056/117] `count`: Explicitly mention that `false` is not allowed in `condition` #375 --- count.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/count.json b/count.json index c2de1451..19cacead 100644 --- a/count.json +++ b/count.json @@ -20,7 +20,7 @@ }, { "name": "condition", - "description": "A condition consists of one or more processes, which in the end return a boolean value. It is evaluated against each element in the array. An element is counted only if the condition returns `true`. Defaults to count valid elements in a list (see ``is_valid()``). Setting this parameter to boolean `true` counts all elements in the list.", + "description": "A condition consists of one or more processes, which in the end return a boolean value. It is evaluated against each element in the array. An element is counted only if the condition returns `true`. Defaults to count valid elements in a list (see ``is_valid()``). Setting this parameter to boolean `true` counts all elements in the list. `false` is not a valid value for this parameter.", "schema": [ { "title": "Condition", @@ -147,4 +147,4 @@ "returns": 3 } ] -} \ No newline at end of file +} From 14b33b16b7e39c7eec9f39024311c3e0e13f19dd Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 19 Jul 2022 15:01:47 +0200 Subject: [PATCH 057/117] `is_nan`: Fixed issue with return value and clarification #360 (#363) --- CHANGELOG.md | 1 + is_nan.json | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 3af63aba..77fa176b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -44,6 +44,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. - `array_contains` and `array_find`: Clarify that giving `null` as `value` always returns `false` or `null` respectively, also fixed the incorrect examples. [#348](https://github.com/Open-EO/openeo-processes/issues/348) - `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) +- `is_nan`: Fixed a wrong description of the return value and simplified/clarified the process descriptions overall. [#360](https://github.com/Open-EO/openeo-processes/issues/360) - `is_nodata`: Clarified that `NaN` can be considered as a no-data value only if it is explicitly specified as no-data value. [#361](https://github.com/Open-EO/openeo-processes/issues/361) - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) - `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) diff --git a/is_nan.json b/is_nan.json index 91f4bcf9..1135c1e2 100644 --- a/is_nan.json +++ b/is_nan.json @@ -1,7 +1,7 @@ { "id": "is_nan", "summary": "Value is not a number", - "description": "Checks whether the specified value `x` is not a number. Returns `true` for numeric values (integers and floating-point numbers), except for the special value `NaN` as defined by the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935). All non-numeric data types MUST also return `true`, including arrays that contain `NaN` values.", + "description": "Checks whether the specified value `x` is *not* a number. Numbers are all integers and floating-point numbers, except for the special value `NaN` as defined by the [IEEE Standard 754](https://ieeexplore.ieee.org/document/4610935).", "categories": [ "comparison", "math > constants" @@ -16,7 +16,7 @@ } ], "returns": { - "description": "`true` if the data is not a number, otherwise `false`.", + "description": "Returns `true` for `NaN` and all non-numeric data types, otherwise returns `false`.", "schema": { "type": "boolean" } From 39f8457a954294af76f4e1c5b087455149fdc71b Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 2 Aug 2022 16:26:54 +0200 Subject: [PATCH 058/117] `inspect`: The parameter `message` has been moved to be the second argument. #369 --- CHANGELOG.md | 1 + meta/implementation.md | 2 +- proposals/inspect.json | 20 ++++++++++---------- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 77fa176b..cd3c1431 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -30,6 +30,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `apply_neighborhood`: Allow `null` as default value for units. - `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376) - `load_collection` and `load_result`: Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) +- `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) ### Removed diff --git a/meta/implementation.md b/meta/implementation.md index 8774f9e4..65a24430 100644 --- a/meta/implementation.md +++ b/meta/implementation.md @@ -207,7 +207,7 @@ The top-level object and/or each dimension can be enhanced with additional stats "values": ["NDVI"] } }, - // optional: Return additional statstics for the data cube if possible, ideally use the corresponsing openEO process names as keys + // optional: Return additional data or statstics for the data cube if possible (see also the chapter for "Arrays" above). "min": -1, "max": 1 } diff --git a/proposals/inspect.json b/proposals/inspect.json index 91d6deb6..b0a0335d 100644 --- a/proposals/inspect.json +++ b/proposals/inspect.json @@ -14,6 +14,15 @@ "description": "Any data type is allowed." } }, + { + "name": "message", + "description": "A message to send in addition to the data.", + "schema": { + "type": "string" + }, + "default": "", + "optional": true + }, { "name": "code", "description": "A label to help identify one or more log entries originating from this process in the list of all log entries. It can help to group or filter log entries and is usually not unique.", @@ -37,15 +46,6 @@ }, "default": "info", "optional": true - }, - { - "name": "message", - "description": "A message to send in addition to the data.", - "schema": { - "type": "string" - }, - "default": "", - "optional": true } ], "returns": { @@ -54,4 +54,4 @@ "description": "Any data type is allowed." } } -} \ No newline at end of file +} From e0970f8db844640f47d5adeb0507ab27eb46b1d7 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 2 Aug 2022 16:39:31 +0200 Subject: [PATCH 059/117] Added exceptions for empty data cube handling (#370) * Added exceptions for empty data cube handling --- CHANGELOG.md | 2 ++ load_collection.json | 7 ++++++- save_result.json | 5 ++++- 3 files changed, 12 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index cd3c1431..c11339c8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -30,7 +30,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `apply_neighborhood`: Allow `null` as default value for units. - `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376) - `load_collection` and `load_result`: Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) +- `load_collection`: Added a `NoDataAvailable` exception - `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) +- `save_result`: Added a more concrete `DataCubeEmpty` exception. ### Removed diff --git a/load_collection.json b/load_collection.json index d6615f21..160888be 100644 --- a/load_collection.json +++ b/load_collection.json @@ -1,7 +1,7 @@ { "id": "load_collection", "summary": "Load a collection", - "description": "Loads a collection from the current back-end by its id and returns it as a processable data cube. The data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent`, `bands` and `properties`.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", + "description": "Loads a collection from the current back-end by its id and returns it as a processable data cube. The data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent`, `bands` and `properties`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", "categories": [ "cubes", "import" @@ -222,6 +222,11 @@ "subtype": "raster-cube" } }, + "exceptions": { + "NoDataAvailable": { + "message": "There is no data available for the given extents." + } + }, "examples": [ { "description": "Loading `Sentinel-2B` data from a `Sentinel-2` collection for 2018, but only with cloud cover between 0 and 50%.", diff --git a/save_result.json b/save_result.json index 905fe4b4..0ad0a582 100644 --- a/save_result.json +++ b/save_result.json @@ -23,7 +23,7 @@ }, { "name": "format", - "description": "The file format to use. It must be one of the values that the server reports as supported output file formats, which usually correspond to the short GDAL/OGR codes. If the format is not suitable for storing the underlying data structure, a `FormatUnsuitable` exception will be thrown. This parameter is *case insensitive*.", + "description": "The file format to use. It must be one of the values that the server reports as supported output file formats, which usually correspond to the short GDAL/OGR codes. This parameter is *case insensitive*.\n\n* If the data cube is empty and the file format can't store empty data cubes, a `DataCubeEmpty` exception is thrown.\n* If the file format is otherwise not suitable for storing the underlying data structure, a `FormatUnsuitable` exception is thrown.", "schema": { "type": "string", "subtype": "output-format" @@ -49,6 +49,9 @@ "exceptions": { "FormatUnsuitable": { "message": "Data can't be transformed into the requested output format." + }, + "DataCubeEmpty": { + "message": "The file format doesn't support storing empty data cubes." } }, "links": [ From fc05527df6ebe316cd266f58c41bebb93a2513ac Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 6 Sep 2022 14:28:16 +0200 Subject: [PATCH 060/117] Add missing cubes category --- anomaly.json | 3 ++- climatological_normal.json | 3 ++- ndvi.json | 3 ++- 3 files changed, 6 insertions(+), 3 deletions(-) diff --git a/anomaly.json b/anomaly.json index 8dee2b68..3f369087 100644 --- a/anomaly.json +++ b/anomaly.json @@ -4,6 +4,7 @@ "description": "Computes anomalies based on normals for temporal periods. It compares the data for each label in the temporal dimension with the corresponding data in the normals data cube by subtracting the normal from the data.", "categories": [ "climatology", + "cubes", "math" ], "parameters": [ @@ -52,4 +53,4 @@ "subtype": "raster-cube" } } -} \ No newline at end of file +} diff --git a/climatological_normal.json b/climatological_normal.json index efaedd74..33cd2d60 100644 --- a/climatological_normal.json +++ b/climatological_normal.json @@ -3,6 +3,7 @@ "summary": "Compute climatology normals", "description": "Climatological normal period is a usually 30 year average of a weather variable. Climatological normals are used as an average or baseline to evaluate climate events and provide context for yearly, monthly, daily or seasonal variability. The default climatology period is from 1981 until 2010 (both inclusive).", "categories": [ + "cubes", "climatology" ], "parameters": [ @@ -65,4 +66,4 @@ "title": "Background information on climatology normal by Wikipedia" } ] -} \ No newline at end of file +} diff --git a/ndvi.json b/ndvi.json index ba5e54fb..e86a27e6 100644 --- a/ndvi.json +++ b/ndvi.json @@ -3,6 +3,7 @@ "summary": "Normalized Difference Vegetation Index", "description": "Computes the Normalized Difference Vegetation Index (NDVI). The NDVI is computed as *`(nir - red) / (nir + red)`*.\n\nThe `data` parameter expects a raster data cube with a dimension of type `bands` or a `DimensionAmbiguous` exception is thrown otherwise. By default, the dimension must have at least two bands with the common names `red` and `nir` assigned. Otherwise, the user has to specify the parameters `nir` and `red`. If neither is the case, either the exception `NirBandAmbiguous` or `RedBandAmbiguous` is thrown. The common names for each band are specified in the collection's band metadata and are *not* equal to the band names.\n\nBy default, the dimension of type `bands` is dropped by this process. To keep the dimension specify a new band name in the parameter `target_band`. This adds a new dimension label with the specified name to the dimension, which can be used to access the computed values. If a band with the specified name exists, a `BandExists` is thrown.\n\nThis process is very similar to the process ``normalized_difference()``, but determines the bands automatically based on the common names (`red`/`nir`) specified in the metadata.", "categories": [ + "cubes", "math > indices", "vegetation indices" ], @@ -89,4 +90,4 @@ "title": "List of common band names as specified by the STAC specification" } ] -} \ No newline at end of file +} From be148835a725966fbf1fff65215dbf2bca82a4a7 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 9 Sep 2022 18:10:01 +0200 Subject: [PATCH 061/117] Fix wrong parameter reference in apply_neighborhood --- apply_neighborhood.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/apply_neighborhood.json b/apply_neighborhood.json index 4966f28e..71d47ef1 100644 --- a/apply_neighborhood.json +++ b/apply_neighborhood.json @@ -23,7 +23,7 @@ "parameters": [ { "name": "data", - "description": "A subset of the data cube as specified in `context` and `overlap`.", + "description": "A subset of the data cube as specified in `size` and `overlap`.", "schema": { "type": "object", "subtype": "raster-cube" From 2518953c74f341f1c6e0f378f6db9deeb7a377a6 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 6 Oct 2022 16:54:08 +0200 Subject: [PATCH 062/117] Add `NoDataAvailable` to load_result --- CHANGELOG.md | 5 +++-- proposals/load_result.json | 7 ++++++- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c11339c8..2a69e93e 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -29,8 +29,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations. - `apply_neighborhood`: Allow `null` as default value for units. - `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376) -- `load_collection` and `load_result`: Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) -- `load_collection`: Added a `NoDataAvailable` exception +- `load_collection` and `load_result`: + - Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) + - Added a `NoDataAvailable` exception - `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) - `save_result`: Added a more concrete `DataCubeEmpty` exception. diff --git a/proposals/load_result.json b/proposals/load_result.json index fa056b48..9e7993c3 100644 --- a/proposals/load_result.json +++ b/proposals/load_result.json @@ -1,7 +1,7 @@ { "id": "load_result", "summary": "Load batch job results", - "description": "Loads batch job results and returns them as a processable data cube. A batch job result can be loaded by ID or URL:\n\n* **ID**: The identifier for a finished batch job. The job must have been submitted by the authenticated user on the back-end currently connected to.\n* **URL**: The URL to the STAC metadata for a batch job result. This is usually a signed URL that is provided by some back-ends since openEO API version 1.1.0 through the `canonical` link relation in the batch job result metadata.\n\nIf supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent` and `bands`.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", + "description": "Loads batch job results and returns them as a processable data cube. A batch job result can be loaded by ID or URL:\n\n* **ID**: The identifier for a finished batch job. The job must have been submitted by the authenticated user on the back-end currently connected to.\n* **URL**: The URL to the STAC metadata for a batch job result. This is usually a signed URL that is provided by some back-ends since openEO API version 1.1.0 through the `canonical` link relation in the batch job result metadata.\n\nIf supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent` and `bands`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", "categories": [ "cubes", "import" @@ -198,5 +198,10 @@ "type": "object", "subtype": "raster-cube" } + }, + "exceptions": { + "NoDataAvailable": { + "message": "There is no data available for the given extents." + } } } From 5ebe41b3872b45167cbf98db5d50a391fdfbf8aa Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 10 Nov 2022 15:36:42 +0100 Subject: [PATCH 063/117] Make climatological_normal future-proof --- climatological_normal.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/climatological_normal.json b/climatological_normal.json index 33cd2d60..b40ed1a9 100644 --- a/climatological_normal.json +++ b/climatological_normal.json @@ -1,7 +1,7 @@ { "id": "climatological_normal", "summary": "Compute climatology normals", - "description": "Climatological normal period is a usually 30 year average of a weather variable. Climatological normals are used as an average or baseline to evaluate climate events and provide context for yearly, monthly, daily or seasonal variability. The default climatology period is from 1981 until 2010 (both inclusive).", + "description": "Climatological normal period is a usually 30 year average of a weather variable. Climatological normals are used as an average or baseline to evaluate climate events and provide context for yearly, monthly, daily or seasonal variability.", "categories": [ "cubes", "climatology" @@ -31,7 +31,7 @@ }, { "name": "climatology_period", - "description": "The climatology period as a closed temporal interval. The first element of the array is the first year to be fully included in the temporal interval. The second element is the last year to be fully included in the temporal interval. The default period is from 1981 until 2010 (both inclusive).", + "description": "The climatology period as a closed temporal interval. The first element of the array is the first year to be fully included in the temporal interval. The second element is the last year to be fully included in the temporal interval.\n\nThe default climatology period is from 1981 until 2010 (both inclusive) right now, but this might be updated over time to what is commonly used in climatology. If you don't want to keep your research to be reproducible, please explicitly specify a period.", "schema": { "type": "array", "subtype": "temporal-interval", From cc3ac847b5141c380d932377b13934816c8bb636 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 10 Nov 2022 16:30:09 +0100 Subject: [PATCH 064/117] Deploy a separate draft branch for 2.0 --- .github/workflows/docs.yml | 1 + 1 file changed, 1 insertion(+) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 2386d523..4d5161d0 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -5,6 +5,7 @@ on: push: branches: - draft + - draft-2.0 - master jobs: deploy: From c9c0e9c2575df2cebeb80ee1513d9b06968f485d Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 23 Jan 2023 18:24:27 +0100 Subject: [PATCH 065/117] Move linter to @openeo/processes-lint --- tests/README.md | 7 +- tests/package.json | 18 +- tests/processes.test.js | 298 --------------------------------- tests/subtypes-file.test.js | 29 ---- tests/subtypes-schemas.test.js | 54 ------ tests/testConfig.json | 13 ++ tests/testHelpers.js | 231 ------------------------- 7 files changed, 20 insertions(+), 630 deletions(-) delete mode 100644 tests/processes.test.js delete mode 100644 tests/subtypes-file.test.js delete mode 100644 tests/subtypes-schemas.test.js create mode 100644 tests/testConfig.json delete mode 100644 tests/testHelpers.js diff --git a/tests/README.md b/tests/README.md index 31c79785..fc2382fe 100644 --- a/tests/README.md +++ b/tests/README.md @@ -5,7 +5,7 @@ To run the tests follow these steps: 1. Install [node and npm](https://nodejs.org) - should run with any recent version 2. Run `npm install` in this folder to install the dependencies 3. Run the tests with `npm test`. This will also lint the files and verify it follows best practices. -4. To show the files nicely formatted in a web browser, run `npm run render`. It starts a server and opens the corresponding page in a web browser. +4. To show the files nicely formatted in a web browser, run `npm start`. It starts a server and opens the corresponding page in a web browser. ## Development processes @@ -28,8 +28,3 @@ Sometimes it is useful to define a new "data type" on top of the JSON types (num For example, a client could make a select box with all collections available by adding a subtype `collection-id` to the JSON type `string`. If you think a new subype should be added, you need to add it to the `meta/subtype-schemas.json` file. It must be a valid JSON Schema. The tests mentioned above will also verify to a certain degree that the subtypes are defined correctly. - -## Examples - -To get out of proposal state, at least two examples must be provided. -The examples are located in the `examples` folder and will also be validated to some extent in the tests. \ No newline at end of file diff --git a/tests/package.json b/tests/package.json index be51806f..500d063b 100644 --- a/tests/package.json +++ b/tests/package.json @@ -1,6 +1,6 @@ { - "name": "@openeo/processes-validator", - "version": "0.2.0", + "name": "@openeo/processes", + "version": "1.2.0", "author": "openEO Consortium", "contributors": [ { @@ -18,19 +18,13 @@ "url": "git+https://github.com/Open-EO/openeo-processes.git" }, "devDependencies": { - "@apidevtools/json-schema-ref-parser": "^9.0.6", - "@openeo/js-processgraphs": "^1.0.0", - "ajv": "^6.12.4", + "@openeo/processes-lint": "^0.1.3", "concat-json-files": "^1.1.0", - "glob": "^7.1.6", - "http-server": "^14.1.1", - "jest": "^26.4.2", - "markdown-spellcheck": "^1.3.1", - "markdownlint": "^0.26.0" + "http-server": "^14.1.1" }, "scripts": { - "test": "jest", + "test": "openeo-processes-lint testConfig.json", "generate": "concat-json-files \"../{*,proposals/*}.json\" -t \"processes.json\"", - "render": "npm run generate && http-server -p 9876 -o docs.html -c-1" + "start": "npm run generate && http-server -p 9876 -o docs.html -c-1" } } diff --git a/tests/processes.test.js b/tests/processes.test.js deleted file mode 100644 index 1fcf5f02..00000000 --- a/tests/processes.test.js +++ /dev/null @@ -1,298 +0,0 @@ -const glob = require('glob'); -const fs = require('fs'); -const path = require('path'); -const { normalizeString, checkDescription, checkSpelling, checkJsonSchema, getAjv, prepareSchema, isObject } = require('./testHelpers'); - -const anyOfRequired = [ - "quantiles", - "array_element" -]; - -var jsv = null; -beforeAll(async () => { - jsv = await getAjv(); -}); - -var loader = (file, proposal = false) => { - try { - var fileContent = fs.readFileSync(file); - // Check JSON structure for faults - var p = JSON.parse(fileContent); - - // Prepare for tests - processes.push([file, p, fileContent.toString(), proposal]); - processIds.push(p.id); - } catch(err) { - processes.push([file, {}, "", proposal]); - console.error(err); - expect(err).toBeUndefined(); - } -}; - -var processes = []; -var processIds = []; - -const files = glob.sync("../*.json", {realpath: true}); -files.forEach(file => loader(file)); - -const proposals = glob.sync("../proposals/*.json", {realpath: true}); -proposals.forEach(file => loader(file, true)); - -test("Check for duplicate process ids", () => { - const duplicates = processIds.filter((id, index) => processIds.indexOf(id) !== index); - expect(duplicates).toEqual([]); -}); - -describe.each(processes)("%s", (file, p, fileContent, proposal) => { - - test("File / JSON", () => { - const ext = path.extname(file); - // Check that the process file has a lower-case json extension - expect(ext).toEqual(".json"); - // Check that the process name is also the file name - expect(path.basename(file, ext)).toEqual(p.id); - // lint: Check whether the file is correctly JSON formatted - expect(normalizeString(JSON.stringify(p, null, 4))).toEqual(normalizeString(fileContent)); - }); - - test("ID", () => { - expect(typeof p.id).toBe('string'); - expect(/^\w+$/.test(p.id)).toBeTruthy(); - }); - - test("Summary", () => { - expect(typeof p.summary === 'undefined' || typeof p.summary === 'string').toBeTruthy(); - // lint: Summary should be short - expect(p.summary.length).toBeLessThan(60); - // lint: Summary should not end with a dot - expect(/[^\.]$/.test(p.summary)).toBeTruthy(); - checkSpelling(p.summary, p); - }); - - test("Description", () => { - // description - expect(typeof p.description).toBe('string'); - // lint: Description should be longer than a summary - expect(p.description.length).toBeGreaterThan(60); - checkDescription(p.description, p, processIds); - }); - - test("Categories", () => { - // categories - expect(Array.isArray(p.categories)).toBeTruthy(); - // lint: There should be at least one category assigned - expect(p.categories.length).toBeGreaterThan(0); - if (Array.isArray(p.categories)) { - for(let i in p.categories) { - expect(typeof p.categories[i]).toBe('string'); - } - } - }); - - test("Flags", () => { - checkFlags(p, proposal); - }); - - test("Parameters", () => { - expect(Array.isArray(p.parameters)).toBeTruthy(); - }); - - var params = o2a(p.parameters); - if (params.length > 0) { - test.each(params)("Parameters > %s", (key, param) => { - checkParam(param, p); - }); - } - - test("Return Value", () => { - expect(isObject(p.returns)).toBeTruthy(); - expect(p.returns).not.toBeNull(); - - // return value description - expect(typeof p.returns.description).toBe('string'); - // lint: Description should not be empty - expect(p.returns.description.length).toBeGreaterThan(0); - checkDescription(p.returns.description, p, processIds); - - // return value schema - expect(p.returns.schema).not.toBeNull(); - expect(typeof p.returns.schema).toBe('object'); - // lint: Description should not be empty - checkJsonSchema(jsv, p.returns.schema); - }); - - test("Exceptions", () => { - expect(typeof p.exceptions === 'undefined' || isObject(p.exceptions)).toBeTruthy(); - }); - - var exceptions = o2a(p.exceptions); - if (exceptions.length > 0) { - test.each(exceptions)("Exceptions > %s", (key, e) => { - expect(/^\w+$/.test(key)).toBeTruthy(); - - // exception message - expect(typeof e.message).toBe('string'); - checkSpelling(e.message, p); - - // exception description - expect(typeof e.description === 'undefined' || typeof e.description === 'boolean').toBeTruthy(); - checkDescription(e.description, p, processIds); - - // exception http code - if (typeof e.http !== 'undefined') { - expect(e.http).toBeGreaterThanOrEqual(100); - expect(e.http).toBeLessThan(600); - } - }); - } - - test("Examples", () => { - expect(typeof p.examples === 'undefined' || Array.isArray(p.examples)).toBeTruthy(); - }); - - if (Array.isArray(p.examples) && p.examples.length > 0) { - - test.each(p.examples)("Examples > %#", (example) => { - // Make an object for easier access later - var parametersObj = {}; - for(var i in p.parameters) { - parametersObj[p.parameters[i].name] = p.parameters[i]; - } - var paramKeys = Object.keys(parametersObj); - - expect(isObject(example)).toBeTruthy(); - expect(example).not.toBeNull(); - - // example title - expect(typeof example.title === 'undefined' || typeof example.title === 'string').toBeTruthy(); - checkSpelling(example.title, p); - - // example description - expect(typeof example.description === 'undefined' || typeof example.description === 'string').toBeTruthy(); - checkDescription(example.description, p, processIds); - - // example process graph - expect(example.process_graph).toBeUndefined(); - - // example arguments - expect(typeof example.arguments).toBe('object'); - expect(example.arguments).not.toBeNull(); - // Check argument values - for(let argName in example.arguments) { - // Does parameter with this name exist? - - expect(paramKeys).toContain(argName); - checkJsonSchemaValue(parametersObj[argName].schema, example.arguments[argName]); - } - // Check whether all required parameters are set - for(let key in parametersObj) { - if (!parametersObj[key].optional) { - expect(example.arguments[key]).toBeDefined(); - } - } - - // example returns: Nothing to validate, everything is allowed - }); - } - - test("Links", () => { - expect(typeof p.links === 'undefined' || Array.isArray(p.links)).toBeTruthy(); - }); - - if (Array.isArray(p.links)) { - test.each(p.links)("Links > %#", (link) => { - expect(isObject(link)).toBeTruthy(); - - // link href - expect(typeof link.href).toBe('string'); - - // link rel - expect(typeof link.rel === 'undefined' || typeof link.rel === 'string').toBeTruthy(); - - // link title - expect(typeof link.title === 'undefined' || typeof link.title === 'string').toBeTruthy(); - checkSpelling(link.title, p); - - // link type - expect(typeof link.type === 'undefined' || typeof link.type === 'string').toBeTruthy(); - }); - } -}); - -function checkFlags(p, proposal = false) { - // deprecated - expect(typeof p.deprecated === 'undefined' || typeof p.deprecated === 'boolean').toBeTruthy(); - // lint: don't specify defaults - expect(typeof p.deprecated === 'undefined' || p.deprecated === true).toBeTruthy(); - if (proposal) { - // experimental must be true for proposals - expect(p.experimental).toBe(true); - } - else { - // experimental must not be false for stable - // lint: don't specify defaults, so false should not be set explicitly - expect(p.experimental).toBeUndefined(); - } -} - -function checkParam(param, p, checkCbParams = true) { - // parameter name - expect(typeof param.name).toBe('string'); - expect(/^\w+$/.test(param.name)).toBeTruthy(); - - // parameter description - expect(typeof param.description).toBe('string'); - // lint: Description should not be empty - expect(param.description.length).toBeGreaterThan(0); - checkDescription(param.description, p, processIds); - - // Parameter flags - expect(typeof param.optional === 'undefined' || typeof param.optional === 'boolean').toBeTruthy(); - // lint: don't specify default value "false" for optional - expect(typeof param.optional === 'undefined' || param.optional === true).toBeTruthy(); - // lint: make sure there's no old required flag - expect(typeof param.required === 'undefined').toBeTruthy(); - // lint: require a default value if the parameter is optional - if (param.optional === true && !anyOfRequired.includes(p.id)) { - expect(param.default).toBeDefined(); - } - // Check flags (recommended / experimental) - checkFlags(param); - - // Parameter schema - expect(param.schema).not.toBeNull(); - expect(typeof param.schema).toBe('object'); - checkJsonSchema(jsv, param.schema); - - if (checkCbParams) { - // Checking that callbacks (process-graphs) define their parameters - if (typeof param.schema === 'object' && param.schema.subtype === 'process-graph') { - // lint: A callback without parameters is not very useful - expect(Array.isArray(param.schema.parameters) && param.schema.parameters.length > 0).toBeTruthy(); - - // Check all callback params - for(var i in param.schema.parameters) { - checkParam(param.schema.parameters[i], p, false); - } - } - } -} - -function checkJsonSchemaValue(schema, value) { - jsv.validate(prepareSchema(schema), value); - expect(jsv.errors).toBeNull(); -} - -function o2a(o) { - if (!o) { - return []; - } - var a = []; - for(var k in o) { - a.push([ - o[k].name ? o[k].name : k, // name - o[k] // obj - ]); - } - return a; -} \ No newline at end of file diff --git a/tests/subtypes-file.test.js b/tests/subtypes-file.test.js deleted file mode 100644 index e70f7e8f..00000000 --- a/tests/subtypes-file.test.js +++ /dev/null @@ -1,29 +0,0 @@ -const fs = require('fs'); -const $RefParser = require("@apidevtools/json-schema-ref-parser"); -const { checkJsonSchema, getAjv, isObject, normalizeString } = require('./testHelpers'); - -test("File subtype-schemas.json", async () => { - let schema; - let fileContent; - try { - fileContent = fs.readFileSync('../meta/subtype-schemas.json'); - schema = JSON.parse(fileContent); - } catch(err) { - console.error("The file for subtypes is invalid and can't be read:"); - console.error(err); - expect(err).toBeUndefined(); - } - - expect(isObject(schema)).toBeTruthy(); - expect(isObject(schema.definitions)).toBeTruthy(); - - // lint: Check whether the file is correctly JSON formatted - expect(normalizeString(JSON.stringify(schema, null, 4))).toEqual(normalizeString(fileContent.toString())); - - // Is JSON Schema valid? - checkJsonSchema(await getAjv(), schema); - - // is everything dereferencable? - let subtypes = await $RefParser.dereference(schema, { dereference: { circular: "ignore" } }); - expect(isObject(subtypes)).toBeTruthy(); -}); \ No newline at end of file diff --git a/tests/subtypes-schemas.test.js b/tests/subtypes-schemas.test.js deleted file mode 100644 index ff1b72bd..00000000 --- a/tests/subtypes-schemas.test.js +++ /dev/null @@ -1,54 +0,0 @@ -const $RefParser = require("@apidevtools/json-schema-ref-parser"); -const { checkDescription, checkSpelling, isObject } = require('./testHelpers'); - -// I'd like to run the tests for each subtype individually instead of in a loop, -// but jest doesn't support that, so you need to figure out yourself what is broken. -// The console.log in afterAll ensures we have a hint of which process was checked last - -// Load and dereference schemas -let subtypes = {}; -let lastTest = null; -let testsCompleted = 0; -beforeAll(async () => { - subtypes = await $RefParser.dereference('../meta/subtype-schemas.json', { dereference: { circular: "ignore" } }); - return subtypes; -}); - -afterAll(async () => { - if (testsCompleted != Object.keys(subtypes.definitions).length) { - console.log('The schema the test has likely failed for: ' + lastTest); - } -}); - -test("Schemas in subtype-schemas.json", () => { - // Each schema must contain at least a type, subtype, title and description - for(let name in subtypes.definitions) { - let schema = subtypes.definitions[name]; - lastTest = name; - - // Schema is object - expect(isObject(schema)).toBeTruthy(); - - // Type is array with an element or a stirng - expect((Array.isArray(schema.type) && schema.type.length > 0) || typeof schema.type === 'string').toBeTruthy(); - - // Subtype is a string - expect(typeof schema.subtype === 'string').toBeTruthy(); - - // Check title - expect(typeof schema.title === 'string').toBeTruthy(); - // lint: Summary should be short - expect(schema.title.length).toBeLessThan(60); - // lint: Summary should not end with a dot - expect(/[^\.]$/.test(schema.title)).toBeTruthy(); - checkSpelling(schema.title, schema); - - // Check description - expect(typeof schema.description).toBe('string'); - // lint: Description should be longer than a summary - expect(schema.description.length).toBeGreaterThan(60); - checkDescription(schema.description, schema); - - testsCompleted++; - } -}); \ No newline at end of file diff --git a/tests/testConfig.json b/tests/testConfig.json new file mode 100644 index 00000000..60d8b893 --- /dev/null +++ b/tests/testConfig.json @@ -0,0 +1,13 @@ +{ + "folder": "../", + "proposalsFolder": "../proposals/", + "ignoredWords": ".words", + "anyOfRequired": [ + "array_element", + "quantiles" + ], + "subtypeSchemas": "../meta/subtype-schemas.json", + "checkSubtypeSchemas": true, + "forbidDeprecatedTypes": false, + "verbose": false +} diff --git a/tests/testHelpers.js b/tests/testHelpers.js deleted file mode 100644 index 418fd830..00000000 --- a/tests/testHelpers.js +++ /dev/null @@ -1,231 +0,0 @@ -const glob = require('glob'); -const fs = require('fs'); -const path = require('path'); -const ajv = require('ajv'); -const $RefParser = require("@apidevtools/json-schema-ref-parser"); -const markdownlint = require('markdownlint'); -const spellcheck = require('markdown-spellcheck').default; - -const ajvOptions = { - schemaId: 'auto', - format: 'full' -}; - -const spellcheckOptions = { - ignoreAcronyms: true, - ignoreNumbers: true, - suggestions: false, - relativeSpellingFiles: true, - dictionary: { - language: "en-us" - } -}; - -// Read custom dictionary for spell check -const words = fs.readFileSync('.words').toString().split(/\r\n|\n|\r/); -for(let i in words) { - spellcheck.spellcheck.addWord(words[i]); -} -// Add the process IDs to the word list -const files = glob.sync("../{*,proposals/*}.json", {realpath: true}); -for(let i in files) { - spellcheck.spellcheck.addWord(path.basename(files[i], path.extname(files[i]))); -} - - -async function getAjv() { - let subtypes = await $RefParser.dereference( - require('../meta/subtype-schemas.json'), - { - dereference: { circular: "ignore" } - } - ); - - let jsv = new ajv(ajvOptions); - jsv.addKeyword("parameters", { - dependencies: [ - "type", - "subtype" - ], - metaSchema: { - type: "array", - items: { - type: "object", - required: [ - "name", - "description", - "schema" - ], - properties: { - name: { - type: "string", - pattern: "^[A-Za-z0-9_]+$" - }, - description: { - type: "string" - }, - optional: { - type: "boolean" - }, - deprecated: { - type: "boolean" - }, - experimental: { - type: "boolean" - }, - default: { - // Any type - }, - schema: { - oneOf: [ - { - type: "object", - // ToDo: Check Schema - }, - { - type: "array", - items: { - type: "object" - // ToDo: Check Schema - } - } - ] - } - } - } - }, - valid: true - }); - jsv.addKeyword("subtype", { - dependencies: [ - "type" - ], - metaSchema: { - type: "string", - enum: Object.keys(subtypes.definitions) - }, - compile: function (subtype, schema) { - if (schema.type != subtypes.definitions[subtype].type) { - throw "Subtype '"+subtype+"' not allowed for type '"+schema.type+"'." - } - return () => true; - }, - errors: false - }); - - return jsv; -} - -function isObject(obj) { - return (typeof obj === 'object' && obj === Object(obj) && !Array.isArray(obj)); -} - -function normalizeString(str) { - return str.replace(/\r\n|\r|\n/g, "\n").trim(); -} - -function checkDescription(text, p = null, processIds = [], commonmark = true) { - if (!text) { - return; - } - - // Check markdown - if (commonmark) { - const options = { - strings: { - description: text - }, - config: { - "line-length": false, // Nobody cares in JSON files anyway - "first-line-h1": false, // Usually no headings in descriptions - "fenced-code-language": false, // Usually no languages available anyway - "single-trailing-newline": false, // New lines at end of a JSON string doesn't make sense. We don't have files here. - } - }; - const result = markdownlint.sync(options); - expect(result).toEqual({description: []}); - } - - // Check spelling - checkSpelling(text, p); - - // Check whether process references are referencing valid processes - if (Array.isArray(processIds) && processIds.length > 0) { - let matches = text.matchAll(/(?:^|[^\w`])``(\w+)\(\)``(?![\w`])/g); - for(match of matches) { - expect(processIds).toContain(match[1]); - } - } -} - -function checkSpelling(text, p = null) { - if (!text) { - return; - } - - const errors = spellcheck.spell(text, spellcheckOptions); - if (errors.length > 0) { - let pre = "Misspelled word"; - if (p && p.id) { - pre += " in " + p.id; - } - console.warn(pre + ": " + JSON.stringify(errors)); - } -} - -function prepareSchema(schema) { - if (Array.isArray(schema)) { - schema = { - anyOf: schema - }; - } - if (typeof schema["$schema"] === 'undefined') { - // Set applicable JSON SChema draft version if not already set - schema["$schema"] = "http://json-schema.org/draft-07/schema#"; - } - return schema; -} - -function checkJsonSchema(jsv, schema, checkFormat = true) { - if (Array.isArray(schema)) { - // lint: For array schemas there should be more than one schema specified, otherwise use directly the schema object - expect(schema.length).toBeGreaterThan(1); - } - - let result = jsv.compile(prepareSchema(schema)); - expect(result.errors).toBeNull(); - - checkSchemaRecursive(schema, checkFormat); -} - -function checkSchemaRecursive(schema, checkFormat = true) { - for(var i in schema) { - var val = schema[i]; - if (typeof val === 'object' && val !== null) { - checkSchemaRecursive(val, checkFormat); - } - - switch(i) { - case 'title': - case 'description': - checkSpelling(val); - break; - case 'format': - if (checkFormat && schema.subtype !== val) { - throw "format '"+val+"' has no corresponding subtype."; - } - break; - } - } -} - -module.exports = { - getAjv, - normalizeString, - checkDescription, - checkSpelling, - checkJsonSchema, - checkSchemaRecursive, - prepareSchema, - isObject -}; \ No newline at end of file From a5764ed2fadbd345432809af00abc6b016e8011b Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 30 Jan 2023 16:30:45 +0100 Subject: [PATCH 066/117] Migrate to general data cube definition (#382) Co-authored-by: Lukas Weidenholzer <17790923+LukeWeidenwalker@users.noreply.github.com> --- CHANGELOG.md | 11 ++- add_dimension.json | 9 +- aggregate_spatial.json | 58 ++++++++---- aggregate_temporal.json | 16 +++- aggregate_temporal_period.json | 16 +++- anomaly.json | 21 ++++- apply.json | 8 +- apply_dimension.json | 14 +-- apply_kernel.json | 26 +++++- apply_neighborhood.json | 48 ++++++++-- climatological_normal.json | 14 ++- ..._raster_cube.json => create_data_cube.json | 12 +-- dimension_labels.json | 4 +- drop_dimension.json | 6 +- filter_bands.json | 16 +++- filter_bbox.json | 64 ++++++++++--- filter_spatial.json | 53 ++++++++--- filter_temporal.json | 14 ++- load_collection.json | 19 +++- mask.json | 35 +++++++- mask_polygon.json | 50 +++++++++-- merge_cubes.json | 18 ++-- meta/subtype-schemas.json | 14 ++- ndvi.json | 25 +++++- proposals/aggregate_spatial_window.json | 26 +++++- .../ard_normalized_radar_backscatter.json | 34 +++++-- proposals/ard_surface_reflectance.json | 34 +++++-- proposals/atmospheric_correction.json | 34 +++++-- proposals/cloud_detection.json | 34 +++++-- proposals/filter_labels.json | 8 +- proposals/filter_vector.json | 89 +++++++++++++++++++ proposals/fit_class_random_forest.json | 37 ++++++-- proposals/fit_curve.json | 8 +- proposals/fit_regr_random_forest.json | 37 ++++++-- proposals/flatten_dimensions.json | 4 +- proposals/inspect.json | 2 +- proposals/load_result.json | 19 +++- proposals/load_uploaded_files.json | 2 +- proposals/predict_curve.json | 10 +-- proposals/reduce_spatial.json | 17 +++- proposals/resample_cube_temporal.json | 25 ++++-- proposals/run_udf_externally.json | 4 +- proposals/sar_backscatter.json | 32 ++++++- proposals/unflatten_dimension.json | 4 +- proposals/vector_buffer.json | 19 +++- proposals/vector_to_random_points.json | 25 ++++-- proposals/vector_to_regular_points.json | 25 ++++-- reduce_dimension.json | 8 +- rename_dimension.json | 6 +- rename_labels.json | 4 +- resample_cube_spatial.json | 39 ++++++-- resample_spatial.json | 24 ++++- save_result.json | 14 +-- tests/package.json | 2 +- tests/testHelpers.js | 70 ++++++++++++++- trim_cube.json | 10 +-- 56 files changed, 1038 insertions(+), 239 deletions(-) rename create_raster_cube.json => create_data_cube.json (53%) create mode 100644 proposals/filter_vector.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 2a69e93e..6f394d38 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - New processes in proposal state: + - `filter_vector` - `fit_class_random_forest` - `fit_regr_random_forest` - `flatten_dimensions` @@ -19,6 +20,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `vector_buffer` - `vector_to_random_points` - `vector_to_regular_points` +- `add_dimension`: Added new dimension type `geometries`. [#68](https://github.com/Open-EO/openeo-processes/issues/68) ### Changed @@ -34,15 +36,22 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Added a `NoDataAvailable` exception - `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) - `save_result`: Added a more concrete `DataCubeEmpty` exception. +- New definition for `aggregate_spatial`: + - Allows more than 3 input dimensions [#126](https://github.com/Open-EO/openeo-processes/issues/126) + - Allow to not export statistics by changing the parameter `target_dimension` [#366](https://github.com/Open-EO/openeo-processes/issues/366) + - Clarify how the resulting vector data cube looks like [#356](https://github.com/Open-EO/openeo-processes/issues/356) +- Renamed `create_raster_cube` to `create_data_cube`. [#68](https://github.com/Open-EO/openeo-processes/issues/68) +- Updated the processes based on the subtypes `raster-cube` or `vector-cube` to work with the subtype `datacube` instead. [#68](https://github.com/Open-EO/openeo-processes/issues/68) ### Removed - The `examples` folder has been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. +- Deprecated `GeometryCollections` are not supported any longer. [#389](https://github.com/Open-EO/openeo-processes/issues/389) ### Fixed - `aggregate_spatial`: - - Clarified that vector properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) + - Clarified that feature properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) - Clarified that a `TargetDimensionExists` exception is thrown if the target dimension exists. - `apply` and `array_apply`: Fixed broken references to the `absolute` process - `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. diff --git a/add_dimension.json b/add_dimension.json index a3b07075..b156846b 100644 --- a/add_dimension.json +++ b/add_dimension.json @@ -11,7 +11,7 @@ "description": "A data cube to add the dimension to.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -39,9 +39,10 @@ "schema": { "type": "string", "enum": [ + "bands", + "geometries", "spatial", "temporal", - "bands", "other" ] }, @@ -53,7 +54,7 @@ "description": "The data cube with a newly added dimension. The new dimension has exactly one dimension label. All other dimensions remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -61,4 +62,4 @@ "message": "A dimension with the specified name already exists." } } -} \ No newline at end of file +} diff --git a/aggregate_spatial.json b/aggregate_spatial.json index 4020610c..380e34c0 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -1,7 +1,7 @@ { "id": "aggregate_spatial", "summary": "Zonal statistics for geometries", - "description": "Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons) over the spatial dimensions. The number of total and valid pixels is returned together with the calculated values.\n\nAn 'unbounded' aggregation over the full extent of the horizontal spatial dimensions can be computed with the process ``reduce_spatial()``.\n\nThis process passes a list of values to the reducer. The list of values has an undefined order, therefore processes such as ``last()`` and ``first()`` that depend on the order of the values will lead to unpredictable results.", + "description": "Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons) over the spatial dimensions. The given data cube can have multiple additional dimensions and for all these dimensions results will be computed individually.\n\nAn 'unbounded' aggregation over the full extent of the horizontal spatial dimensions can be computed with the process ``reduce_spatial()``.\n\nThis process passes a list of values to the reducer. The list of values has an undefined order, therefore processes such as ``last()`` and ``first()`` that depend on the order of the values will lead to unpredictable results.", "categories": [ "cubes", "aggregate & resample" @@ -9,19 +9,40 @@ "parameters": [ { "name": "data", - "description": "A raster data cube.\n\nThe data cube must have been reduced to only contain two spatial dimensions and a third dimension the values are aggregated for, for example the temporal dimension to get a time series. Otherwise, this process fails with the `TooManyDimensions` exception.\n\nThe data cube implicitly gets restricted to the bounds of the geometries as if ``filter_spatial()`` would have been used with the same values for the corresponding parameters immediately before this process.", + "description": "A raster data cube with at least two spatial dimensions.\n\nThe data cube implicitly gets restricted to the bounds of the geometries as if ``filter_spatial()`` would have been used with the same values for the corresponding parameters immediately before this process.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { "name": "geometries", - "description": "Geometries as GeoJSON on which the aggregation will be based. Vector properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per GeoJSON `Feature`, `Geometry` or `GeometryCollection`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", - "schema": { - "type": "object", - "subtype": "geojson" - } + "description": "Geometries for which the aggregation will be computed. Feature properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per label in the dimension of type `geometries`, GeoJSON `Feature` or `Geometry`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations. No operation is applied to geometries that are outside of the bounds of the data.", + "schema": [ + { + "type": "object", + "subtype": "geojson", + "description": "The GeoJSON type `GeometryCollection` is not supported." + }, + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + } + ] }, { "name": "reducer", @@ -60,11 +81,14 @@ }, { "name": "target_dimension", - "description": "The name of a new dimensions that is used to store the results. A new dimension will be created with the given name and type `other` (see ``add_dimension()``). Defaults to the dimension name `result`. Fails with a `TargetDimensionExists` exception if a dimension with the specified name exists.", + "description": "By default (which is `null`), the process only computes the results and doesn't add a new dimension.\n\nIf this parameter contains a new dimension name, the computation also stores information about the total count of pixels (valid + invalid pixels) and the number of valid pixels (see ``is_valid()``) for each computed value. These values are added as a new dimension. The new dimension of type `other` has the dimension labels `value`, `total_count` and `valid_count`.\n\nFails with a `TargetDimensionExists` exception if a dimension with the specified name exists.", "schema": { - "type": "string" + "type": [ + "string", + "null" + ] }, - "default": "result", + "default": null, "optional": true }, { @@ -78,16 +102,18 @@ } ], "returns": { - "description": "A vector data cube with the computed results and restricted to the bounds of the geometries.\n\nThe computed value is used for the dimension with the name that was specified in the parameter `target_dimension`.\n\nThe computation also stores information about the total count of pixels (valid + invalid pixels) and the number of valid pixels (see ``is_valid()``) for each geometry. These values are added as a new dimension with a dimension name derived from `target_dimension` by adding the suffix `_meta`. The new dimension has the dimension labels `total_count` and `valid_count`.", + "description": "A vector data cube with the computed results and restricted to the bounds of the geometries. The spatial dimensions is replaced by a geometries dimension and if `target_dimension` is not `null`, a new dimension is added.", "schema": { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] } }, "exceptions": { - "TooManyDimensions": { - "message": "The number of dimensions must be reduced to three for `aggregate_spatial`." - }, "TargetDimensionExists": { "message": "A dimension with the specified target dimension name already exists." } diff --git a/aggregate_temporal.json b/aggregate_temporal.json index b68b366c..d63099b7 100644 --- a/aggregate_temporal.json +++ b/aggregate_temporal.json @@ -12,7 +12,12 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -162,7 +167,12 @@ "description": "A new data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the given temporal dimension.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, "examples": [ @@ -234,4 +244,4 @@ "title": "Aggregation explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/aggregate_temporal_period.json b/aggregate_temporal_period.json index 832e72aa..ce6ec410 100644 --- a/aggregate_temporal_period.json +++ b/aggregate_temporal_period.json @@ -13,7 +13,12 @@ "description": "The source data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -97,7 +102,12 @@ "description": "A new data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the given temporal dimension. The specified temporal dimension has the following dimension labels (`YYYY` = four-digit year, `MM` = two-digit month, `DD` two-digit day of month):\n\n* `hour`: `YYYY-MM-DD-00` - `YYYY-MM-DD-23`\n* `day`: `YYYY-001` - `YYYY-365`\n* `week`: `YYYY-01` - `YYYY-52`\n* `dekad`: `YYYY-00` - `YYYY-36`\n* `month`: `YYYY-01` - `YYYY-12`\n* `season`: `YYYY-djf` (December - February), `YYYY-mam` (March - May), `YYYY-jja` (June - August), `YYYY-son` (September - November).\n* `tropical-season`: `YYYY-ndjfma` (November - April), `YYYY-mjjaso` (May - October).\n* `year`: `YYYY`\n* `decade`: `YYY0`\n* `decade-ad`: `YYY1`\n\nThe dimension labels in the new data cube are complete for the whole extent of the source data cube. For example, if `period` is set to `day` and the source data cube has two dimension labels at the beginning of the year (`2020-01-01`) and the end of a year (`2020-12-31`), the process returns a data cube with 365 dimension labels (`2020-001`, `2020-002`, ..., `2020-365`). In contrast, if `period` is set to `day` and the source data cube has just one dimension label `2020-01-05`, the process returns a data cube with just a single dimension label (`2020-005`).", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, "exceptions": { @@ -118,4 +128,4 @@ "title": "Aggregation explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/anomaly.json b/anomaly.json index 3f369087..7a3890b7 100644 --- a/anomaly.json +++ b/anomaly.json @@ -13,7 +13,12 @@ "description": "A data cube with exactly one temporal dimension and the following dimension labels for the given period (`YYYY` = four-digit year, `MM` = two-digit month, `DD` two-digit day of month):\n\n* `hour`: `YYYY-MM-DD-00` - `YYYY-MM-DD-23`\n* `day`: `YYYY-001` - `YYYY-365`\n* `week`: `YYYY-01` - `YYYY-52`\n* `dekad`: `YYYY-00` - `YYYY-36`\n* `month`: `YYYY-01` - `YYYY-12`\n* `season`: `YYYY-djf` (December - February), `YYYY-mam` (March - May), `YYYY-jja` (June - August), `YYYY-son` (September - November).\n* `tropical-season`: `YYYY-ndjfma` (November - April), `YYYY-mjjaso` (May - October).\n* `year`: `YYYY`\n* `decade`: `YYY0`\n* `decade-ad`: `YYY1`\n* `single-period` / `climatology-period`: Any\n\n``aggregate_temporal_period()`` can compute such a data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -21,7 +26,12 @@ "description": "A data cube with normals, e.g. daily, monthly or yearly values computed from a process such as ``climatological_normal()``. Must contain exactly one temporal dimension with the following dimension labels for the given period:\n\n* `hour`: `00` - `23`\n* `day`: `001` - `365`\n* `week`: `01` - `52`\n* `dekad`: `00` - `36`\n* `month`: `01` - `12`\n* `season`: `djf` (December - February), `mam` (March - May), `jja` (June - August), `son` (September - November)\n* `tropical-season`: `ndjfma` (November - April), `mjjaso` (May - October)\n* `year`: Four-digit year numbers\n* `decade`: Four-digit year numbers, the last digit being a `0`\n* `decade-ad`: Four-digit year numbers, the last digit being a `1`\n* `single-period` / `climatology-period`: A single dimension label with any name is expected.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -50,7 +60,12 @@ "description": "A data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } } } diff --git a/apply.json b/apply.json index d0be1e1d..20995c88 100644 --- a/apply.json +++ b/apply.json @@ -1,7 +1,7 @@ { "id": "apply", - "summary": "Apply a process to each pixel", - "description": "Applies a process to each pixel value in the data cube (i.e. a local operation). In contrast, the process ``apply_dimension()`` applies a process to all pixel values along a particular dimension.", + "summary": "Apply a process to each value", + "description": "Applies a process to each value in the data cube (i.e. a local operation). In contrast, the process ``apply_dimension()`` applies a process to all values along a particular dimension.", "categories": [ "cubes" ], @@ -11,7 +11,7 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -60,7 +60,7 @@ "description": "A data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "links": [ diff --git a/apply_dimension.json b/apply_dimension.json index 796393f4..7f8a5616 100644 --- a/apply_dimension.json +++ b/apply_dimension.json @@ -1,7 +1,7 @@ { "id": "apply_dimension", - "summary": "Apply a process to pixels along a dimension", - "description": "Applies a process to all pixel values along a dimension of a raster data cube. For example, if the temporal dimension is specified the process will work on a time series of pixel values.\n\nThe process ``reduce_dimension()`` also applies a process to pixel values along a dimension, but drops the dimension afterwards. The process ``apply()`` applies a process to each pixel value in the data cube.\n\nThe target dimension is the source dimension if not specified otherwise in the `target_dimension` parameter. The pixel values in the target dimension get replaced by the computed pixel values. The name, type and reference system are preserved.\n\nThe dimension labels are preserved when the target dimension is the source dimension and the number of pixel values in the source dimension is equal to the number of values computed by the process. Otherwise, the dimension labels will be incrementing integers starting from zero, which can be changed using ``rename_labels()`` afterwards. The number of labels will equal to the number of values computed by the process.", + "summary": "Apply a process to all values along a dimension", + "description": "Applies a process to all values along a dimension of a data cube. For example, if the temporal dimension is specified the process will work on the values of a time series.\n\nThe process ``reduce_dimension()`` also applies a process to values along a dimension, but drops the dimension afterwards. The process ``apply()`` applies a process to each value in the data cube.\n\nThe target dimension is the source dimension if not specified otherwise in the `target_dimension` parameter. The values in the target dimension get replaced by the computed values. The name, type and reference system are preserved.\n\nThe dimension labels are preserved when the target dimension is the source dimension and the number of values in the source dimension is equal to the number of values computed by the process. Otherwise, the dimension labels will be incrementing integers starting from zero, which can be changed using ``rename_labels()`` afterwards. The number of labels will be equal to the number of values computed by the process.", "categories": [ "cubes" ], @@ -11,12 +11,12 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { "name": "process", - "description": "Process to be applied on all pixel values. The specified process needs to accept an array and must return an array with at least one element. A process may consist of multiple sub-processes.", + "description": "Process to be applied on all values along the given dimension. The specified process needs to accept an array and must return an array with at least one element. A process may consist of multiple sub-processes.", "schema": { "type": "object", "subtype": "process-graph", @@ -83,10 +83,10 @@ } ], "returns": { - "description": "A data cube with the newly computed values.\n\nAll dimensions stay the same, except for the dimensions specified in corresponding parameters. There are three cases how the dimensions can change:\n\n1. The source dimension is the target dimension:\n - The (number of) dimensions remain unchanged as the source dimension is the target dimension.\n - The source dimension properties name and type remain unchanged.\n - The dimension labels, the reference system and the resolution are preserved only if the number of pixel values in the source dimension is equal to the number of values computed by the process. Otherwise, all other dimension properties change as defined in the list below.\n2. The source dimension is not the target dimension and the latter exists:\n - The number of dimensions decreases by one as the source dimension is dropped.\n - The target dimension properties name and type remain unchanged. All other dimension properties change as defined in the list below.\n3. The source dimension is not the target dimension and the latter does not exist:\n - The number of dimensions remain unchanged, but the source dimension is replaced with the target dimension.\n - The target dimension has the specified name and the type other. All other dimension properties are set as defined in the list below.\n\nUnless otherwise stated above, for the given (target) dimension the following applies:\n\n- the number of dimension labels is equal to the number of values computed by the process,\n- the dimension labels are incrementing integers starting from zero,\n- the resolution changes, and\n- the reference system is undefined.", + "description": "A data cube with the newly computed values.\n\nAll dimensions stay the same, except for the dimensions specified in corresponding parameters. There are three cases how the dimensions can change:\n\n1. The source dimension is the target dimension:\n - The (number of) dimensions remain unchanged as the source dimension is the target dimension.\n - The source dimension properties name and type remain unchanged.\n - The dimension labels, the reference system and the resolution are preserved only if the number of values in the source dimension is equal to the number of values computed by the process. Otherwise, all other dimension properties change as defined in the list below.\n2. The source dimension is not the target dimension and the latter exists:\n - The number of dimensions decreases by one as the source dimension is dropped.\n - The target dimension properties name and type remain unchanged. All other dimension properties change as defined in the list below.\n3. The source dimension is not the target dimension and the latter does not exist:\n - The number of dimensions remain unchanged, but the source dimension is replaced with the target dimension.\n - The target dimension has the specified name and the type other. All other dimension properties are set as defined in the list below.\n\nUnless otherwise stated above, for the given (target) dimension the following applies:\n\n- the number of dimension labels is equal to the number of values computed by the process,\n- the dimension labels are incrementing integers starting from zero,\n- the resolution changes, and\n- the reference system is undefined.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -101,4 +101,4 @@ "title": "Apply explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/apply_kernel.json b/apply_kernel.json index 20d940c2..cf16dc78 100644 --- a/apply_kernel.json +++ b/apply_kernel.json @@ -1,7 +1,7 @@ { "id": "apply_kernel", "summary": "Apply a spatial convolution with a kernel", - "description": "Applies a 2D convolution (i.e. a focal operation with a weighted kernel) on the horizontal spatial dimensions (axes `x` and `y`) of the data cube.\n\nEach value in the kernel is multiplied with the corresponding pixel value and all products are summed up afterwards. The sum is then multiplied with the factor.\n\nThe process can't handle non-numerical or infinite numerical values in the data cube. Boolean values are converted to integers (`false` = 0, `true` = 1), but all other non-numerical or infinite values are replaced with zeroes by default (see parameter `replace_invalid`).\n\nFor cases requiring more generic focal operations or non-numerical values, see ``apply_neighborhood()``.", + "description": "Applies a 2D convolution (i.e. a focal operation with a weighted kernel) on the horizontal spatial dimensions (axes `x` and `y`) of a raster data cube.\n\nEach value in the kernel is multiplied with the corresponding pixel value and all products are summed up afterwards. The sum is then multiplied with the factor.\n\nThe process can't handle non-numerical or infinite numerical values in the data cube. Boolean values are converted to integers (`false` = 0, `true` = 1), but all other non-numerical or infinite values are replaced with zeroes by default (see parameter `replace_invalid`).\n\nFor cases requiring more generic focal operations or non-numerical values, see ``apply_neighborhood()``.", "categories": [ "cubes", "math > image filter" @@ -9,10 +9,19 @@ "parameters": [ { "name": "data", - "description": "A data cube.", + "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -73,7 +82,16 @@ "description": "A data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "exceptions": { diff --git a/apply_neighborhood.json b/apply_neighborhood.json index 71d47ef1..3b89adf4 100644 --- a/apply_neighborhood.json +++ b/apply_neighborhood.json @@ -8,10 +8,19 @@ "parameters": [ { "name": "data", - "description": "A data cube.", + "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -26,7 +35,16 @@ "description": "A subset of the data cube as specified in `size` and `overlap`.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -43,7 +61,16 @@ "description": "The data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) must remain unchanged, otherwise a `DataCubePropertiesImmutable` exception will be thrown.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } } } @@ -184,10 +211,19 @@ } ], "returns": { - "description": "A data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", + "description": "A raster data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "examples": [ diff --git a/climatological_normal.json b/climatological_normal.json index b40ed1a9..8d97ef5a 100644 --- a/climatological_normal.json +++ b/climatological_normal.json @@ -12,7 +12,12 @@ "description": "A data cube with exactly one temporal dimension. The data cube must span at least the temporal interval specified in the parameter `climatology-period`.\n\nSeasonal periods may span two consecutive years, e.g. temporal winter that includes months December, January and February. If the required months before the actual climate period are available, the season is taken into account. If not available, the first season is not taken into account and the seasonal mean is based on one year less than the other seasonal normals. The incomplete season at the end of the last year is never taken into account.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -56,7 +61,12 @@ "description": "A data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the temporal dimension. The temporal dimension has the following dimension labels:\n\n* `day`: `001` - `365`\n* `month`: `01` - `12`\n* `climatology-period`: `climatology-period`\n* `season`: `djf` (December - February), `mam` (March - May), `jja` (June - August), `son` (September - November)\n* `tropical-season`: `ndjfma` (November - April), `mjjaso` (May - October)", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, "links": [ diff --git a/create_raster_cube.json b/create_data_cube.json similarity index 53% rename from create_raster_cube.json rename to create_data_cube.json index 576728ee..55f0aede 100644 --- a/create_raster_cube.json +++ b/create_data_cube.json @@ -1,16 +1,16 @@ { - "id": "create_raster_cube", - "summary": "Create an empty raster data cube", - "description": "Creates a new raster data cube without dimensions. Dimensions can be added with ``add_dimension()``.", + "id": "create_data_cube", + "summary": "Create an empty data cube", + "description": "Creates a new data cube without dimensions. Dimensions can be added with ``add_dimension()``.", "categories": [ "cubes" ], "parameters": [], "returns": { - "description": "An empty raster data cube with zero dimensions.", + "description": "An empty data cube with no dimensions.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "links": [ @@ -20,4 +20,4 @@ "title": "Data Cubes explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/dimension_labels.json b/dimension_labels.json index 37a5908d..15c5ba0f 100644 --- a/dimension_labels.json +++ b/dimension_labels.json @@ -11,7 +11,7 @@ "description": "The data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -39,4 +39,4 @@ "message": "A dimension with the specified name does not exist." } } -} \ No newline at end of file +} diff --git a/drop_dimension.json b/drop_dimension.json index 90212dd9..eaee1d4c 100644 --- a/drop_dimension.json +++ b/drop_dimension.json @@ -11,7 +11,7 @@ "description": "The data cube to drop a dimension from.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -26,7 +26,7 @@ "description": "A data cube without the specified dimension. The number of dimensions decreases by one, but the dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -37,4 +37,4 @@ "message": "A dimension with the specified name does not exist." } } -} \ No newline at end of file +} diff --git a/filter_bands.json b/filter_bands.json index ee9c9aae..24ccf023 100644 --- a/filter_bands.json +++ b/filter_bands.json @@ -12,7 +12,12 @@ "description": "A data cube with bands.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "bands" + } + ] } }, { @@ -62,7 +67,12 @@ "description": "A data cube limited to a subset of its original bands. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the dimension of type `bands` has less (or the same) dimension labels.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "bands" + } + ] } }, "exceptions": { @@ -85,4 +95,4 @@ "title": "Filters explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/filter_bbox.json b/filter_bbox.json index 8cc2103a..818bcaaa 100644 --- a/filter_bbox.json +++ b/filter_bbox.json @@ -1,7 +1,7 @@ { "id": "filter_bbox", "summary": "Spatial filter using a bounding box", - "description": "Limits the data cube to the specified bounding box.\n\nThe filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC).", + "description": "Limits the data cube to the specified bounding box.\n\n* For raster data cubes, the filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_spatial()`` can be used to filter by geometry.\n* For vector data cubes, the filter retains the geometry in the data cube if the geometry is fully within the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_vector()`` can be used to filter by geometry.", "categories": [ "cubes", "filter" @@ -10,10 +10,32 @@ { "name": "data", "description": "A data cube.", - "schema": { - "type": "object", - "subtype": "raster-cube" - } + "schema": [ + { + "title": "Raster data cube", + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] + }, + { + "title": "Vector data cube", + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + } + ] }, { "name": "extent", @@ -92,10 +114,32 @@ ], "returns": { "description": "A data cube restricted to the bounding box. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the spatial dimensions have less (or the same) dimension labels.", - "schema": { - "type": "object", - "subtype": "raster-cube" - } + "schema": [ + { + "title": "Raster data cube", + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] + }, + { + "title": "Vector data cube", + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + } + ] }, "links": [ { @@ -124,4 +168,4 @@ "title": "Simple Features standard by the OGC" } ] -} \ No newline at end of file +} diff --git a/filter_spatial.json b/filter_spatial.json index b807b8df..b6d7a7de 100644 --- a/filter_spatial.json +++ b/filter_spatial.json @@ -1,7 +1,7 @@ { "id": "filter_spatial", - "summary": "Spatial filter using geometries", - "description": "Limits the data cube over the spatial dimensions to the specified geometries.\n\n- For **polygons**, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nMore specifically, pixels outside of the bounding box of the given geometry will not be available after filtering. All pixels inside the bounding box that are not retained will be set to `null` (no data).", + "summary": "Spatial filter raster data cubes using geometries", + "description": "Limits the raster data cube over the spatial dimensions to the specified geometries.\n\n- For **polygons**, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nMore specifically, pixels outside of the bounding box of the given geometry will not be available after filtering. All pixels inside the bounding box that are not retained will be set to `null` (no data).\n\n Alternatively, use ``filter_bbox()`` to filter by bounding box.", "categories": [ "cubes", "filter" @@ -9,26 +9,55 @@ "parameters": [ { "name": "data", - "description": "A data cube.", + "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { "name": "geometries", - "description": "One or more geometries used for filtering, specified as GeoJSON.", - "schema": { - "type": "object", - "subtype": "geojson" - } + "description": "One or more geometries used for filtering, given as GeoJSON or vector data cube. If multiple geometries are provided, the union of them is used.\n\nLimits the data cube to the bounding box of the given geometries. No implicit masking gets applied. To mask the pixels of the data cube use ``mask_polygon()``.", + "schema": [ + { + "type": "object", + "subtype": "geojson" + }, + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + } + ] } ], "returns": { - "description": "A data cube restricted to the specified geometries. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the spatial dimensions have less (or the same) dimension labels.", + "description": "A raster data cube restricted to the specified geometries. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the spatial dimensions have less (or the same) dimension labels.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "links": [ @@ -43,4 +72,4 @@ "title": "Simple Features standard by the OGC" } ] -} \ No newline at end of file +} diff --git a/filter_temporal.json b/filter_temporal.json index bd7ea0b3..0ba2274e 100644 --- a/filter_temporal.json +++ b/filter_temporal.json @@ -12,7 +12,12 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -76,7 +81,12 @@ "description": "A data cube restricted to the specified temporal extent. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the temporal dimensions (determined by `dimensions` parameter) may have less dimension labels.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, "exceptions": { diff --git a/load_collection.json b/load_collection.json index 160888be..1a5296e2 100644 --- a/load_collection.json +++ b/load_collection.json @@ -1,7 +1,7 @@ { "id": "load_collection", "summary": "Load a collection", - "description": "Loads a collection from the current back-end by its id and returns it as a processable data cube. The data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent`, `bands` and `properties`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", + "description": "Loads a collection from the current back-end by its id and returns it as a processable data cube. The data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent`, `bands` and `properties`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the values in the data cube should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", "categories": [ "cubes", "import" @@ -18,7 +18,7 @@ }, { "name": "spatial_extent", - "description": "Limits the data to load from the collection to the specified bounding box or polygons.\n\nThe process puts a pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry,\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries, or\n* a `GeometryCollection` containing `Polygon` or `MultiPolygon` geometries. To maximize interoperability, `GeometryCollection` should be avoided in favour of one of the alternatives above.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", + "description": "Limits the data to load from the collection to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube if the geometry is fully *within* the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", "schema": [ { "title": "Bounding Box", @@ -93,10 +93,21 @@ }, { "title": "GeoJSON", - "description": "Limits the data cube to the bounding box of the given geometry. All pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).", + "description": "Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported.", "type": "object", "subtype": "geojson" }, + { + "title": "Vector data cube", + "description": "Limits the data cube to the bounding box of the given geometries in the vector data cube. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).", + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + }, { "title": "No filter", "description": "Don't filter spatially. All data is included in the data cube.", @@ -219,7 +230,7 @@ "description": "A data cube for further processing. The dimensions and dimension properties (name, type, labels, reference system and resolution) correspond to the collection's metadata, but the dimension labels are restricted as specified in the parameters.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { diff --git a/mask.json b/mask.json index 515c81cb..d5940b25 100644 --- a/mask.json +++ b/mask.json @@ -1,7 +1,7 @@ { "id": "mask", "summary": "Apply a raster mask", - "description": "Applies a mask to a raster data cube. To apply a vector mask use ``mask_polygon()``.\n\nA mask is a raster data cube for which corresponding pixels among `data` and `mask` are compared and those pixels in `data` are replaced whose pixels in `mask` are non-zero (for numbers) or `true` (for boolean values). The pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data).\n\nThe data cubes have to be compatible so that each dimension in the mask must also be available in the raster data cube with the same name, type, reference system, resolution and labels. Dimensions can be missing in the mask with the result that the mask is applied to each label of the dimension in `data` that is missing in the data cube of the mask. The process fails if there's an incompatibility found between the raster data cube and the mask.", + "description": "Applies a mask to a raster data cube. To apply a polygon as a mask, use ``mask_polygon()``.\n\nA mask is a raster data cube for which corresponding pixels among `data` and `mask` are compared and those pixels in `data` are replaced whose pixels in `mask` are non-zero (for numbers) or `true` (for boolean values). The pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data).\n\nThe data cubes have to be compatible so that each dimension in the mask must also be available in the raster data cube with the same name, type, reference system, resolution and labels. Dimensions can be missing in the mask with the result that the mask is applied to each label of the dimension in `data` that is missing in the data cube of the mask. The process fails if there's an incompatibility found between the raster data cube and the mask.", "categories": [ "cubes", "masks" @@ -12,7 +12,16 @@ "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -20,7 +29,16 @@ "description": "A mask as a raster data cube. Every pixel in `data` must have a corresponding element in `mask`.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -42,7 +60,16 @@ "description": "A masked raster data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } } } diff --git a/mask_polygon.json b/mask_polygon.json index c1f59d4e..f79db016 100644 --- a/mask_polygon.json +++ b/mask_polygon.json @@ -12,16 +12,41 @@ "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { "name": "mask", - "description": "A GeoJSON object containing at least one polygon. The provided feature types can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry,\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries, or\n* a `GeometryCollection` containing `Polygon` or `MultiPolygon` geometries. To maximize interoperability, `GeometryCollection` should be avoided in favour of one of the alternatives above.", - "schema": { - "type": "object", - "subtype": "geojson" - } + "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.", + "schema": [ + { + "type": "object", + "subtype": "geojson", + "description": "The GeoJSON type `GeometryCollection` is not supported." + }, + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries", + "geometry_type": [ + "Polygon", + "MultiPolygon" + ] + } + ] + } + ] }, { "name": "replacement", @@ -57,7 +82,16 @@ "description": "A masked raster data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "links": [ @@ -67,4 +101,4 @@ "title": "Simple Features standard by the OGC" } ] -} \ No newline at end of file +} diff --git a/merge_cubes.json b/merge_cubes.json index 28b4803b..e41d5f2e 100644 --- a/merge_cubes.json +++ b/merge_cubes.json @@ -1,7 +1,7 @@ { "id": "merge_cubes", "summary": "Merge two data cubes", - "description": "The data cubes have to be compatible. A merge operation without overlap should be reversible with (a set of) filter operations for each of the two cubes. The process performs the join on overlapping dimensions, with the same name and type.\n\nAn overlapping dimension has the same name, type, reference system and resolution in both dimensions, but can have different labels. One of the dimensions can have different labels, for all other dimensions the labels must be equal. If data overlaps, the parameter `overlap_resolver` must be specified to resolve the overlap.\n\n**Examples for merging two data cubes:**\n\n1. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the first cube and `B3` and `B4`. An overlap resolver is *not needed*. The merged data cube has the dimensions `x`, `y`, `t` and `bands` and the dimension `bands` has four dimension labels: `B1`, `B2`, `B3`, `B4`.\n2. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the first data cube and `B2` and `B3` for the second. An overlap resolver is *required* to resolve overlap in band `B2`. The merged data cube has the dimensions `x`, `y`, `t` and `bands` and the dimension `bands` has three dimension labels: `B1`, `B2`, `B3`.\n3. Data cubes with the dimensions (`x`, `y`, `t`) have the same dimension labels in `x`, `y` and `t`. There are two options:\n 1. Keep the overlapping values separately in the merged data cube: An overlap resolver is *not needed*, but for each data cube you need to add a new dimension using ``add_dimension()``. The new dimensions must be equal, except that the labels for the new dimensions must differ by name. The merged data cube has the same dimensions and labels as the original data cubes, plus the dimension added with ``add_dimension()``, which has the two dimension labels after the merge.\n 2. Combine the overlapping values into a single value: An overlap resolver is *required* to resolve the overlap for all pixels. The merged data cube has the same dimensions and labels as the original data cubes, but all pixel values have been processed by the overlap resolver.\n4. A data cube with dimensions (`x`, `y`, `t` / `bands`) or (`x`, `y`, `t`, `bands`) and another data cube with dimensions (`x`, `y`) have the same dimension labels in `x` and `y`. Merging them will join dimensions `x` and `y`, so the lower dimension cube is merged with each time step and band available in the higher dimensional cube. This can for instance be used to apply a digital elevation model to a spatio-temporal data cube. An overlap resolver is *required* to resolve the overlap for all pixels.\n\nAfter the merge, the dimensions with a natural/inherent label order (with a reference system this is each spatial and temporal dimensions) still have all dimension labels sorted. For other dimensions where there is no inherent order, including bands, the dimension labels keep the order in which they are present in the original data cubes and the dimension labels of `cube2` are appended to the dimension labels of `cube1`.", + "description": "The process performs the join on overlapping dimensions. The data cubes have to be compatible. A merge operation without overlap should be reversible with (a set of) filter operations for each of the two cubes. As such it is not possible to merge a vector and a raster data cube. It is also not possible to merge vector data cubes that contain different base geometry types (points, lines/line strings, polygons). The base geometry types can be merged with their corresponding multi geometry types. In case of such a conflict, the `IncompatibleGeometryTypes` exception is thrown.\n\nOverlapping dimensions have the same name, type, reference system and resolution, but can have different labels. One of the dimensions can have different labels, for all other dimensions the labels must be equal. Equality for geometries follows the definition in the Simple Features standard by the OGC. If data overlaps, the parameter `overlap_resolver` must be specified to resolve the overlap.\n\n**Examples for merging two data cubes:**\n\n1. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the first cube and `B3` and `B4`. An overlap resolver is *not needed*. The merged data cube has the dimensions `x`, `y`, `t` and `bands` and the dimension `bands` has four dimension labels: `B1`, `B2`, `B3`, `B4`.\n2. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the first data cube and `B2` and `B3` for the second. An overlap resolver is *required* to resolve overlap in band `B2`. The merged data cube has the dimensions `x`, `y`, `t` and `bands` and the dimension `bands` has three dimension labels: `B1`, `B2`, `B3`.\n3. Data cubes with the dimensions (`x`, `y`, `t`) have the same dimension labels in `x`, `y` and `t`. There are two options:\n 1. Keep the overlapping values separately in the merged data cube: An overlap resolver is *not needed*, but for each data cube you need to add a new dimension using ``add_dimension()``. The new dimensions must be equal, except that the labels for the new dimensions must differ by name. The merged data cube has the same dimensions and labels as the original data cubes, plus the dimension added with ``add_dimension()``, which has the two dimension labels after the merge.\n 2. Combine the overlapping values into a single value: An overlap resolver is *required* to resolve the overlap for all values. The merged data cube has the same dimensions and labels as the original data cubes, but all values have been processed by the overlap resolver.\n4. A data cube with dimensions (`x`, `y`, `t` / `bands`) or (`x`, `y`, `t`, `bands`) and another data cube with dimensions (`x`, `y`) have the same dimension labels in `x` and `y`. Merging them will join dimensions `x` and `y`, so the lower dimension cube is merged with each time step and band available in the higher dimensional cube. This can for instance be used to apply a digital elevation model to a spatio-temporal data cube. An overlap resolver is *required* to resolve the overlap for all pixels.\n\nAfter the merge, the dimensions with a natural/inherent label order (with a reference system this is each spatial and temporal dimensions) still have all dimension labels sorted. For other dimensions where there is no inherent order, including bands, the dimension labels keep the order in which they are present in the original data cubes and the dimension labels of `cube2` are appended to the dimension labels of `cube1`.", "categories": [ "cubes" ], @@ -11,7 +11,7 @@ "description": "The first data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -19,7 +19,7 @@ "description": "The second data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -77,12 +77,15 @@ "description": "The merged data cube. See the process description for details regarding the dimensions and dimension properties (name, type, labels, reference system and resolution).", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { "OverlapResolverMissing": { "message": "Overlapping data cubes, but no overlap resolver has been specified." + }, + "IncompatibleGeometryTypes": { + "message": "The geometry types are not compatible and can't be merged." } }, "links": [ @@ -90,6 +93,11 @@ "rel": "about", "href": "https://en.wikipedia.org/wiki/Reduction_Operator", "title": "Background information on reduction operators (binary reducers) by Wikipedia" + }, + { + "href": "http://www.opengeospatial.org/standards/sfa", + "rel": "about", + "title": "Simple Features standard by the OGC" } ] -} \ No newline at end of file +} diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index 3dc15b36..17cd2b72 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "http://processes.openeo.org/1.2.0/meta/subtype-schemas.json", + "$id": "https://processes.openeo.org/1.2.0/meta/subtype-schemas.json", "title": "Subtype Schemas", "description": "This file defines the schemas for subtypes we define for openEO processes.", "definitions": { @@ -112,6 +112,12 @@ "description": "A collection identifier from the list of supported collections.", "pattern": "^[\\w\\-\\.~/]+$" }, + "datacube": { + "type": "object", + "subtype": "datacube", + "title": "Data Cube", + "description": "A data cube that consists of an arbitrary number of dimensions and doesn't require any dimension type specifically." + }, "date": { "type": "string", "subtype": "date", @@ -290,7 +296,8 @@ "type": "object", "subtype": "raster-cube", "title": "Raster data cube", - "description": "A raster data cube, an image collection stored at the back-end. Different back-ends have different internal representations for this data structure." + "description": "A raster data cube, which is a data cube with two dimension of type spatial (x and y). This has been deprecated in favour of `datacube`.", + "deprecated": true }, "temporal-interval": { "type": "array", @@ -417,7 +424,8 @@ "type": "object", "subtype": "vector-cube", "title": "Vector data cube", - "description": "A vector data cube, a vector collection stored at the back-end. Different back-ends have different internal representations for this data structure" + "description": "A vector data cube, which is a data cube with a dimension of type vector. This has been deprecated in favour of `datacube`.", + "deprecated": true }, "wkt2-definition": { "type": "string", diff --git a/ndvi.json b/ndvi.json index e86a27e6..5bb952d4 100644 --- a/ndvi.json +++ b/ndvi.json @@ -13,7 +13,19 @@ "description": "A raster data cube with two bands that have the common names `red` and `nir` assigned.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, { @@ -56,7 +68,16 @@ "description": "A raster data cube containing the computed NDVI values. The structure of the data cube differs depending on the value passed to `target_band`:\n\n* `target_band` is `null`: The data cube does not contain the dimension of type `bands`, the number of dimensions decreases by one. The dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.\n* `target_band` is a string: The data cube keeps the same dimensions. The dimension properties remain unchanged, but the number of dimension labels for the dimension of type `bands` increases by one. The additional label is named as specified in `target_band`.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "exceptions": { diff --git a/proposals/aggregate_spatial_window.json b/proposals/aggregate_spatial_window.json index 77230275..5bc3e03c 100644 --- a/proposals/aggregate_spatial_window.json +++ b/proposals/aggregate_spatial_window.json @@ -13,7 +13,16 @@ "description": "A raster data cube with exactly two horizontal spatial dimensions and an arbitrary number of additional dimensions. The process is applied to all additional dimensions individually.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -103,10 +112,19 @@ } ], "returns": { - "description": "A data cube with the newly computed values and the same dimensions.\n\nThe resolution will change depending on the chosen values for the `size` and `boundary` parameter. It usually decreases for the dimensions which have the corresponding parameter `size` set to values greater than 1.\n\nThe dimension labels will be set to the coordinate at the center of the window. The other dimension properties (name, type and reference system) remain unchanged.", + "description": "A raster data cube with the newly computed values and the same dimensions.\n\nThe resolution will change depending on the chosen values for the `size` and `boundary` parameter. It usually decreases for the dimensions which have the corresponding parameter `size` set to values greater than 1.\n\nThe dimension labels will be set to the coordinate at the center of the window. The other dimension properties (name, type and reference system) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "links": [ @@ -116,4 +134,4 @@ "title": "Aggregation explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/proposals/ard_normalized_radar_backscatter.json b/proposals/ard_normalized_radar_backscatter.json index e643845f..ec60de44 100644 --- a/proposals/ard_normalized_radar_backscatter.json +++ b/proposals/ard_normalized_radar_backscatter.json @@ -13,8 +13,20 @@ "name": "data", "description": "The source data cube containing SAR input.", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, { @@ -73,8 +85,20 @@ "returns": { "description": "Backscatter values expressed as gamma0 in linear scale.\n\nIn addition to the bands `contributing_area` and `ellipsoid_incidence_angle` that can optionally be added with corresponding parameters, the following bands are always added to the data cube:\n\n- `mask`: A data mask that indicates which values are valid (1), invalid (0) or contain no-data (null).\n- `local_incidence_angle`: A band with DEM-based local incidence angles in degrees.\n\nThe data returned is CARD4L compliant with corresponding metadata.", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, "exceptions": { @@ -128,4 +152,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/proposals/ard_surface_reflectance.json b/proposals/ard_surface_reflectance.json index 38aa758b..01328f10 100644 --- a/proposals/ard_surface_reflectance.json +++ b/proposals/ard_surface_reflectance.json @@ -13,8 +13,20 @@ "description": "The source data cube containing multi-spectral optical top of the atmosphere (TOA) reflectances. There must be a single dimension of type `bands` available.", "name": "data", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, { @@ -83,8 +95,20 @@ "returns": { "description": "Data cube containing bottom of atmosphere reflectances for each spectral band in the source data cube, with atmospheric disturbances like clouds and cloud shadows removed. No-data values (null) are directly set in the bands. Depending on the methods used, several additional bands will be added to the data cube:\n\nData cube containing bottom of atmosphere reflectances for each spectral band in the source data cube, with atmospheric disturbances like clouds and cloud shadows removed. Depending on the methods used, several additional bands will be added to the data cube:\n\n- `date` (optional): Specifies per-pixel acquisition timestamps.\n- `incomplete-testing` (required): Identifies pixels with a value of 1 for which the per-pixel tests (at least saturation, cloud and cloud shadows, see CARD4L specification for details) have not all been successfully completed. Otherwise, the value is 0.\n- `saturation` (required) / `saturation_{band}` (optional): Indicates where pixels in the input spectral bands are saturated (1) or not (0). If the saturation is given per band, the band names are `saturation_{band}` with `{band}` being the band name from the source data cube.\n- `cloud`, `shadow` (both required),`aerosol`, `haze`, `ozone`, `water_vapor` (all optional): Indicates the probability of pixels being an atmospheric disturbance such as clouds. All bands have values between 0 (clear) and 1, which describes the probability that it is an atmospheric disturbance.\n- `snow-ice` (optional): Points to a file that indicates whether a pixel is assessed as being snow/ice (1) or not (0). All values describe the probability and must be between 0 and 1.\n- `land-water` (optional): Indicates whether a pixel is assessed as being land (1) or water (0). All values describe the probability and must be between 0 and 1.\n- `incidence-angle` (optional): Specifies per-pixel incidence angles in degrees.\n- `azimuth` (optional): Specifies per-pixel azimuth angles in degrees.\n- `sun-azimuth:` (optional): Specifies per-pixel sun azimuth angles in degrees.\n- `sun-elevation` (optional): Specifies per-pixel sun elevation angles in degrees.\n- `terrain-shadow` (optional): Indicates with a value of 1 whether a pixel is not directly illuminated due to terrain shadowing. Otherwise, the value is 0.\n- `terrain-occlusion` (optional): Indicates with a value of 1 whether a pixel is not visible to the sensor due to terrain occlusion during off-nadir viewing. Otherwise, the value is 0.\n- `terrain-illumination` (optional): Contains coefficients used for terrain illumination correction are provided for each pixel.\n\nThe data returned is CARD4L compliant with corresponding metadata.", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, "links": [ @@ -94,4 +118,4 @@ "title": "CEOS CARD4L specification" } ] -} \ No newline at end of file +} diff --git a/proposals/atmospheric_correction.json b/proposals/atmospheric_correction.json index 9b537322..d366f1ed 100644 --- a/proposals/atmospheric_correction.json +++ b/proposals/atmospheric_correction.json @@ -12,8 +12,20 @@ "description": "Data cube containing multi-spectral optical top of atmosphere reflectances to be corrected.", "name": "data", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, { @@ -63,8 +75,20 @@ "returns": { "description": "Data cube containing bottom of atmosphere reflectances.", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, "exceptions": { @@ -79,4 +103,4 @@ "title": "Atmospheric correction explained by EO4GEO body of knowledge." } ] -} \ No newline at end of file +} diff --git a/proposals/cloud_detection.json b/proposals/cloud_detection.json index f9025c5b..d695720e 100644 --- a/proposals/cloud_detection.json +++ b/proposals/cloud_detection.json @@ -12,8 +12,20 @@ "description": "The source data cube containing multi-spectral optical top of the atmosphere (TOA) reflectances on which to perform cloud detection.", "name": "data", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, { @@ -49,8 +61,20 @@ "returns": { "description": "A data cube with bands for the atmospheric disturbances. Each of the masks contains values between 0 and 1. The data cube has the same spatial and temporal dimensions as the source data cube and a dimension that contains a dimension label for each of the supported/considered atmospheric disturbance.", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, "links": [ @@ -60,4 +84,4 @@ "title": "Cloud mask explained by EO4GEO body of knowledge." } ] -} \ No newline at end of file +} diff --git a/proposals/filter_labels.json b/proposals/filter_labels.json index 01d77035..4b26fb1d 100644 --- a/proposals/filter_labels.json +++ b/proposals/filter_labels.json @@ -13,7 +13,7 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -25,7 +25,7 @@ "parameters": [ { "name": "value", - "description": "A single dimension label to compare against. The data type of the parameter depends on the dimension labels set for the dimension.", + "description": "A single dimension label to compare against. The data type of the parameter depends on the dimension labels set for the dimension. Please note that for some dimension types a representation is used, e.g.\n\n* dates and/or times are usually strings compliant to [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601),\n* geometries can be a WKT string or an identifier.", "schema": [ { "type": "number" @@ -74,7 +74,7 @@ "description": "A data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the given dimension has less (or the same) dimension labels.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -115,4 +115,4 @@ "title": "Filters explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/proposals/filter_vector.json b/proposals/filter_vector.json new file mode 100644 index 00000000..46279fa7 --- /dev/null +++ b/proposals/filter_vector.json @@ -0,0 +1,89 @@ +{ + "id": "filter_vector", + "summary": "Spatial vector filter using geometries", + "description": "Limits the vector data cube to the specified geometries. The process works on geometries as defined in the Simple Features standard by the OGC. Alternatively, use ``filter_bbox()`` to filter by bounding box.", + "categories": [ + "cubes", + "filter", + "vector" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "A vector data cube with the candidate geometries.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + } + }, + { + "name": "geometries", + "description": "One or more base geometries used for filtering, given as GeoJSON or vector data cube. If multiple base geometries are provided, the union of them is used.", + "schema": [ + { + "type": "object", + "subtype": "geojson", + "description": "The GeoJSON type `GeometryCollection` is not supported." + }, + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + } + ] + }, + { + "name": "relation", + "description": "The spatial filter predicate for comparing the geometries provided through (a) `geometries` (base geometries) and (b) `data` (candidate geometries).", + "schema": { + "type": "string", + "enum": [ + "intersects", + "disjoint", + "equals", + "touches", + "crosses", + "overlaps", + "contains", + "within" + ] + }, + "optional": true, + "default": "intersects" + } + ], + "returns": { + "description": "A vector data cube restricted to the specified geometries. The dimensions and dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that the geometries dimension has less (or the same) dimension labels.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + } + }, + "links": [ + { + "href": "https://openeo.org/documentation/1.0/datacubes.html#filter", + "rel": "about", + "title": "Filters explained in the openEO documentation" + }, + { + "href": "http://www.opengeospatial.org/standards/sfa", + "rel": "about", + "title": "Simple Features standard by the OGC" + } + ] +} diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json index a9a549d9..1b6f299f 100644 --- a/proposals/fit_class_random_forest.json +++ b/proposals/fit_class_random_forest.json @@ -10,17 +10,44 @@ { "name": "predictors", "description": "The predictors for the classification model as a vector data cube. Aggregated to the features (vectors) of the target input variable.", - "schema": { - "type": "object", - "subtype": "vector-cube" - } + "schema": [ + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + }, + { + "type": "bands" + } + ] + }, + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + }, + { + "type": "other" + } + ] + } + ] }, { "name": "target", "description": "The training sites for the classification model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).", "schema": { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] } }, { diff --git a/proposals/fit_curve.json b/proposals/fit_curve.json index 3b5df7e1..9d97dfda 100644 --- a/proposals/fit_curve.json +++ b/proposals/fit_curve.json @@ -13,7 +13,7 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -30,7 +30,7 @@ { "title": "Data Cube with optimal values from a previous result of this process.", "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } ] }, @@ -80,7 +80,7 @@ "description": "A data cube with the optimal values for the parameters.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -91,4 +91,4 @@ "message": "A dimension with the specified name does not exist." } } -} \ No newline at end of file +} diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json index e75028b2..121af96d 100644 --- a/proposals/fit_regr_random_forest.json +++ b/proposals/fit_regr_random_forest.json @@ -10,17 +10,44 @@ { "name": "predictors", "description": "The predictors for the regression model as a vector data cube. Aggregated to the features (vectors) of the target input variable.", - "schema": { - "type": "object", - "subtype": "vector-cube" - } + "schema": [ + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + }, + { + "type": "bands" + } + ] + }, + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + }, + { + "type": "other" + } + ] + } + ] }, { "name": "target", "description": "The training sites for the regression model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).", "schema": { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] } }, { diff --git a/proposals/flatten_dimensions.json b/proposals/flatten_dimensions.json index 05e54212..da3647ab 100644 --- a/proposals/flatten_dimensions.json +++ b/proposals/flatten_dimensions.json @@ -12,7 +12,7 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -47,7 +47,7 @@ "description": "A data cube with the new shape. The dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { diff --git a/proposals/inspect.json b/proposals/inspect.json index b0a0335d..9d7f2190 100644 --- a/proposals/inspect.json +++ b/proposals/inspect.json @@ -1,7 +1,7 @@ { "id": "inspect", "summary": "Add information to the logs", - "description": "This process can be used to add runtime information to the logs, e.g. for debugging purposes. This process should be used with caution and it is recommended to remove the process in production workflows. For example, logging each pixel or array individually in a process such as ``apply()`` or ``reduce_dimension()`` could lead to a (too) large number of log entries. Several data structures (e.g. data cubes) are too large to log and will only return summaries of their contents.\n\nThe data provided in the parameter `data` is returned without changes.", + "description": "This process can be used to add runtime information to the logs, e.g. for debugging purposes. This process should be used with caution and it is recommended to remove the process in production workflows. For example, logging each value or array individually in a process such as ``apply()`` or ``reduce_dimension()`` could lead to a (too) large number of log entries. Several data structures (e.g. data cubes) are too large to log and will only return summaries of their contents.\n\nThe data provided in the parameter `data` is returned without changes.", "categories": [ "development" ], diff --git a/proposals/load_result.json b/proposals/load_result.json index 9e7993c3..6d67f4d8 100644 --- a/proposals/load_result.json +++ b/proposals/load_result.json @@ -1,7 +1,7 @@ { "id": "load_result", "summary": "Load batch job results", - "description": "Loads batch job results and returns them as a processable data cube. A batch job result can be loaded by ID or URL:\n\n* **ID**: The identifier for a finished batch job. The job must have been submitted by the authenticated user on the back-end currently connected to.\n* **URL**: The URL to the STAC metadata for a batch job result. This is usually a signed URL that is provided by some back-ends since openEO API version 1.1.0 through the `canonical` link relation in the batch job result metadata.\n\nIf supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent` and `bands`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the pixel values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", + "description": "Loads batch job results and returns them as a processable data cube. A batch job result can be loaded by ID or URL:\n\n* **ID**: The identifier for a finished batch job. The job must have been submitted by the authenticated user on the back-end currently connected to.\n* **URL**: The URL to the STAC metadata for a batch job result. This is usually a signed URL that is provided by some back-ends since openEO API version 1.1.0 through the `canonical` link relation in the batch job result metadata.\n\nIf supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent` and `bands`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", "categories": [ "cubes", "import" @@ -29,7 +29,7 @@ }, { "name": "spatial_extent", - "description": "Limits the data to load from the batch job result to the specified bounding box or polygons.\n\nThe process puts a pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry,\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries, or\n* a `GeometryCollection` containing `Polygon` or `MultiPolygon` geometries. To maximize interoperability, `GeometryCollection` should be avoided in favour of one of the alternatives above.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", + "description": "Limits the data to load from the batch job result to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube of the geometry is fully within the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", "schema": [ { "title": "Bounding Box", @@ -104,10 +104,21 @@ }, { "title": "GeoJSON", - "description": "Limits the data cube to the bounding box of the given geometry. All pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).", + "description": "Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported.", "type": "object", "subtype": "geojson" }, + { + "title": "Vector data cube", + "description": "Limits the data cube to the bounding box of the given geometries in the vector data cube. All pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).", + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] + }, { "title": "No filter", "description": "Don't filter spatially. All data is included in the data cube.", @@ -196,7 +207,7 @@ "description": "A data cube for further processing.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { diff --git a/proposals/load_uploaded_files.json b/proposals/load_uploaded_files.json index bf811b4e..039994ff 100644 --- a/proposals/load_uploaded_files.json +++ b/proposals/load_uploaded_files.json @@ -44,7 +44,7 @@ "description": "A data cube for further processing.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { diff --git a/proposals/predict_curve.json b/proposals/predict_curve.json index 52adcc5e..9fb5d341 100644 --- a/proposals/predict_curve.json +++ b/proposals/predict_curve.json @@ -13,15 +13,15 @@ "description": "A data cube to predict values for.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { "name": "parameters", - "description": "A data cube with optimal values from a result of e.g. ``fit_curve()``.", + "description": "A data cube with optimal values, e.g. computed by the process ``fit_curve()``.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -101,7 +101,7 @@ "description": "A data cube with the predicted values.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -109,4 +109,4 @@ "message": "A dimension with the specified name does not exist." } } -} \ No newline at end of file +} diff --git a/proposals/reduce_spatial.json b/proposals/reduce_spatial.json index d9a2fb56..d27bd9cf 100644 --- a/proposals/reduce_spatial.json +++ b/proposals/reduce_spatial.json @@ -11,10 +11,19 @@ "parameters": [ { "name": "data", - "description": "A data cube.", + "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -66,7 +75,7 @@ "description": "A data cube with the newly computed values. It is missing the horizontal spatial dimensions, the number of dimensions decreases by two. The dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "links": [ @@ -76,4 +85,4 @@ "title": "Reducers explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/proposals/resample_cube_temporal.json b/proposals/resample_cube_temporal.json index 2bd38dde..9c6aac09 100644 --- a/proposals/resample_cube_temporal.json +++ b/proposals/resample_cube_temporal.json @@ -13,7 +13,12 @@ "description": "A data cube with one or more temporal dimensions.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -21,7 +26,12 @@ "description": "A data cube that describes the temporal target resolution.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, { @@ -50,10 +60,15 @@ } ], "returns": { - "description": "A raster data cube with the same dimensions and the same dimension properties (name, type, labels, reference system and resolution) for all non-temporal dimensions. For the temporal dimension, the name and type remain unchanged, but the dimension labels, resolution and reference system may change.", + "description": "A data cube with the same dimensions and the same dimension properties (name, type, labels, reference system and resolution) for all non-temporal dimensions. For the temporal dimension, the name and type remain unchanged, but the dimension labels, resolution and reference system may change.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "temporal" + } + ] } }, "exceptions": { @@ -71,4 +86,4 @@ "title": "Resampling explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/proposals/run_udf_externally.json b/proposals/run_udf_externally.json index 9672eb71..d4713933 100644 --- a/proposals/run_udf_externally.json +++ b/proposals/run_udf_externally.json @@ -1,7 +1,7 @@ { "id": "run_udf_externally", "summary": "Run an externally hosted UDF container", - "description": "Runs a compatible UDF container that is either externally hosted by a service provider or running on a local machine of the user. The UDF container must follow the [openEO UDF specification](https://openeo.org/documentation/1.0/udfs.html).\n\nThe referenced UDF service can be executed in several processes such as ``aggregate_spatial()``, ``apply()``, ``apply_dimension()`` and ``reduce_dimension()``. In this case, an array is passed instead of a raster data cube. The user must ensure that the data is given in a way that the UDF code can make sense of it.", + "description": "Runs a compatible UDF container that is either externally hosted by a service provider or running on a local machine of the user. The UDF container must follow the [openEO UDF specification](https://openeo.org/documentation/1.0/udfs.html).\n\nThe referenced UDF service can be executed in several processes such as ``aggregate_spatial()``, ``apply()``, ``apply_dimension()`` and ``reduce_dimension()``. In this case, an array is passed instead of a data cube. The user must ensure that the data is given in a way that the UDF code can make sense of it.", "categories": [ "cubes", "import", @@ -66,4 +66,4 @@ "title": "openEO UDF repository" } ] -} \ No newline at end of file +} diff --git a/proposals/sar_backscatter.json b/proposals/sar_backscatter.json index 77fdf73e..03d13d29 100644 --- a/proposals/sar_backscatter.json +++ b/proposals/sar_backscatter.json @@ -12,8 +12,20 @@ "name": "data", "description": "The source data cube containing SAR input.", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, { @@ -112,8 +124,20 @@ "returns": { "description": "Backscatter values corresponding to the chosen parametrization. The values are given in linear scale.", "schema": { - "subtype": "raster-cube", - "type": "object" + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + }, + { + "type": "bands" + } + ] } }, "exceptions": { diff --git a/proposals/unflatten_dimension.json b/proposals/unflatten_dimension.json index 1cbf2d1d..990e7469 100644 --- a/proposals/unflatten_dimension.json +++ b/proposals/unflatten_dimension.json @@ -12,7 +12,7 @@ "description": "A data cube that is consistently structured so that operation can execute flawlessly (e.g. the dimension labels need to contain the `label_separator` exactly 1 time for two target dimensions, 2 times for three target dimensions etc.).", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -48,7 +48,7 @@ "description": "A data cube with the new shape. The dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { diff --git a/proposals/vector_buffer.json b/proposals/vector_buffer.json index 204a54b7..ad30030b 100644 --- a/proposals/vector_buffer.json +++ b/proposals/vector_buffer.json @@ -9,15 +9,21 @@ "parameters": [ { "name": "geometries", - "description": "Geometries to apply the buffer on. Vector properties are preserved for vector data cubes and all GeoJSON Features.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", + "description": "Geometries to apply the buffer on. Feature properties are preserved for vector data cubes and all GeoJSON Features.", "schema": [ { "type": "object", - "subtype": "geojson" + "subtype": "geojson", + "description": "The GeoJSON type `GeometryCollection` is not supported." }, { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] } ] }, @@ -36,7 +42,12 @@ "description": "Returns a vector data cube with the computed new geometries.", "schema": { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] } } } diff --git a/proposals/vector_to_random_points.json b/proposals/vector_to_random_points.json index afe340ef..9a018849 100644 --- a/proposals/vector_to_random_points.json +++ b/proposals/vector_to_random_points.json @@ -1,7 +1,7 @@ { "id": "vector_to_random_points", "summary": "Sample random points from geometries", - "description": "Generate a vector data cube of points by sampling random points from input geometries. At least one point is sampled per input geometry. Vector properties are preserved.\n\nIf `geometry_count` and `total_count` are both unrestricted (i.e. set to `null`, which is the default), one sample per geometry is used.", + "description": "Generate a vector data cube of points by sampling random points from input geometries. At least one point is sampled per input geometry. Feature properties are preserved.\n\nIf `geometry_count` and `total_count` are both unrestricted (i.e. set to `null`, which is the default), one sample per geometry is used.", "categories": [ "cubes", "vector" @@ -10,15 +10,21 @@ "parameters": [ { "name": "data", - "description": "Input geometries for sample extraction.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", + "description": "Input geometries for sample extraction.", "schema": [ { "type": "object", - "subtype": "geojson" + "subtype": "geojson", + "description": "The GeoJSON type `GeometryCollection` is not supported." }, { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] } ] }, @@ -80,7 +86,16 @@ "description": "Returns a vector data cube with the sampled points.", "schema": { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries", + "geometry_type": [ + "Point", + "MultiPoint" + ] + } + ] } }, "exceptions": { diff --git a/proposals/vector_to_regular_points.json b/proposals/vector_to_regular_points.json index 3fd105f6..d49a333d 100644 --- a/proposals/vector_to_regular_points.json +++ b/proposals/vector_to_regular_points.json @@ -1,7 +1,7 @@ { "id": "vector_to_regular_points", "summary": "Sample regular points from geometries", - "description": "Generate a vector data cube of points by sampling regularly-spaced points from input geometries. Vector properties are preserved.", + "description": "Generate a vector data cube of points by sampling regularly-spaced points from input geometries. Feature properties are preserved.", "categories": [ "cubes", "vector" @@ -10,15 +10,21 @@ "parameters": [ { "name": "data", - "description": "Input geometries for sample extraction.\n\nTo maximize interoperability, a nested `GeometryCollection` should be avoided. Furthermore, a `GeometryCollection` composed of a single type of geometries should be avoided in favour of the corresponding multi-part type (e.g. `MultiPolygon`).", + "description": "Input geometries for sample extraction.", "schema": [ { "type": "object", - "subtype": "geojson" + "subtype": "geojson", + "description": "The GeoJSON type `GeometryCollection` is not supported." }, { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries" + } + ] } ] }, @@ -44,7 +50,16 @@ "description": "Returns a vector data cube with the sampled points.", "schema": { "type": "object", - "subtype": "vector-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "geometries", + "geometry_type": [ + "Point", + "MultiPoint" + ] + } + ] } } } diff --git a/reduce_dimension.json b/reduce_dimension.json index 27ed34de..7fb77ba5 100644 --- a/reduce_dimension.json +++ b/reduce_dimension.json @@ -1,7 +1,7 @@ { "id": "reduce_dimension", "summary": "Reduce dimensions", - "description": "Applies a reducer to a data cube dimension by collapsing all the pixel values along the specified dimension into an output value computed by the reducer.\n\nThe dimension is dropped. To avoid this, use ``apply_dimension()`` instead.", + "description": "Applies a reducer to a data cube dimension by collapsing all the values along the specified dimension into an output value computed by the reducer.\n\nThe dimension is dropped. To avoid this, use ``apply_dimension()`` instead.", "categories": [ "cubes", "reducer" @@ -12,7 +12,7 @@ "description": "A data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -72,7 +72,7 @@ "description": "A data cube with the newly computed values. It is missing the given dimension, the number of dimensions decreases by one. The dimension properties (name, type, labels, reference system and resolution) for all other dimensions remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -87,4 +87,4 @@ "title": "Reducers explained in the openEO documentation" } ] -} \ No newline at end of file +} diff --git a/rename_dimension.json b/rename_dimension.json index 15c46410..ecfd1983 100644 --- a/rename_dimension.json +++ b/rename_dimension.json @@ -11,7 +11,7 @@ "description": "The data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -33,7 +33,7 @@ "description": "A data cube with the same dimensions, but the name of one of the dimensions changes. The old name can not be referred to any longer. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { @@ -44,4 +44,4 @@ "message": "A dimension with the specified name already exists." } } -} \ No newline at end of file +} diff --git a/rename_labels.json b/rename_labels.json index 41fe7d7d..2042737d 100644 --- a/rename_labels.json +++ b/rename_labels.json @@ -11,7 +11,7 @@ "description": "The data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, { @@ -54,7 +54,7 @@ "description": "The data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except that for the given dimension the labels change. The old labels can not be referred to any longer. The number of labels remains the same.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } }, "exceptions": { diff --git a/resample_cube_spatial.json b/resample_cube_spatial.json index 54a5f801..3cbdfa49 100644 --- a/resample_cube_spatial.json +++ b/resample_cube_spatial.json @@ -9,18 +9,36 @@ "parameters": [ { "name": "data", - "description": "A data cube.", + "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { "name": "target", - "description": "A data cube that describes the spatial target resolution.", + "description": "A raster data cube that describes the spatial target resolution.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -50,10 +68,19 @@ } ], "returns": { - "description": "A data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the spatial dimensions.", + "description": "A raster data cube with the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged, except for the resolution and dimension labels of the spatial dimensions.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "links": [ diff --git a/resample_spatial.json b/resample_spatial.json index 91d6bc5f..d97865f2 100644 --- a/resample_spatial.json +++ b/resample_spatial.json @@ -12,7 +12,16 @@ "description": "A raster data cube.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, { @@ -115,7 +124,16 @@ "description": "A raster data cube with values warped onto the new projection. It has the same dimensions and the same dimension properties (name, type, labels, reference system and resolution) for all non-spatial or vertical spatial dimensions. For the horizontal spatial dimensions the name and type remain unchanged, but reference system, labels and resolution may change depending on the given parameters.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] } }, "links": [ @@ -145,4 +163,4 @@ "title": "gdalwarp resampling methods" } ] -} \ No newline at end of file +} diff --git a/save_result.json b/save_result.json index 0ad0a582..8fa67ebb 100644 --- a/save_result.json +++ b/save_result.json @@ -10,16 +10,10 @@ { "name": "data", "description": "The data to deliver in the given file format.", - "schema": [ - { - "type": "object", - "subtype": "raster-cube" - }, - { - "type": "object", - "subtype": "vector-cube" - } - ] + "schema": { + "type": "object", + "subtype": "datacube" + } }, { "name": "format", diff --git a/tests/package.json b/tests/package.json index be51806f..861bfe5f 100644 --- a/tests/package.json +++ b/tests/package.json @@ -1,6 +1,6 @@ { "name": "@openeo/processes-validator", - "version": "0.2.0", + "version": "0.3.0", "author": "openEO Consortium", "contributors": [ { diff --git a/tests/testHelpers.js b/tests/testHelpers.js index 418fd830..6305049b 100644 --- a/tests/testHelpers.js +++ b/tests/testHelpers.js @@ -106,7 +106,73 @@ async function getAjv() { }, compile: function (subtype, schema) { if (schema.type != subtypes.definitions[subtype].type) { - throw "Subtype '"+subtype+"' not allowed for type '"+schema.type+"'." + throw "Subtype '"+subtype+"' not allowed for type '"+schema.type+"'."; + } + if (subtypes.definitions[subtype].deprecated) { + throw "Deprecated subtypes not allowed."; + } + return () => true; + }, + errors: false + }); + jsv.addKeyword("dimensions", { + dependencies: [ + "type", + "subtype" + ], + metaSchema: { + type: "array", + minItems: 1, + items: { + type: "object", + required: ["type"], + oneOf: [ + { + properties: { + type: { + type: "string", + const: "spatial" + }, + axis: { + type: "array", + minItems: 1, + items: { + type: "string", + enum: ["x", "y", "z"] + } + } + } + }, + { + properties: { + type: { + type: "string", + const: "geometries" + }, + geometry_type: { + type: "array", + minItems: 1, + items: { + type: "string", + enum: ["Point", "LineString", "Polygon", "MultiPoint", "MultiLineString", "MultiPolygon"] + } + } + } + }, + { + properties: { + type: { + type: "string", + enum: ["bands", "temporal", "other"] + } + } + } + ] + } + }, + compile: function (_, schema) { + if (schema.subtype != 'datacube') { + throw "Dimensions only allowed for subtype 'datacube'." } return () => true; }, @@ -169,7 +235,7 @@ function checkSpelling(text, p = null) { if (p && p.id) { pre += " in " + p.id; } - console.warn(pre + ": " + JSON.stringify(errors)); + throw (pre + ": " + JSON.stringify(errors)); } } diff --git a/trim_cube.json b/trim_cube.json index 4329024e..c3c7891e 100644 --- a/trim_cube.json +++ b/trim_cube.json @@ -8,18 +8,18 @@ "parameters": [ { "name": "data", - "description": "A raster data cube to trim.", + "description": "A data cube to trim.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } } ], "returns": { - "description": "A trimmed raster data cube with the same dimensions. The dimension properties name, type, reference system and resolution remain unchanged. The number of dimension labels may decrease.", + "description": "A trimmed data cube with the same dimensions. The dimension properties name, type, reference system and resolution remain unchanged. The number of dimension labels may decrease.", "schema": { "type": "object", - "subtype": "raster-cube" + "subtype": "datacube" } } -} \ No newline at end of file +} From b574aa5cb2978a2584cc149d92ad275783c7f819 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 31 Jan 2023 14:32:19 +0100 Subject: [PATCH 067/117] apply_neighborhood improvements (#390) Co-authored-by: Lukas Weidenholzer <17790923+LukeWeidenwalker@users.noreply.github.com> Co-authored-by: Stefaan Lippens --- CHANGELOG.md | 13 +++++--- apply_neighborhood.json | 71 +++++++++++++++++++++++++---------------- 2 files changed, 52 insertions(+), 32 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 6f394d38..62111934 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -29,7 +29,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `array_concat` - `array_modify` - Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations. -- `apply_neighborhood`: Allow `null` as default value for units. +- `apply_neighborhood`: + - Allow `null` as default value for units. + - Input and Output for the `process` can either be data cubes or arrays (if one-dimensional). [#387](https://github.com/Open-EO/openeo-processes/issues/387) - `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376) - `load_collection` and `load_result`: - Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) @@ -51,10 +53,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed - `aggregate_spatial`: - - Clarified that feature properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) - - Clarified that a `TargetDimensionExists` exception is thrown if the target dimension exists. + - Clarified that feature properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) + - Clarified that a `TargetDimensionExists` exception is thrown if the target dimension exists. - `apply` and `array_apply`: Fixed broken references to the `absolute` process -- `apply_neighborhood`: Parameter `overlap` was optional but had no default value and no schena for the default value defined. +- `apply_neighborhood`: + - Parameter `overlap` was optional but had no default value and no schema for the default value defined. + - Clarified that the overlap must be included in the returned data cube but value changes are ignored. [#386](https://github.com/Open-EO/openeo-processes/issues/386) + - Removed a conflicting statement that dimension labels can be changed. [#385](https://github.com/Open-EO/openeo-processes/issues/385) - `array_contains` and `array_find`: Clarify that giving `null` as `value` always returns `false` or `null` respectively, also fixed the incorrect examples. [#348](https://github.com/Open-EO/openeo-processes/issues/348) - `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) - `is_nan`: Fixed a wrong description of the return value and simplified/clarified the process descriptions overall. [#360](https://github.com/Open-EO/openeo-processes/issues/360) diff --git a/apply_neighborhood.json b/apply_neighborhood.json index 3b89adf4..87f2b1dd 100644 --- a/apply_neighborhood.json +++ b/apply_neighborhood.json @@ -1,7 +1,7 @@ { "id": "apply_neighborhood", "summary": "Apply a process to pixels in a n-dimensional neighborhood", - "description": "Applies a focal process to a data cube.\n\nA focal process is a process that works on a 'neighborhood' of pixels. The neighborhood can extend into multiple dimensions, this extent is specified by the `size` argument. It is not only (part of) the size of the input window, but also the size of the output for a given position of the sliding window. The sliding window moves with multiples of `size`.\n\nAn overlap can be specified so that neighborhoods can have overlapping boundaries. This allows for continuity of the output. The values included in the data cube as overlap can't be modified by the given `process`. The missing overlap at the borders of the original data cube are made available as no-data (`null`) in the sub data cubes.\n\nThe neighborhood size should be kept small enough, to avoid running beyond computational resources, but a too small size will result in a larger number of process invocations, which may slow down processing. Window sizes for spatial dimensions typically are in the range of 64 to 512 pixels, while overlaps of 8 to 32 pixels are common.\n\nThe process must not add new dimensions, or remove entire dimensions, but the result can have different dimension labels.\n\nFor the special case of 2D convolution, it is recommended to use ``apply_kernel()``.", + "description": "Applies a focal process to a data cube.\n\nA focal process is a process that works on a 'neighborhood' of pixels. The neighborhood can extend into multiple dimensions, this extent is specified by the `size` argument. It is not only (part of) the size of the input window, but also the size of the output for a given position of the sliding window. The sliding window moves with multiples of `size`.\n\nAn overlap can be specified so that neighborhoods can have overlapping boundaries. This allows for continuity of the output. The overlap region must be included in the data cube or array returned by `process`, but any changed values will be ignored. The missing overlap at the borders of the original data cube is made available as no-data (`null`) in the sub-data cubes.\n\nThe neighborhood size should be kept small enough, to avoid running beyond computational resources, but a too-small size will result in a larger number of process invocations, which may slow down processing. Window sizes for spatial dimensions typically range from 64 to 512 pixels, while overlaps of 8 to 32 pixels are common.\n\nFor the special case of 2D convolution, it is recommended to use ``apply_kernel()``.", "categories": [ "cubes" ], @@ -32,8 +32,44 @@ "parameters": [ { "name": "data", - "description": "A subset of the data cube as specified in `size` and `overlap`.", + "description": "The input data, which is a subset of the data cube as specified in `size` and `overlap`. If the given size and overlap result in a one-dimensional data cube it is converted to a labeled array.", + "schema": [ + { + "title": "Multi-dimensional data", + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "spatial", + "axis": [ + "x", + "y" + ] + } + ] + }, + { + "title": "One-dimensional data", + "type": "array", + "subtype": "labeled-array" + } + ] + }, + { + "name": "context", + "description": "Additional data passed by the user.", "schema": { + "description": "Any data type." + }, + "optional": true, + "default": null + } + ], + "returns": { + "description": "An array or data cube with the newly computed values. The data type and dimensionality must correspond to the input data.\n\n* Data cubes must have the same dimensions and the dimension properties (name, type, labels, reference system and resolution) must remain unchanged. Otherwise, a `DataCubePropertiesImmutable` exception will be thrown.\n* Arrays can be returned with or without labels.", + "schema": [ + { + "title": "Multi-dimensional data", "type": "object", "subtype": "datacube", "dimensions": [ @@ -45,33 +81,12 @@ ] } ] - } - }, - { - "name": "context", - "description": "Additional data passed by the user.", - "schema": { - "description": "Any data type." }, - "optional": true, - "default": null - } - ], - "returns": { - "description": "The data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) must remain unchanged, otherwise a `DataCubePropertiesImmutable` exception will be thrown.", - "schema": { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "spatial", - "axis": [ - "x", - "y" - ] - } - ] - } + { + "title": "One-dimensional data", + "type": "array" + } + ] } } }, From 727a24ad43d0d41732e727d70bbf65d969cac112 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 31 Jan 2023 17:02:47 +0100 Subject: [PATCH 068/117] Update lint config --- tests/package.json | 2 +- tests/testConfig.json | 1 + 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/tests/package.json b/tests/package.json index 500d063b..b2ca85a1 100644 --- a/tests/package.json +++ b/tests/package.json @@ -18,7 +18,7 @@ "url": "git+https://github.com/Open-EO/openeo-processes.git" }, "devDependencies": { - "@openeo/processes-lint": "^0.1.3", + "@openeo/processes-lint": "^0.1.5", "concat-json-files": "^1.1.0", "http-server": "^14.1.1" }, diff --git a/tests/testConfig.json b/tests/testConfig.json index 60d8b893..9b5fbcb2 100644 --- a/tests/testConfig.json +++ b/tests/testConfig.json @@ -9,5 +9,6 @@ "subtypeSchemas": "../meta/subtype-schemas.json", "checkSubtypeSchemas": true, "forbidDeprecatedTypes": false, + "checkProcessLinks": true, "verbose": false } From 1ef862f286c448c4705ec22c43cdc97f5c548d04 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 6 Feb 2023 14:42:15 +0100 Subject: [PATCH 069/117] Deprecated PROJ definitions for the CRS are not supported any longer. (#406) --- CHANGELOG.md | 1 + filter_bbox.json | 8 +------- load_collection.json | 8 +------- meta/subtype-schemas.json | 13 ++----------- proposals/load_result.json | 8 +------- resample_spatial.json | 8 +------- 6 files changed, 7 insertions(+), 39 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 62111934..1cf1164f 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -49,6 +49,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - The `examples` folder has been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. - Deprecated `GeometryCollections` are not supported any longer. [#389](https://github.com/Open-EO/openeo-processes/issues/389) +- Deprecated PROJ definitions for the CRS are not supported any longer. ### Fixed diff --git a/filter_bbox.json b/filter_bbox.json index 818bcaaa..6351955c 100644 --- a/filter_bbox.json +++ b/filter_bbox.json @@ -83,7 +83,7 @@ "default": null }, "crs": { - "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/), [WKT2 (ISO 19162) string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html) or [PROJ definition (deprecated)](https://proj.org/usage/quickstart.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", + "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", "anyOf": [ { "title": "EPSG Code", @@ -98,12 +98,6 @@ "title": "WKT2", "type": "string", "subtype": "wkt2-definition" - }, - { - "title": "PROJ definition", - "type": "string", - "subtype": "proj-definition", - "deprecated": true } ], "default": 4326 diff --git a/load_collection.json b/load_collection.json index 1a5296e2..050d5a81 100644 --- a/load_collection.json +++ b/load_collection.json @@ -64,7 +64,7 @@ "default": null }, "crs": { - "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/), [WKT2 (ISO 19162) string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html) or [PROJ definition (deprecated)](https://proj.org/usage/quickstart.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", + "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", "anyOf": [ { "title": "EPSG Code", @@ -79,12 +79,6 @@ "title": "WKT2", "type": "string", "subtype": "wkt2-definition" - }, - { - "title": "PROJ definition", - "type": "string", - "subtype": "proj-definition", - "deprecated": true } ], "default": 4326 diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index 17cd2b72..941e6a48 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -14,7 +14,7 @@ "type": "object", "subtype": "bounding-box", "title": "Bounding Box", - "description": "A bounding box with the required fields `west`, `south`, `east`, `north` and optionally `base`, `height`, `crs`. The `crs` is a EPSG code, a WKT2:2018 string or a PROJ definition (deprecated).", + "description": "A bounding box with the required fields `west`, `south`, `east`, `north` and optionally `base`, `height`, `crs`. The `crs` is a EPSG code or a WKT2:2018 string.", "required": [ "west", "south", @@ -55,16 +55,13 @@ "default": null }, "crs": { - "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/), [WKT2 (ISO 19162) string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html) or [PROJ definition (deprecated)](https://proj.org/usage/quickstart.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", + "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", "anyOf": [ { "$ref": "#/definitions/epsg-code" }, { "$ref": "#/definitions/wkt2-definition" - }, - { - "$ref": "#/definitions/proj-definition" } ], "default": 4326 @@ -286,12 +283,6 @@ } } }, - "proj-definition": { - "type": "string", - "subtype": "proj-definition", - "title": "PROJ definition", - "description": "**DEPRECATED.** Specifies details about cartographic projections as [PROJ](https://proj.org/usage/quickstart.html) definition." - }, "raster-cube": { "type": "object", "subtype": "raster-cube", diff --git a/proposals/load_result.json b/proposals/load_result.json index 6d67f4d8..26f6039d 100644 --- a/proposals/load_result.json +++ b/proposals/load_result.json @@ -75,7 +75,7 @@ "default": null }, "crs": { - "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/), [WKT2 (ISO 19162) string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html) or [PROJ definition (deprecated)](https://proj.org/usage/quickstart.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", + "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", "anyOf": [ { "title": "EPSG Code", @@ -90,12 +90,6 @@ "title": "WKT2", "type": "string", "subtype": "wkt2-definition" - }, - { - "title": "PROJ definition", - "type": "string", - "subtype": "proj-definition", - "deprecated": true } ], "default": 4326 diff --git a/resample_spatial.json b/resample_spatial.json index d97865f2..6e13d459 100644 --- a/resample_spatial.json +++ b/resample_spatial.json @@ -49,7 +49,7 @@ }, { "name": "projection", - "description": "Warps the data cube to the target projection, specified as as [EPSG code](http://www.epsg-registry.org/), [WKT2 (ISO 19162) string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html), [PROJ definition (deprecated)](https://proj.org/usage/quickstart.html). By default (`null`), the projection is not changed.", + "description": "Warps the data cube to the target projection, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). By default (`null`), the projection is not changed.", "schema": [ { "title": "EPSG Code", @@ -65,12 +65,6 @@ "type": "string", "subtype": "wkt2-definition" }, - { - "title": "PROJ definition", - "type": "string", - "subtype": "proj-definition", - "deprecated": true - }, { "title": "Don't change projection", "type": "null" From d9e80d4b3ef7cb6fae03c30cf1749dca7f1d4124 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 28 Feb 2023 14:45:15 +0100 Subject: [PATCH 070/117] apply_dimension: Clarification of behavior. #357 (#400) --- CHANGELOG.md | 1 + apply_dimension.json | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 1cf1164f..f6606496 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -57,6 +57,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Clarified that feature properties are preserved for vector data cubes and all GeoJSON Features. [#270](https://github.com/Open-EO/openeo-processes/issues/270) - Clarified that a `TargetDimensionExists` exception is thrown if the target dimension exists. - `apply` and `array_apply`: Fixed broken references to the `absolute` process +- `apply_dimension`: Clarify the behavior for when a dimension gets 'dropped'. [#357](https://github.com/Open-EO/openeo-processes/issues/357) - `apply_neighborhood`: - Parameter `overlap` was optional but had no default value and no schema for the default value defined. - Clarified that the overlap must be included in the returned data cube but value changes are ignored. [#386](https://github.com/Open-EO/openeo-processes/issues/386) diff --git a/apply_dimension.json b/apply_dimension.json index 7f8a5616..8a11f361 100644 --- a/apply_dimension.json +++ b/apply_dimension.json @@ -83,7 +83,7 @@ } ], "returns": { - "description": "A data cube with the newly computed values.\n\nAll dimensions stay the same, except for the dimensions specified in corresponding parameters. There are three cases how the dimensions can change:\n\n1. The source dimension is the target dimension:\n - The (number of) dimensions remain unchanged as the source dimension is the target dimension.\n - The source dimension properties name and type remain unchanged.\n - The dimension labels, the reference system and the resolution are preserved only if the number of values in the source dimension is equal to the number of values computed by the process. Otherwise, all other dimension properties change as defined in the list below.\n2. The source dimension is not the target dimension and the latter exists:\n - The number of dimensions decreases by one as the source dimension is dropped.\n - The target dimension properties name and type remain unchanged. All other dimension properties change as defined in the list below.\n3. The source dimension is not the target dimension and the latter does not exist:\n - The number of dimensions remain unchanged, but the source dimension is replaced with the target dimension.\n - The target dimension has the specified name and the type other. All other dimension properties are set as defined in the list below.\n\nUnless otherwise stated above, for the given (target) dimension the following applies:\n\n- the number of dimension labels is equal to the number of values computed by the process,\n- the dimension labels are incrementing integers starting from zero,\n- the resolution changes, and\n- the reference system is undefined.", + "description": "A data cube with the newly computed values.\n\nAll dimensions stay the same, except for the dimensions specified in corresponding parameters. There are three cases how the dimensions can change:\n\n1. The source dimension is the target dimension:\n - The (number of) dimensions remain unchanged as the source dimension is the target dimension.\n - The source dimension properties name and type remain unchanged.\n - The dimension labels, the reference system and the resolution are preserved only if the number of values in the source dimension is equal to the number of values computed by the process. Otherwise, all other dimension properties change as defined in the list below.\n2. The source dimension is not the target dimension. The target dimension exists with a single label only:\n - The number of dimensions decreases by one as the source dimension is 'dropped' and the target dimension is filled with the processed data that originates from the source dimension.\n - The target dimension properties name and type remain unchanged. All other dimension properties change as defined in the list below.\n3. The source dimension is not the target dimension and the latter does not exist:\n - The number of dimensions remain unchanged, but the source dimension is replaced with the target dimension.\n - The target dimension has the specified name and the type other. All other dimension properties are set as defined in the list below.\n\nUnless otherwise stated above, for the given (target) dimension the following applies:\n\n- the number of dimension labels is equal to the number of values computed by the process,\n- the dimension labels are incrementing integers starting from zero,\n- the resolution changes, and\n- the reference system is undefined.", "schema": { "type": "object", "subtype": "datacube" From 96e9bc03b6719a2e97d59aabd959466a966a37b4 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 28 Feb 2023 14:49:45 +0100 Subject: [PATCH 071/117] Simplify comparison processes (and add date_difference) (#399) --- CHANGELOG.md | 3 ++ between.json | 89 +++--------------------------- eq.json | 31 ++++------- gt.json | 27 ++++------ gte.json | 29 ++++------ lt.json | 29 ++++------ lte.json | 31 +++++------ neq.json | 31 ++++------- proposals/date_difference.json | 99 ++++++++++++++++++++++++++++++++++ proposals/date_shift.json | 2 +- tests/.words | 2 + 11 files changed, 178 insertions(+), 195 deletions(-) create mode 100644 proposals/date_difference.json diff --git a/CHANGELOG.md b/CHANGELOG.md index f6606496..e2165c06 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - New processes in proposal state: + - `date_difference` - `filter_vector` - `fit_class_random_forest` - `fit_regr_random_forest` @@ -38,6 +39,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Added a `NoDataAvailable` exception - `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) - `save_result`: Added a more concrete `DataCubeEmpty` exception. +- The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` don't support temporal comparison any longer. Instead explicitly use `date_difference`. - New definition for `aggregate_spatial`: - Allows more than 3 input dimensions [#126](https://github.com/Open-EO/openeo-processes/issues/126) - Allow to not export statistics by changing the parameter `target_dimension` [#366](https://github.com/Open-EO/openeo-processes/issues/366) @@ -48,6 +50,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Removed - The `examples` folder has been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. +- `between`: Support for temporal comparison. - Deprecated `GeometryCollections` are not supported any longer. [#389](https://github.com/Open-EO/openeo-processes/issues/389) - Deprecated PROJ definitions for the CRS are not supported any longer. diff --git a/between.json b/between.json index 7d8f14df..b2e59b92 100644 --- a/between.json +++ b/between.json @@ -16,50 +16,16 @@ { "name": "min", "description": "Lower boundary (inclusive) to check against.", - "schema": [ - { - "type": "number" - }, - { - "type": "string", - "format": "date-time", - "subtype": "date-time" - }, - { - "type": "string", - "format": "date", - "subtype": "date" - }, - { - "type": "string", - "format": "time", - "subtype": "time" - } - ] + "schema": { + "type": "number" + } }, { "name": "max", "description": "Upper boundary (inclusive) to check against.", - "schema": [ - { - "type": "number" - }, - { - "type": "string", - "format": "date-time", - "subtype": "date-time" - }, - { - "type": "string", - "format": "date", - "subtype": "date" - }, - { - "type": "string", - "format": "time", - "subtype": "time" - } - ] + "schema": { + "type": "number" + } }, { "name": "exclude_max", @@ -122,47 +88,6 @@ "max": 0 }, "returns": true - }, - { - "arguments": { - "x": "00:59:59Z", - "min": "01:00:00+01:00", - "max": "01:00:00Z" - }, - "returns": true - }, - { - "arguments": { - "x": "2018-07-23T17:22:45Z", - "min": "2018-01-01T00:00:00Z", - "max": "2018-12-31T23:59:59Z" - }, - "returns": true - }, - { - "arguments": { - "x": "2000-01-01", - "min": "2018-01-01", - "max": "2020-01-01" - }, - "returns": false - }, - { - "arguments": { - "x": "2018-12-31T17:22:45Z", - "min": "2018-01-01", - "max": "2018-12-31" - }, - "returns": true - }, - { - "arguments": { - "x": "2018-12-31T17:22:45Z", - "min": "2018-01-01", - "max": "2018-12-31", - "exclude_max": true - }, - "returns": false } ], "process_graph": { @@ -226,4 +151,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/eq.json b/eq.json index 6550098e..ce07da96 100644 --- a/eq.json +++ b/eq.json @@ -1,7 +1,7 @@ { "id": "eq", "summary": "Equal to comparison", - "description": "Compares whether `x` is strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`. Therefore, `eq(null, null)` returns `null` instead of `true`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings MUST be compared differently than other strings and MUST NOT be compared based on their string representation due to different possible representations. For example, the time zone representation `Z` (for UTC) has the same meaning as `+00:00`.", + "description": "Compares whether `x` is strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "texts", "comparison" @@ -136,27 +136,11 @@ "returns": true }, { - "arguments": { - "x": "00:00:00+00:00", - "y": "00:00:00Z" - }, - "returns": true - }, - { - "description": "`y` is not a valid date-time representation and therefore will be treated as a string so that the provided values are not equal.", - "arguments": { - "x": "2018-01-01T12:00:00Z", - "y": "2018-01-01T12:00:00" - }, - "returns": false - }, - { - "description": "01:00 in the time zone +1 is equal to 00:00 in UTC.", "arguments": { "x": "2018-01-01T00:00:00Z", - "y": "2018-01-01T01:00:00+01:00" + "y": "2018-01-01T00:00:00+00:00" }, - "returns": true + "returns": false }, { "arguments": { @@ -172,6 +156,13 @@ ] }, "returns": false + }, + { + "arguments": { + "x": null, + "y": null + }, + "returns": null } ] -} \ No newline at end of file +} diff --git a/gt.json b/gt.json index 4ae5ecd0..e290b3c2 100644 --- a/gt.json +++ b/gt.json @@ -1,7 +1,7 @@ { "id": "gt", "summary": "Greater than comparison", - "description": "Compares whether `x` is strictly greater than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If any operand is not a `number` or temporal string (`date`, `time` or `date-time`), the process returns `false`.\n* Temporal strings can *not* be compared based on their string representation due to the time zone / time-offset representations.", + "description": "Compares whether `x` is strictly greater than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", "categories": [ "comparison" ], @@ -61,38 +61,31 @@ }, { "arguments": { - "x": "00:00:00Z", - "y": "00:00:00+01:00" - }, - "returns": true - }, - { - "arguments": { - "x": "1950-01-01T00:00:00Z", - "y": "2018-01-01T12:00:00Z" + "x": "2018-01-02T00:00:00Z", + "y": "2018-01-01T00:00:00Z" }, "returns": false }, { "arguments": { - "x": "2018-01-01T12:00:00+00:00", - "y": "2018-01-01T12:00:00Z" + "x": true, + "y": 0 }, "returns": false }, { "arguments": { "x": true, - "y": 0 + "y": false }, "returns": false }, { "arguments": { - "x": true, - "y": false + "x": null, + "y": null }, - "returns": false + "returns": null } ] -} \ No newline at end of file +} diff --git a/gte.json b/gte.json index ea54a346..1216861b 100644 --- a/gte.json +++ b/gte.json @@ -1,7 +1,7 @@ { "id": "gte", "summary": "Greater than or equal to comparison", - "description": "Compares whether `x` is greater than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`. Therefore, `gte(null, null)` returns `null` instead of `true`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number` or temporal string (`date`, `time` or `date-time`), the process returns `false`.\n* Temporal strings can *not* be compared based on their string representation due to the time zone / time-offset representations.", + "description": "Compares whether `x` is greater than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", "categories": [ "comparison" ], @@ -61,25 +61,11 @@ }, { "arguments": { - "x": "00:00:00Z", - "y": "00:00:00+01:00" - }, - "returns": true - }, - { - "arguments": { - "x": "1950-01-01T00:00:00Z", - "y": "2018-01-01T12:00:00Z" + "x": "2018-01-01T00:00:00Z", + "y": "2018-01-01T00:00:00+00:00" }, "returns": false }, - { - "arguments": { - "x": "2018-01-01T12:00:00+00:00", - "y": "2018-01-01T12:00:00Z" - }, - "returns": true - }, { "arguments": { "x": true, @@ -101,6 +87,13 @@ ] }, "returns": false + }, + { + "arguments": { + "x": null, + "y": null + }, + "returns": null } ], "process_graph": { @@ -139,4 +132,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/lt.json b/lt.json index 54e7b749..10f4c99d 100644 --- a/lt.json +++ b/lt.json @@ -1,7 +1,7 @@ { "id": "lt", "summary": "Less than comparison", - "description": "Compares whether `x` is strictly less than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If any operand is not a `number` or temporal string (`date`, `time` or `date-time`), the process returns `false`.\n* Temporal strings can *not* be compared based on their string representation due to the time zone / time-offset representations.", + "description": "Compares whether `x` is strictly less than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", "categories": [ "comparison" ], @@ -61,22 +61,8 @@ }, { "arguments": { - "x": "00:00:00+01:00", - "y": "00:00:00Z" - }, - "returns": true - }, - { - "arguments": { - "x": "1950-01-01T00:00:00Z", - "y": "2018-01-01T12:00:00Z" - }, - "returns": true - }, - { - "arguments": { - "x": "2018-01-01T12:00:00+00:00", - "y": "2018-01-01T12:00:00Z" + "x": "2018-01-01T00:00:00Z", + "y": "2018-01-02T00:00:00Z" }, "returns": false }, @@ -93,6 +79,13 @@ "y": true }, "returns": false + }, + { + "arguments": { + "x": null, + "y": null + }, + "returns": null } ] -} \ No newline at end of file +} diff --git a/lte.json b/lte.json index 4debaaad..fc4c3639 100644 --- a/lte.json +++ b/lte.json @@ -1,7 +1,7 @@ { "id": "lte", "summary": "Less than or equal to comparison", - "description": "Compares whether `x` is less than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`. Therefore, `lte(null, null)` returns `null` instead of `true`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number` or temporal string (`date`, `time` or `date-time`), the process returns `false`.\n* Temporal strings can *not* be compared based on their string representation due to the time zone / time-offset representations.", + "description": "Compares whether `x` is less than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", "categories": [ "comparison" ], @@ -61,24 +61,10 @@ }, { "arguments": { - "x": "00:00:00+01:00", - "y": "00:00:00Z" + "x": "2018-01-01T00:00:00Z", + "y": "2018-01-01T00:00:00+00:00" }, - "returns": true - }, - { - "arguments": { - "x": "1950-01-01T00:00:00Z", - "y": "2018-01-01T12:00:00Z" - }, - "returns": true - }, - { - "arguments": { - "x": "2018-01-01T12:00:00+00:00", - "y": "2018-01-01T12:00:00Z" - }, - "returns": true + "returns": false }, { "arguments": { @@ -101,6 +87,13 @@ ] }, "returns": false + }, + { + "arguments": { + "x": null, + "y": null + }, + "returns": null } ], "process_graph": { @@ -139,4 +132,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/neq.json b/neq.json index ec3908e4..49aedec9 100644 --- a/neq.json +++ b/neq.json @@ -1,7 +1,7 @@ { "id": "neq", "summary": "Not equal to comparison", - "description": "Compares whether `x` is *not* strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`. Therefore, `neq(null, null)` returns `null` instead of `false`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings MUST be compared differently than other strings and MUST NOT be compared based on their string representation due to different possible representations. For example, the time zone representation `Z` (for UTC) has the same meaning as `+00:00`.", + "description": "Compares whether `x` is **not** strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", "categories": [ "texts", "comparison" @@ -129,27 +129,11 @@ "returns": false }, { - "arguments": { - "x": "00:00:00+00:00", - "y": "00:00:00Z" - }, - "returns": false - }, - { - "description": "`y` is not a valid date-time representation and therefore will be treated as a string so that the provided values are not equal.", - "arguments": { - "x": "2018-01-01T12:00:00Z", - "y": "2018-01-01T12:00:00" - }, - "returns": true - }, - { - "description": "01:00 in the time zone +1 is equal to 00:00 in UTC.", "arguments": { "x": "2018-01-01T00:00:00Z", - "y": "2018-01-01T01:00:00+01:00" + "y": "2018-01-01T00:00:00+00:00" }, - "returns": false + "returns": true }, { "arguments": { @@ -165,6 +149,13 @@ ] }, "returns": false + }, + { + "arguments": { + "x": null, + "y": null + }, + "returns": null } ], "process_graph": { @@ -195,4 +186,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/proposals/date_difference.json b/proposals/date_difference.json new file mode 100644 index 00000000..650030fd --- /dev/null +++ b/proposals/date_difference.json @@ -0,0 +1,99 @@ +{ + "id": "date_difference", + "summary": "Computes the difference between two time instants", + "description": "Computes the difference between two instants in time and returns the difference between them in the unit given.\n\nThe process converts the given dates into numerical timestamps and returns the result of subtracting the other date from the base date. If a given date doesn't include the time, the process assumes that the time component is `00:00:00Z` (i.e. midnight, in UTC). The millisecond part of the times are optional and default to `0` if not given. The process doesn't take daylight saving time (DST) into account as only dates and times in UTC (with potential numerical time zone modifier) are supported.", + "categories": [ + "comparison", + "date & time" + ], + "experimental": true, + "parameters": [ + { + "name": "date1", + "description": "The base date, optionally with a time component.", + "schema": [ + { + "type": "string", + "format": "date-time", + "subtype": "date-time" + }, + { + "type": "string", + "format": "date", + "subtype": "date" + } + ] + }, + { + "name": "date2", + "description": "The other date, optionally with a time component.", + "schema": [ + { + "type": "string", + "format": "date-time", + "subtype": "date-time" + }, + { + "type": "string", + "format": "date", + "subtype": "date" + } + ] + }, + { + "name": "unit", + "description": "The unit for the returned value. The following units are available:\n\n- millisecond\n- second - leap seconds are ignored in computations.\n- minute\n- hour\n- day\n- month\n- year", + "optional": true, + "default": "second", + "schema": { + "type": "string", + "enum": [ + "millisecond", + "second", + "minute", + "hour", + "day", + "month", + "year" + ] + } + } + ], + "returns": { + "description": "Returns the difference between date1 and date2 in the given unit (seconds by default), including a fractional part if required.\n\nFor comparison purposes this means:\n\n- If `date1` < `date2`, the returned value is positive.\n- If `date1` = `date2`, the returned value is 0.\n- If `date1` > `date2`, the returned value is negative.", + "schema": { + "type": "number" + } + }, + "examples": [ + { + "arguments": { + "date1": "2020-01-01T00:00:00.0Z", + "date2": "2020-01-01T00:00:15.5Z" + }, + "returns": 15.5 + }, + { + "arguments": { + "date1": "2020-01-01T00:00:00Z", + "date2": "2020-01-01T01:00:00+01:00" + }, + "returns": 0 + }, + { + "arguments": { + "date1": "2020-01-02", + "date2": "2020-01-01" + }, + "returns": -86400 + }, + { + "arguments": { + "date1": "2020-01-02", + "date2": "2020-01-01", + "unit": "day" + }, + "returns": -1 + } + ] +} diff --git a/proposals/date_shift.json b/proposals/date_shift.json index e9b6226b..d4726f6d 100644 --- a/proposals/date_shift.json +++ b/proposals/date_shift.json @@ -1,7 +1,7 @@ { "id": "date_shift", "summary": "Manipulates dates and times by addition or subtraction", - "description": "Based on a given date (and optionally time), calculates a new date (and time if given) by adding or subtracting a given temporal period.\n\nSome specifics about dates and times need to be taken into account:\n\n* This process doesn't have any effect on the time zone.\n* It doesn't take daylight saving time (DST) into account as only dates and time in UTC (with potential numerical time zone modifier) are supported.\n* Leap years are implemented in a way that computations handle them gracefully (see parameter `unit` for details).\n* Leap seconds are mostly ignored in manipulations as they don't follow a regular pattern. Leap seconds can be passed to the process, but will never be returned.", + "description": "Based on a given date (and optionally time), calculates a new date (and time if given) by adding or subtracting a given temporal period.\n\nSome specifics about dates and times need to be taken into account:\n\n* This process doesn't have any effect on the time zone.\n* It doesn't take daylight saving time (DST) into account as only dates and times in UTC (with potential numerical time zone modifier) are supported.\n* Leap years are implemented in a way that computations handle them gracefully (see parameter `unit` for details).\n* Leap seconds are mostly ignored in manipulations as they don't follow a regular pattern. Leap seconds can be passed to the process, but will never be returned.", "categories": [ "date & time" ], diff --git a/tests/.words b/tests/.words index 66152744..112b2347 100644 --- a/tests/.words +++ b/tests/.words @@ -39,3 +39,5 @@ sinc interpolants Breiman Hyndman +date1 +date2 From dff04082a61ca4a22ff347aac358b31e37cbaff1 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 28 Feb 2023 15:06:08 +0100 Subject: [PATCH 072/117] Align descriptions in comparison processes #399 --- gt.json | 2 +- gte.json | 2 +- lt.json | 2 +- lte.json | 2 +- neq.json | 2 +- 5 files changed, 5 insertions(+), 5 deletions(-) diff --git a/gt.json b/gt.json index e290b3c2..f80b6e11 100644 --- a/gt.json +++ b/gt.json @@ -1,7 +1,7 @@ { "id": "gt", "summary": "Greater than comparison", - "description": "Compares whether `x` is strictly greater than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", + "description": "Compares whether `x` is strictly greater than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "comparison" ], diff --git a/gte.json b/gte.json index 1216861b..378d03e6 100644 --- a/gte.json +++ b/gte.json @@ -1,7 +1,7 @@ { "id": "gte", "summary": "Greater than or equal to comparison", - "description": "Compares whether `x` is greater than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", + "description": "Compares whether `x` is greater than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "comparison" ], diff --git a/lt.json b/lt.json index 10f4c99d..da6fe80d 100644 --- a/lt.json +++ b/lt.json @@ -1,7 +1,7 @@ { "id": "lt", "summary": "Less than comparison", - "description": "Compares whether `x` is strictly less than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", + "description": "Compares whether `x` is strictly less than `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "comparison" ], diff --git a/lte.json b/lte.json index fc4c3639..bf007154 100644 --- a/lte.json +++ b/lte.json @@ -1,7 +1,7 @@ { "id": "lte", "summary": "Less than or equal to comparison", - "description": "Compares whether `x` is less than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", + "description": "Compares whether `x` is less than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "comparison" ], diff --git a/neq.json b/neq.json index 49aedec9..4446bb41 100644 --- a/neq.json +++ b/neq.json @@ -1,7 +1,7 @@ { "id": "neq", "summary": "Not equal to comparison", - "description": "Compares whether `x` is **not** strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings are normal strings. To compare temporal strings use ``date_difference()``.", + "description": "Compares whether `x` is **not** strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "texts", "comparison" From db242a887c313e7ba9b28440da98d390e6f986a9 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 28 Feb 2023 15:06:44 +0100 Subject: [PATCH 073/117] Improve array_concat examples --- proposals/array_concat.json | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/proposals/array_concat.json b/proposals/array_concat.json index 2728ebfe..0f4dddac 100644 --- a/proposals/array_concat.json +++ b/proposals/array_concat.json @@ -44,7 +44,24 @@ }, "examples": [ { - "description": "Concatenates two arrays containing different data type.", + "description": "Concatenates two numerical arrays.", + "arguments": { + "array1": [ + 1.5, + 2.5 + ], + "array2": [ + 5 + ] + }, + "returns": [ + 1.5, + 2.5, + 5 + ] + }, + { + "description": "Concatenates two arrays containing different data type, may not always be supported.", "arguments": { "array1": [ "a", From 4c7ae851d5cad3b06e28c1e40f574fa1b19213bd Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 13 Mar 2023 17:25:40 +0100 Subject: [PATCH 074/117] geometries -> geometry --- add_dimension.json | 2 +- aggregate_spatial.json | 4 ++-- filter_bbox.json | 4 ++-- filter_spatial.json | 2 +- load_collection.json | 2 +- mask_polygon.json | 2 +- proposals/filter_vector.json | 6 +++--- proposals/fit_class_random_forest.json | 6 +++--- proposals/fit_regr_random_forest.json | 6 +++--- proposals/load_result.json | 2 +- proposals/vector_buffer.json | 4 ++-- proposals/vector_to_random_points.json | 4 ++-- proposals/vector_to_regular_points.json | 4 ++-- 13 files changed, 24 insertions(+), 24 deletions(-) diff --git a/add_dimension.json b/add_dimension.json index b156846b..6166b2a4 100644 --- a/add_dimension.json +++ b/add_dimension.json @@ -40,7 +40,7 @@ "type": "string", "enum": [ "bands", - "geometries", + "geometry", "spatial", "temporal", "other" diff --git a/aggregate_spatial.json b/aggregate_spatial.json index 380e34c0..a51e0dbd 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -38,7 +38,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } @@ -108,7 +108,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } diff --git a/filter_bbox.json b/filter_bbox.json index 6351955c..28797193 100644 --- a/filter_bbox.json +++ b/filter_bbox.json @@ -31,7 +31,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } @@ -129,7 +129,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } diff --git a/filter_spatial.json b/filter_spatial.json index b6d7a7de..28dda1ad 100644 --- a/filter_spatial.json +++ b/filter_spatial.json @@ -37,7 +37,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } diff --git a/load_collection.json b/load_collection.json index 050d5a81..27e538bf 100644 --- a/load_collection.json +++ b/load_collection.json @@ -98,7 +98,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] }, diff --git a/mask_polygon.json b/mask_polygon.json index f79db016..902d83fb 100644 --- a/mask_polygon.json +++ b/mask_polygon.json @@ -38,7 +38,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries", + "type": "geometry", "geometry_type": [ "Polygon", "MultiPolygon" diff --git a/proposals/filter_vector.json b/proposals/filter_vector.json index 46279fa7..fd18b07b 100644 --- a/proposals/filter_vector.json +++ b/proposals/filter_vector.json @@ -17,7 +17,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } @@ -36,7 +36,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } @@ -69,7 +69,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json index 1b6f299f..6eb874bf 100644 --- a/proposals/fit_class_random_forest.json +++ b/proposals/fit_class_random_forest.json @@ -16,7 +16,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" }, { "type": "bands" @@ -28,7 +28,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" }, { "type": "other" @@ -45,7 +45,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json index 121af96d..51191fa5 100644 --- a/proposals/fit_regr_random_forest.json +++ b/proposals/fit_regr_random_forest.json @@ -16,7 +16,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" }, { "type": "bands" @@ -28,7 +28,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" }, { "type": "other" @@ -45,7 +45,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } diff --git a/proposals/load_result.json b/proposals/load_result.json index 26f6039d..2ac7de7e 100644 --- a/proposals/load_result.json +++ b/proposals/load_result.json @@ -109,7 +109,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] }, diff --git a/proposals/vector_buffer.json b/proposals/vector_buffer.json index ad30030b..d78eef9c 100644 --- a/proposals/vector_buffer.json +++ b/proposals/vector_buffer.json @@ -21,7 +21,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } @@ -45,7 +45,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } diff --git a/proposals/vector_to_random_points.json b/proposals/vector_to_random_points.json index 9a018849..5a6ad9ac 100644 --- a/proposals/vector_to_random_points.json +++ b/proposals/vector_to_random_points.json @@ -22,7 +22,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } @@ -89,7 +89,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries", + "type": "geometry", "geometry_type": [ "Point", "MultiPoint" diff --git a/proposals/vector_to_regular_points.json b/proposals/vector_to_regular_points.json index d49a333d..efb03417 100644 --- a/proposals/vector_to_regular_points.json +++ b/proposals/vector_to_regular_points.json @@ -22,7 +22,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries" + "type": "geometry" } ] } @@ -53,7 +53,7 @@ "subtype": "datacube", "dimensions": [ { - "type": "geometries", + "type": "geometry", "geometry_type": [ "Point", "MultiPoint" From f86913846deb665ce892af6bff35fe54ae677f2a Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 13 Mar 2023 17:35:21 +0100 Subject: [PATCH 075/117] Update CI --- .github/workflows/docs.yml | 6 +++--- .github/workflows/tests.yml | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 4d5161d0..4e8f4f28 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -13,10 +13,10 @@ jobs: steps: - name: Inject env variables uses: rlespinasse/github-slug-action@v3.x - - uses: actions/setup-node@v1 + - uses: actions/setup-node@v3 with: - node-version: '16' - - uses: actions/checkout@v2 + node-version: 'lts/*' + - uses: actions/checkout@v3 - run: | npm install npm run generate diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index dcb1bcbc..b108eb18 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -4,10 +4,10 @@ jobs: deploy: runs-on: ubuntu-latest steps: - - uses: actions/setup-node@v1 + - uses: actions/setup-node@v3 with: - node-version: '16' - - uses: actions/checkout@v2 + node-version: 'lts/*' + - uses: actions/checkout@v3 - name: Run tests run: | npm install From 81a45940e975caa460188f77d9e3c2166829a2ad Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 14 Mar 2023 13:20:15 +0100 Subject: [PATCH 076/117] Remove ML processes from 2.0.0 (#417) * Remove ML processes for 2.0.0 #416 --- CHANGELOG.md | 5 -- meta/subtype-schemas.json | 6 -- proposals/fit_class_random_forest.json | 110 ------------------------- proposals/fit_regr_random_forest.json | 110 ------------------------- proposals/load_ml_model.json | 53 ------------ proposals/predict_random_forest.json | 42 ---------- proposals/save_ml_model.json | 44 ---------- 7 files changed, 370 deletions(-) delete mode 100644 proposals/fit_class_random_forest.json delete mode 100644 proposals/fit_regr_random_forest.json delete mode 100644 proposals/load_ml_model.json delete mode 100644 proposals/predict_random_forest.json delete mode 100644 proposals/save_ml_model.json diff --git a/CHANGELOG.md b/CHANGELOG.md index e2165c06..fec780e3 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -11,12 +11,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - New processes in proposal state: - `date_difference` - `filter_vector` - - `fit_class_random_forest` - - `fit_regr_random_forest` - `flatten_dimensions` - - `load_ml_model` - - `predict_random_forest` - - `save_ml_model` - `unflatten_dimension` - `vector_buffer` - `vector_to_random_points` diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index 941e6a48..498adf60 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -238,12 +238,6 @@ } } }, - "ml-model": { - "type": "object", - "subtype": "ml-model", - "title": "Machine Learning Model", - "description": "A machine learning model, accompanied with STAC metadata that implements the the STAC ml-model extension." - }, "output-format": { "type": "string", "subtype": "output-format", diff --git a/proposals/fit_class_random_forest.json b/proposals/fit_class_random_forest.json deleted file mode 100644 index 6eb874bf..00000000 --- a/proposals/fit_class_random_forest.json +++ /dev/null @@ -1,110 +0,0 @@ -{ - "id": "fit_class_random_forest", - "summary": "Train a random forest classification model", - "description": "Executes the fit of a random forest classification based on training data. The process does not include a separate split of the data in test, validation and training data. The Random Forest classification model is based on the approach by Breiman (2001).", - "categories": [ - "machine learning" - ], - "experimental": true, - "parameters": [ - { - "name": "predictors", - "description": "The predictors for the classification model as a vector data cube. Aggregated to the features (vectors) of the target input variable.", - "schema": [ - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - }, - { - "type": "bands" - } - ] - }, - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - }, - { - "type": "other" - } - ] - } - ] - }, - { - "name": "target", - "description": "The training sites for the classification model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).", - "schema": { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - } - ] - } - }, - { - "name": "max_variables", - "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split.\n- `all`: All variables are considered for each split.\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split.\n- `onethird`: A third of the number of variables are considered for each split.\n- `sqrt`: The square root of the number of variables are considered for each split. This is often the default for classification.", - "schema": [ - { - "type": "integer", - "minimum": 1 - }, - { - "type": "string", - "enum": [ - "all", - "log2", - "onethird", - "sqrt" - ] - } - ] - }, - { - "name": "num_trees", - "description": "The number of trees build within the Random Forest classification.", - "optional": true, - "default": 100, - "schema": { - "type": "integer", - "minimum": 1 - } - }, - { - "name": "seed", - "description": "A randomization seed to use for the random sampling in training. If not given or `null`, no seed is used and results may differ on subsequent use.", - "optional": true, - "default": null, - "schema": { - "type": [ - "integer", - "null" - ] - } - } - ], - "returns": { - "description": "A model object that can be saved with ``save_ml_model()`` and restored with ``load_ml_model()``.", - "schema": { - "type": "object", - "subtype": "ml-model" - } - }, - "links": [ - { - "href": "https://doi.org/10.1023/A:1010933404324", - "title": "Breiman (2001): Random Forests", - "type": "text/html", - "rel": "about" - } - ] -} diff --git a/proposals/fit_regr_random_forest.json b/proposals/fit_regr_random_forest.json deleted file mode 100644 index 51191fa5..00000000 --- a/proposals/fit_regr_random_forest.json +++ /dev/null @@ -1,110 +0,0 @@ -{ - "id": "fit_regr_random_forest", - "summary": "Train a random forest regression model", - "description": "Executes the fit of a random forest regression based on training data. The process does not include a separate split of the data in test, validation and training data. The Random Forest regression model is based on the approach by Breiman (2001).", - "categories": [ - "machine learning" - ], - "experimental": true, - "parameters": [ - { - "name": "predictors", - "description": "The predictors for the regression model as a vector data cube. Aggregated to the features (vectors) of the target input variable.", - "schema": [ - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - }, - { - "type": "bands" - } - ] - }, - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - }, - { - "type": "other" - } - ] - } - ] - }, - { - "name": "target", - "description": "The training sites for the regression model as a vector data cube. This is associated with the target variable for the Random Forest model. The geometry has to associated with a value to predict (e.g. fractional forest canopy cover).", - "schema": { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - } - ] - } - }, - { - "name": "max_variables", - "description": "Specifies how many split variables will be used at a node.\n\nThe following options are available:\n\n- *integer*: The given number of variables are considered for each split.\n- `all`: All variables are considered for each split.\n- `log2`: The logarithm with base 2 of the number of variables are considered for each split.\n- `onethird`: A third of the number of variables are considered for each split. This is often the default for regression.\n- `sqrt`: The square root of the number of variables are considered for each split.", - "schema": [ - { - "type": "integer", - "minimum": 1 - }, - { - "type": "string", - "enum": [ - "all", - "log2", - "onethird", - "sqrt" - ] - } - ] - }, - { - "name": "num_trees", - "description": "The number of trees build within the Random Forest regression.", - "optional": true, - "default": 100, - "schema": { - "type": "integer", - "minimum": 1 - } - }, - { - "name": "seed", - "description": "A randomization seed to use for the random sampling in training. If not given or `null`, no seed is used and results may differ on subsequent use.", - "optional": true, - "default": null, - "schema": { - "type": [ - "integer", - "null" - ] - } - } - ], - "returns": { - "description": "A model object that can be saved with ``save_ml_model()`` and restored with ``load_ml_model()``.", - "schema": { - "type": "object", - "subtype": "ml-model" - } - }, - "links": [ - { - "href": "https://doi.org/10.1023/A:1010933404324", - "title": "Breiman (2001): Random Forests", - "type": "text/html", - "rel": "about" - } - ] -} diff --git a/proposals/load_ml_model.json b/proposals/load_ml_model.json deleted file mode 100644 index 151513c8..00000000 --- a/proposals/load_ml_model.json +++ /dev/null @@ -1,53 +0,0 @@ -{ - "id": "load_ml_model", - "summary": "Load a ML model", - "description": "Loads a machine learning model from a STAC Item.\n\nSuch a model could be trained and saved as part of a previous batch job with processes such as ``fit_regr_random_forest()`` and ``save_ml_model()``.", - "categories": [ - "machine learning", - "import" - ], - "experimental": true, - "parameters": [ - { - "name": "id", - "description": "The STAC Item to load the machine learning model from. The STAC Item must implement the `ml-model` extension.", - "schema": [ - { - "title": "URL", - "type": "string", - "format": "uri", - "subtype": "uri", - "pattern": "^https?://" - }, - { - "title": "Batch Job ID", - "description": "Loading a model by batch job ID is possible only if a single model has been saved by the job. Otherwise, you have to load a specific model from a batch job by URL.", - "type": "string", - "subtype": "job-id", - "pattern": "^[\\w\\-\\.~]+$" - }, - { - "title": "User-uploaded File", - "type": "string", - "subtype": "file-path", - "pattern": "^[^\r\n\\:'\"]+$" - } - ] - } - ], - "returns": { - "description": "A machine learning model to be used with machine learning processes such as ``predict_random_forest()``.", - "schema": { - "type": "object", - "subtype": "ml-model" - } - }, - "links": [ - { - "href": "https://github.com/stac-extensions/ml-model", - "title": "STAC ml-model extension", - "type": "text/html", - "rel": "about" - } - ] -} diff --git a/proposals/predict_random_forest.json b/proposals/predict_random_forest.json deleted file mode 100644 index 62c54e9f..00000000 --- a/proposals/predict_random_forest.json +++ /dev/null @@ -1,42 +0,0 @@ -{ - "id": "predict_random_forest", - "summary": "Predict values based on a Random Forest model", - "description": "Applies a Random Forest machine learning model to an array and predict a value for it.", - "categories": [ - "machine learning", - "reducer" - ], - "experimental": true, - "parameters": [ - { - "name": "data", - "description": "An array of numbers.", - "schema": { - "type": "array", - "items": { - "type": [ - "number", - "null" - ] - } - } - }, - { - "name": "model", - "description": "A model object that can be trained with the processes ``fit_regr_random_forest()`` (regression) and ``fit_class_random_forest()`` (classification).", - "schema": { - "type": "object", - "subtype": "ml-model" - } - } - ], - "returns": { - "description": "The predicted value. Returns `null` if any of the given values in the array is a no-data value.", - "schema": { - "type": [ - "number", - "null" - ] - } - } -} diff --git a/proposals/save_ml_model.json b/proposals/save_ml_model.json deleted file mode 100644 index 5e9ea8b0..00000000 --- a/proposals/save_ml_model.json +++ /dev/null @@ -1,44 +0,0 @@ -{ - "id": "save_ml_model", - "summary": "Save a ML model", - "description": "Saves a machine learning model as part of a batch job.\n\nThe model will be accompanied by a separate STAC Item that implements the [ml-model extension](https://github.com/stac-extensions/ml-model).", - "categories": [ - "machine learning", - "import" - ], - "experimental": true, - "parameters": [ - { - "name": "data", - "description": "The data to store as a machine learning model.", - "schema": { - "type": "object", - "subtype": "ml-model" - } - }, - { - "name": "options", - "description": "Additional parameters to create the file(s).", - "schema": { - "type": "object", - "additionalParameters": false - }, - "default": {}, - "optional": true - } - ], - "returns": { - "description": "Returns `false` if the process failed to store the model, `true` otherwise.", - "schema": { - "type": "boolean" - } - }, - "links": [ - { - "href": "https://github.com/stac-extensions/ml-model", - "title": "STAC ml-model extension", - "type": "text/html", - "rel": "about" - } - ] -} \ No newline at end of file From 08cb18d9fcea4b2987632788901235cb5e506e42 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 14 Mar 2023 13:51:11 +0100 Subject: [PATCH 077/117] Implicit resampling for spatial dimensions in mask and merge_cubes + clarifications (#405) * `mask` and `merge_cubes`: The spatial dimensions `x` and `y` can now be resampled implicitly instead of throwing an error. #402 * Clarify descriptions #379 * Improve wording as suggested by @soxofaan * Update merge_cubes.json * Slim down description * Default parameters of resample_cube_spatial apply --- CHANGELOG.md | 2 ++ mask.json | 2 +- merge_cubes.json | 10 +++++----- 3 files changed, 8 insertions(+), 6 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index fec780e3..3d74a5d4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -33,6 +33,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) - Added a `NoDataAvailable` exception - `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) +- `mask` and `merge_cubes`: The spatial dimensions `x` and `y` can now be resampled implicitly instead of throwing an error. [#402](https://github.com/Open-EO/openeo-processes/issues/402) - `save_result`: Added a more concrete `DataCubeEmpty` exception. - The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` don't support temporal comparison any longer. Instead explicitly use `date_difference`. - New definition for `aggregate_spatial`: @@ -64,6 +65,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `array_interpolate_linear`: Return value was incorrectly specified as `number` or `null`. It must return an array instead. [#333](https://github.com/Open-EO/openeo-processes/issues/333) - `is_nan`: Fixed a wrong description of the return value and simplified/clarified the process descriptions overall. [#360](https://github.com/Open-EO/openeo-processes/issues/360) - `is_nodata`: Clarified that `NaN` can be considered as a no-data value only if it is explicitly specified as no-data value. [#361](https://github.com/Open-EO/openeo-processes/issues/361) +- `merge_cubes`: Clarified descriptions to better describe when a merge is possible. [#379](https://github.com/Open-EO/openeo-processes/issues/379) - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) - `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) diff --git a/mask.json b/mask.json index d5940b25..0381d220 100644 --- a/mask.json +++ b/mask.json @@ -1,7 +1,7 @@ { "id": "mask", "summary": "Apply a raster mask", - "description": "Applies a mask to a raster data cube. To apply a polygon as a mask, use ``mask_polygon()``.\n\nA mask is a raster data cube for which corresponding pixels among `data` and `mask` are compared and those pixels in `data` are replaced whose pixels in `mask` are non-zero (for numbers) or `true` (for boolean values). The pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data).\n\nThe data cubes have to be compatible so that each dimension in the mask must also be available in the raster data cube with the same name, type, reference system, resolution and labels. Dimensions can be missing in the mask with the result that the mask is applied to each label of the dimension in `data` that is missing in the data cube of the mask. The process fails if there's an incompatibility found between the raster data cube and the mask.", + "description": "Applies a mask to a raster data cube. To apply a polygon as a mask, use ``mask_polygon()``.\n\nA mask is a raster data cube for which corresponding pixels among `data` and `mask` are compared and those pixels in `data` are replaced whose pixels in `mask` are non-zero (for numbers) or `true` (for boolean values). The pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data).\n\nThe data cubes have to be compatible except that the horizontal spatial dimensions (axes `x` and `y`) will be aligned implicitly by ``resample_cube_spatial()``. `data` is the target data cube for resampling and the default parameters of ``resample_cube_spatial()`` apply. All other dimensions in the mask must also be available in the raster data cube with the same name, type, reference system, resolution and labels. Dimensions can be missing in the mask with the result that the mask is applied to each label of the dimension in `data` that is missing in the data cube of the mask. The process fails if there's an incompatibility found between the raster data cube and the mask.", "categories": [ "cubes", "masks" diff --git a/merge_cubes.json b/merge_cubes.json index e41d5f2e..c22421c2 100644 --- a/merge_cubes.json +++ b/merge_cubes.json @@ -1,14 +1,14 @@ { "id": "merge_cubes", "summary": "Merge two data cubes", - "description": "The process performs the join on overlapping dimensions. The data cubes have to be compatible. A merge operation without overlap should be reversible with (a set of) filter operations for each of the two cubes. As such it is not possible to merge a vector and a raster data cube. It is also not possible to merge vector data cubes that contain different base geometry types (points, lines/line strings, polygons). The base geometry types can be merged with their corresponding multi geometry types. In case of such a conflict, the `IncompatibleGeometryTypes` exception is thrown.\n\nOverlapping dimensions have the same name, type, reference system and resolution, but can have different labels. One of the dimensions can have different labels, for all other dimensions the labels must be equal. Equality for geometries follows the definition in the Simple Features standard by the OGC. If data overlaps, the parameter `overlap_resolver` must be specified to resolve the overlap.\n\n**Examples for merging two data cubes:**\n\n1. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the first cube and `B3` and `B4`. An overlap resolver is *not needed*. The merged data cube has the dimensions `x`, `y`, `t` and `bands` and the dimension `bands` has four dimension labels: `B1`, `B2`, `B3`, `B4`.\n2. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the first data cube and `B2` and `B3` for the second. An overlap resolver is *required* to resolve overlap in band `B2`. The merged data cube has the dimensions `x`, `y`, `t` and `bands` and the dimension `bands` has three dimension labels: `B1`, `B2`, `B3`.\n3. Data cubes with the dimensions (`x`, `y`, `t`) have the same dimension labels in `x`, `y` and `t`. There are two options:\n 1. Keep the overlapping values separately in the merged data cube: An overlap resolver is *not needed*, but for each data cube you need to add a new dimension using ``add_dimension()``. The new dimensions must be equal, except that the labels for the new dimensions must differ by name. The merged data cube has the same dimensions and labels as the original data cubes, plus the dimension added with ``add_dimension()``, which has the two dimension labels after the merge.\n 2. Combine the overlapping values into a single value: An overlap resolver is *required* to resolve the overlap for all values. The merged data cube has the same dimensions and labels as the original data cubes, but all values have been processed by the overlap resolver.\n4. A data cube with dimensions (`x`, `y`, `t` / `bands`) or (`x`, `y`, `t`, `bands`) and another data cube with dimensions (`x`, `y`) have the same dimension labels in `x` and `y`. Merging them will join dimensions `x` and `y`, so the lower dimension cube is merged with each time step and band available in the higher dimensional cube. This can for instance be used to apply a digital elevation model to a spatio-temporal data cube. An overlap resolver is *required* to resolve the overlap for all pixels.\n\nAfter the merge, the dimensions with a natural/inherent label order (with a reference system this is each spatial and temporal dimensions) still have all dimension labels sorted. For other dimensions where there is no inherent order, including bands, the dimension labels keep the order in which they are present in the original data cubes and the dimension labels of `cube2` are appended to the dimension labels of `cube1`.", + "description": "The process merges two 'compatible' data cubes.\n\nThe data cubes have to be compatible, which means that they must share a common subset of equal dimensions. To conveniently get to such a subset of equal dimensions, the process tries to align the horizontal spatial dimensions (axes `x` and `y`) implicitly with ``resample_cube_spatial()`` if required. `cube1` is the target data cube for resampling and the default parameters of ``resample_cube_spatial()`` apply. The equality for geometries follows the definition in the Simple Features standard by the OGC.\n\nAll dimensions share the same properties, such as name, type, reference system, and resolution. Dimensions can have disjoint or overlapping labels. If there is any overlap between the dimension labels, the parameter `overlap_resolver` must be specified to combine the two values for these overlapping labels. A merge operation without overlap should be reversible with (a set of) filter operations for each of the two cubes, if no implicit resampling was applied.\n\nIt is not possible to merge a vector and a raster data cube. Merging vector data cubes with different base geometry types (points, lines/line strings, polygons) is not possible and throws the `IncompatibleGeometryTypes` exception. The base geometry types can be merged with their corresponding multi geometry types.\n\nAfter the merge, the dimensions with a natural/inherent label order (with a reference system this is each spatial and temporal dimensions) still have all dimension labels sorted. For other dimensions without inherent order, including bands, the dimension labels keep the order in which they are present in the original data cubes, and the dimension labels of `cube2` get appended to the dimension labels of `cube1`.\n\n**Examples for merging two data cubes:**\n\n1. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the base data cube and `B3` and `B4` for the other. An overlap resolver is *not needed*. The merged data cube has the dimensions `x`, `y`, `t`, `bands`, and the dimension `bands` has four dimension labels: `B1`, `B2`, `B3`, `B4`.\n2. Data cubes with the dimensions (`x`, `y`, `t`, `bands`) have the same dimension labels in `x`, `y` and `t`, but the labels for the dimension `bands` are `B1` and `B2` for the base data cube and `B2` and `B3` for the other. An overlap resolver is *required* to resolve overlap in band `B2`. The merged data cube has the dimensions `x`, `y`, `t` and `bands` and the dimension `bands` has three dimension labels: `B1`, `B2`, `B3`.\n3. Data cubes with the dimensions (`x`, `y`, `t`) have the same dimension labels in `x`, `y` and `t`. There are two options:\n 1. Keep the overlapping values separately in the merged data cube: An overlap resolver is *not needed*, but for each data cube you need to add a new dimension using ``add_dimension()``. The new dimensions must be equal, except that the labels for the new dimensions must differ. The merged data cube has the same dimensions and labels as the original data cubes, plus the dimension added with ``add_dimension()``, which has the two dimension labels after the merge.\n 2. Combine the overlapping values into a single value: An overlap resolver is *required* to resolve the overlap for all values. The merged data cube has the same dimensions and labels as the original data cubes, but all values have been processed by the overlap resolver.\n4. A data cube with dimensions (`x`, `y`, `t` / `bands`) or (`x`, `y`, `t`, `bands`) and another data cube with dimensions (`x`, `y`) have the same dimension labels in `x` and `y`. Merging them will join dimensions `x` and `y`, so the lower dimension cube is merged with each time step and band available in the higher dimensional cube. A use case for this is applying a digital elevation model to a spatio-temporal data cube. An overlap resolver is *required* to resolve the overlap for all pixels.", "categories": [ "cubes" ], "parameters": [ { "name": "cube1", - "description": "The first data cube.", + "description": "The base data cube.", "schema": { "type": "object", "subtype": "datacube" @@ -16,7 +16,7 @@ }, { "name": "cube2", - "description": "The second data cube.", + "description": "The other data cube to be merged with the base data cube.", "schema": { "type": "object", "subtype": "datacube" @@ -31,14 +31,14 @@ "parameters": [ { "name": "x", - "description": "The overlapping value from the first data cube `cube1`.", + "description": "The overlapping value from the base data cube `cube1`.", "schema": { "description": "Any data type." } }, { "name": "y", - "description": "The overlapping value from the second data cube `cube2`.", + "description": "The overlapping value from the other data cube `cube2`.", "schema": { "description": "Any data type." } From dcd3b8a34f68a9539600dc533158cef0c86f4452 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 14 Mar 2023 15:54:55 +0100 Subject: [PATCH 078/117] Change ordering of ties #409 --- CHANGELOG.md | 1 + order.json | 4 ++-- sort.json | 4 ++-- 3 files changed, 5 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index e2165c06..917c192b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -46,6 +46,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Clarify how the resulting vector data cube looks like [#356](https://github.com/Open-EO/openeo-processes/issues/356) - Renamed `create_raster_cube` to `create_data_cube`. [#68](https://github.com/Open-EO/openeo-processes/issues/68) - Updated the processes based on the subtypes `raster-cube` or `vector-cube` to work with the subtype `datacube` instead. [#68](https://github.com/Open-EO/openeo-processes/issues/68) +- `sort` and `order`: The ordering of ties is not defined anymore. [#409](https://github.com/Open-EO/openeo-processes/issues/409) ### Removed diff --git a/order.json b/order.json index 0002d467..871c4fea 100644 --- a/order.json +++ b/order.json @@ -1,7 +1,7 @@ { "id": "order", "summary": "Create a permutation", - "description": "Computes a permutation which allows rearranging the data into ascending or descending order. In other words, this process computes the ranked (sorted) element positions in the original list.\n\n**Remarks:**\n\n* The positions in the result are zero-based.\n* Ties will be left in their original ordering.\n* Temporal strings can *not* be compared based on their string representation due to the time zone/time-offset representations.", + "description": "Computes a permutation which allows rearranging the data into ascending or descending order. In other words, this process computes the ranked (sorted) element positions in the original list.\n\n**Remarks:**\n\n* The positions in the result are zero-based.\n* The ordering of ties is implementation-dependent.\n* Temporal strings can *not* be compared based on their string representation due to the time zone/time-offset representations.", "categories": [ "arrays", "sorting" @@ -203,4 +203,4 @@ "title": "Permutation explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/sort.json b/sort.json index 14e56e9d..27586250 100644 --- a/sort.json +++ b/sort.json @@ -1,7 +1,7 @@ { "id": "sort", "summary": "Sort data", - "description": "Sorts an array into ascending (default) or descending order.\n\n**Remarks:**\n\n* Ties will be left in their original ordering.\n* Temporal strings can *not* be compared based on their string representation due to the time zone/time-offset representations.", + "description": "Sorts an array into ascending (default) or descending order.\n\n**Remarks:**\n\n* The ordering of ties is implementation-dependent.\n* Temporal strings can *not* be compared based on their string representation due to the time zone/time-offset representations.", "categories": [ "arrays", "sorting" @@ -182,4 +182,4 @@ "result": true } } -} \ No newline at end of file +} From c0fb6a5d5e6779da49c77ac4220156304d5e87e3 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 14 Mar 2023 16:08:33 +0100 Subject: [PATCH 079/117] Add vector_reproject (#403) * Add resmple_vector * Renamed * Add dimension parameter * Add dimension parameter * geometries -> geometry * Update proposals/vector_reproject.json * Apply suggestions from code review * Update proposals/vector_reproject.json typo fix * Remove PROJ --------- Co-authored-by: Miha Kadunc --- CHANGELOG.md | 1 + aggregate_spatial.json | 2 +- aggregate_temporal.json | 2 +- aggregate_temporal_period.json | 2 +- filter_temporal.json | 2 +- proposals/aggregate_spatial_window.json | 2 +- proposals/reduce_spatial.json | 2 +- proposals/resample_cube_temporal.json | 2 +- proposals/vector_reproject.json | 97 +++++++++++++++++++++++++ resample_cube_spatial.json | 2 +- resample_spatial.json | 2 +- tests/.words | 2 + 12 files changed, 109 insertions(+), 9 deletions(-) create mode 100644 proposals/vector_reproject.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 73c67f94..04867829 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -14,6 +14,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `flatten_dimensions` - `unflatten_dimension` - `vector_buffer` + - `vector_reproject` - `vector_to_random_points` - `vector_to_regular_points` - `add_dimension`: Added new dimension type `geometries`. [#68](https://github.com/Open-EO/openeo-processes/issues/68) diff --git a/aggregate_spatial.json b/aggregate_spatial.json index a51e0dbd..503a0eaf 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -4,7 +4,7 @@ "description": "Aggregates statistics for one or more geometries (e.g. zonal statistics for polygons) over the spatial dimensions. The given data cube can have multiple additional dimensions and for all these dimensions results will be computed individually.\n\nAn 'unbounded' aggregation over the full extent of the horizontal spatial dimensions can be computed with the process ``reduce_spatial()``.\n\nThis process passes a list of values to the reducer. The list of values has an undefined order, therefore processes such as ``last()`` and ``first()`` that depend on the order of the values will lead to unpredictable results.", "categories": [ "cubes", - "aggregate & resample" + "aggregate" ], "parameters": [ { diff --git a/aggregate_temporal.json b/aggregate_temporal.json index d63099b7..1a9e4b09 100644 --- a/aggregate_temporal.json +++ b/aggregate_temporal.json @@ -4,7 +4,7 @@ "description": "Computes a temporal aggregation based on an array of temporal intervals.\n\nFor common regular calendar hierarchies such as year, month, week or seasons ``aggregate_temporal_period()`` can be used. Other calendar hierarchies must be transformed into specific intervals by the clients.\n\nFor each interval, all data along the dimension will be passed through the reducer.\n\nThe computed values will be projected to the labels. If no labels are specified, the start of the temporal interval will be used as label for the corresponding values. In case of a conflict (i.e. the user-specified values for the start times of the temporal intervals are not distinct), the user-defined labels must be specified in the parameter `labels` as otherwise a `DistinctDimensionLabelsRequired` exception would be thrown. The number of user-defined labels and the number of intervals need to be equal.\n\nIf the dimension is not set or is set to `null`, the data cube is expected to only have one temporal dimension.", "categories": [ "cubes", - "aggregate & resample" + "aggregate" ], "parameters": [ { diff --git a/aggregate_temporal_period.json b/aggregate_temporal_period.json index ce6ec410..0fc94f11 100644 --- a/aggregate_temporal_period.json +++ b/aggregate_temporal_period.json @@ -3,7 +3,7 @@ "summary": "Temporal aggregations based on calendar hierarchies", "description": "Computes a temporal aggregation based on calendar hierarchies such as years, months or seasons. For other calendar hierarchies ``aggregate_temporal()`` can be used.\n\nFor each interval, all data along the dimension will be passed through the reducer.\n\nIf the dimension is not set or is set to `null`, the data cube is expected to only have one temporal dimension.", "categories": [ - "aggregate & resample", + "aggregate", "climatology", "cubes" ], diff --git a/filter_temporal.json b/filter_temporal.json index 0ba2274e..c873645b 100644 --- a/filter_temporal.json +++ b/filter_temporal.json @@ -66,7 +66,7 @@ }, { "name": "dimension", - "description": "The name of the temporal dimension to filter on. If no specific dimension is specified or it is set to `null`, the filter applies to all temporal dimensions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "description": "The name of the temporal dimension to filter on. If no specific dimension is specified, the filter applies to all temporal dimensions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", "schema": { "type": [ "string", diff --git a/proposals/aggregate_spatial_window.json b/proposals/aggregate_spatial_window.json index 5bc3e03c..747db151 100644 --- a/proposals/aggregate_spatial_window.json +++ b/proposals/aggregate_spatial_window.json @@ -4,7 +4,7 @@ "description": "Aggregates statistics over the horizontal spatial dimensions (axes `x` and `y`) of the data cube.\n\nThe pixel grid for the axes `x` and `y` is divided into non-overlapping windows with the size specified in the parameter `size`. If the number of values for the axes `x` and `y` is not a multiple of the corresponding window size, the behavior specified in the parameters `boundary` and `align` is applied.\nFor each of these windows, the reducer process computes the result.", "categories": [ "cubes", - "aggregate & resample" + "aggregate" ], "experimental": true, "parameters": [ diff --git a/proposals/reduce_spatial.json b/proposals/reduce_spatial.json index d27bd9cf..d9233cf7 100644 --- a/proposals/reduce_spatial.json +++ b/proposals/reduce_spatial.json @@ -3,7 +3,7 @@ "summary": "Reduce spatial dimensions 'x' and 'y'", "description": "Applies a reducer to a data cube by collapsing all the pixel values along the horizontal spatial dimensions (i.e. axes `x` and `y`) into an output value computed by the reducer. The horizontal spatial dimensions are dropped.\n\nAn aggregation over certain spatial areas can be computed with the process ``aggregate_spatial()``.\n\nThis process passes a list of values to the reducer. The list of values has an undefined order, therefore processes such as ``last()`` and ``first()`` that depend on the order of the values will lead to unpredictable results.", "categories": [ - "aggregate & resample", + "aggregate", "cubes", "reducer" ], diff --git a/proposals/resample_cube_temporal.json b/proposals/resample_cube_temporal.json index 9c6aac09..340381a0 100644 --- a/proposals/resample_cube_temporal.json +++ b/proposals/resample_cube_temporal.json @@ -4,7 +4,7 @@ "description": "Resamples one or more given temporal dimensions from a source data cube to align with the corresponding dimensions of the given target data cube using the nearest neighbor method. Returns a new data cube with the resampled dimensions.\n\nBy default, this process simply takes the nearest neighbor independent of the value (including values such as no-data / `null`). Depending on the data cubes this may lead to values being assigned to two target timestamps. To only consider valid values in a specific range around the target timestamps, use the parameter `valid_within`.\n\nThe rare case of ties is resolved by choosing the earlier timestamps.", "categories": [ "cubes", - "aggregate & resample" + "reproject" ], "experimental": true, "parameters": [ diff --git a/proposals/vector_reproject.json b/proposals/vector_reproject.json new file mode 100644 index 00000000..d5a099c9 --- /dev/null +++ b/proposals/vector_reproject.json @@ -0,0 +1,97 @@ +{ + "id": "vector_reproject", + "summary": "Reprojects the geometry dimension", + "description": "Converts the geometries stored in a geometry dimension to a different coordinate reference system.", + "categories": [ + "cubes", + "reproject", + "vector" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "A vector data cube.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry" + } + ] + } + }, + { + "name": "projection", + "description": "Coordinate reference system to reproject to. Specified as an [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html).", + "schema": [ + { + "title": "EPSG Code", + "type": "integer", + "subtype": "epsg-code", + "minimum": 1000, + "examples": [ + 3857 + ] + }, + { + "title": "WKT2", + "type": "string", + "subtype": "wkt2-definition" + } + ] + }, + { + "name": "dimension", + "description": "The name of the geometry dimension to reproject. If no specific dimension is specified, the filter applies to all geometry dimensions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "schema": { + "type": [ + "string", + "null" + ] + }, + "default": null, + "optional": true + } + ], + "returns": { + "description": "A vector data cube with geometries projected to the new coordinate reference system. The reference system of the geometry dimension changes, all other dimensions and properties remain unchanged.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry" + } + ] + } + }, + "exceptions": { + "DimensionNotAvailable": { + "message": "A dimension with the specified name does not exist." + } + }, + "links": [ + { + "href": "https://openeo.org/documentation/1.0/datacubes.html#resample", + "rel": "about", + "title": "Resampling explained in the openEO documentation" + }, + { + "rel": "about", + "href": "https://proj.org/usage/projections.html", + "title": "PROJ parameters for cartographic projections" + }, + { + "rel": "about", + "href": "http://www.epsg-registry.org", + "title": "Official EPSG code registry" + }, + { + "rel": "about", + "href": "http://www.epsg.io", + "title": "Unofficial EPSG code database" + } + ] +} diff --git a/resample_cube_spatial.json b/resample_cube_spatial.json index 3cbdfa49..4da4e595 100644 --- a/resample_cube_spatial.json +++ b/resample_cube_spatial.json @@ -4,7 +4,7 @@ "description": "Resamples the spatial dimensions (x,y) from a source data cube to align with the corresponding dimensions of the given target data cube. Returns a new data cube with the resampled dimensions.\n\nTo resample a data cube to a specific resolution or projection regardless of an existing target data cube, refer to ``resample_spatial()``.", "categories": [ "cubes", - "aggregate & resample" + "reproject" ], "parameters": [ { diff --git a/resample_spatial.json b/resample_spatial.json index 6e13d459..fb34f194 100644 --- a/resample_spatial.json +++ b/resample_spatial.json @@ -4,7 +4,7 @@ "description": "Resamples the spatial dimensions (x,y) of the data cube to a specified resolution and/or warps the data cube to the target projection. At least `resolution` or `projection` must be specified.\n\nRelated processes:\n\n* Use ``filter_bbox()`` to set the target spatial extent.\n* To spatially align two data cubes with each other (e.g. for merging), better use the process ``resample_cube_spatial()``.", "categories": [ "cubes", - "aggregate & resample" + "reproject" ], "parameters": [ { diff --git a/tests/.words b/tests/.words index 112b2347..b4cd56fe 100644 --- a/tests/.words +++ b/tests/.words @@ -21,6 +21,8 @@ orthorectification orthorectified radiometrically reflectances +reproject +Reprojects resample resampled resamples From 6fadb688432ba6e27264b872235242c722e8dc54 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 14 Mar 2023 18:15:15 +0100 Subject: [PATCH 080/117] Clarify save_result return value (#401) --- CHANGELOG.md | 1 + save_result.json | 5 +++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 04867829..5decdd66 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -70,6 +70,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `merge_cubes`: Clarified descriptions to better describe when a merge is possible. [#379](https://github.com/Open-EO/openeo-processes/issues/379) - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) - `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) +- `save_result`: Clarified that the process always returns `true` (and otherwise throws). [#334](https://github.com/Open-EO/openeo-processes/issues/334) ## [1.2.0] - 2021-12-13 diff --git a/save_result.json b/save_result.json index 8fa67ebb..7b952ead 100644 --- a/save_result.json +++ b/save_result.json @@ -35,9 +35,10 @@ } ], "returns": { - "description": "Returns `false` if the process failed to make the data available, `true` otherwise.", + "description": "Always returns `true` as in case of an error an exception is thrown which aborts the execution of the process.", "schema": { - "type": "boolean" + "type": "boolean", + "const": true } }, "exceptions": { From 13bee5fa613357a2c8f62112cdd9676a963941cb Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 15 Mar 2023 10:29:08 +0100 Subject: [PATCH 081/117] `array_modify`: Change the default value for `length` from `1` to `0`. #312 (#421) --- CHANGELOG.md | 1 + proposals/array_modify.json | 8 ++++---- 2 files changed, 5 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 5decdd66..7bdba935 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -25,6 +25,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `array_append` - `array_concat` - `array_modify` +- `array_modify`: Change the default value for `length` from `1` to `0`. [#312](https://github.com/Open-EO/openeo-processes/issues/312) - Renamed `text_merge` to `text_concat` for better alignment with `array_concat` and existing implementations. - `apply_neighborhood`: - Allow `null` as default value for units. diff --git a/proposals/array_modify.json b/proposals/array_modify.json index 2ee02e16..1a590bea 100644 --- a/proposals/array_modify.json +++ b/proposals/array_modify.json @@ -39,7 +39,7 @@ "name": "length", "description": "The number of elements in the `data` array to remove (or replace) starting from the given index. If the array contains fewer elements, the process simply removes all elements up to the end.", "optional": true, - "default": 1, + "default": 0, "schema": { "type": "integer", "minimum": 0 @@ -75,7 +75,8 @@ "values": [ "b" ], - "index": 1 + "index": 1, + "length": 1 }, "returns": [ "a", @@ -118,8 +119,7 @@ "values": [ "b" ], - "index": 1, - "length": 0 + "index": 1 }, "returns": [ "a", From f8f6731e5bec1fc147bac7d8075e81d70fe0b0e8 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 15 Mar 2023 10:35:52 +0100 Subject: [PATCH 082/117] Add apply_polygon #287 (#298) * Add chunk_polygon #287 * Add mask_value parameter to chunk_polygon * Rename: chunk_polgon -> apply_polygon * Updates to terminology and definition * Add exception * geometries -> geometry --- CHANGELOG.md | 1 + mask_polygon.json | 2 +- proposals/apply_polygon.json | 128 +++++++++++++++++++++++++++++++++++ 3 files changed, 130 insertions(+), 1 deletion(-) create mode 100644 proposals/apply_polygon.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 7bdba935..782f7865 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -78,6 +78,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - New processes in proposal state + - `apply_polygon` - `fit_curve` - `predict_curve` - `ard_normalized_radar_backscatter` and `sar_backscatter`: Added `options` parameter diff --git a/mask_polygon.json b/mask_polygon.json index 902d83fb..20993acd 100644 --- a/mask_polygon.json +++ b/mask_polygon.json @@ -1,7 +1,7 @@ { "id": "mask_polygon", "summary": "Apply a polygon mask", - "description": "Applies a (multi) polygon mask to a raster data cube. To apply a raster mask use ``mask()``.\n\nAll pixels for which the point at the pixel center **does not** intersect with any polygon (as defined in the Simple Features standard by the OGC) are replaced. This behavior can be inverted by setting the parameter `inside` to `true`.\n\nThe pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data). No data values in `data` will be left untouched by the masking operation.", + "description": "Applies a (multi) polygon mask to a raster data cube. To apply a raster mask use ``mask()``.\n\nAll pixels for which the point at the pixel center **does not** intersect with any polygon (as defined in the Simple Features standard by the OGC) are replaced. This behavior can be inverted by setting the parameter `inside` to `true`. The pixel values are replaced with the value specified for `replacement`, which defaults to `null` (no data). No data values in `data` will be left untouched by the masking operation.", "categories": [ "cubes", "masks" diff --git a/proposals/apply_polygon.json b/proposals/apply_polygon.json new file mode 100644 index 00000000..b52d4ace --- /dev/null +++ b/proposals/apply_polygon.json @@ -0,0 +1,128 @@ +{ + "id": "apply_polygon", + "summary": "Apply a process to segments of the data cube", + "description": "Applies a process to segments of the data cube that are defined by the given polygons. For each polygon provided, all pixels for which the point at the pixel center intersects with the polygon (as defined in the Simple Features standard by the OGC) are collected into sub data cubes. If a pixel is part of multiple of the provided polygons (e.g., when the polygons overlap), the `GeometriesOverlap` exception is thrown. Each sub data cube is passed individually to the given process.", + "categories": [ + "cubes" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "A data cube.", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + }, + { + "name": "polygons", + "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.", + "schema": [ + { + "type": "object", + "subtype": "geojson", + "description": "The GeoJSON type `GeometryCollection` is not supported." + }, + { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry", + "geometry_type": [ + "Polygon", + "MultiPolygon" + ] + } + ] + } + ] + }, + { + "name": "process", + "description": "A process that accepts and returns a single data cube and is applied on each individual sub data cube. The process may consist of multiple sub-processes.", + "schema": { + "type": "object", + "subtype": "process-graph", + "parameters": [ + { + "name": "data", + "description": "A sub data cube of the original data cube. The sub data cubes provided cover the smallest possible grid-aligned extent of the corresponding polygon and all pixels outside of the polygon are replaced with the value given in `mask_value`.", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + }, + { + "name": "context", + "description": "Additional data passed by the user.", + "schema": { + "description": "Any data type." + }, + "optional": true, + "default": null + } + ], + "returns": { + "description": "The updated sub data cube with the newly computed values and the same dimensions. The dimension properties (name, type, reference system and resolution) must remain unchanged. The labels can change, but the number of labels must remain unchanged.", + "schema": { + "description": "A data cube.", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + } + } + } + }, + { + "name": "mask_value", + "description": "All pixels for which the point at the pixel center **does not** intersect with the polygon are replaced with the given value, which defaults to `null` (no data).\n\nIt can provide a distinction between no data values within the polygon and masked pixels outside of it.", + "schema": [ + { + "type": "number" + }, + { + "type": "boolean" + }, + { + "type": "string" + }, + { + "type": "null" + } + ], + "default": null, + "optional": true + }, + { + "name": "context", + "description": "Additional data to be passed to the process.", + "schema": { + "description": "Any data type." + }, + "optional": true, + "default": null + } + ], + "returns": { + "description": "A data cube with the newly computed values and the same dimensions. The dimension properties (name, type, labels, reference system and resolution) remain unchanged.", + "schema": { + "type": "object", + "subtype": "raster-cube" + } + }, + "exceptions": { + "GeometriesOverlap": { + "message": "Geometries are not allowed to overlap to avoid that pixel values are processed multiple times." + } + }, + "links": [ + { + "href": "http://www.opengeospatial.org/standards/sfa", + "rel": "about", + "title": "Simple Features standard by the OGC" + } + ] +} From 883bd80eeaac4d417ea426ad500897d43b9b6fa7 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 28 Mar 2023 21:49:04 +0100 Subject: [PATCH 083/117] Support for arrays and objects removed from comparisons. #208 (#422) --- CHANGELOG.md | 4 +++- array_contains.json | 44 +++++++------------------------------------- eq.json | 31 +++++++++++++------------------ gt.json | 14 ++++++++++++-- gte.json | 31 +++++++++++++------------------ lt.json | 14 ++++++++++++-- lte.json | 31 +++++++++++++------------------ neq.json | 31 +++++++++++++------------------ 8 files changed, 86 insertions(+), 114 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 782f7865..9eb2f31d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -37,7 +37,6 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) - `mask` and `merge_cubes`: The spatial dimensions `x` and `y` can now be resampled implicitly instead of throwing an error. [#402](https://github.com/Open-EO/openeo-processes/issues/402) - `save_result`: Added a more concrete `DataCubeEmpty` exception. -- The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` don't support temporal comparison any longer. Instead explicitly use `date_difference`. - New definition for `aggregate_spatial`: - Allows more than 3 input dimensions [#126](https://github.com/Open-EO/openeo-processes/issues/126) - Allow to not export statistics by changing the parameter `target_dimension` [#366](https://github.com/Open-EO/openeo-processes/issues/366) @@ -52,6 +51,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `between`: Support for temporal comparison. - Deprecated `GeometryCollections` are not supported any longer. [#389](https://github.com/Open-EO/openeo-processes/issues/389) - Deprecated PROJ definitions for the CRS are not supported any longer. +- The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` and `array_contains`: + - Removed support for temporal comparison. Instead explicitly use `date_difference`. + - Removed support for the input data types array and object. [#208](https://github.com/Open-EO/openeo-processes/issues/208) ### Fixed diff --git a/array_contains.json b/array_contains.json index 803e14d0..37ced980 100644 --- a/array_contains.json +++ b/array_contains.json @@ -1,7 +1,7 @@ { "id": "array_contains", "summary": "Check whether the array contains a given value", - "description": "Checks whether the array specified for `data` contains the value specified in `value`. Returns `true` if there's a match, otherwise `false`.\n\n**Remarks:**\n\n* To get the index or the label of the value found, use ``array_find()``.\n* All definitions for the process ``eq()`` regarding the comparison of values apply here as well. A `null` return value from ``eq()`` is handled exactly as `false` (no match).\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*.\n* An integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`. Still, this process may return unexpectedly `false` when comparing floating-point numbers due to floating-point inaccuracy in machine-based computation.\n* Temporal strings are treated as normal strings and MUST NOT be interpreted.\n* If the specified value is an array, object or null, the process always returns `false`. See the examples for one to check for `null` values.", + "description": "Checks whether the array specified for `data` contains the value specified in `value`. Returns `true` if there's a match, otherwise `false`.\n\n**Remarks:**\n\n* To get the index or the label of the value found, use ``array_find()``.\n* All definitions for the process ``eq()`` regarding the comparison of values apply here as well. A `null` return value from ``eq()`` is handled exactly as `false` (no match).\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*.\n* An integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`. Still, this process may return unexpectedly `false` when comparing floating-point numbers due to floating-point inaccuracy in machine-based computation.\n* Temporal strings are treated as normal strings and MUST NOT be interpreted.", "categories": [ "arrays", "comparison", @@ -22,7 +22,12 @@ "name": "value", "description": "Value to find in `data`. If the value is `null`, this process returns always `false`.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } } ], @@ -77,25 +82,6 @@ }, "returns": false }, - { - "arguments": { - "data": [ - [ - 1, - 2 - ], - [ - 3, - 4 - ] - ], - "value": [ - 1, - 2 - ] - }, - "returns": false - }, { "arguments": { "data": [ @@ -111,22 +97,6 @@ "value": 2 }, "returns": false - }, - { - "arguments": { - "data": [ - { - "a": "b" - }, - { - "c": "d" - } - ], - "value": { - "a": "b" - } - }, - "returns": false } ], "links": [ diff --git a/eq.json b/eq.json index ce07da96..e7712399 100644 --- a/eq.json +++ b/eq.json @@ -1,7 +1,7 @@ { "id": "eq", "summary": "Equal to comparison", - "description": "Compares whether `x` is strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", + "description": "Compares whether `x` is strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "texts", "comparison" @@ -11,14 +11,24 @@ "name": "x", "description": "First operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { "name": "y", "description": "Second operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { @@ -142,21 +152,6 @@ }, "returns": false }, - { - "arguments": { - "x": [ - 1, - 2, - 3 - ], - "y": [ - 1, - 2, - 3 - ] - }, - "returns": false - }, { "arguments": { "x": null, diff --git a/gt.json b/gt.json index f80b6e11..ae2cf151 100644 --- a/gt.json +++ b/gt.json @@ -10,14 +10,24 @@ "name": "x", "description": "First operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { "name": "y", "description": "Second operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } } ], diff --git a/gte.json b/gte.json index 378d03e6..a32816c4 100644 --- a/gte.json +++ b/gte.json @@ -1,7 +1,7 @@ { "id": "gte", "summary": "Greater than or equal to comparison", - "description": "Compares whether `x` is greater than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", + "description": "Compares whether `x` is greater than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "comparison" ], @@ -10,14 +10,24 @@ "name": "x", "description": "First operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { "name": "y", "description": "Second operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } } ], @@ -73,21 +83,6 @@ }, "returns": false }, - { - "arguments": { - "x": [ - 1, - 2, - 3 - ], - "y": [ - 1, - 2, - 3 - ] - }, - "returns": false - }, { "arguments": { "x": null, diff --git a/lt.json b/lt.json index da6fe80d..0cc45f87 100644 --- a/lt.json +++ b/lt.json @@ -10,14 +10,24 @@ "name": "x", "description": "First operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { "name": "y", "description": "Second operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } } ], diff --git a/lte.json b/lte.json index bf007154..9f936915 100644 --- a/lte.json +++ b/lte.json @@ -1,7 +1,7 @@ { "id": "lte", "summary": "Less than or equal to comparison", - "description": "Compares whether `x` is less than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", + "description": "Compares whether `x` is less than or equal to `y`.\n\n**Remarks:**\n\n* If any operand is `null`, the return value is `null`.\n* If the operands are not equal (see process ``eq()``) and any of them is not a `number`, the process returns `false`.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "comparison" ], @@ -10,14 +10,24 @@ "name": "x", "description": "First operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { "name": "y", "description": "Second operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } } ], @@ -73,21 +83,6 @@ }, "returns": false }, - { - "arguments": { - "x": [ - 1, - 2, - 3 - ], - "y": [ - 1, - 2, - 3 - ] - }, - "returns": false - }, { "arguments": { "x": null, diff --git a/neq.json b/neq.json index 4446bb41..ff6bc9fd 100644 --- a/neq.json +++ b/neq.json @@ -1,7 +1,7 @@ { "id": "neq", "summary": "Not equal to comparison", - "description": "Compares whether `x` is **not** strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* If any operand is an array or object, the return value is `false`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", + "description": "Compares whether `x` is **not** strictly equal to `y`.\n\n**Remarks:**\n\n* Data types MUST be checked strictly. For example, a string with the content *1* is not equal to the number *1*. Nevertheless, an integer *1* is equal to a floating-point number *1.0* as `integer` is a sub-type of `number`.\n* If any operand is `null`, the return value is `null`.\n* Strings are expected to be encoded in UTF-8 by default.\n* Temporal strings are normal strings. To compare temporal strings as dates/times, use ``date_difference()``.", "categories": [ "texts", "comparison" @@ -11,14 +11,24 @@ "name": "x", "description": "First operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { "name": "y", "description": "Second operand.", "schema": { - "description": "Any data type is allowed." + "type": [ + "number", + "boolean", + "string", + "null" + ] } }, { @@ -135,21 +145,6 @@ }, "returns": true }, - { - "arguments": { - "x": [ - 1, - 2, - 3 - ], - "y": [ - 1, - 2, - 3 - ] - }, - "returns": false - }, { "arguments": { "x": null, From 14c7076a219481b6e0b324b91efc90b927e684dc Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 28 Mar 2023 23:31:32 +0100 Subject: [PATCH 084/117] Change proposal status of some processes #301 (#419) --- CHANGELOG.md | 6 ++++++ proposals/array_append.json => array_append.json | 1 - proposals/array_concat.json => array_concat.json | 1 - proposals/array_create.json => array_create.json | 3 +-- ...interpolate_linear.json => array_interpolate_linear.json | 1 - ...sample_cube_temporal.json => resample_cube_temporal.json | 1 - 6 files changed, 7 insertions(+), 6 deletions(-) rename proposals/array_append.json => array_append.json (98%) rename proposals/array_concat.json => array_concat.json (98%) rename proposals/array_create.json => array_create.json (98%) rename proposals/array_interpolate_linear.json => array_interpolate_linear.json (98%) rename proposals/resample_cube_temporal.json => resample_cube_temporal.json (99%) diff --git a/CHANGELOG.md b/CHANGELOG.md index 9eb2f31d..80524b15 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -21,6 +21,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed +- Moved from proposals to stable processes: + - `array_append` + - `array_concat` + - `array_create` + - `array_interpolate_linear` + - `resample_cube_temporal` - Added better support for labeled arrays. Labels are not discarded in all cases anymore. Affected processes: - `array_append` - `array_concat` diff --git a/proposals/array_append.json b/array_append.json similarity index 98% rename from proposals/array_append.json rename to array_append.json index fc5e6272..80b48d12 100644 --- a/proposals/array_append.json +++ b/array_append.json @@ -5,7 +5,6 @@ "categories": [ "arrays" ], - "experimental": true, "parameters": [ { "name": "data", diff --git a/proposals/array_concat.json b/array_concat.json similarity index 98% rename from proposals/array_concat.json rename to array_concat.json index 0f4dddac..87c04938 100644 --- a/proposals/array_concat.json +++ b/array_concat.json @@ -5,7 +5,6 @@ "categories": [ "arrays" ], - "experimental": true, "parameters": [ { "name": "array1", diff --git a/proposals/array_create.json b/array_create.json similarity index 98% rename from proposals/array_create.json rename to array_create.json index 71d7003d..8e531db0 100644 --- a/proposals/array_create.json +++ b/array_create.json @@ -5,7 +5,6 @@ "categories": [ "arrays" ], - "experimental": true, "parameters": [ { "name": "data", @@ -92,4 +91,4 @@ ] } ] -} \ No newline at end of file +} diff --git a/proposals/array_interpolate_linear.json b/array_interpolate_linear.json similarity index 98% rename from proposals/array_interpolate_linear.json rename to array_interpolate_linear.json index 03021919..021522b0 100644 --- a/proposals/array_interpolate_linear.json +++ b/array_interpolate_linear.json @@ -7,7 +7,6 @@ "math", "math > interpolation" ], - "experimental": true, "parameters": [ { "name": "data", diff --git a/proposals/resample_cube_temporal.json b/resample_cube_temporal.json similarity index 99% rename from proposals/resample_cube_temporal.json rename to resample_cube_temporal.json index 340381a0..260954d0 100644 --- a/proposals/resample_cube_temporal.json +++ b/resample_cube_temporal.json @@ -6,7 +6,6 @@ "cubes", "reproject" ], - "experimental": true, "parameters": [ { "name": "data", From ff00eb7be850471072d1f2cc2764783fb2a18f18 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 30 Mar 2023 11:08:16 +0200 Subject: [PATCH 085/117] quantiles: Deprecate q in favor of probabilities #293 (#297) * quantiles: Deprecate q in favor of probabilities #293 * `quantiles`: Parameter `probabilities` provided as array must be in ascending order. --- CHANGELOG.md | 6 ++++++ quantiles.json | 41 +++++++++++++++++++++++++++-------------- 2 files changed, 33 insertions(+), 14 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 80524b15..d03a479c 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -50,6 +50,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Renamed `create_raster_cube` to `create_data_cube`. [#68](https://github.com/Open-EO/openeo-processes/issues/68) - Updated the processes based on the subtypes `raster-cube` or `vector-cube` to work with the subtype `datacube` instead. [#68](https://github.com/Open-EO/openeo-processes/issues/68) - `sort` and `order`: The ordering of ties is not defined anymore. [#409](https://github.com/Open-EO/openeo-processes/issues/409) +- `quantiles`: Parameter `probabilities` provided as array must be in ascending order. [#297](https://github.com/Open-EO/openeo-processes/pull/297) ### Removed @@ -104,6 +105,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Renamed to `inspect`. - The log level `error` does not need to stop execution. - Added proposals for logging several data types to the implementation guide. +- `quantiles`: The parameter `probabilities` also accepts an integer value to compute q-quantiles. [#293](https://github.com/Open-EO/openeo-processes/issues/293) + +### Deprecated + +- `quantiles`: The parameter `q` has been deprecated in favor of the extended parameter `probabilities`. [#293](https://github.com/Open-EO/openeo-processes/issues/293) ### Removed diff --git a/quantiles.json b/quantiles.json index 81f60c2b..033a4d89 100644 --- a/quantiles.json +++ b/quantiles.json @@ -1,7 +1,7 @@ { "id": "quantiles", "summary": "Quantiles", - "description": "Calculates quantiles, which are cut points dividing the range of a sample distribution into either\n\n* intervals corresponding to the given `probabilities` or\n* equal-sized intervals (q-quantiles based on the parameter `q`).\n\nEither the parameter `probabilities` or `q` must be specified, otherwise the `QuantilesParameterMissing` exception is thrown. If both parameters are set the `QuantilesParameterConflict` exception is thrown.\n\nSample quantiles can be computed with several different algorithms. Hyndman and Fan (1996) have concluded on nine different types, which are commonly implemented in statistical software packages. This process is implementing type 7, which is implemented widely and often also the default type (e.g. in Excel, Julia, Python, R and S).", + "description": "Calculates quantiles, which are cut points dividing the range of a sample distribution into either\n\n1. intervals corresponding to the given probabilities *or*\n2. equal-sized intervals (q-quantiles).\n\nEither the parameter `probabilities` or `q` must be specified, otherwise the `QuantilesParameterMissing` exception is thrown. If both parameters are set the `QuantilesParameterConflict` exception is thrown.\n\nSample quantiles can be computed with several different algorithms. Hyndman and Fan (1996) have concluded on nine different types, which are commonly implemented in statistical software packages. This process is implementing type 7, which is implemented widely and often also the default type (e.g. in Excel, Julia, Python, R and S).", "categories": [ "math > statistics" ], @@ -21,20 +21,30 @@ }, { "name": "probabilities", - "description": "A list of probabilities to calculate quantiles for. The probabilities must be between 0 and 1 (inclusive).", - "schema": { - "type": "array", - "items": { - "type": "number", - "minimum": 0, - "maximum": 1 + "description": "Quantiles to calculate. Either a list of probabilities or the number of intervals:\n\n* Provide an array with a sorted list of probabilities in ascending order to calculate quantiles for. The probabilities must be between 0 and 1 (inclusive). If not sorted in ascending order, an `AscendingProbabilitiesRequired` exception is thrown.\n* Provide an integer to specify the number of intervals to calculate quantiles for. Calculates q-quantiles with equal-sized intervals.", + "schema": [ + { + "title": "List of probabilities", + "type": "array", + "uniqueItems": true, + "items": { + "type": "number", + "minimum": 0, + "maximum": 1 + } + }, + { + "title": "Number of intervals (q-quantiles)", + "type": "integer", + "minimum": 2 } - }, + ], "optional": true }, { "name": "q", - "description": "Number of intervals to calculate quantiles for. Calculates q-quantiles with equal-sized intervals.", + "description": "Number of intervals to calculate quantiles for. Calculates q-quantiles with equal-sized intervals.\n\nThis parameter has been **deprecated**. Please use the parameter `probabilities` instead.", + "deprecated": true, "schema": { "type": "integer", "minimum": 2 @@ -69,6 +79,9 @@ }, "QuantilesParameterConflict": { "message": "The process `quantiles` only allows that either the `probabilities` or the `q` parameter is set." + }, + "AscendingProbabilitiesRequired": { + "message": "The values passed for parameter `probabilities` must be sorted in ascending order." } }, "examples": [ @@ -114,7 +127,7 @@ 7, 9 ], - "q": 4 + "probabilities": 4 }, "returns": [ 4, @@ -130,7 +143,7 @@ null, 1 ], - "q": 2 + "probabilities": 2 }, "returns": [ -0.5 @@ -144,7 +157,7 @@ null, 1 ], - "q": 4, + "probabilities": 4, "ignore_nodata": false }, "returns": [ @@ -181,4 +194,4 @@ "title": "Hyndman and Fan (1996): Sample Quantiles in Statistical Packages" } ] -} \ No newline at end of file +} From 8cad0b5005bcf177d5289f66b13b4e929b58ede7 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 14 Mar 2023 12:53:16 +0100 Subject: [PATCH 086/117] Fix description of fit_curve --- proposals/fit_curve.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/fit_curve.json b/proposals/fit_curve.json index 9d97dfda..5d33b652 100644 --- a/proposals/fit_curve.json +++ b/proposals/fit_curve.json @@ -1,7 +1,7 @@ { "id": "fit_curve", "summary": "Curve fitting", - "description": "Use non-linear least squares to fit a model function `y = f(x, parameters)` to data.\n\nThe process throws an `InvalidValues` exception if invalid values are encountered. Invalid values are finite numbers (see also ``is_valid()``).", + "description": "Use non-linear least squares to fit a model function `y = f(x, parameters)` to data.\n\nThe process throws an `InvalidValues` exception if invalid values are encountered. Valid values are finite numbers (see also ``is_valid()``).", "categories": [ "cubes", "math" From 33b6ca4ee450103fb67596ed96c97ffce2c79c73 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 30 Mar 2023 16:00:32 +0200 Subject: [PATCH 087/117] Document handling of empty geometries. #404 (#423) * Document handling of empty geometries. #404 * Update aggregate_spatial.json Co-authored-by: Stefaan Lippens * Wordsmithing --------- Co-authored-by: Stefaan Lippens --- CHANGELOG.md | 1 + aggregate_spatial.json | 4 ++-- filter_bbox.json | 2 +- filter_spatial.json | 2 +- load_collection.json | 6 +++--- mask_polygon.json | 2 +- proposals/apply_polygon.json | 2 +- proposals/filter_vector.json | 4 ++-- proposals/load_result.json | 6 +++--- proposals/vector_buffer.json | 4 ++-- proposals/vector_to_random_points.json | 2 +- proposals/vector_to_regular_points.json | 2 +- 12 files changed, 19 insertions(+), 18 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index d03a479c..488fbd44 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -81,6 +81,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `rename_labels`: Clarified that the `LabelsNotEnumerated` exception is thrown if `source` is empty instead of if `target` is empty. [#321](https://github.com/Open-EO/openeo-processes/issues/321) - `round`: Clarify that the rounding for ties applies not only for integers. [#326](https://github.com/Open-EO/openeo-processes/issues/326) - `save_result`: Clarified that the process always returns `true` (and otherwise throws). [#334](https://github.com/Open-EO/openeo-processes/issues/334) +- Handling of empty geometries is clarified throughout the processes. [#404](https://github.com/Open-EO/openeo-processes/issues/404) ## [1.2.0] - 2021-12-13 diff --git a/aggregate_spatial.json b/aggregate_spatial.json index 503a0eaf..880ad9a3 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -26,7 +26,7 @@ }, { "name": "geometries", - "description": "Geometries for which the aggregation will be computed. Feature properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per label in the dimension of type `geometries`, GeoJSON `Feature` or `Geometry`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations. No operation is applied to geometries that are outside of the bounds of the data.", + "description": "Geometries for which the aggregation will be computed. Feature properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per label in the dimension of type `geometries`, GeoJSON `Feature` or `Geometry`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. No values will be computed for empty geometries. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations. No operation is applied to geometries that are outside of the bounds of the data.", "schema": [ { "type": "object", @@ -102,7 +102,7 @@ } ], "returns": { - "description": "A vector data cube with the computed results and restricted to the bounds of the geometries. The spatial dimensions is replaced by a geometries dimension and if `target_dimension` is not `null`, a new dimension is added.", + "description": "A vector data cube with the computed results. Empty geometries still exist but without any aggregated values (i.e. no-data). The spatial dimensions are replaced by a dimension of type 'geometries' and if `target_dimension` is not `null`, a new dimension is added.", "schema": { "type": "object", "subtype": "datacube", diff --git a/filter_bbox.json b/filter_bbox.json index 28797193..5eea34ed 100644 --- a/filter_bbox.json +++ b/filter_bbox.json @@ -1,7 +1,7 @@ { "id": "filter_bbox", "summary": "Spatial filter using a bounding box", - "description": "Limits the data cube to the specified bounding box.\n\n* For raster data cubes, the filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_spatial()`` can be used to filter by geometry.\n* For vector data cubes, the filter retains the geometry in the data cube if the geometry is fully within the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_vector()`` can be used to filter by geometry.", + "description": "Limits the data cube to the specified bounding box.\n\n* For raster data cubes, the filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_spatial()`` can be used to filter by geometry.\n* For vector data cubes, the filter retains the geometry in the data cube if the geometry is fully within the bounding box (as defined in the Simple Features standard by the OGC). All geometries that were empty or get empty will be removed from the data cube. Alternatively, ``filter_vector()`` can be used to filter by geometry.", "categories": [ "cubes", "filter" diff --git a/filter_spatial.json b/filter_spatial.json index 28dda1ad..5a7a6c51 100644 --- a/filter_spatial.json +++ b/filter_spatial.json @@ -26,7 +26,7 @@ }, { "name": "geometries", - "description": "One or more geometries used for filtering, given as GeoJSON or vector data cube. If multiple geometries are provided, the union of them is used.\n\nLimits the data cube to the bounding box of the given geometries. No implicit masking gets applied. To mask the pixels of the data cube use ``mask_polygon()``.", + "description": "One or more geometries used for filtering, given as GeoJSON or vector data cube. If multiple geometries are provided, the union of them is used. Empty geometries are ignored.\n\nLimits the data cube to the bounding box of the given geometries. No implicit masking gets applied. To mask the pixels of the data cube use ``mask_polygon()``.", "schema": [ { "type": "object", diff --git a/load_collection.json b/load_collection.json index 27e538bf..f5a8c8cf 100644 --- a/load_collection.json +++ b/load_collection.json @@ -18,7 +18,7 @@ }, { "name": "spatial_extent", - "description": "Limits the data to load from the collection to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube if the geometry is fully *within* the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", + "description": "Limits the data to load from the collection to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube if the geometry is fully *within* the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC). Empty geometries may only be in the data cube if no spatial extent has been provided.\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n* Empty geometries are ignored.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", "schema": [ { "title": "Bounding Box", @@ -87,13 +87,13 @@ }, { "title": "GeoJSON", - "description": "Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported.", + "description": "Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported. Empty geometries are ignored.", "type": "object", "subtype": "geojson" }, { "title": "Vector data cube", - "description": "Limits the data cube to the bounding box of the given geometries in the vector data cube. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).", + "description": "Limits the data cube to the bounding box of the given geometries in the vector data cube. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`). Empty geometries are ignored.", "type": "object", "subtype": "datacube", "dimensions": [ diff --git a/mask_polygon.json b/mask_polygon.json index 20993acd..d2f85a22 100644 --- a/mask_polygon.json +++ b/mask_polygon.json @@ -26,7 +26,7 @@ }, { "name": "mask", - "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.", + "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n* Empty geometries are ignored.", "schema": [ { "type": "object", diff --git a/proposals/apply_polygon.json b/proposals/apply_polygon.json index b52d4ace..d373189f 100644 --- a/proposals/apply_polygon.json +++ b/proposals/apply_polygon.json @@ -17,7 +17,7 @@ }, { "name": "polygons", - "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.", + "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n* Empty geometries are ignored.", "schema": [ { "type": "object", diff --git a/proposals/filter_vector.json b/proposals/filter_vector.json index fd18b07b..90b2df20 100644 --- a/proposals/filter_vector.json +++ b/proposals/filter_vector.json @@ -1,7 +1,7 @@ { "id": "filter_vector", "summary": "Spatial vector filter using geometries", - "description": "Limits the vector data cube to the specified geometries. The process works on geometries as defined in the Simple Features standard by the OGC. Alternatively, use ``filter_bbox()`` to filter by bounding box.", + "description": "Limits the vector data cube to the specified geometries. The process works on geometries as defined in the Simple Features standard by the OGC. All geometries that were empty or get empty will be removed from the data cube. Alternatively, use ``filter_bbox()`` to filter by bounding box.", "categories": [ "cubes", "filter", @@ -24,7 +24,7 @@ }, { "name": "geometries", - "description": "One or more base geometries used for filtering, given as GeoJSON or vector data cube. If multiple base geometries are provided, the union of them is used.", + "description": "One or more base geometries used for filtering, given as GeoJSON or vector data cube. If multiple base geometries are provided, the union of them is used. Empty geometries are ignored.", "schema": [ { "type": "object", diff --git a/proposals/load_result.json b/proposals/load_result.json index 2ac7de7e..1aca00b9 100644 --- a/proposals/load_result.json +++ b/proposals/load_result.json @@ -29,7 +29,7 @@ }, { "name": "spatial_extent", - "description": "Limits the data to load from the batch job result to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube of the geometry is fully within the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", + "description": "Limits the data to load from the batch job result to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube of the geometry is fully within the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC). Empty geometries may only be in the data cube if no spatial extent has been provided.\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", "schema": [ { "title": "Bounding Box", @@ -98,13 +98,13 @@ }, { "title": "GeoJSON", - "description": "Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported.", + "description": "Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported. Empty geometries are ignored.", "type": "object", "subtype": "geojson" }, { "title": "Vector data cube", - "description": "Limits the data cube to the bounding box of the given geometries in the vector data cube. All pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).", + "description": "Limits the data cube to the bounding box of the given geometries in the vector data cube. All pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`). Empty geometries are ignored.", "type": "object", "subtype": "datacube", "dimensions": [ diff --git a/proposals/vector_buffer.json b/proposals/vector_buffer.json index d78eef9c..aa709fbb 100644 --- a/proposals/vector_buffer.json +++ b/proposals/vector_buffer.json @@ -1,7 +1,7 @@ { "id": "vector_buffer", "summary": "Buffer geometries by distance", - "description": "Buffers each input geometry by a given distance, which can either expand (dilate) or a shrink (erode) the geometry. Buffers can be applied to points, lines and polygons, but the results are always polygons. Multi-part types (e.g. `MultiPoint`) are also allowed.", + "description": "Buffers each input geometry by a given distance, which can either expand (dilate) or a shrink (erode) the geometry. Buffers can be applied to points, lines and polygons, but the results are always polygons. Empty geometries are passed through and negative buffers may result in empty geometries. Multi-part types (e.g. `MultiPoint`) are also allowed.", "categories": [ "vector" ], @@ -39,7 +39,7 @@ } ], "returns": { - "description": "Returns a vector data cube with the computed new geometries.", + "description": "Returns a vector data cube with the computed new geometries of which some may be empty.", "schema": { "type": "object", "subtype": "datacube", diff --git a/proposals/vector_to_random_points.json b/proposals/vector_to_random_points.json index 5a6ad9ac..2ebb400f 100644 --- a/proposals/vector_to_random_points.json +++ b/proposals/vector_to_random_points.json @@ -1,7 +1,7 @@ { "id": "vector_to_random_points", "summary": "Sample random points from geometries", - "description": "Generate a vector data cube of points by sampling random points from input geometries. At least one point is sampled per input geometry. Feature properties are preserved.\n\nIf `geometry_count` and `total_count` are both unrestricted (i.e. set to `null`, which is the default), one sample per geometry is used.", + "description": "Generate a vector data cube of points by sampling random points from input geometries. At least one point is sampled per input geometry. Empty geometries are passed through without any points assigned. Feature properties are preserved.\n\nIf `geometry_count` and `total_count` are both unrestricted (i.e. set to `null`, which is the default), one sample per geometry is used.", "categories": [ "cubes", "vector" diff --git a/proposals/vector_to_regular_points.json b/proposals/vector_to_regular_points.json index efb03417..c2b01585 100644 --- a/proposals/vector_to_regular_points.json +++ b/proposals/vector_to_regular_points.json @@ -1,7 +1,7 @@ { "id": "vector_to_regular_points", "summary": "Sample regular points from geometries", - "description": "Generate a vector data cube of points by sampling regularly-spaced points from input geometries. Feature properties are preserved.", + "description": "Generate a vector data cube of points by sampling regularly-spaced points from input geometries. Empty geometries are passed through without any points assigned. Feature properties are preserved.", "categories": [ "cubes", "vector" From a1f5e735e328fa492a6edc6f86621cc4db308ec8 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 31 Mar 2023 09:50:24 +0200 Subject: [PATCH 088/117] load_stac and remove job-id subtype #384 (#413) * Remove job-id subtype #384 * load_result -> load_stac --- CHANGELOG.md | 1 + meta/subtype-schemas.json | 7 - .../{load_result.json => load_stac.json} | 130 +++++++++++++++--- tests/.words | 2 + 4 files changed, 111 insertions(+), 29 deletions(-) rename proposals/{load_result.json => load_stac.json} (50%) diff --git a/CHANGELOG.md b/CHANGELOG.md index 488fbd44..c1f37b14 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -58,6 +58,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `between`: Support for temporal comparison. - Deprecated `GeometryCollections` are not supported any longer. [#389](https://github.com/Open-EO/openeo-processes/issues/389) - Deprecated PROJ definitions for the CRS are not supported any longer. +- `load_result`: Renamed to `load_stac` and the subtype `job-id` was removed in favor of providing a URL. [#322](https://github.com/Open-EO/openeo-processes/issues/322), [#377](https://github.com/Open-EO/openeo-processes/issues/377), [#384](https://github.com/Open-EO/openeo-processes/issues/384) - The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` and `array_contains`: - Removed support for temporal comparison. Instead explicitly use `date_difference`. - Removed support for the input data types array and object. [#208](https://github.com/Open-EO/openeo-processes/issues/208) diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index 498adf60..ba37301c 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -185,13 +185,6 @@ "title": "Options for Input File Formats", "description": "Key-value-pairs with arguments for the input format options supported by the back-end." }, - "job-id": { - "type": "string", - "subtype": "job-id", - "title": "Batch Job ID", - "description": "A batch job id, either one of the jobs a user has stored or a publicly available job.", - "pattern": "^[\\w\\-\\.~]+$" - }, "kernel": { "type": "array", "subtype": "kernel", diff --git a/proposals/load_result.json b/proposals/load_stac.json similarity index 50% rename from proposals/load_result.json rename to proposals/load_stac.json index 1aca00b9..f37160a1 100644 --- a/proposals/load_result.json +++ b/proposals/load_stac.json @@ -1,7 +1,7 @@ { - "id": "load_result", - "summary": "Load batch job results", - "description": "Loads batch job results and returns them as a processable data cube. A batch job result can be loaded by ID or URL:\n\n* **ID**: The identifier for a finished batch job. The job must have been submitted by the authenticated user on the back-end currently connected to.\n* **URL**: The URL to the STAC metadata for a batch job result. This is usually a signed URL that is provided by some back-ends since openEO API version 1.1.0 through the `canonical` link relation in the batch job result metadata.\n\nIf supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent` and `bands`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", + "id": "load_stac", + "summary": "Loads data from STAC", + "description": "Loads data from a static STAC catalog or a STAC API Collection and returns the data as a processable data cube. A batch job result can be loaded by providing a reference to it.\n\nIf supported by the underlying metadata and file format, the data that is added to the data cube can be restricted with the parameters `spatial_extent`, `temporal_extent` and `bands`. If no data is available for the given extents, a `NoDataAvailable` exception is thrown.\n\n**Remarks:**\n\n* The bands (and all dimensions that specify nominal dimension labels) are expected to be ordered as specified in the metadata if the `bands` parameter is set to `null`.\n* If no additional parameter is specified this would imply that the whole data set is expected to be loaded. Due to the large size of many data sets, this is not recommended and may be optimized by back-ends to only load the data that is actually required after evaluating subsequent processes such as filters. This means that the values should be processed only after the data has been limited to the required extent and as a consequence also to a manageable size.", "categories": [ "cubes", "import" @@ -9,27 +9,19 @@ "experimental": true, "parameters": [ { - "name": "id", - "description": "The id of a batch job with results.", - "schema": [ - { - "title": "ID", - "type": "string", - "subtype": "job-id", - "pattern": "^[\\w\\-\\.~]+$" - }, - { - "title": "URL", - "type": "string", - "format": "uri", - "subtype": "uri", - "pattern": "^https?://" - } - ] + "name": "url", + "description": "The URL to a static STAC catalog (STAC Item, STAC Collection, or STAC Catalog) or a specific STAC API Collection that allows to filter items and to download assets. This includes batch job results, which itself are compliant to STAC. For external URLs, authentication details such as API keys or tokens may need to be included in the URL.\n\nBatch job results can be specified in two ways:\n\n- For Batch job results at the same back-end, a URL pointing to the corresponding batch job results endpoint should be provided. The URL usually ends with `/jobs/{id}/results` and `{id}` is the corresponding batch job ID.\n- For external results, a signed URL must be provided. Not all back-ends support signed URLs, which are provided as a link with the link relation `canonical` in the batch job result metadata.", + "schema": { + "title": "URL", + "type": "string", + "format": "uri", + "subtype": "uri", + "pattern": "^https?://" + } }, { "name": "spatial_extent", - "description": "Limits the data to load from the batch job result to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube of the geometry is fully within the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC). Empty geometries may only be in the data cube if no spatial extent has been provided.\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", + "description": "Limits the data to load to the specified bounding box or polygons.\n\n* For raster data, the process loads the pixel into the data cube if the point at the pixel center intersects with the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC).\n* For vector data, the process loads the geometry into the data cube if the geometry is fully within the bounding box or any of the polygons (as defined in the Simple Features standard by the OGC). Empty geometries may only be in the data cube if no spatial extent has been provided.\n\nThe GeoJSON can be one of the following feature types:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n\nSet this parameter to `null` to set no limit for the spatial extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_bbox()`` or ``filter_spatial()`` directly after loading unbounded data.", "schema": [ { "title": "Bounding Box", @@ -124,7 +116,7 @@ }, { "name": "temporal_extent", - "description": "Limits the data to load from the batch job result to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Also supports open intervals by setting one of the boundaries to `null`, but never both.\n\nSet this parameter to `null` to set no limit for the temporal extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_temporal()`` directly after loading unbounded data.", + "description": "Limits the data to load to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Also supports open intervals by setting one of the boundaries to `null`, but never both.\n\nSet this parameter to `null` to set no limit for the temporal extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_temporal()`` directly after loading unbounded data.", "schema": [ { "type": "array", @@ -195,6 +187,44 @@ ], "default": null, "optional": true + }, + { + "name": "properties", + "description": "Limits the data by metadata properties to include only data in the data cube which all given conditions return `true` for (AND operation).\n\nSpecify key-value-pairs with the key being the name of the metadata property, which can be retrieved with the openEO Data Discovery for Collections. The value must be a condition (user-defined process) to be evaluated against a STAC API. This parameter is not supported for static STAC.", + "schema": [ + { + "type": "object", + "subtype": "metadata-filter", + "title": "Filters", + "description": "A list of filters to check against. Specify key-value-pairs with the key being the name of the metadata property name and the value being a process evaluated against the metadata values.", + "additionalProperties": { + "type": "object", + "subtype": "process-graph", + "parameters": [ + { + "name": "value", + "description": "The property value to be checked against.", + "schema": { + "description": "Any data type." + } + } + ], + "returns": { + "description": "`true` if the data should be loaded into the data cube, otherwise `false`.", + "schema": { + "type": "boolean" + } + } + } + }, + { + "title": "No filter", + "description": "Don't filter by metadata properties.", + "type": "null" + } + ], + "default": null, + "optional": true } ], "returns": { @@ -204,6 +234,62 @@ "subtype": "datacube" } }, + "examples": [ + { + "title": "Load from a static STAC / batch job result", + "arguments": { + "url": "https://example.com/api/v1.0/jobs/123/results" + } + }, + { + "title": "Load from a STAC API", + "arguments": { + "url": "https://example.com/collections/SENTINEL2", + "spatial_extent": { + "west": 16.1, + "east": 16.6, + "north": 48.6, + "south": 47.2 + }, + "temporal_extent": [ + "2018-01-01", + "2019-01-01" + ], + "properties": { + "eo:cloud_cover": { + "process_graph": { + "cc": { + "process_id": "between", + "arguments": { + "x": { + "from_parameter": "value" + }, + "min": 0, + "max": 50 + }, + "result": true + } + } + }, + "platform": { + "process_graph": { + "pf": { + "process_id": "eq", + "arguments": { + "x": { + "from_parameter": "value" + }, + "y": "Sentinel-2B", + "case_sensitive": false + }, + "result": true + } + } + } + } + } + } + ], "exceptions": { "NoDataAvailable": { "message": "There is no data available for the given extents." diff --git a/tests/.words b/tests/.words index b4cd56fe..a8a1d78a 100644 --- a/tests/.words +++ b/tests/.words @@ -33,6 +33,8 @@ Sentinel-2A Sentinel-2B signum STAC +catalog +Catalog summand UDFs gdalwarp From e251d05b534666621ed034d2b6a2d8c8bee40386 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 31 Mar 2023 10:28:46 +0200 Subject: [PATCH 089/117] Streamline GeoJSON import #346 (#412) * Streamline geojson import #346 * Fix typos * Deprecation * Remove load_geojson --- CHANGELOG.md | 9 ++++++++- aggregate_spatial.json | 13 +++++++----- filter_spatial.json | 12 +++++++---- load_collection.json | 5 +++-- mask_polygon.json | 13 +++++++----- meta/subtype-schemas.json | 3 ++- proposals/filter_vector.json | 27 +++++++++---------------- proposals/vector_buffer.json | 27 +++++++++---------------- proposals/vector_to_random_points.json | 25 +++++++++-------------- proposals/vector_to_regular_points.json | 25 +++++++++-------------- tests/.words | 1 + 11 files changed, 76 insertions(+), 84 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c1f37b14..1a7ecbbb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -52,13 +52,20 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `sort` and `order`: The ordering of ties is not defined anymore. [#409](https://github.com/Open-EO/openeo-processes/issues/409) - `quantiles`: Parameter `probabilities` provided as array must be in ascending order. [#297](https://github.com/Open-EO/openeo-processes/pull/297) +### Deprecated + +- `aggregate_spatial`, `filter_spatial`, `load_collection`, `mask_polygon`: GeoJSON input is deprecated. [#346](https://github.com/Open-EO/openeo-processes/issues/346) + ### Removed - The `examples` folder has been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. - `between`: Support for temporal comparison. - Deprecated `GeometryCollections` are not supported any longer. [#389](https://github.com/Open-EO/openeo-processes/issues/389) - Deprecated PROJ definitions for the CRS are not supported any longer. -- `load_result`: Renamed to `load_stac` and the subtype `job-id` was removed in favor of providing a URL. [#322](https://github.com/Open-EO/openeo-processes/issues/322), [#377](https://github.com/Open-EO/openeo-processes/issues/377), [#384](https://github.com/Open-EO/openeo-processes/issues/384) +- `load_result`: + - Renamed to `load_stac` + - The subtype `job-id` was removed in favor of providing a URL. [#322](https://github.com/Open-EO/openeo-processes/issues/322), [#377](https://github.com/Open-EO/openeo-processes/issues/377), [#384](https://github.com/Open-EO/openeo-processes/issues/384) + - GeoJSON input is not supported any longer. [#346](https://github.com/Open-EO/openeo-processes/issues/346) - The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` and `array_contains`: - Removed support for temporal comparison. Instead explicitly use `date_difference`. - Removed support for the input data types array and object. [#208](https://github.com/Open-EO/openeo-processes/issues/208) diff --git a/aggregate_spatial.json b/aggregate_spatial.json index 880ad9a3..eb913949 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -29,11 +29,7 @@ "description": "Geometries for which the aggregation will be computed. Feature properties are preserved for vector data cubes and all GeoJSON Features.\n\nOne value will be computed per label in the dimension of type `geometries`, GeoJSON `Feature` or `Geometry`. For a `FeatureCollection` multiple values will be computed, one value per contained `Feature`. No values will be computed for empty geometries. For example, a single value will be computed for a `MultiPolygon`, but two values will be computed for a `FeatureCollection` containing two polygons.\n\n- For **polygons**, the process considers all pixels for which the point at the pixel center intersects with the corresponding polygon (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nThus, pixels may be part of multiple geometries and be part of multiple aggregations. No operation is applied to geometries that are outside of the bounds of the data.", "schema": [ { - "type": "object", - "subtype": "geojson", - "description": "The GeoJSON type `GeometryCollection` is not supported." - }, - { + "title": "Vector Data Cube", "type": "object", "subtype": "datacube", "dimensions": [ @@ -41,6 +37,13 @@ "type": "geometry" } ] + }, + { + "title": "GeoJSON", + "type": "object", + "subtype": "geojson", + "description": "Deprecated. The GeoJSON type `GeometryCollection` is not supported.", + "deprecated": true } ] }, diff --git a/filter_spatial.json b/filter_spatial.json index 5a7a6c51..8f8120ce 100644 --- a/filter_spatial.json +++ b/filter_spatial.json @@ -29,10 +29,7 @@ "description": "One or more geometries used for filtering, given as GeoJSON or vector data cube. If multiple geometries are provided, the union of them is used. Empty geometries are ignored.\n\nLimits the data cube to the bounding box of the given geometries. No implicit masking gets applied. To mask the pixels of the data cube use ``mask_polygon()``.", "schema": [ { - "type": "object", - "subtype": "geojson" - }, - { + "title": "Vector Data Cube", "type": "object", "subtype": "datacube", "dimensions": [ @@ -40,6 +37,13 @@ "type": "geometry" } ] + }, + { + "title": "GeoJSON", + "type": "object", + "subtype": "geojson", + "description": "Deprecated. The GeoJSON type `GeometryCollection` is not supported.", + "deprecated": true } ] } diff --git a/load_collection.json b/load_collection.json index f5a8c8cf..06d4ac94 100644 --- a/load_collection.json +++ b/load_collection.json @@ -87,9 +87,10 @@ }, { "title": "GeoJSON", - "description": "Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported. Empty geometries are ignored.", + "description": "Deprecated. Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported. Empty geometries are ignored.", "type": "object", - "subtype": "geojson" + "subtype": "geojson", + "deprecated": true }, { "title": "Vector data cube", diff --git a/mask_polygon.json b/mask_polygon.json index d2f85a22..8d8e3cad 100644 --- a/mask_polygon.json +++ b/mask_polygon.json @@ -29,11 +29,7 @@ "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n* Empty geometries are ignored.", "schema": [ { - "type": "object", - "subtype": "geojson", - "description": "The GeoJSON type `GeometryCollection` is not supported." - }, - { + "title": "Vector Data Cube", "type": "object", "subtype": "datacube", "dimensions": [ @@ -45,6 +41,13 @@ ] } ] + }, + { + "title": "GeoJSON", + "type": "object", + "subtype": "geojson", + "description": "Deprecated. The GeoJSON type `GeometryCollection` is not supported.", + "deprecated": true } ] }, diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index ba37301c..dc527e08 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -166,7 +166,8 @@ "type": "object", "subtype": "geojson", "title": "GeoJSON", - "description": "GeoJSON as defined by [RFC 7946](https://www.rfc-editor.org/rfc/rfc7946.html).", + "description": "Deprecated. GeoJSON as defined by [RFC 7946](https://www.rfc-editor.org/rfc/rfc7946.html). The GeoJSON type `GeometryCollection` is not supported.", + "deprecated": true, "allOf": [ { "$ref": "https://geojson.org/schema/GeoJSON.json" diff --git a/proposals/filter_vector.json b/proposals/filter_vector.json index 90b2df20..4560848e 100644 --- a/proposals/filter_vector.json +++ b/proposals/filter_vector.json @@ -24,23 +24,16 @@ }, { "name": "geometries", - "description": "One or more base geometries used for filtering, given as GeoJSON or vector data cube. If multiple base geometries are provided, the union of them is used. Empty geometries are ignored.", - "schema": [ - { - "type": "object", - "subtype": "geojson", - "description": "The GeoJSON type `GeometryCollection` is not supported." - }, - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - } - ] - } - ] + "description": "One or more base geometries used for filtering, given as vector data cube. If multiple base geometries are provided, the union of them is used.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry" + } + ] + } }, { "name": "relation", diff --git a/proposals/vector_buffer.json b/proposals/vector_buffer.json index aa709fbb..9c4f86ae 100644 --- a/proposals/vector_buffer.json +++ b/proposals/vector_buffer.json @@ -9,23 +9,16 @@ "parameters": [ { "name": "geometries", - "description": "Geometries to apply the buffer on. Feature properties are preserved for vector data cubes and all GeoJSON Features.", - "schema": [ - { - "type": "object", - "subtype": "geojson", - "description": "The GeoJSON type `GeometryCollection` is not supported." - }, - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - } - ] - } - ] + "description": "Geometries to apply the buffer on. Feature properties are preserved.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry" + } + ] + } }, { "name": "distance", diff --git a/proposals/vector_to_random_points.json b/proposals/vector_to_random_points.json index 2ebb400f..b568b54c 100644 --- a/proposals/vector_to_random_points.json +++ b/proposals/vector_to_random_points.json @@ -11,22 +11,15 @@ { "name": "data", "description": "Input geometries for sample extraction.", - "schema": [ - { - "type": "object", - "subtype": "geojson", - "description": "The GeoJSON type `GeometryCollection` is not supported." - }, - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - } - ] - } - ] + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry" + } + ] + } }, { "name": "geometry_count", diff --git a/proposals/vector_to_regular_points.json b/proposals/vector_to_regular_points.json index c2b01585..2f353bdb 100644 --- a/proposals/vector_to_regular_points.json +++ b/proposals/vector_to_regular_points.json @@ -11,22 +11,15 @@ { "name": "data", "description": "Input geometries for sample extraction.", - "schema": [ - { - "type": "object", - "subtype": "geojson", - "description": "The GeoJSON type `GeometryCollection` is not supported." - }, - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry" - } - ] - } - ] + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry" + } + ] + } }, { "name": "distance", diff --git a/tests/.words b/tests/.words index a8a1d78a..c7f2a702 100644 --- a/tests/.words +++ b/tests/.words @@ -10,6 +10,7 @@ DEM-based Domini gamma0 GeoJSON +FeatureCollections labeled MathWorld n-ary From 19d6f029cebced10affaf94029e016265bf2a81c Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 31 Mar 2023 10:36:19 +0200 Subject: [PATCH 090/117] Remove geojson type from experimental process apply_polygon --- proposals/apply_polygon.json | 35 ++++++++++++++--------------------- 1 file changed, 14 insertions(+), 21 deletions(-) diff --git a/proposals/apply_polygon.json b/proposals/apply_polygon.json index d373189f..735226a1 100644 --- a/proposals/apply_polygon.json +++ b/proposals/apply_polygon.json @@ -17,27 +17,20 @@ }, { "name": "polygons", - "description": "A GeoJSON object or a vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n* Empty geometries are ignored.", - "schema": [ - { - "type": "object", - "subtype": "geojson", - "description": "The GeoJSON type `GeometryCollection` is not supported." - }, - { - "type": "object", - "subtype": "datacube", - "dimensions": [ - { - "type": "geometry", - "geometry_type": [ - "Polygon", - "MultiPolygon" - ] - } - ] - } - ] + "description": "A vector data cube containing at least one polygon. The provided vector data can be one of the following:\n\n* A `Polygon` or `MultiPolygon` geometry,\n* a `Feature` with a `Polygon` or `MultiPolygon` geometry, or\n* a `FeatureCollection` containing at least one `Feature` with `Polygon` or `MultiPolygon` geometries.\n* Empty geometries are ignored.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry", + "geometry_type": [ + "Polygon", + "MultiPolygon" + ] + } + ] + } }, { "name": "process", From 4dff343ef937039beec481705da41d23fc426471 Mon Sep 17 00:00:00 2001 From: Stefaan Lippens Date: Fri, 31 Mar 2023 18:00:53 +0200 Subject: [PATCH 091/117] Improve summary of order process (#424) * Improve summary of order process * Link between processes, other wordsmithing --------- Co-authored-by: Matthias Mohr --- order.json | 4 ++-- rearrange.json | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/order.json b/order.json index 871c4fea..cb8e1681 100644 --- a/order.json +++ b/order.json @@ -1,7 +1,7 @@ { "id": "order", - "summary": "Create a permutation", - "description": "Computes a permutation which allows rearranging the data into ascending or descending order. In other words, this process computes the ranked (sorted) element positions in the original list.\n\n**Remarks:**\n\n* The positions in the result are zero-based.\n* The ordering of ties is implementation-dependent.\n* Temporal strings can *not* be compared based on their string representation due to the time zone/time-offset representations.", + "summary": "Get the order of array elements", + "description": "Computes the ranked (sorted) element positions in the original list (i.e., a permutation), either in ascending or descending order. The process ``rearrange()`` allows sorting the data based on the computed permutation.\n\n**Remarks:**\n\n* The positions in the result are zero-based.\n* The ordering of ties is implementation-dependent.\n* Temporal strings can *not* be compared based on their string representation due to the time zone/time-offset representations.", "categories": [ "arrays", "sorting" diff --git a/rearrange.json b/rearrange.json index 92bdf6db..011136c1 100644 --- a/rearrange.json +++ b/rearrange.json @@ -1,7 +1,7 @@ { "id": "rearrange", - "summary": "Rearrange an array based on a permutation", - "description": "Rearranges an array based on a permutation, i.e. a ranked list of element positions in the original list. The positions must be zero-based.", + "summary": "Sort an array based on a permutation", + "description": "Rearranges an array based on a ranked list of element positions in the original list (i.e., a permutation). The positions must be zero-based. The process ``order()`` can compute such a permutation.", "categories": [ "arrays", "sorting" @@ -109,4 +109,4 @@ "title": "Permutation explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} From 01fa2e39dce4e5daac9f65178d02620a653c7965 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 3 Apr 2023 11:54:30 +0200 Subject: [PATCH 092/117] Add load_url (#428) * Add load_http #415 * load_http -> load_url * Apply suggestions from code review Co-authored-by: Stefaan Lippens --- CHANGELOG.md | 1 + proposals/load_url.json | 53 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 54 insertions(+) create mode 100644 proposals/load_url.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 1a7ecbbb..69dbd96b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `date_difference` - `filter_vector` - `flatten_dimensions` + - `load_url` - `unflatten_dimension` - `vector_buffer` - `vector_reproject` diff --git a/proposals/load_url.json b/proposals/load_url.json new file mode 100644 index 00000000..1b7a1f19 --- /dev/null +++ b/proposals/load_url.json @@ -0,0 +1,53 @@ +{ + "id": "load_url", + "summary": "Load data from a URL", + "description": "Loads a file from a URL (supported protocols: HTTP and HTTPS).", + "categories": [ + "cubes", + "import" + ], + "experimental": true, + "parameters": [ + { + "name": "url", + "description": "The URL to read from. Authentication details such as API keys or tokens may need to be included in the URL.", + "schema": { + "title": "URL", + "type": "string", + "format": "uri", + "subtype": "uri", + "pattern": "^https?://" + } + }, + { + "name": "format", + "description": "The file format to use when loading the data. It must be one of the values that the server reports as supported input file formats, which usually correspond to the short GDAL/OGR codes. If the format is not suitable for loading the data, a `FormatUnsuitable` exception will be thrown. This parameter is *case insensitive*.", + "schema": { + "type": "string", + "subtype": "input-format" + } + }, + { + "name": "options", + "description": "The file format parameters to use when reading the data. Must correspond to the parameters that the server reports as supported parameters for the chosen `format`. The parameter names and valid values usually correspond to the GDAL/OGR format options.", + "schema": { + "type": "object", + "subtype": "input-format-options" + }, + "default": {}, + "optional": true + } + ], + "returns": { + "description": "A data cube for further processing.", + "schema": { + "type": "object", + "subtype": "datacube" + } + }, + "exceptions": { + "FormatUnsuitable": { + "message": "Data can't be loaded with the requested input format." + } + } +} From 8d63d0a4eca9968cfa082ad9f38582e182d8ec13 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 3 Apr 2023 13:11:46 +0200 Subject: [PATCH 093/117] Add load_geojson (#427) * Add load_geojson #346 #415 --- CHANGELOG.md | 5 ++-- aggregate_spatial.json | 2 +- filter_spatial.json | 2 +- load_collection.json | 2 +- mask_polygon.json | 2 +- meta/subtype-schemas.json | 2 +- proposals/load_geojson.json | 53 +++++++++++++++++++++++++++++++++++++ tests/.words | 1 + 8 files changed, 62 insertions(+), 7 deletions(-) create mode 100644 proposals/load_geojson.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 69dbd96b..7d268dd9 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `date_difference` - `filter_vector` - `flatten_dimensions` + - `load_geojson` - `load_url` - `unflatten_dimension` - `vector_buffer` @@ -55,7 +56,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Deprecated -- `aggregate_spatial`, `filter_spatial`, `load_collection`, `mask_polygon`: GeoJSON input is deprecated. [#346](https://github.com/Open-EO/openeo-processes/issues/346) +- `aggregate_spatial`, `filter_spatial`, `load_collection`, `mask_polygon`: GeoJSON input is deprecated in favor of `load_geojson`. [#346](https://github.com/Open-EO/openeo-processes/issues/346) ### Removed @@ -66,7 +67,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `load_result`: - Renamed to `load_stac` - The subtype `job-id` was removed in favor of providing a URL. [#322](https://github.com/Open-EO/openeo-processes/issues/322), [#377](https://github.com/Open-EO/openeo-processes/issues/377), [#384](https://github.com/Open-EO/openeo-processes/issues/384) - - GeoJSON input is not supported any longer. [#346](https://github.com/Open-EO/openeo-processes/issues/346) + - GeoJSON input is not supported any longer. Use `load_geojson` instead. [#346](https://github.com/Open-EO/openeo-processes/issues/346) - The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` and `array_contains`: - Removed support for temporal comparison. Instead explicitly use `date_difference`. - Removed support for the input data types array and object. [#208](https://github.com/Open-EO/openeo-processes/issues/208) diff --git a/aggregate_spatial.json b/aggregate_spatial.json index eb913949..06e07918 100644 --- a/aggregate_spatial.json +++ b/aggregate_spatial.json @@ -42,7 +42,7 @@ "title": "GeoJSON", "type": "object", "subtype": "geojson", - "description": "Deprecated. The GeoJSON type `GeometryCollection` is not supported.", + "description": "Deprecated in favor of ``load_geojson()``. The GeoJSON type `GeometryCollection` is not supported.", "deprecated": true } ] diff --git a/filter_spatial.json b/filter_spatial.json index 8f8120ce..c0c116cd 100644 --- a/filter_spatial.json +++ b/filter_spatial.json @@ -42,7 +42,7 @@ "title": "GeoJSON", "type": "object", "subtype": "geojson", - "description": "Deprecated. The GeoJSON type `GeometryCollection` is not supported.", + "description": "Deprecated in favor of ``load_geojson()``. The GeoJSON type `GeometryCollection` is not supported.", "deprecated": true } ] diff --git a/load_collection.json b/load_collection.json index 06d4ac94..49df9650 100644 --- a/load_collection.json +++ b/load_collection.json @@ -87,7 +87,7 @@ }, { "title": "GeoJSON", - "description": "Deprecated. Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported. Empty geometries are ignored.", + "description": "Deprecated in favor of ``load_geojson()``. Limits the data cube to the bounding box of the given geometries. For raster data, all pixels inside the bounding box that do not intersect with any of the polygons will be set to no data (`null`).\n\nThe GeoJSON type `GeometryCollection` is not supported. Empty geometries are ignored.", "type": "object", "subtype": "geojson", "deprecated": true diff --git a/mask_polygon.json b/mask_polygon.json index 8d8e3cad..f04d3750 100644 --- a/mask_polygon.json +++ b/mask_polygon.json @@ -46,7 +46,7 @@ "title": "GeoJSON", "type": "object", "subtype": "geojson", - "description": "Deprecated. The GeoJSON type `GeometryCollection` is not supported.", + "description": "Deprecated in favor of ``load_geojson()``. The GeoJSON type `GeometryCollection` is not supported.", "deprecated": true } ] diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index dc527e08..b44cb8dc 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -166,7 +166,7 @@ "type": "object", "subtype": "geojson", "title": "GeoJSON", - "description": "Deprecated. GeoJSON as defined by [RFC 7946](https://www.rfc-editor.org/rfc/rfc7946.html). The GeoJSON type `GeometryCollection` is not supported.", + "description": "GeoJSON as defined by [RFC 7946](https://www.rfc-editor.org/rfc/rfc7946.html). The GeoJSON type `GeometryCollection` is not supported.", "deprecated": true, "allOf": [ { diff --git a/proposals/load_geojson.json b/proposals/load_geojson.json new file mode 100644 index 00000000..70566a56 --- /dev/null +++ b/proposals/load_geojson.json @@ -0,0 +1,53 @@ +{ + "id": "load_geojson", + "summary": "Converts GeoJSON into a vector data cube", + "description": "Converts GeoJSON data as defined by [RFC 7946](https://www.rfc-editor.org/rfc/rfc7946.html) into a vector data cube. Feature properties are preserved.", + "categories": [ + "import", + "vector" + ], + "experimental": true, + "parameters": [ + { + "name": "data", + "description": "A GeoJSON object to convert into a vector data cube. The GeoJSON type `GeometryCollection` is not supported. Each geometry in the GeoJSON data results in a dimension label in the `geometries` dimension.", + "schema": { + "type": "object", + "subtype": "geojson" + } + }, + { + "name": "properties", + "description": "A list of properties from the GeoJSON file to construct an additional dimension from. A new dimension with the name `properties` and type `other` is created if at least one property is provided. Only applies for GeoJSON Features and FeatureCollections. Missing values are generally set to no-data (`null`).\n\nDepending on the number of properties provided, the process creates the dimension differently:\n\n- Single property with scalar values: A single dimension label with the name of the property and a single value per geometry.\n- Single property of type array: The dimension labels correspond to the array indices. There are as many values and labels per geometry as there are for the largest array.\n- Multiple properties with scalar values: The dimension labels correspond to the property names. There are as many values and labels per geometry as there are properties provided here.", + "schema": { + "type": "array", + "uniqueItems": true, + "items": { + "type": "string" + } + }, + "default": [], + "optional": true + } + ], + "returns": { + "description": "A vector data cube containing the geometries, either one or two dimensional.", + "schema": { + "type": "object", + "subtype": "datacube", + "dimensions": [ + { + "type": "geometry" + } + ] + } + }, + "links": [ + { + "href": "https://www.rfc-editor.org/rfc/rfc7946.html", + "title": "RFC 7946: The GeoJSON Format", + "type": "text/html", + "rel": "about" + } + ] +} diff --git a/tests/.words b/tests/.words index c7f2a702..a50285ba 100644 --- a/tests/.words +++ b/tests/.words @@ -46,3 +46,4 @@ Breiman Hyndman date1 date2 +favor From 21234cb1d419e2bd73247c24bf4308e40e2560f4 Mon Sep 17 00:00:00 2001 From: Stefaan Lippens Date: Wed, 5 Apr 2023 15:40:39 +0200 Subject: [PATCH 094/117] Finetune aggregate_temporal_period description (#431) * Issue #430 Finetune aggregate_temporal_period description * Update aggregate_temporal_period.json --------- Co-authored-by: Matthias Mohr --- aggregate_temporal_period.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/aggregate_temporal_period.json b/aggregate_temporal_period.json index 0fc94f11..bdaea43e 100644 --- a/aggregate_temporal_period.json +++ b/aggregate_temporal_period.json @@ -23,7 +23,7 @@ }, { "name": "period", - "description": "The time intervals to aggregate. The following pre-defined values are available:\n\n* `hour`: Hour of the day\n* `day`: Day of the year\n* `week`: Week of the year\n* `dekad`: Ten day periods, counted per year with three periods per month (day 1 - 10, 11 - 20 and 21 - end of month). The third dekad of the month can range from 8 to 11 days. For example, the fourth dekad is Feb, 1 - Feb, 10 each year.\n* `month`: Month of the year\n* `season`: Three month periods of the calendar seasons (December - February, March - May, June - August, September - November).\n* `tropical-season`: Six month periods of the tropical seasons (November - April, May - October).\n* `year`: Proleptic years\n* `decade`: Ten year periods ([0-to-9 decade](https://en.wikipedia.org/wiki/Decade#0-to-9_decade)), from a year ending in a 0 to the next year ending in a 9.\n* `decade-ad`: Ten year periods ([1-to-0 decade](https://en.wikipedia.org/wiki/Decade#1-to-0_decade)) better aligned with the anno Domini (AD) calendar era, from a year ending in a 1 to the next year ending in a 0.", + "description": "The time intervals to aggregate. The following pre-defined values are available:\n\n* `hour`: Hour of the day\n* `day`: Day of the year\n* `week`: Week of the year\n* `dekad`: Ten day periods, counted per year with three periods per month (day 1 - 10, 11 - 20 and 21 - end of month). The third dekad of the month can range from 8 to 11 days. For example, the third dekad of a year spans from January 21 till January 31 (11 days), the fourth dekad spans from February 1 till February 10 (10 days) and the sixth dekad spans from February 21 till February 28 or February 29 in a leap year (8 or 9 days respectively).\n* `month`: Month of the year\n* `season`: Three month periods of the calendar seasons (December - February, March - May, June - August, September - November).\n* `tropical-season`: Six month periods of the tropical seasons (November - April, May - October).\n* `year`: Proleptic years\n* `decade`: Ten year periods ([0-to-9 decade](https://en.wikipedia.org/wiki/Decade#0-to-9_decade)), from a year ending in a 0 to the next year ending in a 9.\n* `decade-ad`: Ten year periods ([1-to-0 decade](https://en.wikipedia.org/wiki/Decade#1-to-0_decade)) better aligned with the anno Domini (AD) calendar era, from a year ending in a 1 to the next year ending in a 0.", "schema": { "type": "string", "enum": [ From 39cb6bacbbf3b28813c28aada5a71885c2f13728 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Sat, 29 Apr 2023 19:03:22 +0200 Subject: [PATCH 095/117] Clarify temporal intervals #331 (#394) * The temporal intervals must always be non-empty, i.e. the second instance in time must be after the first instance in time. #331 * Add uniqueItems, remove mention of 24 as the hour, remove temporal-interval subtype from climatological_normal (as it has inclusive upper boundaries) * Improved terminology * Update load_stac * Updates according to recent discussions #331 * Apply suggestions from code review Co-authored-by: Stefaan Lippens * Remove timezone from times --------- Co-authored-by: Stefaan Lippens --- CHANGELOG.md | 10 +++- aggregate_temporal.json | 37 ++++++------ climatological_normal.json | 13 ++--- filter_temporal.json | 23 ++++---- load_collection.json | 24 ++++---- meta/subtype-schemas.json | 28 +++++---- order.json | 5 -- proposals/date_between.json | 112 ++++++++++++++++++++++++++++++++++++ proposals/load_stac.json | 58 +++++++++++++++---- sort.json | 10 ---- 10 files changed, 233 insertions(+), 87 deletions(-) create mode 100644 proposals/date_between.json diff --git a/CHANGELOG.md b/CHANGELOG.md index 7d268dd9..acb2bbe4 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,6 +9,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Added - New processes in proposal state: + - `date_between` - `date_difference` - `filter_vector` - `flatten_dimensions` @@ -39,9 +40,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Allow `null` as default value for units. - Input and Output for the `process` can either be data cubes or arrays (if one-dimensional). [#387](https://github.com/Open-EO/openeo-processes/issues/387) - `run_udf`: Allow all data types instead of just objects in the `context` parameter. [#376](https://github.com/Open-EO/openeo-processes/issues/376) -- `load_collection` and `load_result`: +- `load_collection` and `load_result`/`load_stac`: - Require at least one band if not set to `null`. [#372](https://github.com/Open-EO/openeo-processes/issues/372) - Added a `NoDataAvailable` exception +- `aggregate_temporal`, `filter_temporal`, `load_collection` and `load_result`/`load_stac`: + - The temporal intervals must always be non-empty, i.e. the second instance in time must be after the first instance in time. [#331](https://github.com/Open-EO/openeo-processes/issues/331) + - `24` as the hour is not allowed anymore. [#331](https://github.com/Open-EO/openeo-processes/issues/331) - `inspect`: The parameter `message` has been moved to be the second argument. [#369](https://github.com/Open-EO/openeo-processes/issues/369) - `mask` and `merge_cubes`: The spatial dimensions `x` and `y` can now be resampled implicitly instead of throwing an error. [#402](https://github.com/Open-EO/openeo-processes/issues/402) - `save_result`: Added a more concrete `DataCubeEmpty` exception. @@ -53,6 +57,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Updated the processes based on the subtypes `raster-cube` or `vector-cube` to work with the subtype `datacube` instead. [#68](https://github.com/Open-EO/openeo-processes/issues/68) - `sort` and `order`: The ordering of ties is not defined anymore. [#409](https://github.com/Open-EO/openeo-processes/issues/409) - `quantiles`: Parameter `probabilities` provided as array must be in ascending order. [#297](https://github.com/Open-EO/openeo-processes/pull/297) +- `climatological_normal`: The `climatology_period` parameter accepts an array of integers instead of strings. [#331](https://github.com/Open-EO/openeo-processes/issues/331) ### Deprecated @@ -61,7 +66,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Removed - The `examples` folder has been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. -- `between`: Support for temporal comparison. +- `between`: Support for temporal comparison. Use `date_between` instead. [#331](https://github.com/Open-EO/openeo-processes/issues/331) - Deprecated `GeometryCollections` are not supported any longer. [#389](https://github.com/Open-EO/openeo-processes/issues/389) - Deprecated PROJ definitions for the CRS are not supported any longer. - `load_result`: @@ -71,6 +76,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - The comparison processes `eq`, `neq`, `lt`, `lte`, `gt`, `gte` and `array_contains`: - Removed support for temporal comparison. Instead explicitly use `date_difference`. - Removed support for the input data types array and object. [#208](https://github.com/Open-EO/openeo-processes/issues/208) +- `sort` and `order`: Removed support for time-only values. [#331](https://github.com/Open-EO/openeo-processes/issues/331) ### Fixed diff --git a/aggregate_temporal.json b/aggregate_temporal.json index 1a9e4b09..ad1d560b 100644 --- a/aggregate_temporal.json +++ b/aggregate_temporal.json @@ -22,7 +22,7 @@ }, { "name": "intervals", - "description": "Left-closed temporal intervals, which are allowed to overlap. Each temporal interval in the array has exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Although [RFC 3339 prohibits the hour to be '24'](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.7), **this process allows the value '24' for the hour** of an end time in order to make it possible that left-closed time intervals can fully cover the day.", + "description": "Left-closed temporal intervals, which are allowed to overlap. Each temporal interval in the array has exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified time instant is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified time instant is **excluded** from the interval.\n\nThe second element must always be greater/later than the first element, except when using time without date. Otherwise, a `TemporalExtentEmpty` exception is thrown.", "schema": { "type": "array", "subtype": "temporal-intervals", @@ -30,6 +30,7 @@ "items": { "type": "array", "subtype": "temporal-interval", + "uniqueItems": true, "minItems": 2, "maxItems": 2, "items": { @@ -37,24 +38,20 @@ { "type": "string", "format": "date-time", - "subtype": "date-time" + "subtype": "date-time", + "description": "Date and time with a time zone." }, { "type": "string", "format": "date", - "subtype": "date" + "subtype": "date", + "description": "Date only, formatted as `YYYY-MM-DD`. The time zone is UTC. Missing time components are all 0." }, { "type": "string", - "format": "time", - "subtype": "time" - }, - { - "type": "string", - "subtype": "year", - "minLength": 4, - "maxLength": 4, - "pattern": "^\\d{4}$" + "subtype": "time", + "pattern": "^\\d{2}:\\d{2}:\\d{2}$", + "description": "Time only, formatted as `HH:MM:SS`. The time zone is UTC." }, { "type": "null" @@ -79,12 +76,12 @@ ], [ [ - "00:00:00Z", - "12:00:00Z" + "06:00:00", + "18:00:00" ], [ - "12:00:00Z", - "24:00:00Z" + "18:00:00", + "06:00:00" ] ] ] @@ -235,6 +232,9 @@ }, "DistinctDimensionLabelsRequired": { "message": "The dimension labels have duplicate values. Distinct labels must be specified." + }, + "TemporalExtentEmpty": { + "message": "At least one of the intervals is empty. The second instant in time must always be greater/later than the first instant." } }, "links": [ @@ -242,6 +242,11 @@ "href": "https://openeo.org/documentation/1.0/datacubes.html#aggregate", "rel": "about", "title": "Aggregation explained in the openEO documentation" + }, + { + "href": "https://www.rfc-editor.org/rfc/rfc3339.html", + "rel": "about", + "title": "RFC3339: Details about formatting temporal strings" } ] } diff --git a/climatological_normal.json b/climatological_normal.json index 8d97ef5a..fa3441b6 100644 --- a/climatological_normal.json +++ b/climatological_normal.json @@ -39,20 +39,17 @@ "description": "The climatology period as a closed temporal interval. The first element of the array is the first year to be fully included in the temporal interval. The second element is the last year to be fully included in the temporal interval.\n\nThe default climatology period is from 1981 until 2010 (both inclusive) right now, but this might be updated over time to what is commonly used in climatology. If you don't want to keep your research to be reproducible, please explicitly specify a period.", "schema": { "type": "array", - "subtype": "temporal-interval", + "uniqueItems": true, "minItems": 2, "maxItems": 2, "items": { - "type": "string", - "subtype": "year", - "minLength": 4, - "maxLength": 4, - "pattern": "^\\d{4}$" + "type": "integer", + "subtype": "year" } }, "default": [ - "1981", - "2010" + 1981, + 2010 ], "optional": true } diff --git a/filter_temporal.json b/filter_temporal.json index c873645b..4a463674 100644 --- a/filter_temporal.json +++ b/filter_temporal.json @@ -22,7 +22,7 @@ }, { "name": "extent", - "description": "Left-closed temporal interval, i.e. an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Also supports open intervals by setting one of the boundaries to `null`, but never both.", + "description": "Left-closed temporal interval, i.e. an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified time instant is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified time instant is **excluded** from the interval.\n\nThe second element must always be greater/later than the first element. Otherwise, a `TemporalExtentEmpty` exception is thrown.\n\nAlso supports unbounded intervals by setting one of the boundaries to `null`, but never both.", "schema": { "type": "array", "subtype": "temporal-interval", @@ -33,19 +33,14 @@ { "type": "string", "format": "date-time", - "subtype": "date-time" + "subtype": "date-time", + "description": "Date and time with a time zone." }, { "type": "string", "format": "date", - "subtype": "date" - }, - { - "type": "string", - "subtype": "year", - "minLength": 4, - "maxLength": 4, - "pattern": "^\\d{4}$" + "subtype": "date", + "description": "Date only, formatted as `YYYY-MM-DD`. The time zone is UTC. Missing time components are all 0." }, { "type": "null" @@ -92,6 +87,9 @@ "exceptions": { "DimensionNotAvailable": { "message": "A dimension with the specified name does not exist." + }, + "TemporalExtentEmpty": { + "message": "The temporal extent is empty. The second instant in time must always be greater/later than the first instant in time." } }, "links": [ @@ -99,6 +97,11 @@ "href": "https://openeo.org/documentation/1.0/datacubes.html#filter", "rel": "about", "title": "Filters explained in the openEO documentation" + }, + { + "href": "https://www.rfc-editor.org/rfc/rfc3339.html", + "rel": "about", + "title": "RFC3339: Details about formatting temporal strings" } ] } diff --git a/load_collection.json b/load_collection.json index 49df9650..b93c879c 100644 --- a/load_collection.json +++ b/load_collection.json @@ -112,11 +112,12 @@ }, { "name": "temporal_extent", - "description": "Limits the data to load from the collection to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Also supports open intervals by setting one of the boundaries to `null`, but never both.\n\nSet this parameter to `null` to set no limit for the temporal extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_temporal()`` directly after loading unbounded data.", + "description": "Limits the data to load from the collection to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified time instant is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified time instant is **excluded** from the interval.\n\nThe second element must always be greater/later than the first element. Otherwise, a `TemporalExtentEmpty` exception is thrown.\n\nAlso supports unbounded intervals by setting one of the boundaries to `null`, but never both.\n\nSet this parameter to `null` to set no limit for the temporal extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_temporal()`` directly after loading unbounded data.", "schema": [ { "type": "array", "subtype": "temporal-interval", + "uniqueItems": true, "minItems": 2, "maxItems": 2, "items": { @@ -124,19 +125,14 @@ { "type": "string", "format": "date-time", - "subtype": "date-time" + "subtype": "date-time", + "description": "Date and time with a time zone." }, { "type": "string", "format": "date", - "subtype": "date" - }, - { - "type": "string", - "subtype": "year", - "minLength": 4, - "maxLength": 4, - "pattern": "^\\d{4}$" + "subtype": "date", + "description": "Date only, formatted as `YYYY-MM-DD`. The time zone is UTC. Missing time components are all 0." }, { "type": "null" @@ -231,6 +227,9 @@ "exceptions": { "NoDataAvailable": { "message": "There is no data available for the given extents." + }, + "TemporalExtentEmpty": { + "message": "The temporal extent is empty. The second instant in time must always be greater/later than the first instant in time." } }, "examples": [ @@ -313,6 +312,11 @@ "rel": "about", "href": "https://github.com/radiantearth/stac-spec/tree/master/extensions/eo#common-band-names", "title": "List of common band names as specified by the STAC specification" + }, + { + "href": "https://www.rfc-editor.org/rfc/rfc3339.html", + "rel": "about", + "title": "RFC3339: Details about formatting temporal strings" } ] } diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index b44cb8dc..1a49bf64 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -120,7 +120,7 @@ "subtype": "date", "format": "date", "title": "Date only", - "description": "Date only representation, as defined for `full-date` by [RFC 3339 in section 5.6](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.6). The time zone is UTC." + "description": "Date only representation, as defined for `full-date` by [RFC 3339 in section 5.6](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.6). The time zone is UTC. Missing time components are all 0." }, "date-time": { "type": "string", @@ -282,7 +282,8 @@ "type": "array", "subtype": "temporal-interval", "title": "Single temporal interval", - "description": "Left-closed temporal interval, represented as two-element array with the following elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Although [RFC 3339 prohibits the hour to be '24'](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.7), **this process allows the value '24' for the hour** of an end time in order to make it possible that left-closed time intervals can fully cover the day. `null` can be used to specify open intervals.", + "description": "Left-closed temporal interval, represented as two-element array with the following elements:\n\n1. The first element is the start of the temporal interval. The specified time instant is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified time instant is **excluded** from the interval.\n\nThe second element must always be greater/later than the first element. Otherwise, an exception is thrown.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Also supports unbounded intervals by setting one of the boundaries to `null`, but never both.", + "uniqueItems": true, "minItems": 2, "maxItems": 2, "items": { @@ -315,8 +316,8 @@ "2016-01-01" ], [ - "00:00:00Z", - "12:00:00Z" + "00:00:00", + "12:00:00" ], [ "2015-01-01", @@ -350,12 +351,12 @@ ], [ [ - "00:00:00Z", - "12:00:00Z" + "00:00:00", + "12:00:00" ], [ - "12:00:00Z", - "24:00:00Z" + "12:00:00", + null ] ], [ @@ -369,9 +370,9 @@ "time": { "type": "string", "subtype": "time", - "format": "time", + "pattern": "^\\d{2}:\\d{2}:\\d{2}$", "title": "Time only", - "description": "Time only representation, as defined for `full-time` by [RFC 3339 in section 5.6](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.6). Although [RFC 3339 prohibits the hour to be '24'](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.7), this definition allows the value '24' for the hour as end time in an interval in order to make it possible that left-closed time intervals can fully cover the day." + "description": "Time only representation, as defined for `partial-time` by [RFC 3339 in section 5.6](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.6). The time zone is UTC." }, "udf-code": { "type": "string", @@ -413,13 +414,10 @@ "description": "Specifies details about cartographic projections as WKT2 string. Refers to the latest WKT2 version (currently [WKT2:2018](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html) / ISO 19162:2018) unless otherwise stated by the process." }, "year": { - "type": "string", + "type": "integer", "subtype": "year", - "minLength": 4, - "maxLength": 4, - "pattern": "^\\d{4}$", "title": "Year only", - "description": "Year representation, as defined for `date-fullyear` by [RFC 3339 in section 5.6](https://www.rfc-editor.org/rfc/rfc3339.html#section-5.6)." + "description": "Year as integer, can be any number of digits and can be negative." } } } diff --git a/order.json b/order.json index cb8e1681..9b67f52d 100644 --- a/order.json +++ b/order.json @@ -29,11 +29,6 @@ "type": "string", "format": "date", "subtype": "date" - }, - { - "type": "string", - "format": "time", - "subtype": "time" } ] } diff --git a/proposals/date_between.json b/proposals/date_between.json new file mode 100644 index 00000000..e361acf4 --- /dev/null +++ b/proposals/date_between.json @@ -0,0 +1,112 @@ +{ + "id": "date_between", + "summary": "Between comparison for dates and times", + "description": "By default, this process checks whether `x` is later than or equal to `min` and before or equal to `max`.\n\nIf `exclude_max` is set to `true` the upper bound is excluded so that the process checks whether `x` is later than or equal to `min` and before `max`.\n\nLower and upper bounds are not allowed to be swapped. So `min` MUST be before or equal to `max` or otherwise the process always returns `false`.", + "categories": [ + "comparison", + "date & time" + ], + "experimental": true, + "parameters": [ + { + "name": "x", + "description": "The value to check.", + "schema": [ + { + "type": "string", + "format": "date-time", + "subtype": "date-time", + "description": "Date and time with a time zone." + }, + { + "type": "string", + "format": "date", + "subtype": "date", + "description": "Date only, formatted as `YYYY-MM-DD`. The time zone is UTC. Missing time components are all 0." + }, + { + "type": "string", + "subtype": "time", + "pattern": "^\\d{2}:\\d{2}:\\d{2}$", + "description": "Time only, formatted as HH:MM:SS. The time zone is UTC." + } + ] + }, + { + "name": "min", + "description": "Lower boundary (inclusive) to check against.", + "schema": [ + { + "type": "string", + "format": "date-time", + "subtype": "date-time", + "description": "Date and time with a time zone." + }, + { + "type": "string", + "format": "date", + "subtype": "date", + "description": "Date only, formatted as `YYYY-MM-DD`. The time zone is UTC. Missing time components are all 0." + }, + { + "type": "string", + "subtype": "time", + "pattern": "^\\d{2}:\\d{2}:\\d{2}$", + "description": "Time only, formatted as HH:MM:SS. The time zone is UTC." + } + ] + }, + { + "name": "max", + "description": "Upper boundary (inclusive) to check against.", + "schema": [ + { + "type": "string", + "format": "date-time", + "subtype": "date-time", + "description": "Date and time with a time zone." + }, + { + "type": "string", + "format": "date", + "subtype": "date", + "description": "Date only, formatted as `YYYY-MM-DD`. The time zone is UTC. Missing time components are all 0." + }, + { + "type": "string", + "subtype": "time", + "pattern": "^\\d{2}:\\d{2}:\\d{2}$", + "description": "Time only, formatted as HH:MM:SS. The time zone is UTC." + } + ] + }, + { + "name": "exclude_max", + "description": "Exclude the upper boundary `max` if set to `true`. Defaults to `false`.", + "schema": { + "type": "boolean" + }, + "default": false, + "optional": true + } + ], + "returns": { + "description": "`true` if `x` is between the specified bounds, otherwise `false`.", + "schema": { + "type": [ + "boolean", + "null" + ] + } + }, + "examples": [ + { + "arguments": { + "x": "2020-01-01", + "min": "2021-01-01", + "max": "2022-01-01" + }, + "returns": false + } + ] +} diff --git a/proposals/load_stac.json b/proposals/load_stac.json index f37160a1..c71d3a80 100644 --- a/proposals/load_stac.json +++ b/proposals/load_stac.json @@ -116,11 +116,12 @@ }, { "name": "temporal_extent", - "description": "Limits the data to load to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe specified temporal strings follow [RFC 3339](https://www.rfc-editor.org/rfc/rfc3339.html). Also supports open intervals by setting one of the boundaries to `null`, but never both.\n\nSet this parameter to `null` to set no limit for the temporal extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_temporal()`` directly after loading unbounded data.", + "description": "Limits the data to load to the specified left-closed temporal interval. Applies to all temporal dimensions. The interval has to be specified as an array with exactly two elements:\n\n1. The first element is the start of the temporal interval. The specified instance in time is **included** in the interval.\n2. The second element is the end of the temporal interval. The specified instance in time is **excluded** from the interval.\n\nThe second element must always be greater/later than the first element. Otherwise, a `TemporalExtentEmpty` exception is thrown.\n\nAlso supports open intervals by setting one of the boundaries to `null`, but never both.\n\nSet this parameter to `null` to set no limit for the temporal extent. Be careful with this when loading large datasets! It is recommended to use this parameter instead of using ``filter_temporal()`` directly after loading unbounded data.", "schema": [ { "type": "array", "subtype": "temporal-interval", + "uniqueItems": true, "minItems": 2, "maxItems": 2, "items": { @@ -128,19 +129,14 @@ { "type": "string", "format": "date-time", - "subtype": "date-time" + "subtype": "date-time", + "description": "Date and time with a time zone." }, { "type": "string", "format": "date", - "subtype": "date" - }, - { - "type": "string", - "subtype": "year", - "minLength": 4, - "maxLength": 4, - "pattern": "^\\d{4}$" + "subtype": "date", + "description": "Date only, formatted as `YYYY-MM-DD`. The time zone is UTC. Missing time components are all 0." }, { "type": "null" @@ -293,6 +289,46 @@ "exceptions": { "NoDataAvailable": { "message": "There is no data available for the given extents." + }, + "TemporalExtentEmpty": { + "message": "The temporal extent is empty. The second instant in time must always be greater/later than the first instant in time." + } + }, + "links": [ + { + "href": "https://openeo.org/documentation/1.0/datacubes.html", + "rel": "about", + "title": "Data Cubes explained in the openEO documentation" + }, + { + "rel": "about", + "href": "https://proj.org/usage/projections.html", + "title": "PROJ parameters for cartographic projections" + }, + { + "rel": "about", + "href": "http://www.epsg-registry.org", + "title": "Official EPSG code registry" + }, + { + "rel": "about", + "href": "http://www.epsg.io", + "title": "Unofficial EPSG code database" + }, + { + "href": "http://www.opengeospatial.org/standards/sfa", + "rel": "about", + "title": "Simple Features standard by the OGC" + }, + { + "rel": "about", + "href": "https://github.com/radiantearth/stac-spec/tree/master/extensions/eo#common-band-names", + "title": "List of common band names as specified by the STAC specification" + }, + { + "href": "https://www.rfc-editor.org/rfc/rfc3339.html", + "rel": "about", + "title": "RFC3339: Details about formatting temporal strings" } - } + ] } diff --git a/sort.json b/sort.json index 27586250..422f8d53 100644 --- a/sort.json +++ b/sort.json @@ -29,11 +29,6 @@ "type": "string", "format": "date", "subtype": "date" - }, - { - "type": "string", - "format": "time", - "subtype": "time" } ] } @@ -82,11 +77,6 @@ "type": "string", "format": "date", "subtype": "date" - }, - { - "type": "string", - "format": "time", - "subtype": "time" } ] } From bab65c2ac02a2b1391c90c39d4691c40cf1adf87 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Sat, 29 Apr 2023 19:09:09 +0200 Subject: [PATCH 096/117] New definitions for fit_curve and predict_curve (#420) * Fix description of fit_curve * Fine-tune description * New version for fit_curve and predict_curve * fit_curve: Use a labeled-array for data --- CHANGELOG.md | 1 + proposals/fit_curve.json | 64 +++++++++++++++++------------------- proposals/predict_curve.json | 19 ++--------- 3 files changed, 34 insertions(+), 50 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index acb2bbe4..0eb78e8a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -57,6 +57,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Updated the processes based on the subtypes `raster-cube` or `vector-cube` to work with the subtype `datacube` instead. [#68](https://github.com/Open-EO/openeo-processes/issues/68) - `sort` and `order`: The ordering of ties is not defined anymore. [#409](https://github.com/Open-EO/openeo-processes/issues/409) - `quantiles`: Parameter `probabilities` provided as array must be in ascending order. [#297](https://github.com/Open-EO/openeo-processes/pull/297) +- `fit_curve` and `predict_curve`: Heavily modified specifications. `fit_curve` works on arrays instead of data cubes, `predict_curve` doesn't support gap filling anymore, clarify no-data handling, ... [#425](https://github.com/Open-EO/openeo-processes/issues/425) - `climatological_normal`: The `climatology_period` parameter accepts an array of integers instead of strings. [#331](https://github.com/Open-EO/openeo-processes/issues/331) ### Deprecated diff --git a/proposals/fit_curve.json b/proposals/fit_curve.json index 5d33b652..aafb917e 100644 --- a/proposals/fit_curve.json +++ b/proposals/fit_curve.json @@ -1,38 +1,34 @@ { "id": "fit_curve", "summary": "Curve fitting", - "description": "Use non-linear least squares to fit a model function `y = f(x, parameters)` to data.\n\nThe process throws an `InvalidValues` exception if invalid values are encountered. Valid values are finite numbers (see also ``is_valid()``).", + "description": "Use non-linear least squares to fit a model function `y = f(x, parameters)` to data.", "categories": [ - "cubes", + "arrays", "math" ], "experimental": true, "parameters": [ { "name": "data", - "description": "A data cube.", + "description": "A labeled array, the labels correspond to the variable `y` and the values correspond to the variable `x`.", "schema": { - "type": "object", - "subtype": "datacube" + "type": "array", + "subtype": "labeled-array", + "items": { + "type": "number" + } } }, { "name": "parameters", "description": "Defined the number of parameters for the model function and provides an initial guess for them. At least one parameter is required.", - "schema": [ - { - "type": "array", - "minItems": 1, - "items": { - "type": "number" - } - }, - { - "title": "Data Cube with optimal values from a previous result of this process.", - "type": "object", - "subtype": "datacube" + "schema": { + "type": "array", + "minItems": 1, + "items": { + "type": "number" } - ] + } }, { "name": "function", @@ -45,7 +41,10 @@ "name": "x", "description": "The value for the independent variable `x`.", "schema": { - "type": "number" + "type": [ + "number", + "null" + ] } }, { @@ -69,26 +68,23 @@ } }, { - "name": "dimension", - "description": "The name of the dimension for curve fitting. Must be a dimension with labels that have a order (i.e. numerical labels or a temporal dimension). Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "name": "ignore_nodata", + "description": "Indicates whether no-data values are ignored or not. Ignores them by default. Setting this flag to `false` considers no-data values so that `null` is passed to the model function.", "schema": { - "type": "string" - } + "type": "boolean" + }, + "default": true, + "optional": true } ], "returns": { - "description": "A data cube with the optimal values for the parameters.", + "description": "An array with the optimal values for the parameters.", "schema": { - "type": "object", - "subtype": "datacube" - } - }, - "exceptions": { - "InvalidValues": { - "message": "At least one of the values is not a finite number." - }, - "DimensionNotAvailable": { - "message": "A dimension with the specified name does not exist." + "type": "array", + "minItems": 1, + "items": { + "type": "number" + } } } } diff --git a/proposals/predict_curve.json b/proposals/predict_curve.json index 9fb5d341..479b7fec 100644 --- a/proposals/predict_curve.json +++ b/proposals/predict_curve.json @@ -1,21 +1,13 @@ { "id": "predict_curve", "summary": "Predict values", - "description": "Predict values using a model function and pre-computed parameters. The process is primarily intended to compute values for new labels, but it can also fill gaps where existing labels contain no-data (`null`) values.", + "description": "Predict values using a model function and pre-computed parameters. The process is intended to compute values for new labels.", "categories": [ "cubes", "math" ], "experimental": true, "parameters": [ - { - "name": "data", - "description": "A data cube to predict values for.", - "schema": { - "type": "object", - "subtype": "datacube" - } - }, { "name": "parameters", "description": "A data cube with optimal values, e.g. computed by the process ``fit_curve()``.", @@ -60,7 +52,7 @@ }, { "name": "dimension", - "description": "The name of the dimension for predictions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "description": "The name of the dimension for predictions.", "schema": { "type": "string" } @@ -98,15 +90,10 @@ } ], "returns": { - "description": "A data cube with the predicted values.", + "description": "A data cube with the predicted values with the provided dimension `dimension` having as many labels as provided through `labels`.", "schema": { "type": "object", "subtype": "datacube" } - }, - "exceptions": { - "DimensionNotAvailable": { - "message": "A dimension with the specified name does not exist." - } } } From 32ec399dddf8102395277c86a0b5b7b576582980 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 4 May 2023 17:51:45 +0200 Subject: [PATCH 097/117] Handle units in vector processes #330 (#436) * Handle units in vector processes #330 Co-authored-by: Daniel Thiex <60705209+dthiex@users.noreply.github.com> --- proposals/vector_buffer.json | 7 ++++++- proposals/vector_to_regular_points.json | 7 ++++++- 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/proposals/vector_buffer.json b/proposals/vector_buffer.json index 9c4f86ae..05be9a58 100644 --- a/proposals/vector_buffer.json +++ b/proposals/vector_buffer.json @@ -22,7 +22,7 @@ }, { "name": "distance", - "description": "The distance of the buffer in the unit of the spatial reference system. A positive distance expands the geometries and results in outward buffering (dilation) while a negative distance shrinks the geometries and results in inward buffering (erosion).", + "description": "The distance of the buffer in meters. If the unit of the spatial reference system is not meters, a `UnitMismatch` error is thrown. Use ``vector_reproject()`` to convert the geometries to a suitable spatial reference system.\n\nA positive distance expands the geometries and results in outward buffering (dilation) while a negative distance shrinks the geometries and results in inward buffering (erosion).", "schema": { "type": "number", "not": { @@ -42,5 +42,10 @@ } ] } + }, + "exceptions": { + "UnitMismatch": { + "message": "The unit of the spatial reference system is not meters, but the given distance is in meters." + } } } diff --git a/proposals/vector_to_regular_points.json b/proposals/vector_to_regular_points.json index 2f353bdb..c7f7089c 100644 --- a/proposals/vector_to_regular_points.json +++ b/proposals/vector_to_regular_points.json @@ -23,7 +23,7 @@ }, { "name": "distance", - "description": "Defines the minimum distance in the unit of the reference system that is required between two samples generated *inside* a single geometry.\n\n- For **polygons**, the distance defines the cell sizes of a regular grid that starts at the upper-left bound of each polygon. The centroid of each cell is then a sample point. If the centroid is not enclosed in the polygon, no point is sampled. If no point can be sampled for the geometry at all, the first coordinate of the geometry is returned as point.\n- For **lines** (line strings), the sampling starts with a point at the first coordinate of the line and then walks along the line and samples a new point each time the distance to the previous point has been reached again.\n- For **points**, the point is returned as given.", + "description": "Defines the minimum distance in meters that is required between two samples generated *inside* a single geometry. If the unit of the spatial reference system is not meters, a `UnitMismatch` error is thrown. Use ``vector_reproject()`` to convert the geometries to a suitable spatial reference system.\n\n- For **polygons**, the distance defines the cell sizes of a regular grid that starts at the upper-left bound of each polygon. The centroid of each cell is then a sample point. If the centroid is not enclosed in the polygon, no point is sampled. If no point can be sampled for the geometry at all, the first coordinate of the geometry is returned as point.\n- For **lines** (line strings), the sampling starts with a point at the first coordinate of the line and then walks along the line and samples a new point each time the distance to the previous point has been reached again.\n- For **points**, the point is returned as given.", "schema": { "type": "number", "minimumExclusive": 0 @@ -54,5 +54,10 @@ } ] } + }, + "exceptions": { + "UnitMismatch": { + "message": "The unit of the spatial reference system is not meters, but the given distance is in meters." + } } } From 598f4ca04c2b97dcb6da960d3fda8f4bdf5f2f0c Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 4 May 2023 17:52:14 +0200 Subject: [PATCH 098/117] Update version numbers to v2.0.0-rc.1 (#438) --- CHANGELOG.md | 5 ++++- README.md | 5 +++-- meta/subtype-schemas.json | 2 +- 3 files changed, 8 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0eb78e8a..77d2916a 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft +## [2.0.0-rc.1] - 2023-05-31 + ### Added - New processes in proposal state: @@ -363,7 +365,8 @@ First version which is separated from the openEO API. Complete rework of all pro Older versions of the processes were released as part of the openEO API, see the corresponding changelog for more information. -[Unreleased]: +[Unreleased]: +[2.0.0-rc.1]: [1.2.0]: [1.1.0]: [1.0.0]: diff --git a/README.md b/README.md index 91d2096d..24c28899 100644 --- a/README.md +++ b/README.md @@ -8,12 +8,13 @@ openEO develops interoperable processes for big Earth observation cloud processi The [master branch](https://github.com/Open-EO/openeo-processes/tree/master) is the 'stable' version of the openEO processes specification. An exception is the [`proposals`](proposals/) folder, which provides experimental new processes currently under discussion. They may still change, but everyone is encouraged to implement them and give feedback. -The latest release is version **1.2.0**. The [draft branch](https://github.com/Open-EO/openeo-processes/tree/draft) is where active development takes place. PRs should be made against the draft branch. +The latest release is version **2.0.0-rc.1**. The [draft branch](https://github.com/Open-EO/openeo-processes/tree/draft) is where active development takes place. PRs should be made against the draft branch. | Version / Branch | Status | openEO API versions | | ------------------------------------------------------------ | ------------------------- | ------------------- | | [unreleased / draft](https://processes.openeo.org/draft) | in development | 1.x.x | -| [**1.2.0** / master](https://processes.openeo.org/1.2.0/) | **latest stable version** | 1.x.x | +| [**2.0.0 RC1** / master](https://processes.openeo.org/2.0.0-rc.1/) | **upcoming version (RC)** | 1.x.x | +| [1.2.0](https://processes.openeo.org/1.2.0/) | **latest stable version** | 1.x.x | | [1.1.0](https://processes.openeo.org/1.1.0/) | legacy version | 1.x.x | | [1.0.0](https://processes.openeo.org/1.0.0/) | legacy version | 1.x.x | | [1.0.0 RC1](https://processes.openeo.org/1.0.0-rc.1/) | legacy version | 1.x.x | diff --git a/meta/subtype-schemas.json b/meta/subtype-schemas.json index 1a49bf64..347df234 100644 --- a/meta/subtype-schemas.json +++ b/meta/subtype-schemas.json @@ -1,6 +1,6 @@ { "$schema": "http://json-schema.org/draft-07/schema#", - "$id": "https://processes.openeo.org/1.2.0/meta/subtype-schemas.json", + "$id": "https://processes.openeo.org/2.0.0-rc.1/meta/subtype-schemas.json", "title": "Subtype Schemas", "description": "This file defines the schemas for subtypes we define for openEO processes.", "definitions": { From da815328222860d64facc9da74c1054e267d5b59 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 4 May 2023 18:46:04 +0200 Subject: [PATCH 099/117] Update changelog --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 77d2916a..db092021 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -22,7 +22,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `vector_reproject` - `vector_to_random_points` - `vector_to_regular_points` -- `add_dimension`: Added new dimension type `geometries`. [#68](https://github.com/Open-EO/openeo-processes/issues/68) +- `add_dimension`: Added new dimension type `geometry`. [#68](https://github.com/Open-EO/openeo-processes/issues/68) ### Changed From fa2da6ca23987ddf455d058be7cd5a5fbaefe2f3 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 5 May 2023 13:40:02 +0200 Subject: [PATCH 100/117] Small wording improvements from @neteler --- proposals/vector_buffer.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/vector_buffer.json b/proposals/vector_buffer.json index 05be9a58..e81e9a76 100644 --- a/proposals/vector_buffer.json +++ b/proposals/vector_buffer.json @@ -22,7 +22,7 @@ }, { "name": "distance", - "description": "The distance of the buffer in meters. If the unit of the spatial reference system is not meters, a `UnitMismatch` error is thrown. Use ``vector_reproject()`` to convert the geometries to a suitable spatial reference system.\n\nA positive distance expands the geometries and results in outward buffering (dilation) while a negative distance shrinks the geometries and results in inward buffering (erosion).", + "description": "The distance of the buffer in meters. A positive distance expands the geometries, resulting in outward buffering (dilation), while a negative distance shrinks the geometries, resulting in inward buffering (erosion).\n\nIf the unit of the spatial reference system is not meters, a `UnitMismatch` error is thrown. Use ``vector_reproject()`` to convert the geometries to a suitable spatial reference system.", "schema": { "type": "number", "not": { From def8aac03b125d3af1cef376c76257c617420128 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 8 May 2023 18:13:59 +0200 Subject: [PATCH 101/117] Improve wording --- filter_bbox.json | 2 +- proposals/filter_vector.json | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/filter_bbox.json b/filter_bbox.json index 5eea34ed..3e2a7485 100644 --- a/filter_bbox.json +++ b/filter_bbox.json @@ -1,7 +1,7 @@ { "id": "filter_bbox", "summary": "Spatial filter using a bounding box", - "description": "Limits the data cube to the specified bounding box.\n\n* For raster data cubes, the filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_spatial()`` can be used to filter by geometry.\n* For vector data cubes, the filter retains the geometry in the data cube if the geometry is fully within the bounding box (as defined in the Simple Features standard by the OGC). All geometries that were empty or get empty will be removed from the data cube. Alternatively, ``filter_vector()`` can be used to filter by geometry.", + "description": "Limits the data cube to the specified bounding box.\n\n* For raster data cubes, the filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_spatial()`` can be used to filter by geometry.\n* For vector data cubes, the filter retains the geometry in the data cube if the geometry is fully within the bounding box (as defined in the Simple Features standard by the OGC). All geometries that were empty or not contained fully within the bounding box will be removed from the data cube.\n\nAlternatively, ``filter_vector()`` can be used to filter by geometry.", "categories": [ "cubes", "filter" diff --git a/proposals/filter_vector.json b/proposals/filter_vector.json index 4560848e..349f8d0f 100644 --- a/proposals/filter_vector.json +++ b/proposals/filter_vector.json @@ -1,7 +1,7 @@ { "id": "filter_vector", "summary": "Spatial vector filter using geometries", - "description": "Limits the vector data cube to the specified geometries. The process works on geometries as defined in the Simple Features standard by the OGC. All geometries that were empty or get empty will be removed from the data cube. Alternatively, use ``filter_bbox()`` to filter by bounding box.", + "description": "Limits the vector data cube to the specified geometries. The process works on geometries as defined in the Simple Features standard by the OGC. All geometries that were empty or become empty will be removed from the data cube. Alternatively, use ``filter_bbox()`` to filter by bounding box.", "categories": [ "cubes", "filter", From 093723a0323d4a6037b81bc65d1e13874e1ea499 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 25 May 2023 13:18:20 +0200 Subject: [PATCH 102/117] Update release date --- CHANGELOG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index db092021..407447dc 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,7 +6,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft -## [2.0.0-rc.1] - 2023-05-31 +## [2.0.0-rc.1] - 2023-05-25 ### Added From 7b558aca5f2f13855c35fbbddaa8205fbdeb7d4d Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 25 May 2023 13:30:59 +0200 Subject: [PATCH 103/117] clean-up --- .github/workflows/docs.yml | 1 - tests/docs.html | 4 ++-- tests/package.json | 2 +- 3 files changed, 3 insertions(+), 4 deletions(-) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 4e8f4f28..29baf6db 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -5,7 +5,6 @@ on: push: branches: - draft - - draft-2.0 - master jobs: deploy: diff --git a/tests/docs.html b/tests/docs.html index 23ef98e4..04b1c192 100644 --- a/tests/docs.html +++ b/tests/docs.html @@ -113,8 +113,8 @@ props: { document: 'processes.json', categorize: true, - apiVersion: '1.1.0', - title: 'openEO processes (draft)', + apiVersion: '1.2.0', + title: 'openEO processes (2.0.0-rc.1)', notice: '**Note:** This is the list of all processes specified by the openEO project. Back-ends implement a varying set of processes. Thus, the processes you can use at a specific back-end may derive from the specification, may include non-standardized processes and may not implement all processes listed here. Please check each back-end individually for the processes they support. The client libraries usually have a function called `listProcesses` or `list_processes` for that.' } }) diff --git a/tests/package.json b/tests/package.json index 15f6d2e8..1da8693f 100644 --- a/tests/package.json +++ b/tests/package.json @@ -1,6 +1,6 @@ { "name": "@openeo/processes", - "version": "2.0.0", + "version": "2.0.0-rc.1", "author": "openEO Consortium", "contributors": [ { From 7a293786b4c464d2c7452d7b5f13104098a25232 Mon Sep 17 00:00:00 2001 From: Stefaan Lippens Date: Mon, 31 Jul 2023 13:52:05 +0200 Subject: [PATCH 104/117] aggregate_spatial_window typo fix (#446) --- proposals/aggregate_spatial_window.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/proposals/aggregate_spatial_window.json b/proposals/aggregate_spatial_window.json index 747db151..9e5dce4a 100644 --- a/proposals/aggregate_spatial_window.json +++ b/proposals/aggregate_spatial_window.json @@ -75,7 +75,7 @@ }, { "name": "boundary", - "description": "Behavior to apply if the number of values for the axes `x` and `y` is not a multiple of the corresponding value in the `size` parameter. Options are:\n\n- `pad` (default): pad the data cube with the no-data value `null` to fit the required window size.\n\n- `trim`: trim the data cube to fit the required window size.\n\nSet the parameter `align` to specifies to which corner the data is aligned to.", + "description": "Behavior to apply if the number of values for the axes `x` and `y` is not a multiple of the corresponding value in the `size` parameter. Options are:\n\n- `pad` (default): pad the data cube with the no-data value `null` to fit the required window size.\n\n- `trim`: trim the data cube to fit the required window size.\n\nUse the parameter `align` to align the data to the desired corner.", "schema": { "type": "string", "enum": [ From 0833d4ebab71b16a09658c21e7278e40688b3bf1 Mon Sep 17 00:00:00 2001 From: Stefaan Lippens Date: Sat, 30 Sep 2023 09:21:44 +0200 Subject: [PATCH 105/117] Issue #460 doc crossreferences between filter_bbox/filter_spatial/filter_vector (#462) --- filter_bbox.json | 2 +- filter_spatial.json | 2 +- proposals/filter_vector.json | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/filter_bbox.json b/filter_bbox.json index 3e2a7485..e95d5fa2 100644 --- a/filter_bbox.json +++ b/filter_bbox.json @@ -1,7 +1,7 @@ { "id": "filter_bbox", "summary": "Spatial filter using a bounding box", - "description": "Limits the data cube to the specified bounding box.\n\n* For raster data cubes, the filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_spatial()`` can be used to filter by geometry.\n* For vector data cubes, the filter retains the geometry in the data cube if the geometry is fully within the bounding box (as defined in the Simple Features standard by the OGC). All geometries that were empty or not contained fully within the bounding box will be removed from the data cube.\n\nAlternatively, ``filter_vector()`` can be used to filter by geometry.", + "description": "Limits the data cube to the specified bounding box.\n\n* For raster data cubes, the filter retains a pixel in the data cube if the point at the pixel center intersects with the bounding box (as defined in the Simple Features standard by the OGC). Alternatively, ``filter_spatial()`` can be used to filter by geometry.\n* For vector data cubes, the filter retains the geometry in the data cube if the geometry is fully within the bounding box (as defined in the Simple Features standard by the OGC). All geometries that were empty or not contained fully within the bounding box will be removed from the data cube.\n\nAlternatively, filter spatially with geometries using ``filter_spatial()`` (on a raster data cube) or ``filter_vector()`` (on a vector data cube).", "categories": [ "cubes", "filter" diff --git a/filter_spatial.json b/filter_spatial.json index c0c116cd..f2648b05 100644 --- a/filter_spatial.json +++ b/filter_spatial.json @@ -1,7 +1,7 @@ { "id": "filter_spatial", "summary": "Spatial filter raster data cubes using geometries", - "description": "Limits the raster data cube over the spatial dimensions to the specified geometries.\n\n- For **polygons**, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nMore specifically, pixels outside of the bounding box of the given geometry will not be available after filtering. All pixels inside the bounding box that are not retained will be set to `null` (no data).\n\n Alternatively, use ``filter_bbox()`` to filter by bounding box.", + "description": "Limits the raster data cube over the spatial dimensions to the specified geometries.\n\n- For **polygons**, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nMore specifically, pixels outside of the bounding box of the given geometry will not be available after filtering. All pixels inside the bounding box that are not retained will be set to `null` (no data).\n\n Alternatively, use ``filter_bbox()`` to filter with a bounding box or ``filter_vector()`` to filter a vector data cube based on geometries.", "categories": [ "cubes", "filter" diff --git a/proposals/filter_vector.json b/proposals/filter_vector.json index 349f8d0f..1bb33c86 100644 --- a/proposals/filter_vector.json +++ b/proposals/filter_vector.json @@ -1,7 +1,7 @@ { "id": "filter_vector", "summary": "Spatial vector filter using geometries", - "description": "Limits the vector data cube to the specified geometries. The process works on geometries as defined in the Simple Features standard by the OGC. All geometries that were empty or become empty will be removed from the data cube. Alternatively, use ``filter_bbox()`` to filter by bounding box.", + "description": "Limits the vector data cube to the specified geometries. The process works on geometries as defined in the Simple Features standard by the OGC. All geometries that were empty or become empty will be removed from the data cube. Alternatively, use ``filter_bbox()`` to filter with a bounding box or ``filter_spatial()`` to filter a raster data cube based on geometries.", "categories": [ "cubes", "filter", From c130dd732c912624dea89c8eb15134c1378bd9e7 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Thu, 12 Oct 2023 11:24:39 +0200 Subject: [PATCH 106/117] Move tests to dev --- .github/workflows/docs.yml | 6 +++--- .github/workflows/tests.yml | 6 +++--- README.md | 2 +- {tests => dev}/.gitignore | 0 {tests => dev}/.words | 0 {tests => dev}/README.md | 0 {tests => dev}/docs.html | 0 {tests => dev}/package.json | 0 {tests => dev}/testConfig.json | 0 9 files changed, 7 insertions(+), 7 deletions(-) rename {tests => dev}/.gitignore (100%) rename {tests => dev}/.words (100%) rename {tests => dev}/README.md (100%) rename {tests => dev}/docs.html (100%) rename {tests => dev}/package.json (100%) rename {tests => dev}/testConfig.json (100%) diff --git a/.github/workflows/docs.yml b/.github/workflows/docs.yml index 29baf6db..b09f6e3d 100644 --- a/.github/workflows/docs.yml +++ b/.github/workflows/docs.yml @@ -19,7 +19,7 @@ jobs: - run: | npm install npm run generate - working-directory: tests + working-directory: dev - name: clone gh-pages and clean-up if: ${{ env.GITHUB_REF_SLUG == 'master' }} run: | @@ -31,8 +31,8 @@ jobs: if: ${{ env.GITHUB_REF_SLUG != 'master' }} run: mkdir gh-pages - run: | - cp tests/docs.html index.html - cp tests/processes.json processes.json + cp dev/docs.html index.html + cp dev/processes.json processes.json rsync -vrm --include='*.json' --include='*.html' --include='meta/***' --include='proposals/***' --exclude='*' . gh-pages - name: deploy to root (master) uses: peaceiris/actions-gh-pages@v3 diff --git a/.github/workflows/tests.yml b/.github/workflows/tests.yml index b108eb18..25659365 100644 --- a/.github/workflows/tests.yml +++ b/.github/workflows/tests.yml @@ -8,8 +8,8 @@ jobs: with: node-version: 'lts/*' - uses: actions/checkout@v3 - - name: Run tests + - name: Run linter run: | npm install - npm run test - working-directory: tests \ No newline at end of file + npm test + working-directory: dev \ No newline at end of file diff --git a/README.md b/README.md index 24c28899..621276b6 100644 --- a/README.md +++ b/README.md @@ -35,7 +35,7 @@ This repository contains a set of files formally describing the openEO Processes * [implementation.md](meta/implementation.md) in the `meta` folder provide some additional implementation details for back-ends. For back-end implementors, it's highly recommended to read them. * [subtype-schemas.json](meta/subtype-schemas.json) in the `meta` folder defines common data types (`subtype`s) for JSON Schema used in openEO processes. * Previously, an `examples` folder contained examples of user-defined processes. These have been migrated to the [openEO Community Examples](https://github.com/Open-EO/openeo-community-examples/tree/main/processes) repository. -* The [`tests`](tests/) folder can be used to test the process specification for validity and consistent "style". It also allows rendering the processes in a web browser. Check the [tests documentation](tests/README.md) for details. +* The [`dev`](dev/) folder can be used to test the process specification for validity and consistent "style". It also allows rendering the processes in a web browser. Check the [development documentation](dev/README.md) for details. ## Process diff --git a/tests/.gitignore b/dev/.gitignore similarity index 100% rename from tests/.gitignore rename to dev/.gitignore diff --git a/tests/.words b/dev/.words similarity index 100% rename from tests/.words rename to dev/.words diff --git a/tests/README.md b/dev/README.md similarity index 100% rename from tests/README.md rename to dev/README.md diff --git a/tests/docs.html b/dev/docs.html similarity index 100% rename from tests/docs.html rename to dev/docs.html diff --git a/tests/package.json b/dev/package.json similarity index 100% rename from tests/package.json rename to dev/package.json diff --git a/tests/testConfig.json b/dev/testConfig.json similarity index 100% rename from tests/testConfig.json rename to dev/testConfig.json From c2d77e2e5fea0cafc6576d10a97d472aa6616ffb Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Wed, 25 Oct 2023 14:51:17 +0200 Subject: [PATCH 107/117] Use x \ y instead of a \ b --- and.json | 4 ++-- or.json | 4 ++-- xor.json | 4 ++-- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/and.json b/and.json index c24ce95b..3b28c8ef 100644 --- a/and.json +++ b/and.json @@ -1,7 +1,7 @@ { "id": "and", "summary": "Logical AND", - "description": "Checks if **both** values are true.\n\nEvaluates parameter `x` before `y` and stops once the outcome is unambiguous. If any argument is `null`, the result will be `null` if the outcome is ambiguous.\n\n**Truth table:**\n\n```\na \\ b || null | false | true\n----- || ----- | ----- | -----\nnull || null | false | null\nfalse || false | false | false\ntrue || null | false | true\n```", + "description": "Checks if **both** values are true.\n\nEvaluates parameter `x` before `y` and stops once the outcome is unambiguous. If any argument is `null`, the result will be `null` if the outcome is ambiguous.\n\n**Truth table:**\n\n```\nx \\ y || null | false | true\n----- || ----- | ----- | -----\nnull || null | false | null\nfalse || false | false | false\ntrue || null | false | true\n```", "categories": [ "logic" ], @@ -90,4 +90,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/or.json b/or.json index 5964a341..4a83a63e 100644 --- a/or.json +++ b/or.json @@ -1,7 +1,7 @@ { "id": "or", "summary": "Logical OR", - "description": "Checks if **at least one** of the values is true. Evaluates parameter `x` before `y` and stops once the outcome is unambiguous. If a component is `null`, the result will be `null` if the outcome is ambiguous.\n\n**Truth table:**\n\n```\na \\ b || null | false | true\n----- || ---- | ----- | ----\nnull || null | null | true\nfalse || null | false | true\ntrue || true | true | true\n```", + "description": "Checks if **at least one** of the values is true. Evaluates parameter `x` before `y` and stops once the outcome is unambiguous. If a component is `null`, the result will be `null` if the outcome is ambiguous.\n\n**Truth table:**\n\n```\nx \\ y || null | false | true\n----- || ---- | ----- | ----\nnull || null | null | true\nfalse || null | false | true\ntrue || true | true | true\n```", "categories": [ "logic" ], @@ -90,4 +90,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/xor.json b/xor.json index d8dbde50..6af7ae5e 100644 --- a/xor.json +++ b/xor.json @@ -1,7 +1,7 @@ { "id": "xor", "summary": "Logical XOR (exclusive or)", - "description": "Checks if **exactly one** of the values is true. If a component is `null`, the result will be `null` if the outcome is ambiguous.\n\n**Truth table:**\n\n```\na \\ b || null | false | true\n----- || ---- | ----- | -----\nnull || null | null | null\nfalse || null | false | true\ntrue || null | true | false\n```", + "description": "Checks if **exactly one** of the values is true. If a component is `null`, the result will be `null` if the outcome is ambiguous.\n\n**Truth table:**\n\n```\nx \\ y || null | false | true\n----- || ---- | ----- | -----\nnull || null | null | null\nfalse || null | false | true\ntrue || null | true | false\n```", "categories": [ "logic" ], @@ -125,4 +125,4 @@ "result": true } } -} \ No newline at end of file +} From 13c3f85696d6f142f68694ba4726337b1fecca28 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 27 Oct 2023 11:07:14 +0200 Subject: [PATCH 108/117] `sqrt`: Clarified that NaN is returned for negative numbers #474 (#475) --- CHANGELOG.md | 4 ++++ sqrt.json | 9 +++++++-- 2 files changed, 11 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 407447dc..78b16e73 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft +### Fixed + +- `sqrt`: Clarified that NaN is returned for negative numbers. + ## [2.0.0-rc.1] - 2023-05-25 ### Added diff --git a/sqrt.json b/sqrt.json index bc1aeb6c..b85caf94 100644 --- a/sqrt.json +++ b/sqrt.json @@ -1,7 +1,7 @@ { "id": "sqrt", "summary": "Square root", - "description": "Computes the square root of a real number `x`, which is equal to calculating `x` to the power of *0.5*.\n\nA square root of x is a number a such that *`a² = x`*. Therefore, the square root is the inverse function of a to the power of 2, but only for *a >= 0*.\n\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the square root of a real number `x`, which is equal to calculating `x` to the power of *0.5*. For negative `x`, the process returns `NaN`.\n\nA square root of x is a number a such that *`a² = x`*. Therefore, the square root is the inverse function of a to the power of 2, but only for *a >= 0*.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math", "math > exponential & logarithmic" @@ -58,6 +58,11 @@ "rel": "about", "href": "http://mathworld.wolfram.com/SquareRoot.html", "title": "Square root explained by Wolfram MathWorld" + }, + { + "rel": "about", + "href": "https://ieeexplore.ieee.org/document/8766229", + "title": "IEEE Standard 754-2019 for Floating-Point Arithmetic" } ], "process_graph": { @@ -72,4 +77,4 @@ "result": true } } -} \ No newline at end of file +} From 4fd92b217ec84bfb27273c2d1acdb8418574b7b7 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 27 Oct 2023 11:08:00 +0200 Subject: [PATCH 109/117] `clip`: Throw an exception if min > max #472 (#477) --- CHANGELOG.md | 4 ++++ clip.json | 40 ++++++++-------------------------------- 2 files changed, 12 insertions(+), 32 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 78b16e73..281ab2d8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## Unreleased / Draft +### Changed + +- `clip`: Throw an exception if min > max [#472](https://github.com/Open-EO/openeo-processes/issues/472) + ### Fixed - `sqrt`: Clarified that NaN is returned for negative numbers. diff --git a/clip.json b/clip.json index adbf7eaa..de2a4d1a 100644 --- a/clip.json +++ b/clip.json @@ -1,7 +1,7 @@ { "id": "clip", "summary": "Clip a value between a minimum and a maximum", - "description": "Clips a number between specified minimum and maximum values. A value larger than the maximum value is set to the maximum value, a value lower than the minimum value is set to the minimum value.\n\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Clips a number between specified minimum and maximum values. A value larger than the maximum value is set to the maximum value, a value lower than the minimum value is set to the minimum value. If the maximum value is smaller than the minimum number, the process throws a `MinMaxSwapped` exception.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math" ], @@ -40,6 +40,11 @@ ] } }, + "exceptions": { + "MinMaxSwapped": { + "message": "The minimum value should be lower than or equal to the maximum value." + } + }, "examples": [ { "arguments": { @@ -73,34 +78,5 @@ }, "returns": null } - ], - "process_graph": { - "min": { - "process_id": "min", - "arguments": { - "data": [ - { - "from_parameter": "max" - }, - { - "from_parameter": "x" - } - ] - } - }, - "max": { - "process_id": "max", - "arguments": { - "data": [ - { - "from_parameter": "min" - }, - { - "from_node": "min" - } - ] - }, - "result": true - } - } -} \ No newline at end of file + ] +} From ab4a62eb3fc3eac0c66a51f5670aa131e52c0745 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 27 Oct 2023 12:24:32 +0200 Subject: [PATCH 110/117] `array_append`: Added `number` type for labels to be consistent with other processes. Default to numerical index instead of string. (#478) --- CHANGELOG.md | 4 ++++ array_append.json | 22 +++++++++++++++------- 2 files changed, 19 insertions(+), 7 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 281ab2d8..0319d15d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -16,6 +16,10 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [2.0.0-rc.1] - 2023-05-25 +### Fixed + +- `array_append`: Added `number` type for labels to be consistent with other processes. Default to numerical index instead of string. Clarify that the `label` parameter only applies to labeled arrays. + ### Added - New processes in proposal state: diff --git a/array_append.json b/array_append.json index 80b48d12..f09145d2 100644 --- a/array_append.json +++ b/array_append.json @@ -25,15 +25,20 @@ }, { "name": "label", - "description": "If the given array is a labeled array, a new label for the new value should be given. If not given or `null`, the array index as string is used as the label. If in any case the label exists, a `LabelExists` exception is thrown.", + "description": "Provides a label for the new value. If not given or `null`, the natural next array index as number is used as the label. If in any case the label exists, a `LabelExists` exception is thrown.\n\nThis parameter only applies if the given array is a labeled array. If a non-null values is provided and the array is not labeled, an `ArrayNotLabeled` exception is thrown.", "optional": true, "default": null, - "schema": { - "type": [ - "string", - "null" - ] - } + "schema": [ + { + "type": "number" + }, + { + "type": "string" + }, + { + "type": "null" + } + ] } ], "returns": { @@ -48,6 +53,9 @@ "exceptions": { "LabelExists": { "message": "An array element with the specified label already exists." + }, + "ArrayNotLabeled": { + "message": "A label can't be provided as the given array is not labeled." } }, "examples": [ From f303adfce42cba91cd436934766cfa82dddd7b53 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 27 Oct 2023 12:29:23 +0200 Subject: [PATCH 111/117] `filter_spatial`: Clarify masking (#470) * `filter_spatial`: Clarified that a masking get applied for the given geometries. #469 * `filter_bbox`: Clarified that the bounding box is reprojected to the CRS of the spatial data cube dimensions if required. --------- Co-authored-by: Stefaan Lippens --- CHANGELOG.md | 2 ++ dev/.words | 1 + filter_bbox.json | 2 +- filter_spatial.json | 4 ++-- load_collection.json | 2 +- proposals/load_stac.json | 2 +- 6 files changed, 8 insertions(+), 5 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 0319d15d..f3cf3bc8 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed +- `filter_bbox`, `load_collection`, `load_stac`: Clarified that the bounding box is reprojected to the CRS of the spatial data cube dimensions if required. +- `filter_spatial`: Clarified that masking is applied using the given geometries. [#469](https://github.com/Open-EO/openeo-processes/issues/469) - `sqrt`: Clarified that NaN is returned for negative numbers. ## [2.0.0-rc.1] - 2023-05-25 diff --git a/dev/.words b/dev/.words index a50285ba..846e729a 100644 --- a/dev/.words +++ b/dev/.words @@ -23,6 +23,7 @@ orthorectified radiometrically reflectances reproject +reprojected Reprojects resample resampled diff --git a/filter_bbox.json b/filter_bbox.json index e95d5fa2..b7335847 100644 --- a/filter_bbox.json +++ b/filter_bbox.json @@ -39,7 +39,7 @@ }, { "name": "extent", - "description": "A bounding box, which may include a vertical axis (see `base` and `height`).", + "description": "A bounding box, which may include a vertical axis (see `base` and `height`).\n\nIf the bounding box is not provided in the coordinate reference system (CRS) of the data cube, the bounding box is reprojected to the CRS of the spatial data cube dimensions.", "schema": { "type": "object", "subtype": "bounding-box", diff --git a/filter_spatial.json b/filter_spatial.json index f2648b05..ed4f7c3f 100644 --- a/filter_spatial.json +++ b/filter_spatial.json @@ -1,7 +1,7 @@ { "id": "filter_spatial", "summary": "Spatial filter raster data cubes using geometries", - "description": "Limits the raster data cube over the spatial dimensions to the specified geometries.\n\n- For **polygons**, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nMore specifically, pixels outside of the bounding box of the given geometry will not be available after filtering. All pixels inside the bounding box that are not retained will be set to `null` (no data).\n\n Alternatively, use ``filter_bbox()`` to filter with a bounding box or ``filter_vector()`` to filter a vector data cube based on geometries.", + "description": "Limits the raster data cube over the spatial dimensions to the specified geometries.\n\n- For **polygons**, the filter retains a pixel in the data cube if the point at the pixel center intersects with at least one of the polygons (as defined in the Simple Features standard by the OGC).\n- For **points**, the process considers the closest pixel center.\n- For **lines** (line strings), the process considers all the pixels whose centers are closest to at least one point on the line.\n\nMore specifically, pixels outside of the bounding box of the given geometries will not be available after filtering. All pixels inside the bounding box that are not retained will be set to `null` (no data).\n\n Alternatively, use ``filter_bbox()`` to filter with a bounding box or ``filter_vector()`` to filter a vector data cube based on geometries. Use ``mask_polygon()`` to mask without changing the spatial extent of your data cube.", "categories": [ "cubes", "filter" @@ -26,7 +26,7 @@ }, { "name": "geometries", - "description": "One or more geometries used for filtering, given as GeoJSON or vector data cube. If multiple geometries are provided, the union of them is used. Empty geometries are ignored.\n\nLimits the data cube to the bounding box of the given geometries. No implicit masking gets applied. To mask the pixels of the data cube use ``mask_polygon()``.", + "description": "One or more geometries used for spatial filtering and masking, given as GeoJSON or vector data cube.", "schema": [ { "title": "Vector Data Cube", diff --git a/load_collection.json b/load_collection.json index b93c879c..a6701cc3 100644 --- a/load_collection.json +++ b/load_collection.json @@ -64,7 +64,7 @@ "default": null }, "crs": { - "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", + "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system. If the bounding box is not provided in the coordinate reference system (CRS) of the data cube, the bounding box is reprojected to the CRS of the spatial data cube dimensions.", "anyOf": [ { "title": "EPSG Code", diff --git a/proposals/load_stac.json b/proposals/load_stac.json index c71d3a80..262745fc 100644 --- a/proposals/load_stac.json +++ b/proposals/load_stac.json @@ -67,7 +67,7 @@ "default": null }, "crs": { - "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system.", + "description": "Coordinate reference system of the extent, specified as as [EPSG code](http://www.epsg-registry.org/) or [WKT2 CRS string](http://docs.opengeospatial.org/is/18-010r7/18-010r7.html). Defaults to `4326` (EPSG code 4326) unless the client explicitly requests a different coordinate reference system. If the bounding box is not provided in the coordinate reference system (CRS) of the data cube, the bounding box is reprojected to the CRS of the spatial data cube dimensions.", "anyOf": [ { "title": "EPSG Code", From d8cf96a68f421a02f2e704ae6080a87841f206e1 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 30 Oct 2023 17:02:00 +0100 Subject: [PATCH 112/117] `between`: Clarify that `null` is passed through --- CHANGELOG.md | 1 + between.json | 4 ++-- 2 files changed, 3 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f3cf3bc8..87dbf987 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,6 +12,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed +- `between`: Clarify that `null` is passed through. - `filter_bbox`, `load_collection`, `load_stac`: Clarified that the bounding box is reprojected to the CRS of the spatial data cube dimensions if required. - `filter_spatial`: Clarified that masking is applied using the given geometries. [#469](https://github.com/Open-EO/openeo-processes/issues/469) - `sqrt`: Clarified that NaN is returned for negative numbers. diff --git a/between.json b/between.json index b2e59b92..12e37693 100644 --- a/between.json +++ b/between.json @@ -8,7 +8,7 @@ "parameters": [ { "name": "x", - "description": "The value to check.", + "description": "The value to check.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "schema": { "description": "Any data type is allowed." } @@ -38,7 +38,7 @@ } ], "returns": { - "description": "`true` if `x` is between the specified bounds, otherwise `false`.", + "description": "`true` if `x` is between the specified bounds, `null` if `x` is a no-data value, `false` otherwise.", "schema": { "type": [ "boolean", From 899b8249911fc6f978661bfc63442696bfaeb6f8 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Mon, 30 Oct 2023 18:26:09 +0100 Subject: [PATCH 113/117] `eq` and `neq`: Explicitly set the minimum value for the `delta` parameter. --- CHANGELOG.md | 1 + eq.json | 3 ++- neq.json | 3 ++- 3 files changed, 5 insertions(+), 2 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 87dbf987..c2a4f1d7 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -13,6 +13,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed - `between`: Clarify that `null` is passed through. +- `eq` and `neq`: Explicitly set the minimum value for the `delta` parameter. - `filter_bbox`, `load_collection`, `load_stac`: Clarified that the bounding box is reprojected to the CRS of the spatial data cube dimensions if required. - `filter_spatial`: Clarified that masking is applied using the given geometries. [#469](https://github.com/Open-EO/openeo-processes/issues/469) - `sqrt`: Clarified that NaN is returned for negative numbers. diff --git a/eq.json b/eq.json index e7712399..0c62b42c 100644 --- a/eq.json +++ b/eq.json @@ -38,7 +38,8 @@ "type": [ "number", "null" - ] + ], + "minimumExclusive": 0 }, "default": null, "optional": true diff --git a/neq.json b/neq.json index ff6bc9fd..0e22b347 100644 --- a/neq.json +++ b/neq.json @@ -38,7 +38,8 @@ "type": [ "number", "null" - ] + ], + "minimumExclusive": 0 }, "default": null, "optional": true From ab2e6c233a073a1ac23abc004216794127b32c2c Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Tue, 31 Oct 2023 16:10:18 +0100 Subject: [PATCH 114/117] Clarify linear_scale_range --- linear_scale_range.json | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/linear_scale_range.json b/linear_scale_range.json index 172027c9..01f09857 100644 --- a/linear_scale_range.json +++ b/linear_scale_range.json @@ -1,7 +1,7 @@ { "id": "linear_scale_range", "summary": "Linear transformation between two ranges", - "description": "Performs a linear transformation between the input and output range.\n\nThe given number in `x` is clipped to the bounds specified in `inputMin` and `inputMax` so that the underlying formula *`((x - inputMin) / (inputMax - inputMin)) * (outputMax - outputMin) + outputMin`* never returns any value lower than `outputMin` or greater than `outputMax`.\n\nPotential use case include\n\n* scaling values to the 8-bit range (0 - 255) often used for numeric representation of values in one of the channels of the [RGB colour model](https://en.wikipedia.org/wiki/RGB_color_model#Numeric_representations) or\n* calculating percentages (0 - 100).\n\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Performs a linear transformation between the input and output range.\n\nThe given number in `x` is clipped to the bounds specified in `inputMin` and `inputMax` so that the underlying formula *`((x - inputMin) / (inputMax - inputMin)) * (outputMax - outputMin) + outputMin`* never returns a value outside of the range defined by `outputMin` and `outputMax`.\n\nPotential use case include\n\n* scaling values to the 8-bit range (0 - 255) often used for numeric representation of values in one of the channels of the [RGB colour model](https://en.wikipedia.org/wiki/RGB_color_model#Numeric_representations) or\n* calculating percentages (0 - 100).\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math" ], @@ -166,4 +166,4 @@ "result": true } } -} \ No newline at end of file +} From ad8a2f3551fe9d69c1689e310bf5307e20d86113 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 8 Dec 2023 11:40:09 +0100 Subject: [PATCH 115/117] Math functions: Clarified value ranges and NaN (#476) * divide, ln, log, mod: Clarified behavior for 0 input / infinity results * Trigonometric functions: Clarified that NaN is returned outside of their defined ranges and the output value range for some processes * Clarified for various mathematical functions the defined input and output ranges. Mention that `NaN` is returned outside of the defined input range where possible. * Remove NaN --- CHANGELOG.md | 3 +++ absolute.json | 4 ++-- arccos.json | 15 +++++++++------ arcosh.json | 14 ++++++++------ arcsin.json | 12 +++++++----- arctan.json | 4 ++-- arsinh.json | 6 +++--- artanh.json | 17 ++++++++++++----- cos.json | 8 +++++--- cosh.json | 11 ++++++----- divide.json | 4 ++-- exp.json | 7 ++++--- ln.json | 9 +++++---- log.json | 9 +++++---- mod.json | 4 ++-- sgn.json | 8 +++++++- sin.json | 8 +++++--- sinh.json | 8 ++++---- tan.json | 11 ++++++++--- tanh.json | 12 +++++++----- 20 files changed, 106 insertions(+), 68 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index c2a4f1d7..f70afe7b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -12,10 +12,13 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Fixed +- Clarified for various mathematical functions the defined input and output ranges. Mention that `NaN` is returned outside of the defined input range where possible. +- `divide`: Clarified behavior for division by 0 - `between`: Clarify that `null` is passed through. - `eq` and `neq`: Explicitly set the minimum value for the `delta` parameter. - `filter_bbox`, `load_collection`, `load_stac`: Clarified that the bounding box is reprojected to the CRS of the spatial data cube dimensions if required. - `filter_spatial`: Clarified that masking is applied using the given geometries. [#469](https://github.com/Open-EO/openeo-processes/issues/469) +- `mod`: Clarified behavior for y = 0 - `sqrt`: Clarified that NaN is returned for negative numbers. ## [2.0.0-rc.1] - 2023-05-25 diff --git a/absolute.json b/absolute.json index 3b6e91dc..c6a3713d 100644 --- a/absolute.json +++ b/absolute.json @@ -1,7 +1,7 @@ { "id": "absolute", "summary": "Absolute value", - "description": "Computes the absolute value of a real number `x`, which is the \"unsigned\" portion of x and often denoted as *|x|*.\n\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the absolute value of a real number `x`, which is the \"unsigned\" portion of `x` and often denoted as *|x|*.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math" ], @@ -95,4 +95,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/arccos.json b/arccos.json index 5ffbce35..4cd498a7 100644 --- a/arccos.json +++ b/arccos.json @@ -1,29 +1,32 @@ { "id": "arccos", "summary": "Inverse cosine", - "description": "Computes the arc cosine of `x`. The arc cosine is the inverse function of the cosine so that *`arccos(cos(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the arc cosine of `x`. The arc cosine is the inverse function of the cosine so that *`arccos(cos(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated. `NaN` is returned for values outside of the allowed range.", "categories": [ "math > trigonometric" ], "parameters": [ { "name": "x", - "description": "A number.", + "description": "A number in the range *[-1, 1]*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": -1, + "maximum": 1 } } ], "returns": { - "description": "The computed angle in radians.", + "description": "The computed angle in radians in the range *[0, π]*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": 0 } }, "examples": [ @@ -41,4 +44,4 @@ "title": "Inverse cosine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/arcosh.json b/arcosh.json index 6ed581fe..820b8cd4 100644 --- a/arcosh.json +++ b/arcosh.json @@ -1,29 +1,31 @@ { "id": "arcosh", "summary": "Inverse hyperbolic cosine", - "description": "Computes the inverse hyperbolic cosine of `x`. It is the inverse function of the hyperbolic cosine so that *`arcosh(cosh(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the inverse hyperbolic cosine of `x`. It is the inverse function of the hyperbolic cosine so that *`arcosh(cosh(x)) = x`*.\n\nThe no-data value `null` is passed through and therefore gets propagated. `NaN` is returned for values outside of the allowed range.", "categories": [ "math > trigonometric" ], "parameters": [ { "name": "x", - "description": "A number.", + "description": "A number in the range *[1, +∞)*.", "schema": { "type": [ "number", "null" ] - } + }, + "minimum": 1 } ], "returns": { - "description": "The computed angle in radians.", + "description": "The computed hyperbolic angle in radians in the range *[0, +∞)*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": 0 } }, "examples": [ @@ -41,4 +43,4 @@ "title": "Inverse hyperbolic cosine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/arcsin.json b/arcsin.json index e37eb2d3..2c772a00 100644 --- a/arcsin.json +++ b/arcsin.json @@ -1,24 +1,26 @@ { "id": "arcsin", "summary": "Inverse sine", - "description": "Computes the arc sine of `x`. The arc sine is the inverse function of the sine so that *`arcsin(sin(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the arc sine of `x`. The arc sine is the inverse function of the sine so that *`arcsin(sin(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated. `NaN` is returned for values < -1 and > 1.", "categories": [ "math > trigonometric" ], "parameters": [ { "name": "x", - "description": "A number.", + "description": "A number in the range *[-1, 1]*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": -1, + "maximum": 1 } } ], "returns": { - "description": "The computed angle in radians.", + "description": "The computed angle in radians in the range *[-π/2, π/2]*.", "schema": { "type": [ "number", @@ -41,4 +43,4 @@ "title": "Inverse sine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/arctan.json b/arctan.json index dc8d5a68..9461eba3 100644 --- a/arctan.json +++ b/arctan.json @@ -18,7 +18,7 @@ } ], "returns": { - "description": "The computed angle in radians.", + "description": "The computed angle in radians in the range *(−π/2, π/2)*.", "schema": { "type": [ "number", @@ -41,4 +41,4 @@ "title": "Inverse tangent explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/arsinh.json b/arsinh.json index 37384dcd..2b7942dd 100644 --- a/arsinh.json +++ b/arsinh.json @@ -1,7 +1,7 @@ { "id": "arsinh", "summary": "Inverse hyperbolic sine", - "description": "Computes the inverse hyperbolic sine of `x`. It is the inverse function of the hyperbolic sine so that *`arsinh(sinh(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the inverse hyperbolic sine of `x`. It is the inverse function of the hyperbolic sine so that *`arsinh(sinh(x)) = x`*.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math > trigonometric" ], @@ -18,7 +18,7 @@ } ], "returns": { - "description": "The computed angle in radians.", + "description": "The computed hyperbolic angle in radians.", "schema": { "type": [ "number", @@ -41,4 +41,4 @@ "title": "Inverse hyperbolic sine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/artanh.json b/artanh.json index 926b48ea..6308290d 100644 --- a/artanh.json +++ b/artanh.json @@ -1,24 +1,26 @@ { "id": "artanh", "summary": "Inverse hyperbolic tangent", - "description": "Computes the inverse hyperbolic tangent of `x`. It is the inverse function of the hyperbolic tangent so that *`artanh(tanh(x)) = x`*.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the inverse hyperbolic tangent of `x`. It is the inverse function of the hyperbolic tangent so that *`artanh(tanh(x)) = x`*.\n\nThe no-data value `null` is passed through and therefore gets propagated. `NaN` is returned for values outside of the allowed range. The computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. Therefore, `x` = 1 results in +infinity and `x` = 0 results in -infinity. Otherwise, an exception is thrown.", "categories": [ "math > trigonometric" ], "parameters": [ { "name": "x", - "description": "A number.", + "description": "A number in the range *(-1, 1)*.", "schema": { "type": [ "number", "null" - ] + ], + "minimumExclusive": -1, + "maximumExclusive": 1 } } ], "returns": { - "description": "The computed angle in radians.", + "description": "The computed hyperbolic angle in radians.", "schema": { "type": [ "number", @@ -39,6 +41,11 @@ "rel": "about", "href": "http://mathworld.wolfram.com/InverseHyperbolicTangent.html", "title": "Inverse hyperbolic tangent explained by Wolfram MathWorld" + }, + { + "rel": "about", + "href": "https://ieeexplore.ieee.org/document/4610935", + "title": "IEEE Standard 754-2008 for Floating-Point Arithmetic" } ] -} \ No newline at end of file +} diff --git a/cos.json b/cos.json index 6e6e4143..0d6229a8 100644 --- a/cos.json +++ b/cos.json @@ -18,12 +18,14 @@ } ], "returns": { - "description": "The computed cosine of `x`.", + "description": "The computed cosine in the range *[-1, 1]*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": -1, + "maximum": 1 } }, "examples": [ @@ -41,4 +43,4 @@ "title": "Cosine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/cosh.json b/cosh.json index 975958a4..8b56a222 100644 --- a/cosh.json +++ b/cosh.json @@ -1,14 +1,14 @@ { "id": "cosh", "summary": "Hyperbolic cosine", - "description": "Computes the hyperbolic cosine of `x`.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the hyperbolic cosine of `x`.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math > trigonometric" ], "parameters": [ { "name": "x", - "description": "An angle in radians.", + "description": "An hyperbolic angle in radians.", "schema": { "type": [ "number", @@ -18,12 +18,13 @@ } ], "returns": { - "description": "The computed hyperbolic cosine of `x`.", + "description": "The computed hyperbolic cosine in the range *[1, +∞)*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": 1 } }, "examples": [ @@ -41,4 +42,4 @@ "title": "Hyperbolic cosine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/divide.json b/divide.json index 5dd664f1..0c6c254a 100644 --- a/divide.json +++ b/divide.json @@ -1,7 +1,7 @@ { "id": "divide", "summary": "Division of two numbers", - "description": "Divides argument `x` by the argument `y` (*`x / y`*) and returns the computed result.\n\nNo-data values are taken into account so that `null` is returned if any element is such a value.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. Therefore, a division by zero results in ±infinity if the processing environment supports it. Otherwise, a `DivisionByZero` exception must the thrown.", + "description": "Divides argument `x` by the argument `y` (*`x / y`*) and returns the computed result.\n\nNo-data values are taken into account so that `null` is returned if any element is such a value.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. A division by zero results in:\n\n- +infinity for `x` > 0,\n- -infinity for `x` < 0,\n- `NaN` for `x` = 0,\n- or otherwise, throws a `DivisionByZero` exception if the other options are not supported by the processing environment.", "categories": [ "math" ], @@ -76,4 +76,4 @@ "title": "IEEE Standard 754-2019 for Floating-Point Arithmetic" } ] -} \ No newline at end of file +} diff --git a/exp.json b/exp.json index 5a5e3283..2d551390 100644 --- a/exp.json +++ b/exp.json @@ -18,12 +18,13 @@ } ], "returns": { - "description": "The computed value for *e* raised to the power of `p`.", + "description": "The computed value for *e* raised to the power of `p`. Value is in the range of *(0, +∞)*", "schema": { "type": [ "number", "null" - ] + ], + "minimumExclusive": 0 } }, "examples": [ @@ -65,4 +66,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/ln.json b/ln.json index e073c7a2..1663771b 100644 --- a/ln.json +++ b/ln.json @@ -1,19 +1,20 @@ { "id": "ln", "summary": "Natural logarithm", - "description": "The natural logarithm is the logarithm to the base *e* of the number `x`, which equals to using the *log* process with the base set to *e*. The natural logarithm is the inverse function of taking *e* to the power x.\n\nThe no-data value `null` is passed through.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. Therefore, *`ln(0)`* results in ±infinity if the processing environment supports it or otherwise an exception is thrown.", + "description": "The natural logarithm is the logarithm to the base *e* of the number `x`, which equals to using the *log* process with the base set to *e*. The natural logarithm is the inverse function of taking *e* to the power x.\n\nThe no-data value `null` is passed through.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. Therefore, *`ln(0)`* results in -infinity if the processing environment supports it or otherwise an exception is thrown. `NaN` is returned for values outside of the allowed range.", "categories": [ "math > exponential & logarithmic" ], "parameters": [ { "name": "x", - "description": "A number to compute the natural logarithm for.", + "description": "A number to compute the natural logarithm for in the range *[0, +∞)*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": 0 } } ], @@ -64,4 +65,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/log.json b/log.json index 89500837..67a19ba2 100644 --- a/log.json +++ b/log.json @@ -1,19 +1,20 @@ { "id": "log", "summary": "Logarithm to a base", - "description": "Logarithm to the base `base` of the number `x` is defined to be the inverse function of taking b to the power of x.\n\nThe no-data value `null` is passed through and therefore gets propagated if any of the arguments is `null`.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. Therefore, `log(0, 2)` results in ±infinity if the processing environment supports it or otherwise an exception is thrown.", + "description": "Logarithm to the base `base` of the number `x` is defined to be the inverse function of taking b to the power of x.\n\nThe no-data value `null` is passed through and therefore gets propagated if any of the arguments is `null`.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. Therefore, having `x` set to `0` with any base results in -infinity if the processing environment supports it or otherwise an exception is thrown. `NaN` is returned for values outside of the allowed range.", "categories": [ "math > exponential & logarithmic" ], "parameters": [ { "name": "x", - "description": "A number to compute the logarithm for.", + "description": "A number to compute the logarithm for in the range *[0, +∞)*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": 0 } }, { @@ -78,4 +79,4 @@ "title": "IEEE Standard 754-2019 for Floating-Point Arithmetic" } ] -} \ No newline at end of file +} diff --git a/mod.json b/mod.json index ca709386..0c8a6ea9 100644 --- a/mod.json +++ b/mod.json @@ -1,7 +1,7 @@ { "id": "mod", "summary": "Modulo", - "description": "Remainder after a division of `x` by `y` for both integers and floating-point numbers.\n\nThe result of a modulo operation has the sign of the divisor. The handling regarding the sign of the result [differs between programming languages](https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages) and needs careful consideration to avoid unexpected results.\n\nThe no-data value `null` is passed through and therefore gets propagated if any of the arguments is `null`. A modulo by zero results in ±infinity if the processing environment supports it. Otherwise, a `DivisionByZero` exception must the thrown.", + "description": "Remainder after a division of `x` by `y` for both integers and floating-point numbers.\n\nThe result of a modulo operation has the sign of the divisor. The handling regarding the sign of the result [differs between programming languages](https://en.wikipedia.org/wiki/Modulo_operation#In_programming_languages) and needs careful consideration to avoid unexpected results.\n\nThe no-data value `null` is passed through and therefore gets propagated if any of the arguments is `null`. If `y` is set to 0 this results in:\n\n- +infinity for `x` > 0,\n- -infinity for `x` < 0,\n- `NaN` for `x` = 0,\n- or otherwise, throws a `DivisionByZero` exception if the other options are not supported by the processing environment.", "categories": [ "math" ], @@ -92,4 +92,4 @@ "title": "Modulo explained by Wikipedia" } ] -} \ No newline at end of file +} diff --git a/sgn.json b/sgn.json index f59afdc4..ecdbd9d1 100644 --- a/sgn.json +++ b/sgn.json @@ -23,6 +23,12 @@ "type": [ "number", "null" + ], + "enum": [ + -1, + 0, + 1, + null ] } }, @@ -104,4 +110,4 @@ "result": true } } -} \ No newline at end of file +} diff --git a/sin.json b/sin.json index 06c45cc4..15285979 100644 --- a/sin.json +++ b/sin.json @@ -18,12 +18,14 @@ } ], "returns": { - "description": "The computed sine of `x`.", + "description": "The computed sine in the range *[-1, 1]*.", "schema": { "type": [ "number", "null" - ] + ], + "minimum": -1, + "maximum": 1 } }, "examples": [ @@ -41,4 +43,4 @@ "title": "Sine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/sinh.json b/sinh.json index c505b3a3..6eced19c 100644 --- a/sinh.json +++ b/sinh.json @@ -1,14 +1,14 @@ { "id": "sinh", "summary": "Hyperbolic sine", - "description": "Computes the hyperbolic sine of `x`.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the hyperbolic sine of `x`.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math > trigonometric" ], "parameters": [ { "name": "x", - "description": "An angle in radians.", + "description": "An hyperbolic angle in radians.", "schema": { "type": [ "number", @@ -18,7 +18,7 @@ } ], "returns": { - "description": "The computed hyperbolic sine of `x`.", + "description": "The computed hyperbolic sine.", "schema": { "type": [ "number", @@ -41,4 +41,4 @@ "title": "Hyperbolic sine explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} diff --git a/tan.json b/tan.json index c3952efa..6e927ffc 100644 --- a/tan.json +++ b/tan.json @@ -1,7 +1,7 @@ { "id": "tan", "summary": "Tangent", - "description": "Computes the tangent of `x`. The tangent is defined to be the sine of x divided by the cosine of x.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the tangent of `x`. The tangent is defined to be the sine of x divided by the cosine of x.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.\n\nThe computations follow [IEEE Standard 754](https://ieeexplore.ieee.org/document/8766229) whenever the processing environment supports it. Therefore, *`tan(pi()/2 + multipliy(pi(), n))`* with `n` being any integer results in ±infinity. -infinity for negative values passed to `tan`, +infinity otherwise. If the processing environment does not supports it, an exception is thrown.", "categories": [ "math > trigonometric" ], @@ -18,7 +18,7 @@ } ], "returns": { - "description": "The computed tangent of `x`.", + "description": "The computed tangent.", "schema": { "type": [ "number", @@ -39,6 +39,11 @@ "rel": "about", "href": "http://mathworld.wolfram.com/Tangent.html", "title": "Tangent explained by Wolfram MathWorld" + }, + { + "rel": "about", + "href": "https://ieeexplore.ieee.org/document/4610935", + "title": "IEEE Standard 754-2008 for Floating-Point Arithmetic" } ] -} \ No newline at end of file +} diff --git a/tanh.json b/tanh.json index 203f581e..a38462c9 100644 --- a/tanh.json +++ b/tanh.json @@ -1,14 +1,14 @@ { "id": "tanh", "summary": "Hyperbolic tangent", - "description": "Computes the hyperbolic tangent of `x`. The tangent is defined to be the hyperbolic sine of x divided by the hyperbolic cosine of x.\n\nWorks on radians only.\nThe no-data value `null` is passed through and therefore gets propagated.", + "description": "Computes the hyperbolic tangent of `x`. The tangent is defined to be the hyperbolic sine of x divided by the hyperbolic cosine of x.\n\nThe no-data value `null` is passed through and therefore gets propagated.", "categories": [ "math > trigonometric" ], "parameters": [ { "name": "x", - "description": "An angle in radians.", + "description": "An hyperbolic angle in radians.", "schema": { "type": [ "number", @@ -18,12 +18,14 @@ } ], "returns": { - "description": "The computed hyperbolic tangent of `x`.", + "description": "The computed hyperbolic tangent in the range *(-1, 1)*.", "schema": { "type": [ "number", "null" - ] + ], + "minimumExclusive": -1, + "maximumExclusive": 1 } }, "examples": [ @@ -41,4 +43,4 @@ "title": "Hyperbolic tangent explained by Wolfram MathWorld" } ] -} \ No newline at end of file +} From 427421595bc4b052c541de90dfdd995c44f54b30 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 22 Dec 2023 17:07:27 +0100 Subject: [PATCH 116/117] Added uniqueness contraints and clarified DimensionNotAvailable exception in temporal aggregations --- CHANGELOG.md | 2 ++ aggregate_temporal.json | 5 +++-- aggregate_temporal_period.json | 4 ++-- proposals/array_create_labeled.json | 3 ++- proposals/flatten_dimensions.json | 1 + proposals/predict_curve.json | 1 + proposals/unflatten_dimension.json | 1 + rename_labels.json | 2 ++ 8 files changed, 14 insertions(+), 5 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index f70afe7b..60a5aa22 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,10 +9,12 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ### Changed - `clip`: Throw an exception if min > max [#472](https://github.com/Open-EO/openeo-processes/issues/472) +- Added a uniqueness contraint to various array-typed parameters (e.g. lists of dimension names or labels) ### Fixed - Clarified for various mathematical functions the defined input and output ranges. Mention that `NaN` is returned outside of the defined input range where possible. +- `aggregate_temporal` and `aggregate_temporal_period`: Clarified that the process throws a `DimensionNotAvailable` exception when no temporal dimension exists. - `divide`: Clarified behavior for division by 0 - `between`: Clarify that `null` is passed through. - `eq` and `neq`: Explicitly set the minimum value for the `delta` parameter. diff --git a/aggregate_temporal.json b/aggregate_temporal.json index ad1d560b..2c66e0f2 100644 --- a/aggregate_temporal.json +++ b/aggregate_temporal.json @@ -128,6 +128,7 @@ "description": "Distinct labels for the intervals, which can contain dates and/or times. Is only required to be specified if the values for the start of the temporal intervals are not distinct and thus the default labels would not be unique. The number of labels and the number of groups need to be equal.", "schema": { "type": "array", + "uniqueItems": true, "items": { "type": [ "number", @@ -140,7 +141,7 @@ }, { "name": "dimension", - "description": "The name of the temporal dimension for aggregation. All data along the dimension is passed through the specified reducer. If the dimension is not set or set to `null`, the data cube is expected to only have one temporal dimension. Fails with a `TooManyDimensions` exception if it has more dimensions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "description": "The name of the temporal dimension for aggregation. All data along the dimension is passed through the specified reducer. If the dimension is not set or set to `null`, the data cube is expected to only have one temporal dimension. Fails with a `TooManyDimensions` exception if it has more dimensions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist or no temporal dimension is available.", "schema": { "type": [ "string", @@ -228,7 +229,7 @@ "message": "The data cube contains multiple temporal dimensions. The parameter `dimension` must be specified." }, "DimensionNotAvailable": { - "message": "A dimension with the specified name does not exist." + "message": "A dimension with the specified name does not exist or no temporal dimension is available." }, "DistinctDimensionLabelsRequired": { "message": "The dimension labels have duplicate values. Distinct labels must be specified." diff --git a/aggregate_temporal_period.json b/aggregate_temporal_period.json index bdaea43e..ce6d4be5 100644 --- a/aggregate_temporal_period.json +++ b/aggregate_temporal_period.json @@ -78,7 +78,7 @@ }, { "name": "dimension", - "description": "The name of the temporal dimension for aggregation. All data along the dimension is passed through the specified reducer. If the dimension is not set or set to `null`, the source data cube is expected to only have one temporal dimension. Fails with a `TooManyDimensions` exception if it has more dimensions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist.", + "description": "The name of the temporal dimension for aggregation. All data along the dimension is passed through the specified reducer. If the dimension is not set or set to `null`, the source data cube is expected to only have one temporal dimension. Fails with a `TooManyDimensions` exception if it has more dimensions. Fails with a `DimensionNotAvailable` exception if the specified dimension does not exist or no temporal dimension is available.", "schema": { "type": [ "string", @@ -115,7 +115,7 @@ "message": "The data cube contains multiple temporal dimensions. The parameter `dimension` must be specified." }, "DimensionNotAvailable": { - "message": "A dimension with the specified name does not exist." + "message": "A dimension with the specified name does not exist or no temporal dimension is available." }, "DistinctDimensionLabelsRequired": { "message": "The dimension labels have duplicate values. Distinct labels must be specified." diff --git a/proposals/array_create_labeled.json b/proposals/array_create_labeled.json index 8b5d2034..f83d6559 100644 --- a/proposals/array_create_labeled.json +++ b/proposals/array_create_labeled.json @@ -19,6 +19,7 @@ "description": "An array of labels to be used.", "schema": { "type": "array", + "uniqueItems": true, "items": { "type": [ "number", @@ -43,4 +44,4 @@ "message": "The number of values in the parameters `data` and `labels` don't match." } } -} \ No newline at end of file +} diff --git a/proposals/flatten_dimensions.json b/proposals/flatten_dimensions.json index da3647ab..1c02bf39 100644 --- a/proposals/flatten_dimensions.json +++ b/proposals/flatten_dimensions.json @@ -20,6 +20,7 @@ "description": "The names of the dimension to combine. The order of the array defines the order in which the dimension labels and values are combined (see the example in the process description). Fails with a `DimensionNotAvailable` exception if at least one of the specified dimensions does not exist.", "schema": { "type": "array", + "uniqueItems": true, "items": { "type": "string" } diff --git a/proposals/predict_curve.json b/proposals/predict_curve.json index 479b7fec..ef3e9596 100644 --- a/proposals/predict_curve.json +++ b/proposals/predict_curve.json @@ -68,6 +68,7 @@ }, { "type": "array", + "uniqueItems": true, "items": { "anyOf": [ { diff --git a/proposals/unflatten_dimension.json b/proposals/unflatten_dimension.json index 990e7469..a5a4bb6d 100644 --- a/proposals/unflatten_dimension.json +++ b/proposals/unflatten_dimension.json @@ -27,6 +27,7 @@ "description": "The names of the new target dimensions. New dimensions will be created with the given names and type `other` (see ``add_dimension()``). Fails with a `TargetDimensionExists` exception if any of the dimensions exists.\n\nThe order of the array defines the order in which the dimensions and dimension labels are added to the data cube (see the example in the process description).", "schema": { "type": "array", + "uniqueItems": true, "minItems": 2, "items": { "type": "string" diff --git a/rename_labels.json b/rename_labels.json index 2042737d..de4f07da 100644 --- a/rename_labels.json +++ b/rename_labels.json @@ -26,6 +26,7 @@ "description": "The new names for the labels.\n\nIf a target dimension label already exists in the data cube, a `LabelExists` exception is thrown.", "schema": { "type": "array", + "uniqueItems": true, "items": { "type": [ "number", @@ -39,6 +40,7 @@ "description": "The original names of the labels to be renamed to corresponding array elements in the parameter `target`. It is allowed to only specify a subset of labels to rename, as long as the `target` and `source` parameter have the same length. The order of the labels doesn't need to match the order of the dimension labels in the data cube. By default, the array is empty so that the dimension labels in the data cube are expected to be enumerated.\n\nIf the dimension labels are not enumerated and the given array is empty, the `LabelsNotEnumerated` exception is thrown. If one of the source dimension labels doesn't exist, the `LabelNotAvailable` exception is thrown.", "schema": { "type": "array", + "uniqueItems": true, "items": { "type": [ "number", From d5d0a1896ed64573529a0cc3225e3c3e682c4d46 Mon Sep 17 00:00:00 2001 From: Matthias Mohr Date: Fri, 22 Dec 2023 19:26:29 +0100 Subject: [PATCH 117/117] Remove unused exception from aggregate_temporal_period, clarified week definition --- CHANGELOG.md | 2 ++ aggregate_temporal_period.json | 5 +---- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 60a5aa22..9a301787 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -15,6 +15,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - Clarified for various mathematical functions the defined input and output ranges. Mention that `NaN` is returned outside of the defined input range where possible. - `aggregate_temporal` and `aggregate_temporal_period`: Clarified that the process throws a `DimensionNotAvailable` exception when no temporal dimension exists. +- `aggregate_temporal_period`: Removed unused exception `DistinctDimensionLabelsRequired` +- `aggregate_temporal_period`: Clarified that the definition of weeks follows ISO 8601 - `divide`: Clarified behavior for division by 0 - `between`: Clarify that `null` is passed through. - `eq` and `neq`: Explicitly set the minimum value for the `delta` parameter. diff --git a/aggregate_temporal_period.json b/aggregate_temporal_period.json index ce6d4be5..446afa8d 100644 --- a/aggregate_temporal_period.json +++ b/aggregate_temporal_period.json @@ -23,7 +23,7 @@ }, { "name": "period", - "description": "The time intervals to aggregate. The following pre-defined values are available:\n\n* `hour`: Hour of the day\n* `day`: Day of the year\n* `week`: Week of the year\n* `dekad`: Ten day periods, counted per year with three periods per month (day 1 - 10, 11 - 20 and 21 - end of month). The third dekad of the month can range from 8 to 11 days. For example, the third dekad of a year spans from January 21 till January 31 (11 days), the fourth dekad spans from February 1 till February 10 (10 days) and the sixth dekad spans from February 21 till February 28 or February 29 in a leap year (8 or 9 days respectively).\n* `month`: Month of the year\n* `season`: Three month periods of the calendar seasons (December - February, March - May, June - August, September - November).\n* `tropical-season`: Six month periods of the tropical seasons (November - April, May - October).\n* `year`: Proleptic years\n* `decade`: Ten year periods ([0-to-9 decade](https://en.wikipedia.org/wiki/Decade#0-to-9_decade)), from a year ending in a 0 to the next year ending in a 9.\n* `decade-ad`: Ten year periods ([1-to-0 decade](https://en.wikipedia.org/wiki/Decade#1-to-0_decade)) better aligned with the anno Domini (AD) calendar era, from a year ending in a 1 to the next year ending in a 0.", + "description": "The time intervals to aggregate. The following pre-defined values are available:\n\n* `hour`: Hour of the day\n* `day`: Day of the year\n* `week`: Week of the year as defined in [ISO 8601](https://en.wikipedia.org/wiki/ISO_8601#Week_dates)\n* `dekad`: Ten day periods, counted per year with three periods per month (day 1 - 10, 11 - 20 and 21 - end of month). The third dekad of the month can range from 8 to 11 days. For example, the third dekad of a year spans from January 21 till January 31 (11 days), the fourth dekad spans from February 1 till February 10 (10 days) and the sixth dekad spans from February 21 till February 28 or February 29 in a leap year (8 or 9 days respectively).\n* `month`: Month of the year\n* `season`: Three month periods of the calendar seasons (December - February, March - May, June - August, September - November).\n* `tropical-season`: Six month periods of the tropical seasons (November - April, May - October).\n* `year`: Proleptic years\n* `decade`: Ten year periods ([0-to-9 decade](https://en.wikipedia.org/wiki/Decade#0-to-9_decade)), from a year ending in a 0 to the next year ending in a 9.\n* `decade-ad`: Ten year periods ([1-to-0 decade](https://en.wikipedia.org/wiki/Decade#1-to-0_decade)) better aligned with the anno Domini (AD) calendar era, from a year ending in a 1 to the next year ending in a 0.", "schema": { "type": "string", "enum": [ @@ -116,9 +116,6 @@ }, "DimensionNotAvailable": { "message": "A dimension with the specified name does not exist or no temporal dimension is available." - }, - "DistinctDimensionLabelsRequired": { - "message": "The dimension labels have duplicate values. Distinct labels must be specified." } }, "links": [