Skip to content

Commit

Permalink
Tooling: add and run basic pre-commit hooks (#457)
Browse files Browse the repository at this point in the history
* Tooling: add and run basic pre-commit hooks to keep codebase clean and consistent
  • Loading branch information
sergei-encord authored Nov 9, 2023
1 parent f7424ec commit 80d5658
Show file tree
Hide file tree
Showing 20 changed files with 52 additions and 48 deletions.
2 changes: 1 addition & 1 deletion .git-blame-ignore-revs
Original file line number Diff line number Diff line change
Expand Up @@ -3,4 +3,4 @@
# to be what you are interested in when blaming.

# Apply isort
8de122a3f511350ac47c8fc0aee1fb4a4f6d5c39
8de122a3f511350ac47c8fc0aee1fb4a4f6d5c39
4 changes: 2 additions & 2 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,11 @@ _There should be enough internal documentation for a product owner to write cust

# Tests

_Make a quick statement and post any relevant links of CI / test results. If the testing infrastructure isn’t yet in-place, note that instead._
_Make a quick statement and post any relevant links of CI / test results. If the testing infrastructure isn’t yet in-place, note that instead._

- _What are the critical unit tests?_
- _Explain the Integration Tests such that it’s clear Correctness is satisfied. Link to test results if possible._

# Known issues

_If there are any known issues with the solution, make a statement about what they are and why they are Ok to leave unsolved for now. Make tickets for the known issues linked to the original ticket linked above_
_If there are any known issues with the solution, make a statement about what they are and why they are Ok to leave unsolved for now. Make tickets for the known issues linked to the original ticket linked above_
2 changes: 1 addition & 1 deletion .github/workflows/publish_sdk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -65,4 +65,4 @@ jobs:
success-channel: ${{ secrets.SLACK_DEPLOYMENT_PROD_CHANNEL_ID }}
failure-channel: ${{ secrets.SLACK_FAILURE_CHANNEL_ID }}
success-message: Deployed to https://pypi.org/project/cord-client-python/
failure-message: This pipeline has failed!
failure-message: This pipeline has failed!
6 changes: 3 additions & 3 deletions .github/workflows/test-sdk.yml
Original file line number Diff line number Diff line change
Expand Up @@ -99,8 +99,8 @@ jobs:
name: Publish test reports
runs-on: ubuntu-20.04
needs: [
unit-tests-python-3-8,
unit-tests-python-3-9,
unit-tests-python-3-8,
unit-tests-python-3-9,
unit-tests-python-3-10,
unit-tests-python-3-11,
integration-tests-python-3-11,
Expand Down Expand Up @@ -187,4 +187,4 @@ jobs:
with:
success-parameter: ${{ env.WORKFLOW_CONCLUSION }}
failure-channel: ${{ secrets.SLACK_FAILURE_CHANNEL_ID }}
failure-message: This pipeline has failed!
failure-message: This pipeline has failed!
8 changes: 8 additions & 0 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
@@ -1,4 +1,12 @@
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: c4a0b883114b00d8d76b479c820ce7950211c99b # 4.5.0
hooks:
- id: check-yaml
- id: check-json
- id: check-toml
- id: end-of-file-fixer
- id: trailing-whitespace
- repo: local
hooks:
- id: autoflake
Expand Down
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -198,4 +198,4 @@
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
limitations under the License.
40 changes: 20 additions & 20 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,15 @@

To build the documentation, follow these steps:

**Prerequisites:**
**Prerequisites:**

- Install poetry using commands given [here](https://python-poetry.org/docs/).
- Install poetry using commands given [here](https://python-poetry.org/docs/).

- If you encounter "zsh: command not found: python" follow [this link](https://dev.to/smpnjn/how-to-fix-zsh-command-not-found-python-22j8) to fix it.
- If you encounter "zsh: command not found: python" follow [this link](https://dev.to/smpnjn/how-to-fix-zsh-command-not-found-python-22j8) to fix it.

---

1. Navigate to the root of the repo and run
1. Navigate to the root of the repo and run
```shell
> poetry install
```
Expand All @@ -31,7 +31,7 @@ To build the documentation, follow these steps:

- View the doc changes by running:
```shell
> open _build/html/index.html
> open _build/html/index.html
```

---
Expand All @@ -49,23 +49,23 @@ If you are adding a new module which should be documented, follow the examples w
Whenever you add a new public class or function, this documentation should be rebuild and uploaded to the server.

## Common Doc-writing-patterns
1. To keep terminology consistent, we have defined multiple substitutes for common entities in `source/substitutes.rst`.
1. To keep terminology consistent, we have defined multiple substitutes for common entities in `source/substitutes.rst`.
You should familiarise your self with them.
For example, instead of writing "Encord SDK" repeatedly, we have defined the substitute `|sdk|` which will be substituted for "Encord SDK" at build time.
2. To avoid having hardcoded links all over the place, we keep links in the `source/links.json` file. To insert a link, use ``:xref:`link_key` ``.
2. To avoid having hardcoded links all over the place, we keep links in the `source/links.json` file. To insert a link, use ``:xref:`link_key` ``.
For example, in the `links.json` file, the following entry exists:
```json
"create_project_on_platform": {
"user_text": "Create Project on Platform",
"url": "https://docs.encord.com/docs/annotate-annotation-projects#creating-annotation-projects"
}
```
Using ``:xref:`create_project_on_platform` `` will be transformed into [Create Project on Platform](https://docs.encord.com/docs/annotate-annotation-projects#creating-annotation-projects) at built time.
Using ``:xref:`create_project_on_platform` `` will be transformed into [Create Project on Platform](https://docs.encord.com/docs/annotate-annotation-projects#creating-annotation-projects) at built time.
__NB:__ There is a script that allows you to easily add a link to the json form the
commandline. Simply run `./add_link` and it will query the necessary information and
(naively) check for duplicates.
3. We have included the `sphinx_tabs` [module](https://github.com/executablebooks/sphinx-tabs), which, e.g., allows you to have both a code tab and an output tab. See, e.g., `source/quickstart.rst`.
4. As demonstrated in the `source/quickstart.rst`, you can include code from files.
4. As demonstrated in the `source/quickstart.rst`, you can include code from files.
With python files (in `source/code_examples`), we try to keep them "blacked" with an enforced 60 column linewidth (to avoid horizontally scrolling code examples).
To black all code examples, you can run `> black source/code_examples/**/*.py --line-length 60`.
5. We use the [sphinx-codeautolink](https://sphinx-codeautolink.readthedocs.io/en/latest/index.html)
Expand All @@ -80,7 +80,7 @@ Whenever you add a new public class or function, this documentation should be re
> sphinx-build -b text source _text
```
This will build the docs in a separate `_text` directory from which you can copy the content of text files to, e.g., Grammarly.


## Hosting
Bake this delicious recipe and share the docs publicly with your friends with these easy steps.
Expand All @@ -93,36 +93,36 @@ Bake this delicious recipe and share the docs publicly with your friends with th
1) Pre-heat the oven and build your project with `make html` from the `docs` directory
2) Run `firebase deploy --only hosting:python-docs` to publish them online.
3) Visit the [firebase hosting site](https://console.firebase.google.com/u/0/project/cord-docs/hosting/sites/python-docs), check out the public docs domain and double check if everything is to your taste.
4) Share the docs with your friends who are most hungry for knowledge!
4) Share the docs with your friends who are most hungry for knowledge!

# TODOs / potential improvements
- [ ] For "Pardon our dust"-sections, we could add links to (to-be-removed) docs from the web-app documentation until they are updated.
- [ ] Datasets, Projects, Label rows, etc. are not written consistently (caps or not) across the documentation. Consider using substitutes for this.
- [ ] There may be some remaining `resource_id` examples left, which are confusing because in most contexts, the `resource_id` can only be one of, e.g., a `project_hash` or `dataset_hash`. We should be specific when possible.
- [ ] As we use Black which generally uses `line-length` 88 (88 columns code), the code in the auto-docs sometimes looks ugly.
- [ ] As we use Black which generally uses `line-length` 88 (88 columns code), the code in the auto-docs sometimes looks ugly.
See, e.g., [the docs for `get_datasets`](python.docs.encord.com/user_client.html#EncordUserClient.get_datasets).
It seems like there is a [css fix](https://github.com/sphinx-doc/sphinx/issues/3092#issuecomment-258922773) to this issue.
A fix is commented out in the `_static/css/custom.css` file because it seems that black currently still allows longer lines than, e.g., 88 columns.
- [ ] We should have one complete code example for each tutorial subsection where all the stuff possible is done in one example.
That way, developers can just copy this file instead of stitching all the tiny examples.
- [ ] We should have one complete code example for each tutorial subsection where all the stuff possible is done in one example.
That way, developers can just copy this file instead of stitching all the tiny examples.

# Inconsistencies/potential confusions in our code
I (FHV) have tried to take notes of inconsistencies in the code base that may be sources of confusion:

1. `EncordUserClient.get_or_create_dataset_api_key` returns an `encord.orm.dataset.DatasetAPIKey` but `EncordUserClient.get_or_create_project_api_key` just returns the key as a string.
1. `EncordUserClient.get_or_create_dataset_api_key` returns an `encord.orm.dataset.DatasetAPIKey` but `EncordUserClient.get_or_create_project_api_key` just returns the key as a string.
2. Access scopes for API keys are defined in different places for projects and datasets.
The project api keys have access scopes defined in `encord.utilities.client_utilities.APIKeyScopes` but dataset api keys have scopes defined in `encord.orm.dataset.DatasetScope` this makes no sense.
The project api keys have access scopes defined in `encord.utilities.client_utilities.APIKeyScopes` but dataset api keys have scopes defined in `encord.orm.dataset.DatasetScope` this makes no sense.
3. The `encord.orm.dataset.DatasetInfo` has the attribute `type` which is an int, but one uses the `encord.orm.dataset.StorageLocation` when creating a dataset. I see no reason why the DatasetInfo wouldn't present the type as a `StorageLocation` such that users don't need to guess what, e.g., type `0` means.
4. FHV: It seems weird to me that the logic for converting ontology classifications and ontology objects to dictionaries is located in the `encord.project_ontology.ontology.py` file and not in the `encord.project_ontology.ontology_classification.py` class.
4. FHV: It seems weird to me that the logic for converting ontology classifications and ontology objects to dictionaries is located in the `encord.project_ontology.ontology.py` file and not in the `encord.project_ontology.ontology_classification.py` class.
Also, there is a lot of custom logic for doing so, when the `dataclasses.asdict` with a custom factory could do the same thing with much less code.
Note, I have a local `fhv/ontology_parsing` branch which does it with `asdict` but probably not perfectly structured either.
5. FHV: Why is the `ObjectShape` definitions in the file `encord.project_ontology.object_type`? It makes no sense that both are not called either `object_type` or `object_shape`.
For classifications, they are both called `ClassificationType`.
6. FHV: When fetching projects through the `EncordUserClient`, the return type is a list of dictionaries with key `project` and value `encord.orm.project.Project` value.
The `Project` orm has properties apart from properties like `title`, `description`, and `project_hash`, which work fine for the particular call, the `Project` definition also has properties `label_rows` and `editor_ontology`, which are not populated in the mentioned call.
Even worse, when calling `project.label_rows`, a `KeyError` is thrown.
7. We are super inconsistent with the use of `uid`, `project_hash`, etc.
I think that we should be as specific as possible.
7. We are super inconsistent with the use of `uid`, `project_hash`, etc.
I think that we should be as specific as possible.
There is no point in abstracting away the distinction. We just make everything more confusing that way.
For example, `encord.orm.dataset.DataRow` uses `data_hash` under the name `uid` which makes no sense.
8. Why does `project_client.get_label_row('<label_hash>')` not have a `data_hash` attribute in the response, similar to `project_client.get_project().label_rows[0]`?
Expand All @@ -131,4 +131,4 @@ I (FHV) have tried to take notes of inconsistencies in the code base that may be


# Known issues
1. The scrollspy (right-side toc) highlights the previous item when clicking a link. This is apparently a [bootstrap issue](https://github.com/twbs/bootstrap/issues/32496).
1. The scrollspy (right-side toc) highlights the previous item when clicking a link. This is apparently a [bootstrap issue](https://github.com/twbs/bootstrap/issues/32496).
2 changes: 1 addition & 1 deletion docs/_static/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -61,4 +61,4 @@ div.document {
margin-right: 0px;
margin-left: 0px;
}
*/
*/
1 change: 0 additions & 1 deletion docs/source/code_examples/tutorials/end-to-end/README.txt
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,3 @@ End-to-end Code Examples

Here, you can browse examples of typical use-cases involving the |sdk|.
If you haven't logged in to the platform and added a :xref:`public-private_key_pair`, please do so before using the |sdk|.

2 changes: 1 addition & 1 deletion docs/source/faq.rst
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ If you see this method timing out, try reducing the number of labels requested a
Why do I see an `AuthenticationError` when working with label hashes?
=======================================================================

An `AuthenticationError` can occur when no label hash is present in a label row. To fix this the :meth:`~encord.objects.LabelRowV2.initialise_labels`
An `AuthenticationError` can occur when no label hash is present in a label row. To fix this the :meth:`~encord.objects.LabelRowV2.initialise_labels`
method needs to be called before any operation using the label hash can be performed, such as in the example below.

.. tabs::
Expand Down
2 changes: 1 addition & 1 deletion docs/source/images/end-to-end-thumbs/Artboard 10.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
2 changes: 1 addition & 1 deletion docs/source/images/end-to-end-thumbs/product-data.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading

0 comments on commit 80d5658

Please sign in to comment.