Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flake8-use-pathlib] Recommend Path.iterdir() over os.listdir() (PTH208) #14509

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

InSyncWithFoo
Copy link
Contributor

Summary

Resolves #14490.

Test Plan

cargo nextest run and cargo insta test.

Copy link
Contributor

github-actions bot commented Nov 21, 2024

ruff-ecosystem results

Linter (stable)

✅ ecosystem check detected no linter changes.

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+52 -0 violations, +0 -0 fixes in 3 projects; 51 projects unchanged)

apache/airflow (+14 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:1134:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:2181:29: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/breeze/src/airflow_breeze/commands/sbom_commands.py:299:33: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/breeze/src/airflow_breeze/utils/cdxgen.py:142:22: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/check_files.py:204:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/check_files.py:217:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ dev/check_files.py:230:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ docs/build_docs.py:456:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ providers/src/airflow/providers/amazon/aws/hooks/sagemaker.py:176:63: PTH208 Use `pathlib.Path.iterdir()` instead.
+ providers/tests/openlineage/plugins/test_execution.py:60:81: PTH208 Use `pathlib.Path.iterdir()` instead.
+ providers/tests/sftp/hooks/test_sftp.py:177:39: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/ci/pre_commit/version_heads_map.py:47:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests_common/test_utils/system_tests_class.py:103:17: PTH208 Use `pathlib.Path.iterdir()` instead.
... 1 additional changes omitted for project

bokeh/bokeh (+11 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ examples/server/app/simple_hdf5/main.py:19:28: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/command/subcommands/__init__.py:54:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/sphinxext/bokeh_gallery.py:134:18: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/sphinxext/bokeh_gallery.py:160:21: PTH208 Use `pathlib.Path.iterdir()` instead.
+ src/bokeh/sphinxext/bokeh_releases.py:76:47: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/support/util/examples.py:186:33: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test___init___subcommands.py:48:13: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test_json__subcommands.py:111:54: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test_json__subcommands.py:123:50: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tests/unit/bokeh/command/subcommands/test_json__subcommands.py:135:50: PTH208 Use `pathlib.Path.iterdir()` instead.
... 1 additional changes omitted for project

zulip/zulip (+27 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

+ corporate/tests/test_stripe.py:137:18: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/run_hooks.py:63:38: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/setup_venv.py:165:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:143:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:308:21: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:351:27: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:370:64: PTH208 Use `pathlib.Path.iterdir()` instead.
+ scripts/lib/zulip_tools.py:657:26: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/documentation_crawler/documentation_crawler/spiders/check_help_documentation.py:34:29: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/lib/test_script.py:102:34: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/setup/generate_landing_page_images.py:29:21: PTH208 Use `pathlib.Path.iterdir()` instead.
+ tools/setup/generate_zulip_bots_static_files.py:48:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/data_import/mattermost.py:879:8: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/data_import/slack.py:1431:8: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/data_import/slack.py:818:22: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/lib/sounds.py:10:22: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/compilemessages.py:98:23: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/convert_mattermost_data.py:61:12: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/convert_rocketchat_data.py:39:12: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/export.py:138:20: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/management/commands/export_single_user.py:41:47: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_delete_unclaimed_attachments.py:75:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_delete_unclaimed_attachments.py:94:17: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_import_export.py:268:30: PTH208 Use `pathlib.Path.iterdir()` instead.
+ zerver/tests/test_urls.py:37:20: PTH208 Use `pathlib.Path.iterdir()` instead.
... 2 additional changes omitted for project

Changes by rule (1 rules affected)

code total + violation - violation + fix - fix
PTH208 52 52 0 0 0

@MichaReiser MichaReiser added rule Implementing or modifying a lint rule preview Related to preview mode features labels Nov 22, 2024
@MichaReiser
Copy link
Member

I noticed two common patterns when reviewing the ecosystem checks:

if os.listdir("dir"): ....

if "file" in os.listdir("dir"):

The first requires using len, list, or any because using the Path.iterdir directly always returns true.

The second mainly becomes more verbose. I'm interested in more opinions if we should exclude them. Wdyt @sbrugman

https://github.com/apache/airflow/blob/440c224af5592f9007eef43d1dbe9025aa34e177/docs/build_docs.py#L456

https://github.com/bokeh/bokeh/blob/829b2a75c402d0d0abd7e37ff201fbdfd949d857/examples/server/app/simple_hdf5/main.py#L19
https://github.com/zulip/zulip/blob/65f05794ee59d638ad054ae6602d8ebc980fb637/scripts/lib/zulip_tools.py#L657
https://github.com/zulip/zulip/blob/65f05794ee59d638ad054ae6602d8ebc980fb637/zerver/data_import/mattermost.py#L879

@InSyncWithFoo
Copy link
Contributor Author

The first requires using len, list, or any

To be pedantic, len() can't be used on an iterator, so only the other two should be suggested in that case.

@sbrugman
Copy link
Contributor

sbrugman commented Nov 25, 2024

The pathlib rules should flag all os.path cases. When these rules are active I assume users made the decision to favour pathlib over os.path, and partially excluding some examples will be unexpected.

It could be good to already include these cases in the tests to make sure they are covered when autofix is implemented later. The complexity here is in the fix, not in the detection of the violation.

Going over the ecosystem results I realise that os.scandir should also be flagged (and is closer to Path.iterdir as it produces a generator). @InSyncWithFoo it's probably worth adding this as PTH209. The non-trivial fixes stem from unidiomatic use of os.path in the first place imo.

Using listdir for checking that a file does not exist is even a candidate for it's own rule as this first lists all files in a directory, and then only checks one:

'demo_data.hdf5' not in os.listdir(app_dir)

Idiomatic os.path solution:

not os.path.exists(os.path.join(app_dir), 'demo_data.hdf5')

Pathlib equivalent:

not (app_dir / "demo_data.hdf5").exists()

Checking if a directory is empty with os.listdir is also wasteful:

not os.listdir(api_dir)

Users should probably write something like:

next(os.scandir(api_dir), None) is None

Pathlib equivalent:

next(api_dir.iterdir(), None) is None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
preview Related to preview mode features rule Implementing or modifying a lint rule
Projects
None yet
Development

Successfully merging this pull request may close these issues.

New flake8-pathlib rule: os.listdir (PTH208)
3 participants