Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[perflint] fix invalid hoist in perf401 #14369

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

w0nder1ng
Copy link
Contributor

@w0nder1ng w0nder1ng commented Nov 15, 2024

This should fix #14362. This new fix currently deletes lines like this:

- tmp = 1; result = []
- for i in range(10):
-   result.append(i+1)
+ result = [i+1 for i in range(10)]

Is there a convenient way to get every statement within a given TextRange to detect when this is happening?

Copy link
Contributor

github-actions bot commented Nov 15, 2024

ruff-ecosystem results

Linter (stable)

ℹ️ ecosystem check detected linter changes. (+17 -29 violations, +0 -0 fixes in 6 projects; 48 projects unchanged)

apache/airflow (+6 -11 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3070:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3070:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use a list comprehension to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use `list.extend` to create a transformed list
+ docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use a list comprehension to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:111:9: PERF401 Use `list.extend` to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:119:9: PERF401 Use `list.extend` to create a transformed list
- providers/src/airflow/providers/amazon/aws/auth_manager/aws_auth_manager.py:400:25: PERF401 Use a list comprehension to create a transformed list
- providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use `list.extend` with an async comprehension to create a transformed list
+ providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use an async list comprehension to create a transformed list
- scripts/in_container/run_provider_yaml_files_check.py:178:17: PERF401 Use `list.extend` to create a transformed list
- scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use `list.extend` to create a transformed list
+ scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use a list comprehension to create a transformed list
- tests/jobs/test_scheduler_job.py:759:13: PERF401 Use a list comprehension to create a transformed list

apache/superset (+4 -10 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- scripts/benchmark_migration.py:128:21: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/base.py:107:9: PERF401 Use `list.extend` to create a transformed list
+ superset/db_engine_specs/base.py:107:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:245:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:245:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:251:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:251:9: PERF401 Use a list comprehension to create a transformed list
- superset/db_engine_specs/lib.py:261:9: PERF401 Use a list comprehension to create a transformed list
- superset/db_engine_specs/lib.py:281:9: PERF401 Use a list comprehension to create a transformed list
- superset/db_engine_specs/lib.py:293:9: PERF401 Use a list comprehension to create a transformed list
- superset/tasks/cache.py:208:13: PERF401 Use a list comprehension to create a transformed list
- tests/integration_tests/annotation_layers/fixtures.py:85:9: PERF401 Use a list comprehension to create a transformed list
+ tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use `list.extend` to create a transformed list
- tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use a list comprehension to create a transformed list

bokeh/bokeh (+1 -3 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --no-preview --select ALL

- src/bokeh/plotting/_figure.py:479:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/plotting/_figure.py:485:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/server/contexts.py:310:17: PERF401 Use `list.extend` to create a transformed list
+ src/bokeh/server/contexts.py:310:17: PERF401 Use a list comprehension to create a transformed list

latchbio/latch (+3 -5 violations, +0 -0 fixes)

- src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use `list.extend` to create a transformed list
+ src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use a list comprehension to create a transformed list
- src/latch/registry/utils.py:70:13: PERF401 Use `list.extend` to create a transformed list
+ src/latch/registry/utils.py:70:13: PERF401 Use a list comprehension to create a transformed list
- src/latch_cli/services/cp/utils.py:54:9: PERF401 Use `list.extend` to create a transformed list
+ src/latch_cli/services/cp/utils.py:54:9: PERF401 Use a list comprehension to create a transformed list
- src/latch_cli/snakemake/workflow.py:158:21: PERF401 Use `list.extend` to create a transformed list
- src/latch_cli/snakemake/workflow.py:978:25: PERF401 Use a list comprehension to create a transformed list

pandas-dev/pandas (+0 -0 violations, +0 -0 fixes)


indico/indico (+3 -0 violations, +0 -0 fixes)

+ indico/modules/events/registration/lists.py:116:93: RUF100 [*] Unused `noqa` directive (unused: `PERF401`)
+ indico/modules/rb/models/rooms.py:513:91: RUF100 [*] Unused `noqa` directive (unused: `PERF401`)
+ indico/modules/rb/operations/rooms.py:60:87: RUF100 [*] Unused `noqa` directive (unused: `PERF401`)

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
PERF401 43 14 29 0 0
RUF100 3 3 0 0 0

Linter (preview)

ℹ️ ecosystem check detected linter changes. (+17 -29 violations, +0 -0 fixes in 5 projects; 49 projects unchanged)

apache/airflow (+6 -11 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3070:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/commands/release_management_commands.py:3070:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/exclude_from_matrix.py:32:9: PERF401 Use a list comprehension to create a transformed list
- dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use `list.extend` to create a transformed list
+ dev/breeze/src/airflow_breeze/utils/packages.py:327:9: PERF401 Use a list comprehension to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use `list.extend` to create a transformed list
+ docs/exts/docs_build/fetch_inventories.py:103:9: PERF401 Use a list comprehension to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:111:9: PERF401 Use `list.extend` to create a transformed list
- docs/exts/docs_build/fetch_inventories.py:119:9: PERF401 Use `list.extend` to create a transformed list
- providers/src/airflow/providers/amazon/aws/auth_manager/aws_auth_manager.py:400:25: PERF401 Use a list comprehension to create a transformed list
- providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use `list.extend` with an async comprehension to create a transformed list
+ providers/src/airflow/providers/microsoft/azure/hooks/wasb.py:721:13: PERF401 Use an async list comprehension to create a transformed list
- scripts/in_container/run_provider_yaml_files_check.py:178:17: PERF401 Use `list.extend` to create a transformed list
- scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use `list.extend` to create a transformed list
+ scripts/in_container/update_quarantined_test_status.py:81:13: PERF401 Use a list comprehension to create a transformed list
- tests/jobs/test_scheduler_job.py:759:13: PERF401 Use a list comprehension to create a transformed list

apache/superset (+4 -10 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- scripts/benchmark_migration.py:128:21: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/base.py:107:9: PERF401 Use `list.extend` to create a transformed list
+ superset/db_engine_specs/base.py:107:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:245:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:245:9: PERF401 Use a list comprehension to create a transformed list
+ superset/db_engine_specs/lib.py:251:9: PERF401 Use `list.extend` to create a transformed list
- superset/db_engine_specs/lib.py:251:9: PERF401 Use a list comprehension to create a transformed list
- superset/db_engine_specs/lib.py:261:9: PERF401 Use a list comprehension to create a transformed list
- superset/db_engine_specs/lib.py:281:9: PERF401 Use a list comprehension to create a transformed list
- superset/db_engine_specs/lib.py:293:9: PERF401 Use a list comprehension to create a transformed list
- superset/tasks/cache.py:208:13: PERF401 Use a list comprehension to create a transformed list
- tests/integration_tests/annotation_layers/fixtures.py:85:9: PERF401 Use a list comprehension to create a transformed list
+ tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use `list.extend` to create a transformed list
- tests/integration_tests/security/migrate_roles_tests.py:50:13: PERF401 Use a list comprehension to create a transformed list

bokeh/bokeh (+1 -3 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview --select ALL

- src/bokeh/plotting/_figure.py:479:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/plotting/_figure.py:485:17: PERF401 Use `list.extend` to create a transformed list
- src/bokeh/server/contexts.py:310:17: PERF401 Use `list.extend` to create a transformed list
+ src/bokeh/server/contexts.py:310:17: PERF401 Use a list comprehension to create a transformed list

latchbio/latch (+3 -5 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

- src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use `list.extend` to create a transformed list
+ src/latch/ldata/_transfer/upload.py:163:25: PERF401 Use a list comprehension to create a transformed list
- src/latch/registry/utils.py:70:13: PERF401 Use `list.extend` to create a transformed list
+ src/latch/registry/utils.py:70:13: PERF401 Use a list comprehension to create a transformed list
- src/latch_cli/services/cp/utils.py:54:9: PERF401 Use `list.extend` to create a transformed list
+ src/latch_cli/services/cp/utils.py:54:9: PERF401 Use a list comprehension to create a transformed list
- src/latch_cli/snakemake/workflow.py:158:21: PERF401 Use `list.extend` to create a transformed list
- src/latch_cli/snakemake/workflow.py:978:25: PERF401 Use a list comprehension to create a transformed list

indico/indico (+3 -0 violations, +0 -0 fixes)

ruff check --no-cache --exit-zero --ignore RUF9 --output-format concise --preview

+ indico/modules/events/registration/lists.py:116:93: RUF100 [*] Unused `noqa` directive (unused: `PERF401`)
+ indico/modules/rb/models/rooms.py:513:91: RUF100 [*] Unused `noqa` directive (unused: `PERF401`)
+ indico/modules/rb/operations/rooms.py:60:87: RUF100 [*] Unused `noqa` directive (unused: `PERF401`)

Changes by rule (2 rules affected)

code total + violation - violation + fix - fix
PERF401 43 14 29 0 0
RUF100 3 3 0 0 0

@charliermarsh charliermarsh added bug Something isn't working fixes Related to suggested fixes for violations labels Nov 16, 2024
@Skylion007
Copy link
Contributor

Skylion007 commented Nov 17, 2024

@w0nder1ng Another fun edge case that PR #14369 doesn't currently fix. Scoping rules are different between list comprehensions and for loops:

another example from pytorch

     if kwargs is None:
         kwargs = {}
-    impl_args = []
-    for a in args:
-        impl_args.append(_helper(a, map_fn))
+    impl_args = [_helper(a, map_fn) for a in args]
     impl_kwargs = {}
     for k in kwargs.keys():
         impl_kwargs[k] = _helper(a, map_fn)

the a on the line impl_kwargs[k] = _helper(a, map_fn) is now undefined (and will error with F821) since the list comprehension was converted from a loop. RUFF immeaditely flags this with an F821 error, and I test and confirm the temporary variable a does not leave the listcomp scope, while it does leave the for loop one.

@w0nder1ng If you want a good test bed, just run the autofix on PyTorch (or another large codebase) and see after applying the fixes if any other ruff rule violations are immediately detected.

@Skylion007
Copy link
Contributor

Skylion007 commented Nov 17, 2024

Also another really minor nit is that it can duplicate comments.

-        for dtype in ["f16", "bf16"]:
-            kernels.append(
-                cls(
-                    aligned=True,
-                    dtype=dtype,
-                    sm_range=(80, SM[SM.index(80) + 1]),
-                    apply_dropout=False,
-                    preload_mmas=True,
-                    block_i=128,
-                    block_j=64,
-                    max_k=96,
-                    # Sm80 has a faster kernel for this case
-                    dispatch_cond="cc == 86 || cc == 89",
-                )
+        # Sm80 has a faster kernel for this case
+        kernels.extend(
+            cls(
+                aligned=True,
+                dtype=dtype,
+                sm_range=(80, SM[SM.index(80) + 1]),
+                apply_dropout=False,
+                preload_mmas=True,
+                block_i=128,
+                block_j=64,
+                max_k=96,
+                # Sm80 has a faster kernel for this case
+                dispatch_cond="cc == 86 || cc == 89",
             )
+            for dtype in ["f16", "bf16"]
+        )

See here.

@w0nder1ng
Copy link
Contributor Author

Another fun edge case that PR #14369 doesn't currently fix

I think in this case, the lint shouldn't apply at all. If applying the fix breaks the code, then someone manually doing the same thing also wouldn't work. It should probably check that all references to the loop variable are inside the for loop before reporting the lint.

I'll also try to fix the duplicate comments. I'm starting to see why this didn't have a fix before :)

@w0nder1ng
Copy link
Contributor Author

The comment duplication happened when the comment was inside the append body, it should be fixed now.

// ```
let for_loop_target = checker
.semantic()
.lookup_symbol(id.as_str())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, resolve_name returns None for the for-loop target. Also, references to it outside of the for-loop are not included in its list of references; e.g.

def f():
    result = []
    for val in range(5):
        result.append(val * 2)
    print(val) # this reference is not included in the list of references

Copy link
Member

@MichaReiser MichaReiser Nov 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the problem here is that we build the semantic model lazily as we traverse the AST for checking and it hasn't reached the point after the for loop yet

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@charliermarsh any suggestions on how to best find all usages of target?

@MichaReiser
Copy link
Member

Could you take a look why https://github.com/apache/superset/blob/e528cb48c44543c14c1ac9a93528b147bcaecfde/scripts/benchmark_migration.py#L128 is no longer reported. I might have missed something obvious but it isn't clear to me why the diagnostic isn't reported anymore.

@w0nder1ng
Copy link
Contributor Author

w0nder1ng commented Nov 18, 2024

I suspect the regressions are because of the resolve_name currently being a lookup_symbol. In the example you gave, I think it's finding a different foreign_key binding on line 113 and concluding that the symbol is used outside of the for loop. Once the binding has the right scope, the only regressions should be ones where the lint shouldn't have applied.

@Skylion007
Copy link
Contributor

Could you take a look why https://github.com/apache/superset/blob/e528cb48c44543c14c1ac9a93528b147bcaecfde/scripts/benchmark_migration.py#L128 is no longer reported. I might have missed something obvious but it isn't clear to me why the diagnostic isn't reported anymore.

Because the lambda arg here https://github.com/apache/superset/blob/e528cb48c44543c14c1ac9a93528b147bcaecfde/scripts/benchmark_migration.py#L131C28-L131C33 is shadowing the variable "model" in the for loop. In this specific case, it's probably okay since it's in a lambda arg, but in general it shouldn't apply the fix there. I suppose only references rvalues are problematic outside of the forloop, not lvalues (if all they do is immediately get overwritten after all).

@Skylion007
Copy link
Contributor

Okay hopefully last nit:

     reduction_axes: List[int] = []
-    for i in range(input_rank):
-        if i != axis:
-            reduction_axes.append(i)
+    reduction_axes.extend(i for i in range(input_rank) if i != axis)

Doesn't seem to like to hoist if there are any type_hints on the original [] instantiation. Is this intentional?

@w0nder1ng
Copy link
Contributor Author

I'll take a look

@w0nder1ng
Copy link
Contributor Author

w0nder1ng commented Nov 18, 2024

Annotated assigns have a different statement type than normal assigns, and I wasn't handling it. The fix should work on type-annotated lists now.

@Skylion007
Copy link
Contributor

@w0nder1ng Another fun edge case that PR #14369 doesn't currently fix. Scoping rules are different between list comprehensions and for loops:

another example from pytorch

     if kwargs is None:
         kwargs = {}
-    impl_args = []
-    for a in args:
-        impl_args.append(_helper(a, map_fn))
+    impl_args = [_helper(a, map_fn) for a in args]
     impl_kwargs = {}
     for k in kwargs.keys():
         impl_kwargs[k] = _helper(a, map_fn)

the a on the line impl_kwargs[k] = _helper(a, map_fn) is now undefined (and will error with F821) since the list comprehension was converted from a loop. RUFF immeaditely flags this with an F821 error, and I test and confirm the temporary variable a does not leave the listcomp scope, while it does leave the for loop one.

@w0nder1ng If you want a good test bed, just run the autofix on PyTorch (or another large codebase) and see after applying the fixes if any other ruff rule violations are immediately detected.

This one is still occurring sadly. Maybe because it references another loop lol?

@Skylion007
Copy link
Contributor

Skylion007 commented Nov 19, 2024

Despite this minor false positive, looks like all the other fixes worked well on the PyTorch codebase (for the torch/torchgen folders) pytorch/pytorch#140980. 👍

@w0nder1ng
Copy link
Contributor Author

I'm still stuck on two things.

  1. How can I determine whether the binding statement has another statement on the line?
result = []; test = [] # deleting this whole line is wrong, but only deleting the binding statement also is wrong
# since it would leave the comment if there were no other statement
for ...
  1. How can I get the semantic model to find references after the for loop?
result = []
for i in range(10):
  result.append(i+1)
print(i) # "fixing" this for loop is invalid because something relies on the loop variable, but the semantic model doesn't see this reference yet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fixes Related to suggested fixes for violations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PERF401 new preview fixes invalidly hoists extend to list compre
4 participants