Improve performance in environments with long search paths #17948

hauntsaninja · 2024-10-15T05:53:22Z

In my work environment, we editably install most Python packages. This leads to long search paths, e.g. 200 entries is common. I think it should be possible to significantly improve mypy's performance in this case.

My benchmark workload is mypy -c "import torch" on a mypyc-compiled mypy with compile level 3.

I'll run it in the following environments:

clean

rm -rf clean
python -m venv clean
uv pip install torch --python clean/bin/python

long

rm -rf long
python -m venv long
uv pip install torch --python long/bin/python
for i in $(seq 1 200); do
    dir=$(pwd)/repo/$i
    mkdir -p $dir
    echo $dir >> $(long/bin/python -c "import site; print(site.getsitepackages()[0])")/repo.pth
done

openai
This is my main dev environment. I'll see if I can make an artificial environment that matches the performance characteristics of this more closely (this is pretty easy, just need to install a bunch of third party libraries).

bd9200b is my baseline commit

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental
  Time (mean ± σ):     19.372 s ±  0.179 s    [User: 17.018 s, System: 2.285 s]
  Range (min … max):   19.223 s … 19.570 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     34.571 s ±  0.085 s    [User: 31.770 s, System: 2.762 s]
  Range (min … max):   34.499 s … 34.664 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_bd9200bda/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     51.342 s ±  0.472 s    [User: 46.853 s, System: 4.423 s]
  Range (min … max):   50.840 s … 51.776 s    3 runs

#17920 has already provided a big win here

88ae62b was the commit I measured

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=clean/bin/python --no-incremental
  Time (mean ± σ):     19.094 s ±  0.195 s    [User: 16.782 s, System: 2.243 s]
  Range (min … max):   18.935 s … 19.312 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental
  Time (mean ± σ):     24.838 s ±  0.237 s    [User: 22.038 s, System: 2.750 s]
  Range (min … max):   24.598 s … 25.073 s    3 runs

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     34.161 s ±  0.163 s    [User: 29.818 s, System: 4.289 s]
  Range (min … max):   34.013 s … 34.336 s    3 runs

You can see that mypy in my environment is still 1.8x slower than it could be (and 1.3x slower in the reproducible toy environment).

Some ideas for things to experiment with:

We could make fscache cleverer, seeing if we can scandir on parents to get cheaper is_dir and is_file. Especially when querying the existence of entries that are on search paths.
We could avoid some of the case sensitive handling if we know our file system is case sensitive
We could vendor some of os.path into mypy, so that mypyc can compile these functions
- Starting with Let mypyc optimise os.path.join #17949
- Make is_sub_path faster #17962 (although even just not using pathlib would probably be great)
Use the fast path in modulefinder in more places
- Use fast path in modulefinder more often #17950
Misc
- Speed up stubs suggestions #17965

The text was updated successfully, but these errors were encountered:

See python#17948 There's one call site which has varargs that I leave as os.path.join, it doesn't show up on my profile. I do see the `endswith` on the profile, we could try `path[-1] == '/'` instead In my work environment, this is about a 10% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 30.842 s ± 0.119 s [User: 26.383 s, System: 4.396 s] Range (min … max): 30.706 s … 30.927 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 34.161 s ± 0.163 s [User: 29.818 s, System: 4.289 s] Range (min … max): 34.013 s … 34.336 s 3 runs ``` In the toy "long" environment mentioned in the issue, this is about a 7% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python Time (mean ± σ): 23.177 s ± 0.317 s [User: 20.265 s, System: 2.873 s] Range (min … max): 22.815 s … 23.407 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental Time (mean ± σ): 24.838 s ± 0.237 s [User: 22.038 s, System: 2.750 s] Range (min … max): 24.598 s … 25.073 s 3 runs ```

See #17948 There's one call site which has varargs that I leave as os.path.join, it doesn't show up on my profile. I do see the `endswith` on the profile, we could try `path[-1] == '/'` instead (could save a few dozen milliseconds) In my work environment, this is about a 10% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 30.842 s ± 0.119 s [User: 26.383 s, System: 4.396 s] Range (min … max): 30.706 s … 30.927 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 34.161 s ± 0.163 s [User: 29.818 s, System: 4.289 s] Range (min … max): 34.013 s … 34.336 s 3 runs ``` In the toy "long" environment mentioned in the issue, this is about a 7% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python Time (mean ± σ): 23.177 s ± 0.317 s [User: 20.265 s, System: 2.873 s] Range (min … max): 22.815 s … 23.407 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental Time (mean ± σ): 24.838 s ± 0.237 s [User: 22.038 s, System: 2.750 s] Range (min … max): 24.598 s … 25.073 s 3 runs ``` In the "clean" environment, this is a 1% speedup, but below the noise floor.

JukkaL · 2024-10-15T09:47:45Z

What about filtering the module search path based on the first component(s) of the target module name? We could create a dict that maps a module name prefix <prefix> to the search path filtered based on the existence of a <prefix> directory, <prefix>.py or <prefix>.pyi in the search path entry.

For example, if torch is only present in a single search path entry, the search path for the torch prefix would only contain this single item. If we are resolving, say, torch.foo, we'd first look up the filtered search path based on the torch prefix. This would usually contain only a single item, so performance should be similar to the easy/clean case, even if there are hundreds of search path entries.

If many search path entries have the same directory/namespace package (e.g. common/), we could also filter by a length-two prefix. So we'd have module search path for common.a mapping to search path entries that contain common/a/, common/a.py or common/a.pyi. Creating this lookup table could be slightly expensive, so we'd probably want to only build the second-level mapping when there are more than N matching search path entries for some top-level package, and only build it for these packages.

To determine the effective search path for module, we'd look up prefixes of length 2 and 1 (e.g. pkg.a and pkg for module pkg.a.b) to find a filtered search path. Building the top-level lookup table should be pretty quick, so we can probably always use it. We'd use the second-level lookup table when it exists, and otherwise fallback to the first-level table.

See #17948 This is about 1.06x faster on `mypy -c 'import torch'` (in both the clean and openai environments) - 19.094 -> 17.896 - 34.161 -> 32.214 ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable clean/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable clean/bin/python Time (mean ± σ): 17.896 s ± 0.130 s [User: 16.472 s, System: 1.408 s] Range (min … max): 17.757 s … 18.014 s 3 runs λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 32.214 s ± 0.106 s [User: 29.468 s, System: 2.722 s] Range (min … max): 32.098 s … 32.305 s 3 runs ```

hauntsaninja · 2024-10-15T19:02:25Z

Recording new baseline numbers here for eb816b0 (after a few of ^ PRs have been merged):

λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     18.240 s ±  0.046 s    [User: 16.671 s, System: 1.552 s]
  Range (min … max):   18.201 s … 18.291 s    3 runs
 
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python' 
Benchmark 1: /tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     21.581 s ±  0.115 s    [User: 19.600 s, System: 1.965 s]
  Range (min … max):   21.496 s … 21.712 s    3 runs
 
λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_eb816b05c/venv/bin/mypy  -c "import torch" --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     28.439 s ±  0.270 s    [User: 25.591 s, System: 2.829 s]
  Range (min … max):   28.197 s … 28.731 s    3 runs

Compared to bd9200b we are:

1.06x faster on clean
1.6x faster on long
1.8x faster on openai
1.6x faster on openai incremental (9.376 -> 5.847)

See python#17948 Haven't run the benchmark yet, but profile indicates that this could save 0.5s on both incremental and non-incremental builds in environments with long search path

See #17948 This is starting to show up on profiles - 1.01x faster on clean (below noise) - 1.02x faster on long - 1.02x faster on openai - 1.01x faster on openai incremental I had a dumb bug that was preventing the optimisation for a while, I'll see if I can make it even faster. Currently it's a small improvement We could also get rid of the legacy stuff in mypy 2.0

See #17948 - 1.01x faster on clean - 1.06x faster on long - 1.04x faster on openai - 1.26x faster on openai incremental

hauntsaninja · 2024-10-17T02:35:46Z

New numbers for c201a18 (with orjson installed):

hyperfine -w 1 -M 3 /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable clean/bin/python
Benchmark 1: /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     17.205 s ±  0.057 s    [User: 15.689 s, System: 1.500 s]
  Range (min … max):   17.153 s … 17.265 s    3 runs
 
hyperfine -w 1 -M 3 /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable long/bin/python
Benchmark 1: /tmp/mypy_primer/timer_mypy_c201a187b/venv/bin/mypy -c 'import torch' --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     19.361 s ±  0.373 s    [User: 17.489 s, System: 1.857 s]
  Range (min … max):   19.102 s … 19.789 s    3 runs

The openai environment I was using previously got mutated, so not posting raw numbers for that. In the following, I re-ran the bd9200b baseline in a similar environment to get fair openai comparisons. Compared to bd9200b we are:

1.13x faster on clean
1.18x faster on clean incremental (1.06x faster without orjson)
1.79x faster on long
1.92 faster on similar openai
2.19x faster on similar openai incremental (2.05x faster without orjson)

JukkaL · 2024-10-17T16:19:38Z

@hauntsaninja Are you interested in looking into filtering the search path (see my comment above)? If not, I might have a look at it at some point.

hauntsaninja · 2024-10-17T18:23:24Z

Yup, I'm interested in looking into it.
Worth noting that the difference between "clean" and "long" is down to 1.13x (from 1.8x), so I'm prioritising things that will help "clean" and "openai", rather than specifically "long". The difference between "openai" and "long" seems to just be many more entries in site-packages (but equal number in sys.path). This is still a little mysterious to me, maybe torch has some hidden dependencies or something.

See #17948 There's one call site which has varargs that I leave as os.path.join, it doesn't show up on my profile. I do see the `endswith` on the profile, we could try `path[-1] == '/'` instead (could save a few dozen milliseconds) In my work environment, this is about a 10% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 30.842 s ± 0.119 s [User: 26.383 s, System: 4.396 s] Range (min … max): 30.706 s … 30.927 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 34.161 s ± 0.163 s [User: 29.818 s, System: 4.289 s] Range (min … max): 34.013 s … 34.336 s 3 runs ``` In the toy "long" environment mentioned in the issue, this is about a 7% speedup: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_6eddd3ab1/venv/bin/mypy -c "import torch" --no-incremental --python-executable long/bin/python Time (mean ± σ): 23.177 s ± 0.317 s [User: 20.265 s, System: 2.873 s] Range (min … max): 22.815 s … 23.407 s 3 runs ``` Compared to: ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental' Benchmark 1: /tmp/mypy_primer/timer_mypy_88ae62b4a/venv/bin/mypy -c "import torch" --python-executable=long/bin/python --no-incremental Time (mean ± σ): 24.838 s ± 0.237 s [User: 22.038 s, System: 2.750 s] Range (min … max): 24.598 s … 25.073 s 3 runs ``` In the "clean" environment, this is a 1% speedup, but below the noise floor.

See #17948 This is about 1.06x faster on `mypy -c 'import torch'` (in both the clean and openai environments) - 19.094 -> 17.896 - 34.161 -> 32.214 ``` λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable clean/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable clean/bin/python Time (mean ± σ): 17.896 s ± 0.130 s [User: 16.472 s, System: 1.408 s] Range (min … max): 17.757 s … 18.014 s 3 runs λ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python' Benchmark 1: /tmp/mypy_primer/timer_mypy_36738b392/venv/bin/mypy -c "import torch" --no-incremental --python-executable /opt/oai/bin/python Time (mean ± σ): 32.214 s ± 0.106 s [User: 29.468 s, System: 2.722 s] Range (min … max): 32.098 s … 32.305 s 3 runs ```

See #17948 This is starting to show up on profiles - 1.01x faster on clean (below noise) - 1.02x faster on long - 1.02x faster on openai - 1.01x faster on openai incremental I had a dumb bug that was preventing the optimisation for a while, I'll see if I can make it even faster. Currently it's a small improvement We could also get rid of the legacy stuff in mypy 2.0

See #17948 - 1.01x faster on clean - 1.06x faster on long - 1.04x faster on openai - 1.26x faster on openai incremental

hauntsaninja · 2024-10-25T07:09:26Z

Posting more numbers:

+ /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy --version
mypy 1.14.0+dev.3420ef1554c40b433a638e31cb2109e591e85008 (compiled: yes)
+ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --no-incremental --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --no-incremental --python-executable clean/bin/python
  Time (mean ± σ):     19.671 s ±  0.155 s    [User: 18.219 s, System: 1.439 s]
  Range (min … max):   19.551 s … 19.845 s    3 runs
 
+ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --no-incremental --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --no-incremental --python-executable long/bin/python
  Time (mean ± σ):     21.881 s ±  0.089 s    [User: 20.061 s, System: 1.807 s]
  Range (min … max):   21.784 s … 21.957 s    3 runs
 
+ hyperfine -w 1 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --no-incremental --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --no-incremental --python-executable /opt/oai/bin/python
  Time (mean ± σ):     28.509 s ±  0.212 s    [User: 26.081 s, System: 2.409 s]
  Range (min … max):   28.364 s … 28.752 s    3 runs
 
+ hyperfine -w 2 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --python-executable clean/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --python-executable clean/bin/python
  Time (mean ± σ):      2.085 s ±  0.005 s    [User: 1.718 s, System: 0.366 s]
  Range (min … max):    2.081 s …  2.091 s    3 runs
 
+ hyperfine -w 2 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --python-executable long/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --python-executable long/bin/python
  Time (mean ± σ):      3.104 s ±  0.006 s    [User: 2.288 s, System: 0.816 s]
  Range (min … max):    3.098 s …  3.110 s    3 runs
 
+ hyperfine -w 2 -M 3 '/tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c '\''import torch'\'' --python-executable /opt/oai/bin/python'
Benchmark 1: /tmp/mypy_primer/timer_mypy_3420ef155/venv/bin/python -m mypy -c 'import torch' --python-executable /opt/oai/bin/python
  Time (mean ± σ):      3.928 s ±  0.006 s    [User: 3.067 s, System: 0.861 s]
  Range (min … max):    3.922 s …  3.932 s    3 runs

set -x
export MYPY_CACHE_DIR=mypycache/$COMMIT
mkdir -p "$MYPY_CACHE_DIR"
mkdir benchjson
export PYTHON="/tmp/mypy_primer/timer_mypy_$COMMIT/venv/bin/python"
$PYTHON -m pip install orjson
$PYTHON -m mypy --version
hyperfine -w 1 -M 5 --export-json "benchjson/${COMMIT}_clean.json" "$PYTHON -m mypy -c 'import torch' --no-incremental --python-executable clean/bin/python"
hyperfine -w 1 -M 5 --export-json "benchjson/${COMMIT}_long.json" "$PYTHON -m mypy -c 'import torch' --no-incremental --python-executable long/bin/python"
hyperfine -w 1 -M 5 --export-json "benchjson/${COMMIT}_oai.json" "$PYTHON -m mypy -c 'import torch' --no-incremental --python-executable /opt/oai/bin/python"
hyperfine -w 2 -M 5 --export-json "benchjson/${COMMIT}_clean_inc.json" "$PYTHON -m mypy -c 'import torch' --python-executable clean/bin/python"
hyperfine -w 2 -M 5 --export-json "benchjson/${COMMIT}_long_inc.json" "$PYTHON -m mypy -c 'import torch' --python-executable long/bin/python"
hyperfine -w 2 -M 5 --export-json "benchjson/${COMMIT}_oai_inc.json" "$PYTHON -m mypy -c 'import torch' --python-executable /opt/oai/bin/python"

hauntsaninja · 2024-10-26T02:15:05Z

Okay, with #18038 and the follow up #18045, we're down to within noise between the clean and long environments

See timings here: #18045 (comment)

So I think we can call this complete! :-)

hauntsaninja added the feature label Oct 15, 2024

This was referenced Oct 15, 2024

Let mypyc optimise os.path.join #17949

Merged

mypy is slow when type checking torch #17919

Open

hauntsaninja added the performance label Oct 15, 2024

hauntsaninja mentioned this issue Oct 15, 2024

Use fast path in modulefinder more often #17950

Merged

This was referenced Oct 15, 2024

Make is_sub_path faster #17962

Merged

Speed up stubs suggestions #17965

Merged

hauntsaninja added a commit that referenced this issue Oct 17, 2024

Make is_sub_path faster (#17962)

c201a18

See #17948 - 1.01x faster on clean - 1.06x faster on long - 1.04x faster on openai - 1.26x faster on openai incremental

hauntsaninja added a commit that referenced this issue Oct 20, 2024

Make is_sub_path faster (#17962)

854ad18

See #17948 - 1.01x faster on clean - 1.06x faster on long - 1.04x faster on openai - 1.26x faster on openai incremental

hauntsaninja mentioned this issue Oct 25, 2024

Filter to possible package paths before trying to resolve a module. #18038

Merged

hauntsaninja closed this as completed Oct 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve performance in environments with long search paths #17948

Improve performance in environments with long search paths #17948

hauntsaninja commented Oct 15, 2024 •

edited

Loading

JukkaL commented Oct 15, 2024 •

edited

Loading

hauntsaninja commented Oct 15, 2024 •

edited

Loading

hauntsaninja commented Oct 17, 2024 •

edited

Loading

JukkaL commented Oct 17, 2024

hauntsaninja commented Oct 17, 2024 •

edited

Loading

hauntsaninja commented Oct 25, 2024 •

edited

Loading

hauntsaninja commented Oct 26, 2024

Improve performance in environments with long search paths #17948

Improve performance in environments with long search paths #17948

Comments

hauntsaninja commented Oct 15, 2024 • edited Loading

JukkaL commented Oct 15, 2024 • edited Loading

hauntsaninja commented Oct 15, 2024 • edited Loading

hauntsaninja commented Oct 17, 2024 • edited Loading

JukkaL commented Oct 17, 2024

hauntsaninja commented Oct 17, 2024 • edited Loading

hauntsaninja commented Oct 25, 2024 • edited Loading

hauntsaninja commented Oct 26, 2024

hauntsaninja commented Oct 15, 2024 •

edited

Loading

JukkaL commented Oct 15, 2024 •

edited

Loading

hauntsaninja commented Oct 15, 2024 •

edited

Loading

hauntsaninja commented Oct 17, 2024 •

edited

Loading

hauntsaninja commented Oct 17, 2024 •

edited

Loading

hauntsaninja commented Oct 25, 2024 •

edited

Loading