Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve CLI speed with lazy imports #1319

Merged
merged 79 commits into from
Nov 16, 2024
Merged

Conversation

jgbradley1
Copy link
Collaborator

@jgbradley1 jgbradley1 commented Oct 24, 2024

Description

This is an optional PR to consider. Because of too many imports getting executed in various __init__.py files throughout the library, the CLI is slow to start up. The initialization of so many imports also makes the initial import of graphrag slow as well.

Proposed Changes

Following some tips from this article, I profiled the startup time for the entire library with a focus on improving the CLI startup time using the following commands:

python -X importtime -c 'from graphrag.cli.main import *' 2> graphrag-imports.log
tuna graphrag-imports.log
  • import statements involving very large libraries or long-processing times were moved inside the function calls that rely on them.
  • nearly all __init__.py files across the entire package are empty now.
  • relative imports were converted to absolute imports to avoid the circular import problem
  • comments were added in places where import statements may appear unusual at first explaining why they were moved there

The following screenshot shows the startup time for the CLI code when the package is first installed.
graphrag-initial-startup-time

After the library has been loaded once (python generates bytecode to speed up future imports), we can achieve an even faster load time of 0.3 seconds.

graphrag-cached-startup-time

Tab completion with the CLI is also now very responsive.

Checklist

  • I have tested these changes locally.
  • I have reviewed the code changes.
  • I have updated the documentation (if necessary).
  • I have added appropriate unit tests (if applicable).

Additional Notes

This PR closes #1299 as well.

A small patch for Typer path completion was added to the CLI to improve the CLI experience.

@jgbradley1 jgbradley1 merged commit 22a57d1 into main Nov 16, 2024
15 checks passed
@jgbradley1 jgbradley1 deleted the joshbradley/import-speedup branch November 16, 2024 00:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants