-
-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Static Type Annotations for SymPy #4
base: main
Are you sure you want to change the base?
Conversation
Timely. I think this can be adopted (and enforced with mypy) gradually. I would like to get LPython to the point that SymPy could use it, but we are not there yet. Here is what works today: https://github.com/lcompilers/lpython/blob/615e7e7d6baa032ab29b8b5f3999dba0e7e6640b/integration_tests/symbolics_05.py, this gets compiled to almost maximum performance code, using SymEngine as the backend. For now we settled on using just "S" as the type, since it's short, and it will be used a lot, for every variable, etc. If you have ideas or want to collaborate on this, let me know. |
Is this the entirety of the proposal or it is a draft? I ask because it does not seem to address any of the issues that have been brought up about adding typing to SymPy nor does it outline any technical considerations about how this would be done or any policies on exactly what typing will and will not be in SymPy. The proposal even seems to imply removing run time type checks, which will also need much more explanation. There are no references to the many past discussions on the topic. My feedback at this point, is that the proposal needs much more content and specifics. |
Just want to chip in that the use of .pyi-files should be considered as an alternative to |
They should be considered. Perhaps the SymPEP should spell out what that would look like and clarify whether there would be any "policy" about it. I prefer the type annotations to be inline when the types are good but some SymPy functions do not exactly have types that can be expressed succinctly. I think it is worth noting that there are two different things that type hints can be used for:
My impression is that actually autocompletion is the most frequent use of type hints. Basically if SymPy exposes type hints somehow (inline or pyi files) then editors like vscode etc will use that to help users write code that uses SymPy. I estimate that probably the majority of SymPy users probably do use editors that would do this. See also |
SymPEP-XXXX.md
Outdated
|
||
It's important to note that tools like mypy and pyright are capable of inferring types, facilitating the incorporation of static typing without a steep learning curve. By adding a few annotations, code can be made cleaner and the transition to static typing across the entire codebase can be eased. | ||
|
||
For functions, classes, and modules internally used by SymPy that currently feature unnecessary dynamic type checks, a shift towards static typing should be promoted. This transition can help eliminate these unnecessary checks and subsequently enhance the overall performance of the SymPy codebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Who deems these checks unnecessary? If we want a function/method to be duck-typed, I believe we should be allowed to do so. Duck-typing has been the design paradigm preached by the Python community since I can remember and it is a feature, not a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should not drop dynamic type checks because of adding static typing. When not using an editor supporting this (you have no idea how many students edit their code in whatever the system offers), there should still be sensible error messages, not the code crashing later with a seemingly unrelated error message.
If nothing else, this will probably reduce the number of issues opened that boils down to the user doing something wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The checks are necessary in user-facing functions but should not be necessary in much of the internal code if a static checker can verify correctness.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note also that if the type hints are used consistently on user facing functions then an editor like vscode will already warn the user about passing the wrong type into a function before the code even runs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python allows to inspect type annotations in runtime
def square(x: int):
return x * x
square.__annotations__
# {'x': int}
such that we can implement something like @validate
such that
# Function for users that raises on isinstance(x, int)
@validate
def square(x: int):
return x * x
# Function for internal use
def square(x: int):
return x * x
So even if we want to do runtime validation,
It should be better done like above, than repeating isinstance
checks manually,
which often makes the code more complicated with validation code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree. Just wanted to point out that not all people use editors with type hinting (and/or do not understand how to benefit from them).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not all people use editors with type hinting
Agreed. I make very heavy use of ipython for example which does not support this.
Note though that the sentence in the PEP that we are all commenting on here explicitly says (emphasis added):
For functions, classes, and modules INTERNALLY used by SymPy that currently feature unnecessary dynamic type checks, a shift towards static typing should be promoted.
The example here is a good one:
https://github.com/sympy/sympy/blob/b0dcb5af49e7289680d9789d292197675e40490d/sympy/polys/polyclasses.py#L167-L171
In gh-25651 I found bugs by enabling that check. The check is too expensive to be used at runtime though so it was commented out before merge. Almost everything that it checks is something that could be verified by a static type checker but doing that check at runtime means recursively calling isinstance
down through a potentially large data structure. The class in question is purely internal and not something that any ordinary SymPy user would ever see directly.
Note also that just attempting to verify that the codebase does indeed only use those types in this particular function is something that consumed development time. The whole PR gh-25651 would be completely unnecessary if the code in question just used type hints but as it is if I want to verify what the types are than I have to add runtime checks and ensure that the entire CI test suite completes successfully (and then fix the bugs that show up).
SymPEP-XXXX.md
Outdated
|
||
For functions, classes, and modules internally used by SymPy that currently feature unnecessary dynamic type checks, a shift towards static typing should be promoted. This transition can help eliminate these unnecessary checks and subsequently enhance the overall performance of the SymPy codebase. | ||
|
||
Similarly, functions, classes, and modules within SymPy that intentionally avoided runtime type checks for performance reasons should consider embracing static typing. Static typing undoubtedly mitigates performance overhead while providing better bug detection related to type errors. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a potentially compelling statement. If there were some real world metrics that show how much a large codebase like SymPy could be sped up by removing all runtime checks and moving to a linting-based type checking procedure, then that would help convince us of any need to do such a radical change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand though, the speed bottlenecks in SymPy are not due to large numbers of runtime checks but due to inefficient symbolic algorithms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There can be some discussions about how checking list[list[int]]
in runtime could be avoided in polynomial systems, and can be replaced by static typing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @sylee957 notes it is not just about speed bottlenecks because in many cases the checking is not done to avoid the slowdowns that it would cause. Then the cost of not checking is bugs, debugging etc rather then any immediate measurable slowness. Static type checking can reduce certain types of bugs without causing runtime slowdown.
@moorepants |
When you come to a first complete draft, I'll review again. Thanks for championing this PEP. |
@sylee957 do you think you could please put in the proposal the discussion of using "S" instead of "Expr" for the type of a symbolic expression? Also, to mention the possibility of using Python compilers that these annotations might allow. However, we need to work together on settling on a type system style, this is some research work to be done, and I am not asking that we do it now. All I am asking is that the door is not closed today to this effort in the future. See my comment above. |
Okay, I try to add that part. |
@sylee957 very good, thanks! I think that summarizes it well. |
No description provided.