Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor codebase to use AST #59

Open
grayfox88 opened this issue Sep 12, 2023 · 3 comments
Open

Refactor codebase to use AST #59

grayfox88 opened this issue Sep 12, 2023 · 3 comments

Comments

@grayfox88
Copy link

py2puml uses internally a mix of code inspection and Abstract Syntax Tree parsing. The whole code based should be refactored to use AST parsing only as it is more robust and straightforward.

@lucsorel
Copy link
Owner

I agree, code inspection hits several walls:

  1. in some code bases, I saw some type annotations referring to types that are not imported "at the top of the files", but imported in try-except blocks that are within constructors. When inspecting, py2puml does not pass through these "dynamic " imports and fails to solve the fully-qualified type name (the type prefixed by the module path)
  2. during code inspection, py2puml fails to distinguish the fields of the current class for the fields of the parent class if any. When resolving the fully-qualified type name of a field, its corresponding import would then be in the module of the parent class and not in the current module. Heuristics can be implemented to compare the fields from a class with the fields of the parent classes to avoid duplicate declarations and parent-import-resolution but it also make the code more complicated
  3. inspecting a module makes the python interpreter execute the file. And if side-effects are not wrapped by a classic if __name__ == '__main__': clause, they will be executed

The last point is the one I fear the most because users of py2puml may be executing some code when generating the documentation, causing unexpected consequences.

@lucsorel
Copy link
Owner

lucsorel commented Sep 14, 2023

Some downsides of relying only on AST parsing:

  1. implementing py2puml features will be harder because AST parsing relies on the visitor pattern (see https://docs.python.org/3/library/ast.html#ast.NodeVisitor, https://www.mattlayman.com/blog/2018/decipher-python-ast/), which makes the "visiting" code harder to write, understand and test
  2. fields and methods that are added dynamically to a class (using decorators, for example) will be missing in the generated documentation

@grayfox88
Copy link
Author

Good summary @lucsorel , concerning the downside related to AST being harder to understand, I kind of agree but believe it is a matter of time before one gets used to it. On this matter I would recommend the excellent hands-on documentation Green Tree Snake.

I have been working during the past months on a version of py2puml using solely AST which I will send a pull request for as soon as it gives decent results. It is still on a Proof Of Concept stage and needs some bug fixing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants