-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for SQLAlchemy models #56
Comments
would you have a snippet of Python code describing a sample domain model involving SQLAlchemy, please? Something like this Pydantic example: https://docs.pydantic.dev/latest/usage/models/#nested-models. That would help contributing to this issue. Second question: would you agree on reviewing the corresponding pull request when it is done? |
@lucsorel Yes, i'll create some tomorrow. Naturally, I will help with the PR and review when done. Are you also willing to help out? I would appreciate it immensely, but please only do if you have enough free time. |
Thank you for your help, I appreciate it 🙏 For now, I am working with other people on #43. When it is done, I will be able to work on this one. I need someone to bootstrap me on SQLAlchemy (the sample domain model), then I'll be able see what to do. I maintain Moreover, there have been some older requests that I should handle too. So no ETA for any feature, I am sorry 🤷 |
@lucsorel Apologies it took a bit longer. You know how these things go. Below, i've created a little snippet with 3 classes. It tries to have both a foreign key relationship in a one-to-many fashion, as well as use inheritance for an abstract base class. from sqlalchemy import Boolean, Column, ForeignKey
from sqlalchemy.ext.declarative import AbstractConcreteBase
from sqlalchemy.orm import declarative_base, relationship
from sqlalchemy.types import Date, DateTime, Float, Integer, String
example_base = declarative_base()
class User(AbstractConcreteBase, example_base):
user_id = Column(Integer, autoincrement=True, primary_key=True)
username = Column(String(100))
salted_pw = Column(String(100))
created_at = Column(DateTime)
last_login = Column(DateTime)
age = Column(Integer)
height = Column(Float)
class Login(example_base):
__tablename__ = "logins"
_id = Column(Integer, primary_key=True, autoincrement=True)
login_timestamp = Column(DateTime)
user_id = Column(Integer, ForeignKey('special_users.user_id'))
class specialUser(User):
__tablename__ = 'special_users'
__mapper_args__ = {"polymorphic_identity": "special_users", "concrete": True}
special_id = Column(Integer, primary_key=True, autoincrement=True)
admin_access = Column(Boolean)
clearance_number = Column(Integer)
logins = relationship('Login', backref='user') |
🙏 thank you for the snippet. I you are familiar with PlantUML code, can you draw a class diagram in the online editor and share the link (make sure that the url to the diagram has been refreshed), please? Honestly, I apprehend writing the parser that will handle such code :/
This is not a simple feature you are asking for 😟 |
I've stumbled upon this project today (looking great, btw), and have the same problem as @Baukebrenninkmeijer. Here are my thoughts, hope they help. SQLAlchemy 1.4 vs 2.0AFAIK, this is the old (1.4) style of declaring models. This is still supported in version 2.0, but examples in the documentation seem to prefer the new style declaration. I'd think 1.4 is still more often seen in the wild, but it might make sense to skip 1.4 style and immediately support the new declarative syntax. The new declarative API works via hinting native types, and is therefore much more amenable to inspection, which should make implementing this a lot easier. That should at least take care of caveats 1 and 2. And RE: 3 my personal opinion is that it would be nice, but it's much better to have a way to automatically generate a diagram without that information, than to have no way at all. For comparison, I've ported the sample to the newer 2.0 syntax. from datetime import datetime
from typing import Optional
from sqlalchemy import ForeignKey, create_engine
from sqlalchemy.orm import DeclarativeBase, Mapped, mapped_column, relationship
from sqlalchemy.types import String
class ExampleBase(DeclarativeBase):
pass
class User(ExampleBase):
__abstract__ = True
user_id: Mapped[int] = mapped_column(primary_key=True)
username: Mapped[Optional[int]] = mapped_column(String(100))
salted_pw: Mapped[Optional[int]] = mapped_column(String(100))
created_at: Mapped[Optional[datetime]] = mapped_column()
last_login: Mapped[Optional[datetime]] = mapped_column()
age: Mapped[Optional[int]] = mapped_column()
height: Mapped[Optional[float]] = mapped_column()
class Login(ExampleBase):
__tablename__ = "logins"
_id: Mapped[int] = mapped_column(primary_key=True)
login_timestamp: Mapped[Optional[datetime]] = mapped_column()
user_id: Mapped[Optional[int]] = mapped_column(ForeignKey("special_users.user_id"))
class SpecialUser(User):
__tablename__ = "special_users"
__mapper_args__ = {"concrete": True}
special_id: Mapped[int] = mapped_column(primary_key=True)
admin_access: Mapped[Optional[bool]] = mapped_column()
clearance_number: Mapped[Optional[int]] = mapped_column()
logins = relationship("Login", backref="user")
if __name__ == "__main__":
# check generated SQL with this
engine = create_engine(f"sqlite://", echo=True)
ExampleBase.metadata.create_all(engine) Compared to the old style example, this produces identical SQL code as far as I can see. Class diagramsCaveat emptor: I don't know how py2puml handles imports, as I have never used it before, so I'm using a very naive approach for this. Adjust accordingly. Also, I'm generally somewhat mediocre when it comes to UML, so please excuse if this is not ideal. For example, if there's a need to fully model the imports, I have absolutely no idea how to deal with metaclasses in UML. The very naive, but probably good enough approachAnd the PUML code
@startuml
This doesn't take care of properly typing relationships at all, and doesn't look for the And the PUML code
@startuml
This correctly identifies the relationship, and that I guess implementing the relationship typing would be doable. Getting the association right would be quite hard though. You'd have to remember the Follow imports optionAnother potentially easier solution for this Issue would be to add an option to not follow imports, like mypy, but I don't know how feasible that is, given py2puml's design. Contributing some timeI'd be willing to spend some time to help work on this, but honestly, I don't think I'd be much help implementing anything on the py2puml side of things So if there's some groundwork I could do, or assist in another way to make this easier for you, I'd be happy to help. |
@FynnFreyer Thanks! Yes, this is exactly the elaboration needed, and is extremely useful! In terms of skipping all <2.0 version, I also considered that but given that most codebases will not have migrated and will use an older version, my preference would then be to see if we can support both. With the new 2.0 mapper system, things get easier using typehinting but extracting a type from an SQLAlchemy object was fairly easy before as well, since it had mapping to python types (i.e. The diagram I would have created would be the same as you created above, @FynnFreyer. BTW I copied the dunder tags from what we are using at work - i'm not entirely sure about whether they are required, such as the mapper args. |
I fear you're right. Version two is quite new, and SQLAlchemy quite old, so only supporting 2.0 might not cut it in a lot of cases. I guess the biggest conceptional issue left would be point 5, programatically identifying that this is an SQLAlchemy domain model. A quick and dirty way would be to check whether |
One more thing I just noticed: you'd have to check |
Thank you @Baukebrenninkmeijer and @FynnFreyer for the thorough description of your need and hints to a solution 🙏 However, most of the hints you gave rely on code inspection (actually loading the module - meaning that the Code inspection is a great and powerful feature of Python and it is what I used when I started this tool because it was so convenient. But it is also painful and risky (see #59 (comment): the 3rd bullet mentions that the Some downsides of AST parsing is that it is harder to implement heuristics and It seems to me that extracting the domain model from an SQLAlchemy model should be feasible with AST parsing, but with different heuristics relying on type annotations (SQLAlchemy 2+) or on the expression assigned to the class fields (SQLAlchemy 1.4); what do you think? |
@lucsorel These are 100% fair considerations and I think it should be doable with just the AST as well. Given that for the 1.4 version, it's mentioned in column and for 2.0 it's mentioned as a quasi-normal python type, do you think it will be significantly more complex than parsing other types of code? Compared with pydantic for example, they also have the type as a typehint similarly to sqlalchemy 2.0, so my expectation is that similar approaches can be leverages. In terms of sequentiality and effort, I'd start with the 2.0 implementation, given my additional expectation of that being easier since it uses typehints. We will in the (hopeful) near future also migrate to 2.0 so maybe that should just be the way to go. For additional future proofing, naturally support for 1.4 would be nice, but I think this is, given the circumstances, the better approach forward. |
It should not be very different; but for now, such classes (where an abstract class or a decorator - like We'll see 😃 |
Issue to add support for SQLAlchemy models. This will include considerations about 1-1, 1-* and - relationships. Additionally, it should also consider inheritance.
The text was updated successfully, but these errors were encountered: