Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] - OGM distinct from driver #97

Open
ryanpstauffer opened this issue Jan 17, 2018 · 7 comments
Open

[Proposal] - OGM distinct from driver #97

ryanpstauffer opened this issue Jan 17, 2018 · 7 comments

Comments

@ryanpstauffer
Copy link

Does it make sense to keep the OGM abstraction distinct and separate from the driver execution?
From a design perspective this would be beneficial in a couple of ways:

  1. Decouples OGM functionality and usability from TP / Gremlin Server versioning and required driver changes
  2. A user can create their OGM model and then worry about interaction with the graph, with the option of adding and retrieving their created Elements either synchronously (through an iPython console for example), or async.

I understand this could take a bit of work, so happy to break this into pieces that follow your overall roadmap and submit PRs.

@davebshow
Copy link
Owner

davebshow commented Jan 17, 2018

Yes it does make sense. Originally Goblin was a complete toolkit for TP3 with driver, async GLV, and OGM. I made a step towards decoupling everything when I extracted the driver and GLV to aiogremlin. The logical continuation of this would be to make Goblin work with a different driver/GLV implementations.

Making this happen though isn't really trivial. I would have to put some thought into the best way of doing this. I have a few questions for you: what is your motivation for this? Do you really want to use a different driver or GLV? If so, which one? Official Gremlin-Python? The DSE driver? Or do you just want to be able to use Goblin in a synchronous fashion to avoid the boilerplate code with the event loop?

If really what you want is just to run Goblin in a synchronous fashion, it may be easier to simply add a sync mode, although the best solution would be to improved Goblin's composability with other drivers/GLVs.

@ryanpstauffer
Copy link
Author

My specific motivation for this is to be able to use Goblin OGM to avoid boilerplate while quickly iterating through designs of our graph and application. Agreed the simplest way to accomplish that is to just add a sync mode.

My broader motivation is that I'm ramping up a couple projects in parallel. Both of these would add at least one additional layer of abstraction above the OGM. The existing Goblin/aiogremlin codebase accomplishes a lot, so we don't want to recreate the wheel, and the development of both projects would be sped up if they could utilize the existing Goblin package. However, having Goblin as a core dependency probably only makes sense if:

  1. There's a sync mode
  2. OGM is decoupled from driver (at least one of these projects may soon run on DSE or Neptune, so driver and Gremlin Server version compatibility will be a concern)

Definitely understand that making any of the above happen isn't trivial. How can I help?

@davebshow
Copy link
Owner

Give me a day or so to think about this and I will get back to you

@davebshow
Copy link
Owner

The more I think about this, the more I like it. I guess the first step would be to get it running with the official Gremlin-Python library. I'm still not sure on how I want to deal with the glaring issue integrating the async and sync code. Any bright ideas?

@ryanpstauffer
Copy link
Author

ryanpstauffer commented Jan 19, 2018

I think a lot of the basic interaction would take place in a similar way. Here's my thoughts on initial desired functionality

>>> from goblin import Goblin, element, properties

# Model a Person
>>> class Person(element.Vertex):
...      name = properties.Property(properties.String)
...      age = properties.Property(properties.Integer)

# Create Leif, a Person
>>> leif = Person()
>>> leif.name = 'Leif'
>>> leif.age = 28

# Add Leif to our graph via context manager
>>> with Goblin.sync_session(remote='ws://localhost:8182/gremlin') as session:
...      session.add(leif) # Add leif to the pending queue
...      # Then we could either manually flush our pending queue
...      session.flush()
...      # Or have the pending queue flush upon context manager __exit__()

# We could also keep the session open and add objects iteratively
>>> session = Goblin.sync_session(remote='ws://localhost:8182/gremlin')
>>> session.add(leif)
>>> session.add(jon, works_with) # Add some more objects to the queue
>>> session.flush() # or perhaps session.commit()

# Traversals return defined objects, when possible
# Maybe this isn't the cleanest interface, but it's a starting point
>>> g = Goblin.sync_traversal().with_remote(remote='ws://localhost:8182/gremlin')
>>> new_person = g.V(Person).next()
# And our returned object is an instance of our Person class
>>> type(new_person) == Person
True
>>> new_person is leif
True

This mirrors the async functionality today, and generally matches up with other Python ORM (SQLAlchemy) syntax. I'm still going through the existing codebase of goblin, aiogremlin, and gremlin_python to figure out the best way to implement the above.

One issue I see right now is the difference in goblin dependency of gremlin_python==3.2.6 vs current 3.3.1. If we were to add the above functionality, I would think it makes the most sense to build it with 3.3 in mind. This support current vendor offerings (for example, AWS Neptune's managed graph solution is compatible with TinkerPop 3.3.0)

@davebshow
Copy link
Owner

Yeah the idea would be that the API is the same regardless of whether you want a sync or async session. A couple things:

  1. Regarding the gremlinpython version. In the next version of Goblin, the user will choose a GLV (gremlinpython/aiogremlin) and version (3.2.7, 3.3.1, etc.), and pass a factory function to the app constructor. If None, try to use aiogremlin. This allows people using Janus 0.2 with TP3.2.6 use the same Goblin codebase as someone using Neptune with TP3.3.0.

  2. Regarding the implementation. I guess the best thing to do will be to basically replicate the code in session.py, but without the async/await API. We can probably use the same app class, just return futures. I'll have to look haven't got there yet.

Make sense?

I may try to do this this weekend.

@ryanpstauffer
Copy link
Author

ryanpstauffer commented Jan 19, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants