Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Querying and constructing multiple graphs #241

Open
wants to merge 83 commits into
base: master
Choose a base branch
from

Conversation

boggle
Copy link

@boggle boggle commented Jul 2, 2017

This is a proposal for making Cypher work with multiple graphs.

It is part of the redesign of Cypher for adding support for working with multiple graphs that targets Cypher 10.

View latest version of CIP from associated branch

boggle added 2 commits June 27, 2017 19:36
This covers a lot of ground:

* Data model
* Language execution model
* Working with named graphs
* Declarative Graph Construction
* Graph composition
* New Patterns: Optional Copy Patterns
* New Patterns: Merge Patterns
* Create, update, modify persistent graphs
@boggle boggle changed the title CIP2017-06-18 Multiple Graphs CIP2017-06-18: Multiple Graphs Jul 2, 2017
@boggle boggle force-pushed the CIP2017-06-18-multiple-graphs branch 3 times, most recently from 2498907 to 7332c02 Compare July 2, 2017 23:21
@boggle boggle force-pushed the CIP2017-06-18-multiple-graphs branch from 7332c02 to 8459014 Compare July 3, 2017 07:42
@boggle boggle force-pushed the CIP2017-06-18-multiple-graphs branch from 8459014 to 4714ca6 Compare July 3, 2017 08:12
=== (Property) Graph

_Definition_ A *property graph* is a set of labeled nodes and typed relationships both together with their properties (a property is a tuple of a named key and a value).
Graphs may be updatable, i.e. the set of contained nodes and relationships may change during the lifetime of the graph.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section should probably link to the PGM spec in our repo.

It is an error to attempt to update a read-only graph.

The same node or relationship may be part of many graphs.
A relationship may only be part of a graph if it's start node and it's end node are both also part of the same graph.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's -> its

or rephrased:

if its source and target nodes are both also ...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That always trips me up :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, used to trip me up too, but then I learned that it's == it is, so in case you're unsure, just spell it out and it'll become apparent :)


The same node or relationship may be part of many graphs.
A relationship may only be part of a graph if it's start node and it's end node are both also part of the same graph.
Therefore removing a node from a graph may require removing some of it's relationships from the graph, too.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's -> its

It not only may, it will require removing all of them. Or rephrased:

Thus, removing a node from a graph will require removing all of its relationships from that graph, too.


Graphs do not expose an identity like nodes or relationships do.

Graphs may be made addressable through other means by a conforming implementation (e.g. through exposing the graph under a _graph URL_ for referencing and loading it).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest unwrapping the example from parentheses.


With this terminology in place, execution of a parameterized Cypher query in the single graph execution model can be described as executing within (and operating on) a given execution context and an initial query context and finally returning the query context produced as output for the top-most `RETURN` clause.

Note: This formulation is introduced to describe a high-level model for the execution of queries; A real world implementation is free to choose any other internal representation (e.g. based on an algebra) as long as it does not violate the specified semantics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A -> a (not capitalised)

* `<graph-specifier-list>`: A comma separated list of `<graph-specifier>` that are to be passed on
* `*`: All named graphs are to be passed on
* `*, <graph-specifier-list>`: All named graphs are to be passed on together with any additional named graphs that are newly bound in `<graph-specifier-list>`
* `-`: No named graphs are to be passed on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm interpreting that GRAPHS is optional (which I support). What is the point of GRAPHS - if we can just leave it out?

This in essence mirrors the semantics for tabular data returned by Cypher.

Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the point of having a long-form GRAPHS - as the normal form, and call leaving it out syntactic sugar? Why not say that leaving it out is the normal form, and that the other forms modify that?

Copy link
Author

@boggle boggle Jul 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The procedures CIP (?) I think added - for procedures not returing any columns, similarly researches have suggested, that it is an omission on the part of SQL to not be able to return no columns (more so for Cypher where the single row field less table plays a special role to start off queries). In light of this I added this for no reason but consistency with these other decisions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for procedures there was actually a need for YIELD - in order to not cause implicit conflicts with variables that were in scope, as I recall. My personal preference would be to use the empty string to denote the intention in this proposal.

Both `WITH ... GRAPHS ...` and `RETURN ... GRAPHS ...` will pass on (or return respectively) exactly the set of described named graphs.
To simplify passing on available graphs it is proposed by this CIP that regular `WITH <return-items>` is taken to be syntactic sugar for `WITH <return-items> GRAPHS -` and that regular `RETURN <return-items>` is taken to be syntactic sugar for `RETURN <return-items> GRAPHS -`.

To even further simplify, it is additionally proposed that `WITH|RETURN <return-items> INPUT GRAPHS <graph-return-items>` is to be syntactic sugar for `WITH|RETURN <return-items> GRAPHS <graph-return-items>, SOURCE GRAPH, TARGET GRAPH`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced of the usefulness of this syntactic sugar -- I find that it is hard to know what kind of queries will be prominent in this new model. In general, I think that it would be useful to have a little less focus on the syntactic sugar bits, and more on the core model. Syntactic sugar additions could always follow later.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revisit how default graphs are handled as a group first - this may very well remove the need for this. In short I added this as a simple way for a query to say: "I'm ok to run on any incoming graphs and am happy to pass those on, just give 'em some names for me". Without this sugar, expressing this becomes rather verbose.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not against the sugar per se, I just find it difficult to assess whether a particular piece of sugar is valuable this early in the process of defining these very new concepts, and so I'm leaning towards skepticism in general. I find it is peripheral to the contents of the CIP anyway.


=== Discarding available tabular data

It is additionally proposed that both `WITH GRAPHS <graph-return-items>` and `RETURN GRAPHS <graph-return-items>` are syntactic sugar for `WITH - GRAPHS <graph-return-items>` (and `RETURN - GRAPHS <graph-return-items>` respectively).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel similarly to this as to GRAPHS -; I prefer the absence of - to its presence in this context.


However, the change has been carefully designed to not change the semantics of existing queries.

== Alternatives
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this and subsequent sections are superfluous since the introduction of CIRs. We should modify our template.

@Mats-SX
Copy link
Member

Mats-SX commented Jul 3, 2017

Great work putting these concepts into spec!

boggle and others added 8 commits August 3, 2017 14:44
- Homogenized graph specifier syntax
- Added DEFAULT GRAPH
- WITH, RETURN can also return comma separated list of graphs without
  leading `GRAPHS` if bound graphs are prefixed with `GRAPH`,
  i.e. RETURN a, b, c COPY OF GRAPH foo is possible
- COPY .. TO ..
- Allow FROM <name> AS <new-name> (wo leading GRAPH)
- Allow INTO <name> AS <new-name> (wo leading GRAPH)
@petraselmer petraselmer force-pushed the CIP2017-06-18-multiple-graphs branch 2 times, most recently from 54286fa to b402f1d Compare August 4, 2017 21:23
- The jpg files ought to be moved elsewhere at a later stage
@petraselmer petraselmer force-pushed the CIP2017-06-18-multiple-graphs branch 2 times, most recently from 335a474 to 3258b3b Compare August 5, 2017 08:36
@boggle boggle force-pushed the CIP2017-06-18-multiple-graphs branch from 7f258be to c5b8e42 Compare May 8, 2018 07:56
@boggle boggle added oCIG cypher10 This work targets Cypher 10 and removed NOT READY FOR REVIEW labels May 8, 2018
@boggle boggle changed the title CIP2017-06-18: Multiple Graphs Querying and constructing multiple graphs May 8, 2018
@linsimiao
Copy link

hi all, I have read your documentation. I found it easy to mix CONSTRUCT with UPDATE. I wonder whether the following cyphers mean the same.

FROM xxx match (a:Person) UPDATE GRAPH merge (b:Student{name:a.name})

and

FROM xxx match (a:Person) CONSTRUCT merge (b:Student{name:a.name})

suppose the working graph is yyy, is it both two cyphers will lead to create nodes with a label Student in the graph yyy.

Thanks for your reply.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants