-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FILTER keyword for latest anchor queries #146
Conversation
d09f413
to
2206197
Compare
2206197
to
6578e67
Compare
storage/memory/memory.go
Outdated
defer func() { | ||
lo.FilterOptions = (*storage.FilteringOptions)(nil) | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is cleaning lo.FilterOptions, right?
could you add a comment?
the comment could state what is being done and why it is necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just added the comments, but this is only being necessary here because we are not deprecating the bool LatestAnchor
from lookupOptions
yet.
This way, just to keep making the tests related to this bool passing in memory_test.go
(only place where it is used), I decided the best approach for now was to keep support for it in the volatile OSS driver, but using a FILTER latest(?p)
behind the drapes as the behavior is the same.
About the cleaning of lo.FilterOptions
, it should not be necessary as I already do this cleaning on the planner level, with resetFilterOptions
there. But since for LatestAnchor
I am artificially creating a new lo.FilterOptions
in the driver level, it is safer to also guarantee that it will be cleaned to nil
in the driver level too, in the case someone think about adding a test in memory_test.go
that concatenates a LatestAnchor
with a normal call to the driver or a FILTER
for something else there (without calling the planner function resetFilterOptions
then). This does not happen at the moment (so if I do not add this defer
there no tests would break) and I doubt someone will do it in the future, but I thought it was safer to add it.
Given that, it brings to light the discussion on when do we plan to deprecate this LatestAnchor
bool from lookupOptions
. Can we do this in a follow up PR or should we wait a bit more? This could allow us to simplify multiple parts of the code in memory.go
, deleting all the boilerplate there related to LatestAnchor
, and also delete a number of tests in memory_test.go
that at the moment are just retesting FILTER
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have talked with Thiago about and it seems that it will be necessary some refactoring in the internal side to deprecate this LatestAnchor
bool from lookupOptions
. Then, for now, I will be keeping the support for it in the OSS side as well.
699b5f3
to
374eac0
Compare
c169600
to
30a5072
Compare
…ter test FILTER) Two new triples were added so we could have, for both predicate and object positions, examples on which two triples would have the same latest anchor (expecting 2 rows for "FILTER latest" in this case, in the place of only 1 as it was in the usual case tested so far). For that, the triples added were the one with predicate ""bought"@[2016-04-01T00:00:00-08:00]" and the other with object ""turned"@[2016-04-01T00:00:00-08:00]". The third triple was added so that both "/u<peter>" and "/u<paul>" would have in common two temporal predicates - this way we can test if only one "FILTER latest(?p)" is working as expected for multiple graph clauses inside of WHERE (if they share that same binding "?p"). For this, the triple added was the one with predicate ""bought"@[2016-01-01T00:00:00-08:00]".
30a5072
to
c447e3e
Compare
c447e3e
to
d6698a8
Compare
d6698a8
to
3102dcb
Compare
3102dcb
to
e241717
Compare
… anchor only for now)
…ake FILTER for latest return multiple triples if they share the same predicate and same latest anchor
…n be applied to, improving error checking, and also move verification for supported filter functions to the planner level
…dition of multiple filter functions in the future)
…-prone when implementing the driver)
…ne when implementing the driver)
e241717
to
3c5c9ff
Compare
This PR comes as a part of #129, setting up a
FILTER
keyword in BadWolf. The firstFILTER
function chosen to be implemented is thelatest
one, solving what was requested by #86.It would be very useful to have in BQL a
FILTER
keyword that could allow us to filter out part of the results of a query in a level closer to the storage (closer to the driver), improving performance. This is exactly what this PR comes to introduce, with the full implementation of how this keyword could work on the driver/storage side as well (exemplified with the changes added to the volatile open-source driver inmemory.go
below).Then, now the user can specify, inside of
WHERE
, which bindings they want to apply aFILTER
to, proceeding with a more fine-grained lookup on storage, avoiding unnecessary retrieval of data and optimizing query performance.To illustrate, queries such as the one below are now possible:
That would return all the temporal triples of the
?test
graph that have the latest timestamp of the time series they are part of (a recorrent use case in BadWolf), skipping immutable triples found along the way. This FILTER function also works for objects?o
in the case of reification/blank nodes: in this case the returned triples would be the ones on which the object is necessarily a temporal predicate with latest timestamp among the predicates with that same predicate ID (in the "object" position of the clause), analogously to what happened with?p
.This
FILTER
forlatest
anchor above also works for alias bindings obtained with theAS
keyword for predicates and objects too.At the moment only one
FILTER
is supported for each graph clause inside ofWHERE
, and only oneFILTER
is supported for each given binding as well.Regarding their position inside
WHERE
,FILTER
clauses must come after all the graph pattern clauses, just by the end of theWHERE
(closer to its closing bracket). Regarding trailing dots, aFILTER
clause is understood just like any other graph clause inside ofWHERE
: the trailing dot is mandatory at the end of each clause (FILTER
or graph ones indistinguishably), with the exception of the last one (for which the dot is optional).In the future, to add a new
FILTER
function, the steps to follow are:filter.go
;SupportedOperations
map infilter.go
to map the lowercase string of theFILTER
function being added to its correspondentfilter.Operation
element;String
method ofOperation
infilter.go
;compatibleBindingsInClauseForFilterOperation
inplanner.go
to specify for which fields and bindings of a clause the newly addedfilter.Operation
can be applied to;