-
Notifications
You must be signed in to change notification settings - Fork 66
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FILTER keyword for latest anchor queries (planner/memory) #150
Merged
thiagovas
merged 26 commits into
google:master
from
rogerlucena:filter-latest-planner-memory
Oct 16, 2020
Merged
FILTER keyword for latest anchor queries (planner/memory) #150
thiagovas
merged 26 commits into
google:master
from
rogerlucena:filter-latest-planner-memory
Oct 16, 2020
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Open
thiagovas
approved these changes
Oct 7, 2020
58f6a07
to
338df90
Compare
rbkloss
reviewed
Oct 8, 2020
…ter test FILTER) Two new triples were added so we could have, for both predicate and object positions, examples on which two triples would have the same latest anchor (expecting 2 rows for "FILTER latest" in this case, in the place of only 1 as it was in the usual case tested so far). For that, the triples added were the one with predicate ""bought"@[2016-04-01T00:00:00-08:00]" and the other with object ""turned"@[2016-04-01T00:00:00-08:00]". The third triple was added so that both "/u<peter>" and "/u<paul>" would have in common two temporal predicates - this way we can test if only one "FILTER latest(?p)" is working as expected for multiple graph clauses inside of WHERE (if they share that same binding "?p"). For this, the triple added was the one with predicate ""bought"@[2016-01-01T00:00:00-08:00]".
… anchor only for now)
…ake FILTER for latest return multiple triples if they share the same predicate and same latest anchor
…n be applied to, improving error checking, and also move verification for supported filter functions to the planner level
…dition of multiple filter functions in the future)
…-prone when implementing the driver)
…ne when implementing the driver)
9dc5c4a
to
f8c694c
Compare
thiagovas
approved these changes
Oct 14, 2020
…into auxiliar functions)
…peration" anymore
72a44d4
to
5fd5a95
Compare
jlsotomayorm
approved these changes
Oct 15, 2020
…ide "compatibleBindingsInClauseForFilterOperation"
This was referenced Nov 18, 2020
Open
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR comes as a part of #129, finishing to set up a
FILTER
keyword in BadWolf at the planner and memory levels. The firstFILTER
function chosen to be implemented is thelatest
one, solving what was requested by #86.It would be very useful to have in BQL a
FILTER
keyword that could allow us to filter out part of the results of a query in a level closer to the storage (closer to the driver), improving performance. This is exactly what this PR finishes introducing, completing what started with the PR #149. This PR also showcases the full implementation of how this keyword could work on the driver/storage side as well (exemplified with the changes added to the volatile open-source driver inmemory.go
below).Then, now the user can specify, inside of
WHERE
, which bindings they want to apply aFILTER
to, proceeding with a more fine-grained lookup on storage, avoiding unnecessary retrieval of data and optimizing query performance.To illustrate, queries such as the one below are now possible:
That would return all the temporal triples of the
?test
graph that have the latest timestamp of the time series they are part of (a recorrent use case in BadWolf), skipping immutable triples found along the way. This FILTER function also works for objects?o
in the case of reification/blank nodes: in this case the returned triples would be the ones on which the object is necessarily a temporal predicate with latest timestamp among the predicates with that same predicate ID (in the "object" position of the clause), analogously to what happened with?p
.This
FILTER
forlatest
anchor above also works for alias bindings obtained with theAS
keyword for predicates and objects too.At the moment only one
FILTER
is supported for each graph clause inside ofWHERE
, and only oneFILTER
is supported for each given binding as well.In the future, to add a new
FILTER
function, the steps to follow are:filter.go
;SupportedOperations
map infilter.go
to map the lowercase string of theFILTER
function being added to its correspondentfilter.Operation
element;String
method ofOperation
infilter.go
;compatibleBindingsInClauseForFilterOperation
inplanner.go
to specify for which fields and bindings of a clause the newly addedfilter.Operation
can be applied to;