Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FILTER keyword for latest anchor queries (planner/memory) #150

Merged
merged 26 commits into from
Oct 16, 2020

Conversation

rogerlucena
Copy link
Contributor

This PR comes as a part of #129, finishing to set up a FILTER keyword in BadWolf at the planner and memory levels. The first FILTER function chosen to be implemented is the latest one, solving what was requested by #86.

It would be very useful to have in BQL a FILTER keyword that could allow us to filter out part of the results of a query in a level closer to the storage (closer to the driver), improving performance. This is exactly what this PR finishes introducing, completing what started with the PR #149. This PR also showcases the full implementation of how this keyword could work on the driver/storage side as well (exemplified with the changes added to the volatile open-source driver in memory.go below).

Then, now the user can specify, inside of WHERE, which bindings they want to apply a FILTER to, proceeding with a more fine-grained lookup on storage, avoiding unnecessary retrieval of data and optimizing query performance.

To illustrate, queries such as the one below are now possible:

SELECT ?s, ?p, ?o
FROM ?test
WHERE {
    ?s ?p ?o .
    FILTER latest(?p)
};

That would return all the temporal triples of the ?test graph that have the latest timestamp of the time series they are part of (a recorrent use case in BadWolf), skipping immutable triples found along the way. This FILTER function also works for objects ?o in the case of reification/blank nodes: in this case the returned triples would be the ones on which the object is necessarily a temporal predicate with latest timestamp among the predicates with that same predicate ID (in the "object" position of the clause), analogously to what happened with ?p.

This FILTER for latest anchor above also works for alias bindings obtained with the AS keyword for predicates and objects too.

At the moment only one FILTER is supported for each graph clause inside of WHERE, and only one FILTER is supported for each given binding as well.

In the future, to add a new FILTER function, the steps to follow are:

  1. Add a new enum item in the list of supported filter Operations in filter.go;
  2. Add a new entry in the SupportedOperations map in filter.go to map the lowercase string of the FILTER function being added to its correspondent filter.Operation element;
  3. Update the String method of Operation in filter.go;
  4. Add a new switch case inside compatibleBindingsInClauseForFilterOperation in planner.go to specify for which fields and bindings of a clause the newly added filter.Operation can be applied to;
  5. Implement the appropriate behavior on the driver side.

@thiagovas thiagovas linked an issue Oct 6, 2020 that may be closed by this pull request
storage/storage.go Outdated Show resolved Hide resolved
storage/memory/memory_test.go Outdated Show resolved Hide resolved
storage/memory/memory_test.go Outdated Show resolved Hide resolved
storage/memory/memory_test.go Outdated Show resolved Hide resolved
storage/memory/memory_test.go Outdated Show resolved Hide resolved
storage/memory/memory_test.go Outdated Show resolved Hide resolved
storage/memory/memory_test.go Outdated Show resolved Hide resolved
storage/memory/memory_test.go Outdated Show resolved Hide resolved
bql/planner/filter/filter.go Outdated Show resolved Hide resolved
bql/planner/filter/filter.go Outdated Show resolved Hide resolved
@thiagovas thiagovas requested a review from rbkloss October 7, 2020 13:47
@rogerlucena rogerlucena force-pushed the filter-latest-planner-memory branch from 58f6a07 to 338df90 Compare October 7, 2020 18:15
bql/semantic/semantic.go Outdated Show resolved Hide resolved
bql/semantic/hooks.go Outdated Show resolved Hide resolved
bql/semantic/hooks.go Outdated Show resolved Hide resolved
bql/semantic/hooks.go Outdated Show resolved Hide resolved
bql/semantic/hooks.go Outdated Show resolved Hide resolved
bql/semantic/hooks.go Show resolved Hide resolved
…ter test FILTER)

Two new triples were added so we could have, for both predicate and object positions, examples on which two triples would have the same latest anchor (expecting 2 rows for "FILTER latest" in this case, in the place of only 1 as it was in the usual case tested so far).
For that, the triples added were the one with predicate ""bought"@[2016-04-01T00:00:00-08:00]" and the other with object ""turned"@[2016-04-01T00:00:00-08:00]".

The third triple was added so that both "/u<peter>" and "/u<paul>" would have in common two temporal predicates - this way we can test if only one "FILTER latest(?p)" is working as expected for multiple graph clauses inside of WHERE (if they share that same binding "?p").
For this, the triple added was the one with predicate ""bought"@[2016-01-01T00:00:00-08:00]".
…ake FILTER for latest return multiple triples if they share the same predicate and same latest anchor
…n be applied to, improving error checking, and also move verification for supported filter functions to the planner level
…dition of multiple filter functions in the future)
@rogerlucena rogerlucena force-pushed the filter-latest-planner-memory branch from 9dc5c4a to f8c694c Compare October 13, 2020 15:35
@rogerlucena rogerlucena marked this pull request as ready for review October 13, 2020 15:39
@rogerlucena rogerlucena force-pushed the filter-latest-planner-memory branch from 72a44d4 to 5fd5a95 Compare October 15, 2020 00:53
…ide "compatibleBindingsInClauseForFilterOperation"
@thiagovas thiagovas merged commit 85e2195 into google:master Oct 16, 2020
@rogerlucena rogerlucena deleted the filter-latest-planner-memory branch October 16, 2020 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

FILTER keyword
4 participants