Add documentation for spatial query features #1581

ullingerc · 2024-10-24T11:58:01Z

No description provided.

codecov · 2024-10-24T12:32:16Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.07%. Comparing base (f856919) to head (126bef7).
Report is 78 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1581      +/-   ##
==========================================
+ Coverage   89.00%   89.07%   +0.07%     
==========================================
  Files         368      371       +3     
  Lines       33888    34437     +549     
  Branches     3828     3899      +71     
==========================================
+ Hits        30161    30675     +514     
- Misses       2473     2484      +11     
- Partials     1254     1278      +24

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

hannahbast · 2024-10-26T14:08:56Z

@ullingerc Thanks a lot, this is great. I have a comment and a question:

I think it's better to move this kind of documentation to https://github.com/ad-freiburg/qlever/wiki and link to it (with a short explanation) from the main README.md of https://github.com/ad-freiburg/qlever. The reason is that documentation is more frequently updated (or simply corrected) than code, and we don't want a commit for every such update. For the same reason, the QLever CLI is in a separate repository https://github.com/ad-freiburg/qlever-control.
You mention that QLever does not yet provide ad-hoc computation of the spatial predicates contains, intersects, and so on. I wonder: since the S2 library is already integrated now, wouldn't it be fairly straightforward to use it for this?

ullingerc · 2024-10-26T15:18:56Z

@hannahbast Thank you very much for your feedback, A few thoughts on this:

Okay, I see. I totally agree that we should stick to the project's conventions. While I think that, personally, I would prefer a self-contained git repository that can for example be easily moved to a different platform, I agree that we should be consistent: I will change this as you suggested.
Thanks for bringing this up. I also already thought about this. S2 provides some functions, also for example to calculate the centroid ad-hoc. But it would involve quite some infrastructure code (for parsing the different types of WKT strings, converting each of them to the adequate data structure in s2, the wrapping code for each of the functions, etc.). Therefore my suggestion is to postpone this for a few weeks, while I still have lots of other work to do. If I can find time, I will be glad to implement this. I think it would be very useful to have the ad-hoc methods as a sort of fallback for the precomputed ones. For example when importing a small dataset without OpenStreetMap this would be a practical shortcut.

hannahbast · 2024-10-26T18:21:23Z

@ullingerc Thanks for the explanation! I have a follow-up question regarding the spatial join to understand this better:

Let's consider any one of the spatial functions, say geof:sfContains and let's consider the special case, where we use it in a FILTER and both arguments are variables and both variables are bound by parts of the remaining query. Then, instead of FILTER geof:sfContains(?x, ?y), we could equivalently write ?x <spatial-join:contains> ?y, where the latter is a magic predicate with the appropriate semantics.

That looks very similar to the magic functions <nearest-neighbors:k>, <nearest-neighbors:k:m>, and <max-distance-in-meters:m, which you have already implemented.

My question is: Would the implementation of such a magic <spatial-join:contains> predicate be analogous, or is there an additional complication?

ullingerc · 2024-10-28T07:16:00Z

@hannahbast Thanks a lot for your thoughts. There are a few things that come to my mind on this:

In principle, indeed the problems are similar, however the devil is in the details. Currently the spatial join implements two algorithms, both of which are suitable for all of the magic predicates. These are (1) a nested-loop baseline algorithm and (2) a fast algorithm based on the S2PointIndex class. The implementation of both algorithms is based on working only with GeoPoints. For contains, intersects etc. a solution only for points is obviously not very useful. Modifying both algorithms and the infrastructure code to work with any kind of geometry and to answer contains, intersects etc. would significantly increase the complexity of the implementation and thus would involve refactoring much of the code. An alternative would be to introduce a second version of the spatial join class for the contains, intersects, ... questions to avoid trouble with the existing functionality. This comes with the risk of duplicate code. Another possibility is a straightforward implementation as a normal SPARQL function geof:something(?a, ?b). This comes with the heavy runtime penalty of having to compare everything pairwise.
While an implementation via magic predicates is without a doubt a good step in the right direction, it would be even better to follow the GeoSPARQL standard's syntax. This would be a question concering the parser and/or query planner: how can we detect certain patterns in the query (like FILTER(geof:sfContains(?a, ?b)) or FILTER(geof:distance(?a, ?b) <= SomeConstant)) to invoke a spatial join implicitly - for both complying with the standard's syntax and using an efficient implementation?
In any case, even the parsing of more complex WKT literals and the construction as well as querying of appropriate S2 index structures for each of the geometric relations is a larger implementation task. It will also require further familiarization with the S2 library's details.

I suggest that we leave it open for this documentation PR and discuss this in further detail in a meeting, possibly also with @joka921.

ullingerc · 2024-10-28T07:58:48Z

https://github.com/ad-freiburg/qlever/wiki/Spatial-Queries-in-QLever

sparql-conformance · 2024-10-28T10:00:03Z

Conformance check passed ✅

No test result changes.

Details: https://qlever.cs.uni-freiburg.de/sparql-conformance-ui?cur=126bef7972f7c7953b2af9cf12a65c735d81e0e0&prev=39e63b44de44f5bddd600cb3910cbc63648f565e

sonarqubecloud · 2024-10-28T13:19:53Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

ullingerc added 2 commits October 24, 2024 13:56

Add documentation for spatial query features

9702f75

typo

a5231f3

hannahbast mentioned this pull request Oct 26, 2024

Add internal function ql:isGeoPoint #1565

Merged

Apply Feedback from Hannah Bast

126bef7

ullingerc closed this Jan 8, 2025

ullingerc deleted the spatial-docs branch January 8, 2025 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation for spatial query features #1581

Add documentation for spatial query features #1581

ullingerc commented Oct 24, 2024

codecov bot commented Oct 24, 2024 •

edited

Loading

hannahbast commented Oct 26, 2024

ullingerc commented Oct 26, 2024

hannahbast commented Oct 26, 2024

ullingerc commented Oct 28, 2024

ullingerc commented Oct 28, 2024

sparql-conformance bot commented Oct 28, 2024

sonarqubecloud bot commented Oct 28, 2024

Add documentation for spatial query features #1581

Add documentation for spatial query features #1581

Conversation

ullingerc commented Oct 24, 2024

codecov bot commented Oct 24, 2024 • edited Loading

Codecov Report

hannahbast commented Oct 26, 2024

ullingerc commented Oct 26, 2024

hannahbast commented Oct 26, 2024

ullingerc commented Oct 28, 2024

ullingerc commented Oct 28, 2024

sparql-conformance bot commented Oct 28, 2024

Conformance check passed ✅

sonarqubecloud bot commented Oct 28, 2024

Quality Gate passed

codecov bot commented Oct 24, 2024 •

edited

Loading