Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide framework for generic lazily evaluated operation results #1350

Draft
wants to merge 71 commits into
base: master
Choose a base branch
from

Conversation

RobinTF
Copy link
Collaborator

@RobinTF RobinTF commented May 18, 2024

Still WIP. Currently missing:

  • Discussion about remaining TODOs
  • Lots of unit tests
  • Also most likely some functions need to be broken up into smaller pieces once we found everything else to be working "correctly".
  • Documentation of all newly introduced functions once they're becoming somewhat "final"
  • Cold Fusion & World domination?

result._resultPointer->resultTable()->idTable().numColumns();
LOG(DEBUG) << "Computed result of size " << resultNumRows << " x "
<< resultNumCols << std::endl;
}
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this debug message provide any real benefit to make it worth somehow incorporating it into lazily evaluated operations?

Copy link

codecov bot commented May 18, 2024

Codecov Report

Attention: Patch coverage is 73.56322% with 184 lines in your changes missing coverage. Please review.

Project coverage is 88.57%. Comparing base (797f325) to head (e5ceacc).

Files Patch % Lines
src/engine/Result.cpp 57.76% 98 Missing and 8 partials ⚠️
src/util/CacheableGenerator.h 85.80% 1 Missing and 22 partials ⚠️
src/engine/Operation.cpp 74.11% 21 Missing and 1 partial ⚠️
src/engine/IndexScan.cpp 5.88% 15 Missing and 1 partial ⚠️
src/util/Cache.h 82.08% 0 Missing and 12 partials ⚠️
src/engine/ExportQueryExecutionTrees.cpp 93.33% 0 Missing and 3 partials ⚠️
src/engine/QueryExecutionTree.cpp 83.33% 0 Missing and 1 partial ⚠️
src/engine/QueryPlanner.cpp 50.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1350      +/-   ##
==========================================
- Coverage   89.00%   88.57%   -0.43%     
==========================================
  Files         329      331       +2     
  Lines       29155    29773     +618     
  Branches     3236     3327      +91     
==========================================
+ Hits        25948    26370     +422     
- Misses       2055     2204     +149     
- Partials     1152     1199      +47     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

joka921 pushed a commit that referenced this pull request May 23, 2024
This PR contains all the changes from the infrastructure for lazy operation evaluation (#1350)  that are simple and repetitive, but touch many files. In particular:

* Rename the `ResultTable` class to `Result` (a TODO suggested by @hannahbast some time ago).
* Add a new parameter `bool requestLaziness` to `Operation::computeResult`. This parameter is currently unused.
hannahbast pushed a commit that referenced this pull request Jun 13, 2024
This makes the code much simpler, and makes no difference for almost all queries. The expensive part (reading from disk and decompressing) is still done in parallel, only the writing to the `IdTable` is now serialized + there is an additional copy compared to before. An example query that is slower now because of this change is: materialize a large index scan (for example, for the predicate `rdf:type`) and group by subject (there is a shortcut for grouping by object when there are few objects). But such queries will become lazy soon anyway (see #1350) and then this will be irrelevant.
realHannes pushed a commit to realHannes/qlever that referenced this pull request Jun 15, 2024
…eiburg#1323)

This makes the code much simpler, and makes no difference for almost all queries. The expensive part (reading from disk and decompressing) is still done in parallel, only the writing to the `IdTable` is now serialized + there is an additional copy compared to before. An example query that is slower now because of this change is: materialize a large index scan (for example, for the predicate `rdf:type`) and group by subject (there is a shortcut for grouping by object when there are few objects). But such queries will become lazy soon anyway (see ad-freiburg#1350) and then this will be irrelevant.
Copy link

sonarcloud bot commented Jun 30, 2024

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant