Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fuseki: report number of changes made in SPARQL UPDATE responses #2765

Closed
Ostrzyciel opened this issue Oct 10, 2024 · 5 comments
Closed

Fuseki: report number of changes made in SPARQL UPDATE responses #2765

Ostrzyciel opened this issue Oct 10, 2024 · 5 comments
Labels
enhancement Incrementally add new feature

Comments

@Ostrzyciel
Copy link
Contributor

Version

5.1.0

Feature

I think it would be useful if the SPARQL UPDATE endpoint in Fuseki returned the number of changes made during the update (inserts and deletes). This would give the user immediate feedback on whether the query worked as intended or not. Quite a few DBs have a feature like that, and it was always very useful for me. Currently my workflow to get around this issue is to first issue a SELECT query to see what rows would be returned, and then uncomment the INSERT/DELETE lines.

This should be fine with the SPARQL spec, which states (source):

The response body of a successful update request is implementation defined. Implementations may use HTTP content negotiation to provide both human-readable and machine-processable information about the completed update request.

Currently Fuseki can return the response in plain text, equivalent HTML, and JSON. For JSON, it looks like this:

{
    "statusCode": 200,
    "message": "Update succeeded"
}

My suggestion would be to both extend the text message to include the number of updates, and add two fields for that in the JSON, so that it's machine-readable. Something like this:

{
    "statusCode": 200,
    "message": "Update succeeded. Inserted triples: 1234. Deleted triples: 875.",
    "inserted": 1234,
    "deleted": 875
}

This is the simplest approach I can think of, as it would show up fine in YASGUI and other clients without any modifications.

The piece of code that would need to be modified is here:

public static void successPage(HttpAction action, String message) {

I'm just not sure how to get the number of updates programmatically. I'd have to dig deeper in the code.

As with #2764, this is nothing urgent, more of a quality-of-life improvement. :) I may look into resolving this sometime in the future.

Are you interested in contributing a solution yourself?

Perhaps?

@Ostrzyciel Ostrzyciel added the enhancement Incrementally add new feature label Oct 10, 2024
@afs
Copy link
Member

afs commented Oct 10, 2024

There is a reason!

What can be done is to record the number of adds and deletes attempted for DELETE-INSERT-WHERE for the *DATA operations i.e. not checking whether a triple is actually added (the triple may already be in the data) or deleted (a triple may not be in the data).

Some operations like DROP GRAPH don't have a low cost way to estimate the count in all cases.

And one request can be several operations:

INSERT DATA { :s :p :o } ;
DELETE DATA { :s :p :o }

@Ostrzyciel
Copy link
Contributor Author

@afs Hm, good point! So, this would be a bit more complicated...

Maybe store implementations would have the option to report the number of changes in an update? Then, if it can be easily calculated, we could report that. If not, then 🤷 oh well.

For multiple queries it is indeed more complex. Maybe the resulting JSON/text message could refer to a series of queries?

The main question is if this can be feasibly implemented for simple DELETE-INSERT-WHERE queries, because this would be the main use case, I imagine.

@arne-bdt
Copy link
Contributor

@Ostrzyciel:
In Jena the core point, where operations are executed usually is either on the Graph or the DatasetGraph interface.
Unfortunately, none of the relevant methods (Graph#add, Graph#delete, Graph#remove, DatasetGraph#add, DatasetGraph#delete, DatasetGraph#deletAny) have a return values to indicate success or the number of affected triples.

Counting the number of triples before and after execution could be a way.
That could be cheap for most in-memory implementations. But I am not sure about graph stores, where this could be expensive.

There is also Graph.getCapabilities().sizeAccurate() which could undermine this approach when false,

@afs
Copy link
Member

afs commented Oct 15, 2024

If you want to write a wrapper dataset that manages the counting, go for it. It isn't for all databases.

At scale, bulk operations like deleteAny or DROP/CLEAR in SPARQL may work by nulling out a slot in a datastructure and letting the garbage collector free up space later.

It isn't always so easy to determine whether a adding a triple is, in fact, a change. An update is a sequence of operations - not one - and a later operation may reverse a change. Retain the state across operations would be a significant memory cost.

@Ostrzyciel
Copy link
Contributor Author

Hm, okay, I think I get it. This would be a ton of work for a feature that's not that important. I think I will close this then, but if anyone wants to pursue this, feel free to reopen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Incrementally add new feature
Projects
None yet
Development

No branches or pull requests

3 participants