Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix inconsistent search results for date-related queries fixes#12296 #12724

Open
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

AbdoMostfa2
Copy link
Contributor

@AbdoMostfa2 AbdoMostfa2 commented Mar 13, 2025

Fix inconsistent search results for date-related queries

This update ensures that searching for different date fields (e.g., year=2012, date=2012, Date=2012) returns consistent results. Previously, these queries produced different results due to variations in field names.

Closes #12296

Mandatory checks

  • I own the copyright of the code submitted and I licence it under the MIT license
  • Change in CHANGELOG.md described in a way that is understandable for the average user (if change is visible to the user)
  • Tests created for changes (if applicable)
  • Manually tested changed features in running JabRef (always required)
  • Screenshots added in PR description (for UI changes)
  • Checked developer's documentation: Is the information available and up to date? If not, I outlined it in this pull request.
  • Checked documentation: Is the information available and up to date? If not, I created an issue at https://github.com/JabRef/user-documentation/issues or, even better, I submitted a pull request to the documentation repository.

@@ -261,6 +269,75 @@ private SqlQueryNode getFieldQueryNode(String field, String term, EnumSet<Search
}
}

private SqlQueryNode buildTwoFieldsUnionQuery(String field1, String field2, String operator, String prefixSuffix, String term) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Method lacks JavaDoc documentation despite being a complex method handling SQL query construction with multiple parameters and specific logic for date/year fields.

return new SqlQueryNode("cte" + cteCounter++);
}

private SqlQueryNode buildTwoFieldsNegationQuery(String field1, String field2, String operator, String prefixSuffix, String term) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complex method handling SQL negation queries lacks JavaDoc documentation. The method's purpose and parameter usage should be clearly documented for maintainability.

@Siedlerchr
Copy link
Member

Jabref treats date and year as alias fields. And some others as well.
There's a method getFieldOrAlias
You would need to compare the search with the one in JabRef 5.15 to see how alias fields are treated there. My guess would be if I search for date, if the field exist return that value else the alias

@koppor
Copy link
Member

koppor commented Mar 13, 2025

  1. Download JabREf 5.15 as portable edition at https://github.com/JabRef/jabref/releases/tag/v5.15
  2. Disable single instance
    image
  3. Start development version of JabRef and 5.15 in the same time --> you can then check in parallel.

@@ -228,6 +228,14 @@ private SqlQueryNode getFieldQueryNode(String field, String term, EnumSet<Search
term = escapeTermForSql(term);
}

if (field.equalsIgnoreCase(StandardField.DATE.getName()) || field.equalsIgnoreCase(StandardField.YEAR.getName())) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think, this is the wrong approach. org.jabref.model.entry.EntryConverter#FIELD_ALIASES needs to be used - maybe all fields "unzipped" into postgres

Maybe, also a "view" in postgres needs to be generated - a derived_date covering the date and then the query on the month/day/... go to this view.. The view could offer all fields values pre-calculated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thought was that the issue was only related to the date and year fields, so my first approach focused on that. After your clarification, I realized the need to handle all date-related fields properly.
I’ve now updated the solution accordingly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@koppor do you think there’s still a need to create a view for this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A view can be a next step. A view typically helps to gain performance and reduce memory consumption.

@AbdoMostfa2 AbdoMostfa2 force-pushed the fix-for-issue-12296 branch from 1f8ead4 to e4f9283 Compare March 17, 2025 01:53
@AbdoMostfa2 AbdoMostfa2 changed the title Fix inconsistent search results for date-related queries Fix inconsistent search results for date-related queries fixes#12296 Mar 17, 2025
@@ -47,6 +47,7 @@ public class BibFieldsIndexer {
private final String splitValuesTable;
private final String schemaSplitValuesTableReference;
private final Character keywordSeparator;
private final Field[] dateFields = new Field[] {StandardField.DATE, StandardField.YEAR, StandardField.MONTH, StandardField.DAY};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use modern Java data structures: Set<Field> seems to pbe approivate?

addBatch(preparedStatement, entryId, field, value, resolvedFieldLatexFree.orElse(""));

// Skip indexing of date-related fields separately to ensure proper handling later in the process.
if (field != StandardField.DATE && field != StandardField.YEAR && field != StandardField.MONTH && field != StandardField.DAY) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not using dateFields.contains(field)?

Comment on lines 248 to 249
String dateValue = resolvedDateValue.orElse("");
addBatch(preparedStatement, entryId, dateField, dateValue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know, Optionals are hard to learn, please:

Use the methods of java.util.Optional. ifPresent.

NOT

Optional<String> resolved = bibEntry.getResolvedFieldOrAlias(...);
String value = resolved.orElse("");
doSomething(value)

BUT

Optional<String> resolved = bibEntry.getResolvedFieldOrAlias(...);
resolved.ifPresent(value -> doSomething(value));

FIELD_VALUE_TRANSFORMED, FIELD_VALUE_TRANSFORMED);
String entryId = entry.getId();
// If the updated field is date-related, re-index all date fields to overwrite the previous value.
if (field == StandardField.DATE || field == StandardField.YEAR || field == StandardField.MONTH || field == StandardField.DAY) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use dateFields.contains

Comment on lines 332 to 333
String dateValue = resolvedDateValue.orElse("");
addBatch(preparedStatement, entryId, dateField, dateValue);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above

String insertDateFieldQuery = """
INSERT INTO %s ("%s", "%s", "%s", "%s")
VALUES (?, ?, ?, ?)
ON CONFLICT ("%s", "%s")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ON CONFLICT is not clear to me. Add Java coment

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This handles situations where the user inserts conflicting data, such as date=2025-03-17 and year=2026. In this case, it will overwrite the previously calculated year. Should I ignore this conflict instead?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add Java comment

}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why empty line removed?

@@ -325,7 +359,6 @@ private void insertField(BibEntry entry, Field field) {
FIELD_VALUE_TRANSFORMED);

try (PreparedStatement preparedStatement = connection.prepareStatement(insertIntoSplitTable)) {
String entryId = entry.getId();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why this part of the code is touched. I totally forgot about the use of the split fields. I asked @LoayGhreeb for support at documentaiotn

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the Split Table handles multi-valued fields like keywords or groups. I did not remove it, but the entryId is calculated above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a Java comment to org.jabref.model.search.PostgreConstants#getSplitTableSchemaReference please?

@AbdoMostfa2 AbdoMostfa2 force-pushed the fix-for-issue-12296 branch from e62842d to 034b28f Compare March 17, 2025 11:58
@AbdoMostfa2 AbdoMostfa2 requested a review from koppor March 17, 2025 12:20
@AbdoMostfa2 AbdoMostfa2 marked this pull request as ready for review March 17, 2025 18:51
Comment on lines +221 to +225
if (!DATE_FIELDS.contains(field)) {
Optional<String> resolvedFieldLatexFree = bibEntry.getResolvedFieldOrAliasLatexFree(field, this.databaseContext.getDatabase());
assert resolvedFieldLatexFree.isPresent();
addBatch(preparedStatement, entryId, field, value, resolvedFieldLatexFree.orElse(""));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code uses assert and orElse("") for Optional handling instead of proper Optional methods. Should use ifPresent() for better Optional handling.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use resolvedFieldLatexFree.ifPresent(content -> ...)

Copy link
Contributor Author

@AbdoMostfa2 AbdoMostfa2 Mar 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the point about using ifPresent() for better Optional handling. However, I deliberately retained the assertion because the comment emphasizes detecting potential flaws.
image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do both.

Consistency with other code.

We cannot change current tooling easily.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also use get() and ignore the bot comment. Seems to be best (seeing these two lines only)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

meaningful search
3 participants