-
-
Notifications
You must be signed in to change notification settings - Fork 219
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(9586): implement freetext search in cht datasource #9625
base: master
Are you sure you want to change the base?
feat(9586): implement freetext search in cht datasource #9625
Conversation
eba7aac
to
43efbef
Compare
…text-search-in-cht-datasource
…text-search-in-cht-datasource
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work here! This was a huge lift, but I think this will mean that cht-datasoure can support the majority of READ functionality we need! Got a bunch of nitpicks/suggestions, but also a few cases where I think we need to get aligned with the existing search code.
Also, we can either do it in this PR, or in a follow one (since we already have a ton of changes here), but ultimatly we need to refactor shared-libs/search
to call cht-datasource
instead of directly using the freetext indexes.
Co-authored-by: Joshua Kuestersteffen <[email protected]>
Co-authored-by: Joshua Kuestersteffen <[email protected]>
Co-authored-by: Joshua Kuestersteffen <[email protected]>
…thub.com:medic/cht-core into 9586-implement-freetext-search-in-cht-datasource
…text-search-in-cht-datasource
@jkuester the e2e tests related to cht-datasource are failing right now. I remember seeing something related to changes in how we deal with auth and its impact in cht-datasource. Do we have any workaround at the moment? |
Oh wow. I did not realize that the end-result of the changes was to just delete the existing remote cht-datasource tests with no alternative. This is very disappointing. We can do better here, though. The challenge is that in a browser context, the session cookie is automatically included when making the remote 1. Update cht-datasource to accept auth information when creating a remote DataContextThis would essentially be addressing #9701. We could pass username/password in the As I noted on the ticket, though, I am reluctant to make changes to the implementation code just for testing purposes. 2. Re-introduce the MITM for
|
@jkuester I did not go with the hook implementation as it would mean checking for filenames to run because this auth setup needs to run only for a few test suites, which would have resulted in a not-so-fruitful hook file with checks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright! We are getting very close here.
freetext: Nullable<string> = null, | ||
type: Nullable<string> = null | ||
) => ctx.bind(Contact.v1.getIds)(Contact.v1.createQualifier(freetext, type)), | ||
getUuids: ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: I guess this should be getUuidsByTypeFreetext
. Then we also need:
getUuidsPageByType
getUuidsByType
getUuidsPageByFreetext
getUuidsByFreetext
Right? I am still open to other ideas here since this full permutation approach is not very scalable. 😞 But, I am not sure what else to do except for accepting Qualifier parameters in this API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only other alternative I could think of is to make every parameter optional and process according to the non-empty parameters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FTR, after further discussion on the squad call, we decided to go forwards with having specific functions for each permutation of search parameters that we want to support. This will result in a lot of functions, but the amount of duplicated code will be limited since it will all be just calling through to the same core api functions. Additionally, being able to have verbose names for each function will improve the readability/clarity of this api. This seems like an opportunity to find simplicity by expanding horizontally instead of vertically (by nesting a bunch of optional params into the same qualifier object).
@@ -228,12 +227,11 @@ export const getDatasource = (ctx: DataContext) => { | |||
* @throws InvalidArgumentError if the provided `limit` value is `<=0` | |||
* @throws InvalidArgumentError if the provided cursor is not a valid page token or `null` | |||
*/ | |||
getIdsPage: ( | |||
getUuidsPage: ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: Same here. I this this should be getUuidsPageByFreetext
. Then the next function should be getUuidsByFreetext
const allContacts = person ? [person, ...fetchedContacts] : fetchedContacts; | ||
const contactsWithHydratedPrimaryContact = contacts.map( | ||
hydratePrimaryContact(allContacts) | ||
).filter(item => item ?? false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: why was this check included? I don't think we want to filter the null
values here. That is part of the horrible jankyness here with the lineage where we might not actually find docs for all the levels of a contacts hierarchy. But, if a level is missing, we want to show that with null
and not just drop the entry so it looks like there was no level there at all...
}; | ||
|
||
/** @internal */ | ||
export const getContactLineage = (medicDb: PouchDB.Database<Doc>) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I guess now the getPrimaryContactIds, hydratePrimaryContact, hydrateLineage functions do not need to be exported since they are only getting used in this file.
data: pagedDocs.data.map((doc) => doc._id), | ||
cursor: pagedDocs.cursor | ||
}; | ||
return await getPaginatedDocs(getDocsFn, limit, skip); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
issue: okay bad news 😞 I found an edge case that breaks this logic... We should never get null id values back from these freetext queries, but it is possible to get duplicate id values.
To test this, I created a contact with:
"name": "Alberto O'Kon Rivera",
"short_name": "River",
Then I did a search with freetext=river
. The list of ids returned contained two instances of the same id value for this contact. This is because the getByStartsWithFreetext
query will match the emissions for both River
and Rivera
(as intended). To make things even worse, even if we dupe-checked the ids returned for a given page, I don't think there is any reason that Couch could not give us more dupes of the same id later on different pages (because the view will return things ordered by key not by id).... 😬
@sugat009 definitely interested to hear your thoughts on the best way to proceed here. Pragmatically, my current inclination is to guarantee that each page will be free from duplicates, but then note in our documentation that different pages could contain the same ids. (I cannot come up with any feasible way to guarantee no dupes across pages).
The good news is that if we decide to just dupe-check on a page-by-page basis, then I think we can easily just reuse the fetch and filter logic here like this:
const uuidSet = new Set<string>();
const filterFn = (uuid: Nullable<string>): boolean => {
if (!uuid) {
return false;
}
const { size } = uuidSet;
uuidSet.add(uuid);
return uuidSet.size !== size;
};
return await fetchAndFilter(
getDocsFn,
filterFn,
limit
)(limit, skip);
(We just need to update the fetchAndFilter
signature and change <T extends Doc>
to just be <T>
. I don't think we actually need the extends Doc
for the current functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same thing is going to apply for the contact logic too.
@@ -50,7 +50,8 @@ export const getResource = (context: RemoteDataContext, path: string) => async < | |||
if (response.status === 404) { | |||
return null; | |||
} else if (response.status === 400) { | |||
throw new InvalidArgumentError(response.statusText); | |||
const errorMessage = await response.text(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: what was the difference here between statusText
and text()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: Nice! Super clean! 👍
Can we add a comment here just to indicate why we had to do this? Something like:
Currently the remote context for cht-datasource does not handle any authentication for its fetch calls because it inherits the session cookie when running in the browser. NodeJS does not support automatically applying cookies to fetch calls. So, for the integration tests we need to wrap the fetch calls to set the basic auth headers on each request.
Co-authored-by: Joshua Kuestersteffen <[email protected]>
Co-authored-by: Joshua Kuestersteffen <[email protected]>
Co-authored-by: Joshua Kuestersteffen <[email protected]>
Co-authored-by: Joshua Kuestersteffen <[email protected]>
Description
Closes: #9586
Code review checklist
Compose URLs
If Build CI hasn't passed, these may 404:
License
The software is provided under AGPL-3.0. Contributions to this project are accepted under the same license.