Skip to content
This repository has been archived by the owner on Sep 12, 2018. It is now read-only.

[query] Support SQLite's LIKE for non-fulltext text searching #747

Open
ncalexan opened this issue Jun 18, 2018 · 3 comments
Open

[query] Support SQLite's LIKE for non-fulltext text searching #747

ncalexan opened this issue Jun 18, 2018 · 3 comments
Labels
A-query Issues or requests for query capabilities. enhancement help wanted

Comments

@ncalexan
Copy link
Member

While implementing a rough clone of Firefox for iOS's logins handling, I noticed that we don't support SQLite's LIKE operator for non-fulltext text searching. That's what we use in Firefox for iOS, and I don't want to make some of these fields :db/fulltext true, so we should grow support for it. I'm thinking that it'll be a special filtering function, like:

[:find ?e :where
 [?e :credential/name ?t]
 [(string-contains ?t "pattern")]]

although there are a few subtleties. First, "pattern" can be a binding (which is well supported by SQLite). Second, the pattern can contain _ and %, which have special meaning to SQLite. It's easy to escape a constant pattern, but not so easy to escape a binding (coming from elsewhere in the query engine, i.e., another column). We could make escaping the responsibility of the consumer, but that's likely to lead to surprises.

As a first cut, we could only accept literal patterns, which we can escape (or not) and transform concretely.

@ncalexan ncalexan added enhancement help wanted A-query Issues or requests for query capabilities. labels Jun 18, 2018
@rnewman
Copy link
Collaborator

rnewman commented Jun 19, 2018

I concur re literal patterns, but perhaps consider string-starts-with and string-ends-with (which would omit the implied %string% stuff in SQLite).

I encourage you to require that any such ?t has a bound attribute, because otherwise you’re doing an unbounded full search of any string in the store, including vocabularies that calling code might not be aware of. You can use or successfully here.

@ncalexan
Copy link
Member Author

I encourage you to require that any such ?t has a bound attribute, because otherwise you’re doing an unbounded full search of any string in the store, including vocabularies that calling code might not be aware of. You can use or successfully here.

It's not clear to me how to trace provenance of bindings so concretely in the code as currently expressed. This has been on my mind a bit as I idly ponder what our performance profile would be if we stored the indices as separate tables and then used the "best possible" table at query time. (That's the strategy Datascript and presumably Datomic take.)

But yes, bad things can happen with unbounded table walks. Datomic prevents them, which is irritating when you're trying to inspect the store :)

@rnewman
Copy link
Collaborator

rnewman commented Jun 21, 2018

We could hack in this kind of placement analysis into the initial pattern walk — we know at this point which variables are both string matches and the objects of patterns, and what the attributes of those patterns are — but in general this is constraint algebrizing, and it’s hard.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-query Issues or requests for query capabilities. enhancement help wanted
Projects
None yet
Development

No branches or pull requests

2 participants