-
Notifications
You must be signed in to change notification settings - Fork 2
flag/syntax to return exact matches only #5
Comments
Hi, thanks for the feedback! Ehmm would something like this helps? Or what's your exact use case?
|
for example, if I wanted to search for domains which resolve to
however, this currently also returns "similar" records, such as: [...]
{
"domain": {
"raw": "sayari.ch"
},
"aaaa_record": {
"raw": [
"2a02:168:be04::42"
]
},
"_meta": {
"id": "sayari.ch",
"engine": "domains-prod",
"score": 5.4933805
},
"id": {
"raw": "sayari.ch"
}
},
{
"domain": {
"raw": "alainwolf.ch"
},
"aaaa_record": {
"raw": [
"2a02:168:f405::42"
]
},
"_meta": {
"id": "alainwolf.ch",
"engine": "domains-prod",
"score": 5.4933805
},
"id": {
"raw": "alainwolf.ch"
}
} i.e. the aaaa record does not contain similarly, if I search for "picantepizza", I get tons of results which contain the word "pizza" but not necessarily "picatepizza", such as:
so, what I was hoping for is an option in the GUI/API to only return results which contain the full search string, and not perform any similarity searches. |
Alright, let me take a look on it on the weekend or evening. I guess it has to do how Elasticsearch is indexing this field... |
I've checked it and it seems a problem how the data gets indexed with ElasticSearch. I have contacted the ElasticSearch team how to solve it with the AppSearch I'm using under the hood. Will update if I get a solution from their side... |
Sorry for the long delay. I'm quite busy with school and work. Sadly there was no progress from Elastic side: https://discuss.elastic.co/t/precise-regex-search/266141/4 I'll try to fix and reindex the data on the weekend... |
no worries, thanks for the update! |
Ok, it's a product limitation of AppSearch (may be added in a future version). Anyway, I planed to create a REST-API that queries the ElasticSearch backend. With that implemented it will be possible. For example:
Which currently result in 8 matches, possible? 🤔 My semester ends soon, hopefully I'll find some time to continue with the project. |
So, for testing purpose you can use this endpoint. Syntax is the elastic Search API: https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html Currently it isn't documented on my side - and I'm not sure if I leave it like this (security, ...) - but if you need help with the syntax and fields let me know.
Resulting in:
|
Works very well, thanks! Apart from the "passive dns" use case this enables other interesting searches like "give me all domains with null MX" 👍
No worries about the stable API - if you have to make changes/disable for security reasons that's obviously understandable. |
nothing easier than this ;)
Keep in mind elasticsearch returns 10000 results per query, check the https://www.elastic.co/guide/en/elasticsearch/reference/current/scroll-api.html for more results! For each record I have the [type]_record & [type]_valid (true = it exists) field. My elasticsearch mapping got a little messed up with the last upgrade, have to review it later.... So currently I have these records:
|
Ohh I may understood you wrong - https://datatracker.ietf.org/doc/html/rfc7505 😁 but still I hope my comment above helps |
Thanks for searchzone.ch, it is a useful tool
Is it possible to somehow disable similarity search and only return results which contain the search string exactly?
For example I tried to perform "passive dns" like searches to see which ch-domains are hosted in certain ip ranges, but the results contain many unrelated results which just start with similar octets.
The text was updated successfully, but these errors were encountered: