Skip to content
This repository has been archived by the owner on Dec 29, 2022. It is now read-only.

Should GeoHashQuery include endValue or not? #155

Open
algrid opened this issue Jul 5, 2019 · 6 comments
Open

Should GeoHashQuery include endValue or not? #155

algrid opened this issue Jul 5, 2019 · 6 comments

Comments

@algrid
Copy link

algrid commented Jul 5, 2019

Hello! Thanks for nice library.

I have a question about the following methods from GeoHashQuery.java:

    public static GeoHashQuery queryForGeoHash(GeoHash geohash, int bits) {
        String hash = geohash.getGeoHashString();
        int precision = (int)Math.ceil((double)bits/Base32Utils.BITS_PER_BASE32_CHAR);
        if (hash.length() < precision) {
            return new GeoHashQuery(hash, hash+"~");
        }
        hash = hash.substring(0, precision);
        String base = hash.substring(0, hash.length() - 1);
        int lastValue = Base32Utils.base32CharToValue(hash.charAt(hash.length() - 1));
        int significantBits = bits - (base.length() * Base32Utils.BITS_PER_BASE32_CHAR);
        int unusedBits = Base32Utils.BITS_PER_BASE32_CHAR - significantBits;
        // delete unused bits
        int startValue = (lastValue >> unusedBits) << unusedBits;
        int endValue = startValue + (1 << unusedBits);
        String startHash = base + Base32Utils.valueToBase32Char(startValue);
        String endHash;
        if (endValue > 31) {
            endHash = base + "~";
        } else {
            endHash = base + Base32Utils.valueToBase32Char(endValue);
        }
        return new GeoHashQuery(startHash, endHash);
    }

    public static Set<GeoHashQuery> queriesAtLocation(GeoLocation location, double radius) {
    ...
    }

I don't fully understand this code, but I have an impression that endValue from queryForGeoHash is not supposed to be included into the query. I mean that the query should be interpreted as startValue <= geohash < endValue but in fact when the query is executed it's inclusive: startValue <= geohash <= endValue.

To fix that we could change the line:

        int endValue = startValue + (1 << unusedBits);

into:

        int endValue = startValue + (1 << unusedBits) - 1;

To illustrate this, look at my results with center in Caracas, Venezuela and radius=350km (the selected area is what's included into the query):

  1. Original version:

Screen Shot 2019-07-05 at 11 40 44 PM

  1. My fixed version:

Screen Shot 2019-07-05 at 11 37 59 PM

Note: actually I'm using a kotlin version of the code from here: https://github.com/imperiumlabs/GeoFirestore-Android/blob/34b935c534c583520a706c3388c57962122846bf/geofirestore/src/main/java/org/imperiumlabs/geofirestore/core/GeoHashQuery.kt

but the logic is exactly the same, so I think that the code in this repo is (or is closer to) the original.

@smohankrishna

This comment has been minimized.

@samtstern

This comment has been minimized.

@samtstern
Copy link
Contributor

@algrid thanks for filing this with such detail! This looks like a sensible improvement to me but I don't know enough about geohashes to say for sure ... I am hoping that @puf can comment.

@puf
Copy link

puf commented Jul 22, 2019

I noticed the same outliers in my own testing with geohashes/geoqueries based on GeoFire, although never as bad as the db hash in your first screenshot. But I also had some queries where I didn't see outliers, so I assumed that it rounds up somewhere for edge cases. I wish I'd kept those queries around. :-/

@algrid
Copy link
Author

algrid commented Jul 23, 2019

@puf thanks for the info.

If we hit the endHash = base + "~"; case, then there should be no outliers, but otherwise they will exist. That's of course if my theory in the original post is valid.

But if there are cases with no outliers where the resulting endHash doesn't end with a ~, then the reality is more complex :)

@joselpomares
Copy link

joselpomares commented Nov 19, 2022

Hello. I thinks that the increment in the endValue must be calculate in other form. imo: is correct the bitwise operations for startValue, leaving 0s in unusedBits, but for the endValue we need filter values until (inclusive) that start with significantBits, and no more (the current bitwise addition change the values of significantBits). For that we need put 1s in the unusedBits places, without change the significantBits values (endValue = startValue + (31 >> significantBits);). Finally, to get a "suffix inclusive" query, add "~" to end of endValue-geohash. I hope that the attached image help to explain my idea. Sorry by my Inglish.
Sin título

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants