Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question - how to use this to search for common strings based on prefix #70

Open
avnerbarr opened this issue Jan 4, 2022 · 2 comments

Comments

@avnerbarr
Copy link

avnerbarr commented Jan 4, 2022

Hi,

I'm a bit confused as to how this is supposed to be used.

My example is as follows:

I have many entries (sentences/paragraphs etc.)

given an arbitrary prefix, I would like to retrieve all relevant matches from the trie.

For example:

Trie construction by these phrases

1. Hello I am happy
2. Hello I am sad
3. Hello I am afraid
4. Hello you are sad
5. Hello you are happy

If I understand correctly the trie would compress along these lines

                                                                   │ compressed�
         ┌─────────────────────────►   Hello                       │
         │                              │   │                      │    │
         │                              │   │                      │    │
         │                              │   │          ◄───────────┘    │
         │ ┌─────────────────────► I ◄──┘   └──► You                    │
compressed │                        │              │          ◄─────────┘
                                    │              └──► are
           │                        ▼                     ▼  │
           └───────────────────►  a�m�│                  sad └──────►  happy
                                     ┌┤
                                 │   ▼│
                        happy◄───┘ sad└─────►afraid

Now given a partial phrase such as:
Hello I

I would like to be able to retrieve the results for 1,2,3 (since they share the prefix Hello I)

Can you point me in how to achieve this?

Thanks!

@Drvanon
Copy link

Drvanon commented Apr 6, 2022

I have a similar issue. I would like to run an algorithm based on the common prefix, but as far as I have been able to read the documentation, the common prefix is not able to be accessed, only the nibble vec that represents it.

@vemonet
Copy link

vemonet commented Dec 21, 2023

@avnerbarr @Drvanon I also faced the same issue, in case that might be useful to the next person searching: I reused and improved an existing implementation for a generic trie without any dependencies. I added functions to easily find prefixes or postfixes in the trie and packaged it there: https://github.com/vemonet/ptrie

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants