Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommandation (unsupervised/low ressource text alignment) #21

Open
pltrdy opened this issue Dec 20, 2018 · 1 comment
Open

Recommandation (unsupervised/low ressource text alignment) #21

pltrdy opened this issue Dec 20, 2018 · 1 comment
Assignees
Labels
question Further information is requested

Comments

@pltrdy
Copy link

pltrdy commented Dec 20, 2018

Hey guys,

Good work both on MatchZoo and this list!
I would be interested in quick advices/pointers on something related: I'd like to match related parts of texts.

More formally, I've a document, made of different sections (each with multiple sentences), and I'd like to map it to a similar text (transcription in fact), which is a bit longer, with some noise but talk about the same thing (lot of similarities) and in the same order. I made a dynamic programming algorithms which maximize a cosine similarities between sentence embeddings. Results aren't too bad, but I'd like to experiment other stuff.

Any idea?

Thanks a lot for any clue / references that seems relevant. We could discuss through gitter.im as well.

Paul


I've not much gold data (i.e. suitable segments to be training pairs), which is why I mention unsupervised/low ressources).

@bwanglzu bwanglzu added the question Further information is requested label Dec 20, 2018
@bwanglzu
Copy link
Member

@faneshion can you answer?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants