Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotation tool with another languages #49

Open
KirillErokhin opened this issue Apr 24, 2023 · 2 comments
Open

Annotation tool with another languages #49

KirillErokhin opened this issue Apr 24, 2023 · 2 comments

Comments

@KirillErokhin
Copy link

Hello @yangheng95 and team!

Trying to annotate my own data to fine-tuning by ABSADatasetPrepareTool.html v.2.0 or https://pyabsa.pages.dev/, but having some issues:
Cannot annotate with another language (Russian) - words just stick together.

There is an example on screenshot by tool -
image
безопасноститоже. - like this, must be безопасности тоже.

And by https://pyabsa.pages.dev/ -
image
Here is better, but I cannot annotate words. Only full text.

What should I do?

P.S.
Thanks a lot for your great work by PyABSA!

@yangheng95
Copy link
Owner

#14 (comment)

@KirillErokhin
Copy link
Author

#14 (comment)

Tried. Not working.
And saving files works exectly the same.

image

Now:
image

And saving files likes this:
image

There are a lot of sticking words.

OKEY. I find the solution:

Replace this code on line 143:

<span class="wtoken">
    <span :id="item.tkid"
            :class="TokenClass(item)"
            v-for="(item, index) in PageArrays.rvarr"
            v-on:click="_.size(item.tk.trim())==0 ? void(0) : TokenHighlight(event.target, PageArrays.rvarr, item, index)"
            v-if="item.tk.trim()!==''"
            :key="item.tkid">{{index==0 ? item.tk : !item.tk.match(/\p{P}/gu) ? ' ' + item.tk : item.tk}}</span>
</span>

With this:

<span class="wtoken">
    <span :id="item.tkid"
            :class="TokenClass(item)"
            v-for="(item, index) in PageArrays.rvarr"
            v-if="item.tk.trim() !== ''"
            v-on:click="_.size(item.tk.trim()) === 0 ? false : TokenHighlight(event.target, PageArrays.rvarr, item, index)"
            :key="item.tkid">
        {{ index === 0 ? item.tk : (!item.tk.match(/[^\w\s]/g) ? ' ' : '') + item.tk }}
    </span>
</span>

And now:
image

Then, if needed to save file without sticking:
Replace following lines:

  • line 568
sentence = (v.map((v, i) => i == 0 || v.tk == "" ? v.tk : !v.tk.match(/\p{P}/gu) ? ' ' + v.tk : v.tk)).join("");

with this:

sentence = (v.map((v, i) => i == 0 || v.tk == "" ? v.tk : !v.tk.match(/[^\w\s]/g) ? ' ' : ' ' + v.tk)).join(""); 
  • line 584
sentence = (temp.map((v, i) => i == 0 || v.tkFinal == "" ? v.tkFinal : !v.tkFinal.match(/\p{P}/gu) ? ' ' + v.tkFinal : v.tkFinal)).join("");

with this:

sentence = (temp.map((v, i) => i == 0 || v.tkFinal == "" ? v.tkFinal : !v.tkFinal.match(/[^\w\s]/g) ? ' ' : ' ' + v.tkFinal)).join("");
  • line 626
return i == 0 || v.tk == "" ? v.tk : !v.tk.match(/\p{P}/gu) ? ' ' + v.tk : v.tk;

with this:

return i == 0 || v.tk == "" ? v.tk : !v.tk.match(/[^\w\s]/g) ? ' ' : ' ' + v.tk;

And now saving perfect:
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants