Skip to content

Commit

Permalink
README: Update readme (#20)
Browse files Browse the repository at this point in the history
* README: Update readme

Signed-off-by: Marta Plantykow <[email protected]>

* remove print

* add images and HamNoSys description into readme

* add citation to readme

Co-authored-by: majsylw <[email protected]>
  • Loading branch information
mplantykow and majsylw authored Apr 15, 2022
1 parent f4bb62d commit e991c98
Show file tree
Hide file tree
Showing 5 changed files with 109 additions and 153 deletions.
261 changes: 109 additions & 152 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,18 +1,77 @@
# parse-hamnosys
Sign language HamNoSys notation parsing tool.
![CI workflow badge](https://github.com/hearai/parse-hamnosys/workflows/CI-pipeline/badge.svg) ![Visits Badge](https://badges.pufler.dev/visits/hearai/parse-hamnosys)

<p align="center">
<a href="https://www.hearai.pl"><img src="https://i.imgur.com/wKCpSOh.png" height="auto" width="200"></a>
</p>

# HamNoSys Parser

![parser schema](./imgs/schemat.JPG)

The main goal of the parser is to translate a label representing a SL gloss,
written in HamNoSys format, into a form that can be used for DL-based classification.
We used a decision tree to decompose the notation into numerical multilabels
for the defined classes.
The parser analyzes a series of symbols and matches each symbol with the class
that describes it (while assigning it the appropriate number) or removes it.

## Hamburg Sign Language Notation System

The Hamburg Sign Language Notation System (HamNoSys) is a _phonetic_
transcription system that has been widespread for more then 20 years.
HamNoSys does not refer to different national finger alphabets and
can therefore be used internationally.
It can be divided into six basic blocks, as presented
in the below Figure (upper panel).
The first two out of six blocks - symmetry operator and
non-manual features - are optional. The remaining four components - handshape,
hand position, hand location, and movement - are mandatory.
More detailed structure is also presented in the figure (lower panel).

![HamNoSys structure](./imgs/HamNoSys_structure_detailed.JPG)

### Font

HamNoSys font is required to be installed, to properly visualise provided examples.
It can be downloaded directly from
[DSG Corpus](https://www.sign-lang.uni-hamburg.de/dgs-korpus/index.php/hamnosys-97.html)
website.

## Parser classes

In our implementation, four blocks (symmetry operator, location left/right, location top/bottom, distance from the body) refer to the overall human posture, while five blocks (handshape base form, handshape thumb position, handshape bending, hand position extended finger direction, and hand position palm orientation) relate to a single hand.

List of blocks and their ranges:
* __Symmetry operator__ class consists of 9 symbols (0 to 8).
* __Handshape - Baseform__ class consists of 12 symbols (0 to 11).
* __Handshape - Thumb position__ class consists of 4 symbols (0 to 3).
* __Handshape - bending__ class consists of 6 symbols (0 to 5).
* __Handposition - Extended finger direction__ class consists of 18 symbols (0 to 17).
* __Handposition - Palm orientation__ class consists of 8 symbols (0 to 7).
* __Handposition - Left/Right__ class consists of 5 symbols (0 to 4).
* __Handposition - Top/Bottom__ class consists of 37 symbols (0 to 36).
* __Handposition - Distance__ class consists of 6 symbols (0 to 5).

The figure presents the numerical values and assigned to them HamNoSys symbols.
![multilabel classes](./imgs/Classes_all_some_frames.JPG)

## Usage

In its most basic form, parsing new annotation file boils down to:

```
$ python parse-hamnosys.py -sf <source_file> -df <destination_file>
$ python parse-hamnosys.py -sf <source_file> -df <destination_file> -ef <error_file>
```

## Usage example
In repositry we prepared some example files, which were produced by running:

```
$ python parse-hamnosys.py -sf hamnosys_example.txt -df hamnosys_parsed.txt
$ python parse-hamnosys.py -sf hamnosys_example.txt -df hamnosys_parsed.txt -ef error.txt
```

## Source file format
Default input file format is defined by the [HearAI](https://github.com/hearai/hearai) project requirements. As in[hamnosys_example.txt](hamnosys_example.txt) parser requires a file that has 6 columns separated with a space sign " ". Due to this, if any of the columns contains space, it must be removed or replaced (for example with "_" sign) before passing to the parser. Parsers operates only on HamNoSys notation, that is stored in the last (6th) column. It shall start with HamNoSys sign (No quote nor apostrophe sign is allowed).
### Source file format

Default input file format is defined by the [HearAI](https://github.com/hearai/hearai) project requirements. As in [hamnosys_example.txt](hamnosys_example.txt) parser requires a file that has 6 columns separated with a space sign " ". Due to this, if any of the columns contains space, it must be removed or replaced (for example with "_" sign) before passing to the parser. Parsers operates only on HamNoSys notation, that is stored in the last (6th) column. It shall start with HamNoSys sign (No quote nor apostrophe sign is allowed).
Input file columns and description:
* Name - name of a video file that given notation refers to
* Start - sign start time (on a video)
Expand All @@ -21,162 +80,60 @@ Input file columns and description:
* Word - translation to a spoken language
* Hamnosys - Notation

## Destination File format
Destination file consists of following columns separated by the space " " sign:
### Destination file format

Default estination file consists of following columns separated by the space " " sign:
* Name - name of a video file that given notation refers to, directly copied from source file
* Start - sign start time (on a video), directly copied from source file
* End - sign end time (on a vide), directly copied from source file
* Symmetry operator - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* NonDom first - Used when notation starts with  sign
* Dominant - Handshape - Baseform - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handshape - Thumb position - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handshape - bending - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handposition - extended finger direction - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handposition - palm orientation - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handposition - LR - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handposition - TB - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handposition - Distance - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Dominant - Handshape - Baseform2
* Dominant - Handshape - Thumb position2
* Dominant - Handshape - Bending2
* Dominant - Handposition - Extended finger direction2
* Dominant - Handposition - Palm orientation2
* NONDominant - Handshape - Baseform
* NONDominant - Handshape - Thumb position
* NONDominant - Handshape - Bending
* NONDominant - Handposition - Extended finger direction
* NONDominant - Handposition - Palm orientation
* NONDominant - Handshape - Baseform2
* NONDominant - Handshape - Thumb position2
* NONDominant - Handshape - Bending2
* NONDominant - Handposition - Extended finger direction2
* NONDominant - Handposition - Palm orientation2
* Handposition - LR - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Handposition - TB - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation
* Handposition - Distance - Number that represents one of the classes (please refer to hamnosys_dicts.txt), parsed from notation

## hamnosys_example.txt
File created using [Korpusowy słownik polskiego języka migowego](https://www.slownikpjm.uw.edu.pl/)
Joanna Łacheta, Małgorzata Czajkowska-Kisil, Jadwiga Linde-Usiekniewicz, Paweł Rutkowski (red.), 2016, Korpusowy słownik polskiego języka migowego, Warszawa: Wydział Polonistyki Uniwersytetu Warszawskiego, ISBN: 978-83-64111-49-5 (publikacja online).
## Citation

If you find this code useful in your research, please consider [citing](https://arxiv.org/abs/2204.06924):

## Font
HamNoSys font is required to be installed to properly undestand subcesction [Classes](#classes).
It can be downloaded directly from [DSG Corpus](https://www.sign-lang.uni-hamburg.de/dgs-korpus/index.php/hamnosys-97.html) website.

## Classes
### Symmetry operator
Symmetry operator class consists of 9 symbols (0 to 8). Following list represents class number to symbol mapping:
0. None
1. 
2. 
3. 
4. 
5. 
6. 
7.
8.

### Handshape - Baseform
Handshape - baseform class consists of 12 symbols (0 to 11). Following list represents class number to symbol mapping:
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.

### Handshape - Thumb position
Handshape - thumb position class consists of 4 symbols (0 to 3). Following list represents class number to symbol mapping ([Handshape - base form sign](#handshape---baseform) sign is used only as a reference):
0. None
1. 
2. 
3. 

### Handshape - Bending
Handshape - bending class consists of 6 symbols (0 to 5). Following list represents class number to symbol mapping ([Handshape - base form sign](#handshape---baseform) sign is used only as a reference):
0.  (none)
1. 
2. 
3. 
4. 
5. 

### Handposition - Extended finger direction
Handposition - extended finger direction class consists of 18 symbols (0 to 17). Following list represents class number to symbol mapping:
0.
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
### Handposition - Palm orientation
Handshape - palm orientation class consists of 8 symbols (0 to 7). Following list represents class number to symbol mapping:
0.
1.
2.
3.
4.
5.
6.
7.

### Handposition - Left/Right
Handposition - left/right class consists of 5 symbols (0 to 4). Following list represents class number to symbol mapping ([Handposition - Top/Bottom sign](#handposition---topbottom) is used only as a reference):
0.  - center
1.  - left to the left
2.  - left
3.  - right
4.  - right to the right

### Handposition - Top/Bottom
Handposition - top/bottom class consists of 37 symbols (0 to 36). Following list represents class number to symbol mapping:
0. None
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.

### Handposition - Distance
Handposition - top/bottom class consists of 37 symbols (0 to 36). Following list represents class number to symbol mapping:
0. None
1.
2.
3.
4.
5.
6.
```
@misc{majchrowska2022hamnosys,
doi = {10.48550/ARXIV.2204.06924},
url = {https://arxiv.org/abs/2204.06924},
author = {Majchrowska, Sylwia and Plantykow, Marta and Olech, Milena},
title = {Open Source HamNoSys Parser for Multilingual Sign Language Encoding},
publisher = {arXiv},
year = {2022},
copyright = {arXiv.org perpetual, non-exclusive license}
}
```

## Acknowledgement

File `hamnosys_example.txt` was created using [Korpusowy słownik polskiego języka migowego](https://www.slownikpjm.uw.edu.pl/)

Joanna Łacheta, Małgorzata Czajkowska-Kisil, Jadwiga Linde-Usiekniewicz, Paweł Rutkowski (red.), 2016, Korpusowy słownik polskiego języka migowego, Warszawa: Wydział Polonistyki Uniwersytetu Warszawskiego, ISBN: 978-83-64111-49-5 (publikacja online).

## Read more
* [Introduction to HamNoSys](https://www.hearai.pl/post/4-hamnosys/)
* [Introduction to HamNoSys Part 2](https://www.hearai.pl/post/5-hamnosys2/)
* [Introduction to HamNoSys Part 2](https://www.hearai.pl/post/5-hamnosys2/)
Binary file added imgs/Classes_all_some_frames.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added imgs/HamNoSys_structure_detailed.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added imgs/schemat.JPG
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion parse-hamnosys.py
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,6 @@ def main(args):
data["Handposition - Distance"] = 0

for index, _ in data.iterrows():
print(index)
# Search for symmetry operators that consists of from 3 to 1 symbol,
# remove if found
for i in range(1, 10):
Expand Down

0 comments on commit e991c98

Please sign in to comment.