Onsei: Japanese pitch accent practice tool

This project aims at creating tools to automatically assess the pitch accent accuracy of a Japanese language learner, and help them practice their pitch-accent at the sentence level.

PLEASE NOTE THAT THIS IS AN EXPERIMENTAL WORK IN PROGRESS !
Feedbacks and suggestions are welcome => Gitter chat or Github issues

How to play with it

Click here to deploy the web interface.

Note that this can take a few minutes to load !

Did you like it ? Please consider donating to help me support future developments, thank you !

What is it for ?

As Japanese is a pitch-accent based language, foreign learners that don't have a pitch-accent or tonal mother tongue will likely struggle to identify and reproduce the correct pitch patterns.

If you are completely novice to pitch-accent, I suggest you first start with an introductory course such as this one.

Practicing with sentence rather than individual words is interesting because there is a difference between the theoretical accent patterns in a sentence and how native speakers actually say it, for many reasons (emphasis on certain words, emotions, slurred speech...)

Setup

The following instructions have been tested on Ubuntu 20.20.

Since there are many dependencies to compile from source, the easiest way is to build using Docker:

docker build -t onsei .

Then run the following command:

docker run -p 8866:8866 -v "$PWD":/home/jovyan/work --entrypoint=voila onsei:latest

Open the interface in your web browser: http://localhost:8866/voila/render/work/notebook.ipynb

For development purpose, run the JupyterLab:

docker run -p 8888:8888 -e JUPYTER_ENABLE_LAB=yes -v "$PWD":/home/jovyan/work onsei:latest

Open the notebook in your browser: http://127.0.0.1:8888/lab/tree/notebook.ipynb

Alternatively, it should build with jupyter-repo2docker

pip3 install jupyter-repo2docker
jupyter-repo2docker -E .

API

An API version has been developed to create an Anki addon !

To setup it up:

# First build onsei base image
docker build -t onsei .
# Then build the onsei-api image on top of it
docker build -f Dockerfile.api -t onsei-api .
docker run --network=host onsei-api
# Open http://127.0.0.1:8000/ in your web browser

Or if you already have everything installed locally, can simply run it with:

uvicorn onsei.api:app

Using the CLI

Note: you probably want to use the Jupyter notebook first, see instructions above.

For more advanced usages, a CLI is available.

Visualize a recording

python3 -m onsei.cli view \
    "data/ps/ps1_boku_no_chijin-teacher2.wav" \
    --sentence "僕の知人の経営者に"

Comparing teacher and student recordings

The following script compares teacher and student recordings of the same sentence, show a bunch of graphs to visualize the differences and computes a distance, i.e., how close the student pronunciation is to the teacher's.

Here is an example with the sentence 僕の知人の経営者に (boku no chijin no keieisha ni). The sample recordings are:

data/ps/ps1_boku_no_chijin-student1.wav: student mispronouncing words
data/ps/ps1_boku_no_chijin-teacher2.wav: teacher repeating with correct pronunciation
data/ps/ps1_boku_no_chijin-student3.wav: student trying again and fixing the mistakes

First comparing the mispronounced sentence with the teacher's:

python3 -m onsei.cli compare \
    data/ps/ps1_boku_no_chijin-teacher2.wav \
    data/ps/ps1_boku_no_chijin-student1.wav \
    --sentence "僕の知人の経営者に"
# Mean distance: 1.21 (smaller means student speech is closer to teacher)

Then comparing the rectified sentence with the teacher's:

python3 -m onsei.cli compare \
    data/ps/ps1_boku_no_chijin-teacher2.wav \
    data/ps/ps1_boku_no_chijin-student3.wav \
    --sentence "僕の知人の経営者に"
# Mean distance: 0.57 (smaller means student speech is closer to teacher)

(Note that the natural offset in the pitch is removed when we normalize the pitches to compute the distance)

As the student fixes the mistakes, we can see that the computed distance lowers.

Other commands

To see other possible commands, see the help of the CLI:

# List of the commands
python3 -m onsei.cli --help

# Details on a specific command
python3 -m onsei.cli <command> --help

Methodology

If you are interested in the way the comparison process works, here is an overview:

Crop both recordings to remove the noise before and after the sentence
Segment both recordings to find where each phoneme starts and ends
Align the student recording with the teacher's, using Dynamic Time Warping (DTW) based on detected phonemes (by default) or on speech intensity
Apply the same alignment on the pitch signals and normalize them
Compute a mean distance based on the aligned and normalized pitch signals

Name		Name	Last commit message	Last commit date
Latest commit History 64 Commits
data		data
onsei		onsei
Dockerfile		Dockerfile
Dockerfile.api		Dockerfile.api
README.md		README.md
graphs_bad_student.png		graphs_bad_student.png
graphs_good_student.png		graphs_good_student.png
notebook-requirements.txt		notebook-requirements.txt
notebook.ipynb		notebook.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Onsei: Japanese pitch accent practice tool

How to play with it

What is it for ?

Setup

API

Using the CLI

Visualize a recording

Comparing teacher and student recordings

Other commands

Methodology

About

Releases

Packages

Languages

itsupera/onsei

Folders and files

Latest commit

History

Repository files navigation

Onsei: Japanese pitch accent practice tool

How to play with it

What is it for ?

Setup

API

Using the CLI

Visualize a recording

Comparing teacher and student recordings

Other commands

Methodology

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages