Evaluating Multimodal Large Language Models on Vertically Written Japanese Text

This is the official repository for the paper "Evaluating Multimodal Large Language Models on Vertically Written Japanese Text".

Introduction

We evaluate the reading capability of exsisting MLLMs on vertically written Japanese text.

Releases

JSSODa

JSSODa (Japanese Simple Synthetic OCR Dataset) is constructed by rendering Japanese text generated by an LLM into images. The images contain text written both vertically and horizontally, which is organized into one to four columns.

train, val: llm-jp/JSSODa
test: llm-jp/JSSODa-test

VJRODa

VJRODa (Vertical Japanese Real-world OCR Dataset) consists of images containing vertically written Japanese text sourced from the real-world PDF pages.

https://gitlab.llm-jp.nii.ac.jp/datasets/vjroda

Installation

Install uv, then run the following commands:

uv venv --python 3.10.18 --seed
uv sync

Datset Construction, Training, and Evaluation

Please refer to this README.

License

The code is released under the Apache License, Version 2.0.

Citation

@misc{sasagawa2025evaluatingmultimodallargelanguage,
      title={Evaluating Multimodal Large Language Models on Vertically Written Japanese Text}, 
      author={Keito Sasagawa and Shuhei Kurita and Daisuke Kawahara},
      year={2025},
      eprint={2511.15059},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2511.15059}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evaluating Multimodal Large Language Models on Vertically Written Japanese Text

Introduction

Releases

JSSODa

VJRODa

Installation

Datset Construction, Training, and Evaluation

License

Citation

About

Uh oh!

Releases

Packages

Languages

License

llm-jp/eval_vertical_ja

Folders and files

Latest commit

History

Repository files navigation

Evaluating Multimodal Large Language Models on Vertically Written Japanese Text

Introduction

Releases

JSSODa

VJRODa

Installation

Datset Construction, Training, and Evaluation

License

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages