A simple python srt parser.
Soustitle
[FrEnglish]:sous [FR]meaningsubandtitle[EN], a combination of the two formssoustitle(subtitle).
Read and parse subtitle (.srt) file, returns a dictionary consisting of the subtitle details (e.g start / end time and subtitle text). The result can also be converted into csv or json format for further use.
This script is ported from my Scala version of the library. You can check the project version here.
$ python3
>>> from soustitle import Subtitle>>> srt = Subtitle('resources/sample.srt')
>>> res = srt.open()
>>> print(res)Alternatively, the file path could be passed directly to the open method.
>>> srt = Subtitle()
>>> res = srt.open('resources/sample.srt')
>>> print(res)>>> sample = """1
00:00:12,815 --> 00:00:14,509
Lorem ipsum dolor sit amet
consectetur adipiscing elit.
2
00:00:14,815 --> 00:00:16,498
Lorem ipsum dolor sit amet.
3
00:00:16,934 --> 00:00:17,814
Lorem ipsum dolor sit amet."""
>>> parse_srt = Subtitle(srt_string=sample)
>>> result = parse_srt.parse()
>>> print(result)[
{
"start_time": "00:00:12:815",
"end_time": "00:00:14:509",
"subtitle_text": "Lorem ipsum dolor sit amet consectetur adipiscing elit."
}, {
"start_time": "00:00:14:815",
"end_time": "00:00:16:498",
"subtitle_text": "Lorem ipsum dolor sit amet."
}, {
"start_time": "00:00:16:934",
"end_time": "00:00:17:814",
"subtitle_text": "Lorem ipsum dolor sit amet."
}
]- Convert to
.csvformat
>>> csv_out = parse_srt.to_csv(result, 'resources/output.csv')
>>> print(csv_out)- Convert to
.jsonformat
>>> csv_out = parse_srt.to_json(result, 'resources/output.json')
>>> print(csv_out)git clone https://github.com/mdauthentic/soustitle-py.git
cd soustitle-py
python3 -m venv my-env
source my-env/bin/activate
pip install -r requirements.txtRun
pytest tests.pyOptionally, you can run the test with docker
docker build -t image-name .
docker run image-name