This repository contains a collection of Persian poetry scraped from the Ganjoor website for NLP tasks.
-
Poet Name Folder: Each poet has a folder named after them.
-
Poetic Work Folder: Within the poet’s folder, there are subfolders for different poetic works.
-
Poem File: Each poem is saved as a separate .txt file.
-
File Content: Each line in the poem files represents a verse, with each hemistich separated by
|
character.
Let’s say you want to get a poem with the link ganjoor.net/hafez/ghazal/sh354
. In this repository, this poem would be saved in a file located at the following path:
hafez/ghazal/354.txt