This Python script converts a Microsoft Word document (.docx) into an MP3 audio file using Azure Cognitive Services Text-to-Speech API. It now supports conversion of long documents, overcoming the 10-minute limit of Azure TTS API by splitting the text and combining the resulting audio files.
- Python 3.6 or higher
- An Azure subscription key for the Text-to-Speech service. Follow the instructions at BobTranslate to obtain an API key.
- The region for the Azure Text-to-Speech service.
- A voice shortname for the Text-to-Speech service. A list of available voices can be found at Language and voice support for the Speech service.
ffmpegsoftware package. This is required for splitting long documents into smaller chunks and combining the resulting audio files.
-
Clone the repository or download the source code. You can clone the repository by using the command:
git clone https://github.com/lancer1911/msword-Azure-tts.gitIf you don't have Git installed, you can download the source code directly. Go to the repository's main page on GitHub, click on the "Code" button, and then click "Download ZIP". Once the ZIP file is downloaded, extract it to access the source code.
-
Install the required dependencies:
pip install -r requirements.txt -
Install
ffmpeg:macOS:
brew install ffmpegLinux (Ubuntu/Debian):
sudo apt-get update sudo apt-get install ffmpegWindows: Download a static build from the official site. Unzip the downloaded file and add the
bindirectory from the unzipped file to your system PATH. -
Open the
settings.cfgfile and add your Azure subscription key, region, voice shortname, and speech recognition language:[Azure] subscription_key = your_subscription_key region = your_region voice_shortname = voice_shortname # e.g. en-US-EricNeural speech_recognition_language = your_recognition_language # e.g. en-USReplace
YOUR_SUBSCRIPTION_KEY,YOUR_REGION,YOUR_VOICE_SHORTNAME, andYOUR_SPEECH_RECOGNITION_LANGUAGEwith the appropriate values.
You can run the script by providing a .docx file path as a command-line argument or by selecting the file using a file dialog.
python msword-Azure-tts.py sample.docxReplace sample.docx with the path to your Word document.
Run the script without any command-line arguments:
python msword-Azure-tts.pyA file dialog will appear, allowing you to select the Word document to convert.
The script will save the generated MP3 audio file in the same directory as the input file with the same name and an -Azure-tts.mp3 suffix. For example, if the input file is named sample.docx, the output file will be named sample-Azure-tts.mp3.