Real-time Translation using Azure Speech, Translation and PubSub Services

This application is a real-time translation and speech-to-text demo. It utilizes Azure services such as Web PubSub, Translator, Speech, and OpenAI to enable translation, transfer and summarization of spoken messages to a browser. The server can be started and accessed via API endpoints. The output is displayed on the endpoint index.html. Audio input via a microphone may be accessed via a web app at speaker.html or running a local app, main.py. Additionally, there is a test mode available for recording and summarizing English text.

Prerequisites

python
Create an Azure Web PubSub resource
Create an Azure Translator resource
Create an Azure Speech services resource
Create an Azure openAI services resource
(optional) Install ffmpeg if you plan to use the the web app and page speaker.html for input

Setup

# Create venv
python -m venv env
# Active venv
source .env/bin/activate
# pip install
pip install -r requirements.txt

Install ffmpeg on the server machine if you intend to use the web app and page speaker.html for audio input. Note: ffmpeg is required to convert the audio from a browser in .webm format to .wav format required by Azure Speech services.

Environment Setup

API keys, connection details such as strings and regions are set via environment variables. They may either be set in system environment variables or via the .env file. An example .env file is provided in the .env_example file. To create the .env file copy the .env_example file and edit the values.

Varaibles required:

AZURE_REGION - Azure region, e.g. 'eastus'
TRANSLATOR_KEY - Azure translation API key from Keys and Endpoint tab of Azure Translator service
PUBSUB_ENDPOINT - Connection String from Keys tab of the created Azure Web PubSub service
PUBSUB_HUBNAME - identifier for the pub/sub topic, e.g., "sample_stream"
SPEECH_KEY - Azure speech API key from Keys and Endpoint tab of Azure Speech service
AZURE_OPENAI_ENDPOINT - URL to the endpoint Keys and Endpoint tab of Azure Open AI service
CHAT_COMPLETIONS_DEPLOYMENT_NAME - deployment name of the resource from the Deployments tab of Azure Open AI Studio
OPENAI_API_KEY - Azure openAI API key from Keys and Endpoint tab of Azure Open AI service

Start the server

python server.py

The server is then started. Open the home page http://localhost:5000 in a browser. On the page enter any name for a site and click the 'Go' button. You will be taken to the page http://localhost:5000/<sitename> in browser. You can use any name as sitename. If you use F12 to view the Network you can see the WebSocket connection is established.

The server may also be accessed via an API via the endpoints documented in api-definitions.md

Audio input may be done through the a local python app, main.py, or via the web page http://localhost:5000/<sitename>/speaker

To start the local app, main.py

Run:

source ./env/bin/activate
python main.py <sitename>

The sitename should be the same as that used when accessing the translation display through the browser. If no site is provided it defaluts to the defalut site: test_site The main app should use your system's default microphone. Start speaking messages and you can see the messages are translated and transferred to the browser.

Test Mode

There is a test mode available which creates a test_site and displays the english text that it recorded via the microphone. It also prefills the the english text box with an example recording to enable testing of the summarization without having to speak the complete text into the microphone. To go to the test site open http://localhost:5000/test in a browser.

Docker Build

The entire application can be built withi a docker container. The docker build supports a command line arguemnt that will copy a .env file which contains the environment variables into the container for local testing. For production deployment the container should be built without the .env file (the default build) and the environment variables provided by the host.

To build for production:

docker build -t my-translator-app .

To build and use a .env file for testing:

docker build --build-arg COPY_ENV_FILE=true -t my-translator-app .

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
.github/workflows		.github/workflows
src		src
static		static
templates		templates
.env_example		.env_example
.gitignore		.gitignore
Dockerfile		Dockerfile
Readme.md		Readme.md
api-definitions.md		api-definitions.md
languages.json		languages.json
main.py		main.py
requirements.txt		requirements.txt
server.py		server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Real-time Translation using Azure Speech, Translation and PubSub Services

Prerequisites

Setup

Environment Setup

Start the server

To start the local app, main.py

Test Mode

Docker Build

About

Uh oh!

Releases

Packages

Languages

rickryan/translation-demo

Folders and files

Latest commit

History

Repository files navigation

Real-time Translation using Azure Speech, Translation and PubSub Services

Prerequisites

Setup

Environment Setup

Start the server

To start the local app, main.py

Test Mode

Docker Build

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages