Speech Synthesis Markup Language (SSML)

Markup language that can be used for defining text to be read by Text-to-Speech systems like Web Speech API or Google Text-to-Speech API.

Part of the examples below. Rest can be found nicely written in Google Cloud Docs.

To run 2 scripts from the /google-cloud-text-to-speech directory, you need to set up the Google Cloud account and then Authenticate to Text-to-Speech.

`say-as`

<speak>
  <p><say-as interpret-as="time" format="24">12:00</say-as></p>
  <p><say-as interpret-as="unit">10 foot</say-as></p>
  <p><say-as interpret-as="ordinal">1</say-as> grade</p>
  <p><say-as interpret-as="duration" format="m:s">12:30</say-as></p>
</speak>

Interpreting something in a specific way (as currency, unit, duration, etc.).

`par`

<speak>
  <par>
    <media begin="1s">
      <audio src="https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3">
        <desc>T-rex roar</desc>
        ROOARRRRR
      </audio>
    </media>
    <media>
      <speak>You can hear the sound of the t-rex in the background.</speak>
    </media>
  </par>
</speak>

Bringing sounds together.

`seq`

<speak>
  <seq>
    <media begin="1s">
      <speak>Sentence after 1 second.</speak>
    </media>
    <media begin="3s">
      <speak>And another 3 seconds after the previous one.</speak>
    </media>
  </seq>
</speak>

Like par but in sequence, not in parallel.

`media`

Part of the par or seq elements.

`phoneme`

<speak>
  <phoneme alphabet="ipa" ph="ˌmænɪˈtoʊbə">manitoba</phoneme>
  <phoneme alphabet="x-sampa" ph='m@"hA:g@%ni:'>mahogany</phoneme>
</speak>

Phonetic transcription.

See: Supported phonemes and levels of stress.

`voice`

<speak>
  <p>
    <s>Here comes the conversation.</s>
    <s><voice gender="female">I am a woman.</voice></s>
    <s><voice gender="male">I am a man.</voice></s>
    <s><voice language="pl-PL" gender="male">A ja jestem z Polski.</voice></s>
  </p>
</speak>

Setting a voice.

`break`

<speak>
  <p>
    <s>Some text <break time="1s" /> and then text after 1 second.</s>
    <s>
      And then <break strength="medium" />text after medium-strength break
      <break strength="x-weak" />x-weak break and <break strength="x-strong" />x-strong break
    </s>
  </p>
</speak>

`audio`

<speak>
  <p>Let's listen to the sound of the t-rex!</p>
  <p>
    <audio src="https://interactive-examples.mdn.mozilla.net/media/cc0-audio/t-rex-roar.mp3">
      <desc>T-rex roar</desc>
      ROOARRRRR
    </audio>
  </p>
</speak>

Playing audio.

`prosody`

<speak>
  <p>
    <prosody rate="slow">This is the slow voice.</prosody>
  </p>
  <p>
    <prosody rate="x-fast">This is the very fast voice.</prosody>
  </p>
  <p>
    <prosody volume="x-soft">This is the very soft voice.</prosody>
  </p>
  <p>
    <prosody volume="x-loud">This is the very loud voice.</prosody>
  </p>
  <p>
    <prosody pitch="low">This is the low pitch voice.</prosody>
  </p>
  <p>
    <prosody pitch="high">This is the high pitch voice.</prosody>
  </p>
  <p>
    <prosody pitch="-10st">This is the customized pitch voice.</prosody>
  </p>
  <p>
    <prosody pitch="+10st">This is another customized pitch voice.</prosody>
  </p>
</speak>

Giving specific tone to the voice.

`emphasis`

<speak>
  <emphasis level="strong">This is an important announcement.</emphasis>
  <emphasis level="moderate">This is an important announcement.</emphasis>
  <emphasis level="emphasis">This is an important announcement.</emphasis>
  <emphasis level="reduced">This is an important announcement.</emphasis>
</speak>

I like to call it prosody's presets.

`p` and `s`

<p><s>This is sentence one.</s><s>This is sentence two.</s></p>

Paragraphs and sentences.

`sub`

<sub alias="World Wide Web Consortium">W3C</sub>

Replacing text with something specific to read. Useful in case of abbreviations for instance.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
google-cloud-text-to-speech		google-cloud-text-to-speech
web-speech-api		web-speech-api
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Speech Synthesis Markup Language (SSML)

`say-as`

`par`

`seq`

`media`

`phoneme`

`voice`

`break`

`audio`

`prosody`

`emphasis`

`p` and `s`

`sub`

About

Releases

Packages

Languages

robert-orlinski/ssml-playground

Folders and files

Latest commit

History

Repository files navigation

Speech Synthesis Markup Language (SSML)

say-as

par

seq

media

phoneme

voice

break

audio

prosody

emphasis

p and s

sub

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

`say-as`

`par`

`seq`

`media`

`phoneme`

`voice`

`break`

`audio`

`prosody`

`emphasis`

`p` and `s`

`sub`

Packages