已比較的版本

索引鍵

  • 此行已新增。
  • 此行已移除。
  • 格式已變更。

...

  • A python class VoiceSynthesis similar with SpeechSynthesizer under azure.cognitiveservices.speech.

  • A python class VoiceClientConfig can store development configurations.

  • A python class AudioFormat can specify the audio format.

  • A python class VoiceSynthesisResult can store the result status, audio data, and necessary message.

  • A python class VoiceSynthesisResultStatus is an enum class to specify the result status.

程式碼區塊
languagepy
class VoiceSynthesis(
    config: VoiceClientConfig
    format: AudioFormats
)
(method) def speak_ssml(ssml: str) -> VoiceSynthesisResult

The SSML string (ref1, ref2) in the method speak_ssml should contains at least two tags: voice and prosody. For instance:

程式碼區塊
languagepy
<speak>
    <voice language="en-US" name="en-US-BigBigNeural">
        <prosody rate="1.05">
            This is a sample sentence.
        </prosody>
    </voice>
</speak>
程式碼區塊
languagepy
class VoiceClientConfig(
    api_key: str = "",
    endpoint: str = "wss://localhost/voice/synthesis/v1"
)
程式碼區塊
languagepy
# PCM format only
# The defaults are we need currently.
class AudioFormat(
    samples_per_second: int = 16_000,
    bits_per_sample: int = 16,
    channels: int = 1,
)
程式碼區塊
languagepy
class VoiceSynthesisResult
(property) status: VoiceSynthesisResultStatus
(property) audio_data: bytes
(property) message: str

The audio_data should not contain any audio file header.

程式碼區塊
languagepy
class VoiceSynthesisResultStatus(Enum):
    SynthesizingCompleted = 1
    Canceled = 2

...