...
A python class VoiceSynthesis
similar with SpeechSynthesizer
under azure.cognitiveservices.speech
.
A python class VoiceClientConfig
can store development configurations.
A python class AudioFormat
can specify the audio format.
A python class VoiceSynthesisResult
can store the result status, audio data, and necessary message.
A python class VoiceSynthesisResultStatus
is an enum class to specify the result status.
程式碼區塊 |
---|
|
class VoiceSynthesis(
config: VoiceClientConfig
format: AudioFormats
)
(method) def speak_ssml(ssml: str) -> VoiceSynthesisResult |
The SSML string (ref1, ref2) in the method speak_ssml
should contains at least two tags: voice
and prosody
. For instance:
程式碼區塊 |
---|
|
<speak>
<voice language="en-US" name="en-US-BigBigNeural">
<prosody rate="1.05">
This is a sample sentence.
</prosody>
</voice>
</speak> |
程式碼區塊 |
---|
|
class VoiceClientConfig(
api_key: str = "",
endpoint: str = "wss://localhost/voice/synthesis/v1"
) |
程式碼區塊 |
---|
|
# PCM format only
# The defaults are we need currently.
class AudioFormat(
samples_per_second: int = 16_000,
bits_per_sample: int = 16,
channels: int = 1,
) |
程式碼區塊 |
---|
|
class VoiceSynthesisResult
(property) status: VoiceSynthesisResultStatus
(property) audio_data: bytes
(property) message: str |
The audio_data
should not contain any audio file header.
程式碼區塊 |
---|
|
class VoiceSynthesisResultStatus(Enum):
SynthesizingCompleted = 1
Canceled = 2 |
...