(简体中文|English)
This demo is an implementation of starting the streaming speech synthesis service and accessing the service. It can be achieved with a single command using paddlespeech_server
and paddlespeech_client
or a few lines of code in python.
For service interface definition, please check:
see installation.
It is recommended to use paddlepaddle 2.4rc or above.
You can choose one way from easy, meduim and hard to install paddlespeech.
If you install in easy mode, you need to prepare the yaml file by yourself, you can refer to the yaml file in the conf directory.
The configuration file can be found in conf/tts_online_application.yaml
.
protocol
indicates the network protocol used by the streaming TTS service. Currently, both http and websocket are supported.engine_list
indicates the speech engine that will be included in the service to be started, in the format of <speech task>_<engine type>
.
tts
.online
indicates an engine that uses python for dynamic graph inference; online-onnx
indicates an engine that uses onnxruntime for inference. The inference speed of online-onnx is faster.am_block
indicates the number of valid frames in the chunk, and am_pad
indicates the number of frames added before and after am_block in a chunk. The existence of am_pad is used to eliminate errors caused by streaming inference and avoid the influence of streaming inference on the quality of synthesized audio.
voc_block
indicates the number of valid frames in the chunk, and voc_pad
indicates the number of frames added before and after the voc_block in a chunk. The existence of voc_pad is used to eliminate errors caused by streaming inference and avoid the influence of streaming inference on the quality of synthesized audio.
host
address in the configuration file with the local IP address.Command Line (Recommended)
Start the service (the configuration file uses http by default):
paddlespeech_server start --config_file ./conf/tts_online_application.yaml
Usage:
paddlespeech_server start --help
Arguments:
config_file
: yaml file of the app, defalut: ./conf/tts_online_application.yamllog_file
: log file. Default: ./log/paddlespeech.logOutput:
[2022-04-24 20:05:27,887] [ INFO] - The first response time of the 0 warm up: 1.0123658180236816 s
[2022-04-24 20:05:28,038] [ INFO] - The first response time of the 1 warm up: 0.15108466148376465 s
[2022-04-24 20:05:28,191] [ INFO] - The first response time of the 2 warm up: 0.15317344665527344 s
[2022-04-24 20:05:28,192] [ INFO] - **********************************************************************
INFO: Started server process [14638]
[2022-04-24 20:05:28] [INFO] [server.py:75] Started server process [14638]
INFO: Waiting for application startup.
[2022-04-24 20:05:28] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-24 20:05:28] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-24 20:05:28] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
Python API
from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
server_executor = ServerExecutor()
server_executor(
config_file="./conf/tts_online_application.yaml",
log_file="./log/paddlespeech.log")
Output:
[2022-04-24 21:00:16,934] [ INFO] - The first response time of the 0 warm up: 1.268730878829956 s
[2022-04-24 21:00:17,046] [ INFO] - The first response time of the 1 warm up: 0.11168622970581055 s
[2022-04-24 21:00:17,151] [ INFO] - The first response time of the 2 warm up: 0.10413002967834473 s
[2022-04-24 21:00:17,151] [ INFO] - **********************************************************************
INFO: Started server process [320]
[2022-04-24 21:00:17] [INFO] [server.py:75] Started server process [320]
INFO: Waiting for application startup.
[2022-04-24 21:00:17] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-24 21:00:17] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-24 21:00:17] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
Command Line (Recommended)
Access http streaming TTS service:
If 127.0.0.1
is not accessible, you need to use the actual service IP address.
paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol http --input "您好,欢迎使用百度飞桨语音合成服务。" --output output.wav
Usage:
paddlespeech_client tts_online --help
Arguments:
server_ip
: erver ip. Default: 127.0.0.1port
: server port. Default: 8092protocol
: Service protocol, choices: [http, websocket], default: http.input
: (required): Input text to generate.spk_id
: Speaker id for multi-speaker text to speech. Default: 0output
: Client output wave filepath. Default: None, which means not to save the audio to the local.play
: Whether to play audio, play while synthesizing, default value: False, which means not playing. Playing audio needs to rely on the pyaudio library.spk_id
does not take effect. Streaming TTS does not support changing sample rate, variable speed and volume.Output:
[2022-04-24 21:08:18,559] [ INFO] - tts http client start
[2022-04-24 21:08:21,702] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-24 21:08:21,703] [ INFO] - 首包响应:0.18863153457641602 s
[2022-04-24 21:08:21,704] [ INFO] - 尾包响应:3.1427218914031982 s
[2022-04-24 21:08:21,704] [ INFO] - 音频时长:3.825 s
[2022-04-24 21:08:21,704] [ INFO] - RTF: 0.8216266382753459
[2022-04-24 21:08:21,739] [ INFO] - 音频保存至:output.wav
Python API
from paddlespeech.server.bin.paddlespeech_client import TTSOnlineClientExecutor
import json
executor = TTSOnlineClientExecutor()
executor(
input="您好,欢迎使用百度飞桨语音合成服务。",
server_ip="127.0.0.1",
port=8092,
protocol="http",
spk_id=0,
output="./output.wav",
play=False)
Output:
[2022-04-24 21:11:13,798] [ INFO] - tts http client start
[2022-04-24 21:11:16,800] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-24 21:11:16,801] [ INFO] - 首包响应:0.18234872817993164 s
[2022-04-24 21:11:16,801] [ INFO] - 尾包响应:3.0013909339904785 s
[2022-04-24 21:11:16,802] [ INFO] - 音频时长:3.825 s
[2022-04-24 21:11:16,802] [ INFO] - RTF: 0.7846773683635238
[2022-04-24 21:11:16,837] [ INFO] - 音频保存至:./output.wav
Command Line (Recommended)
First modify the configuration file conf/tts_online_application.yaml
, set protocol
to websocket
.
Start the service:
paddlespeech_server start --config_file ./conf/tts_online_application.yaml
Usage:
paddlespeech_server start --help
Arguments:
config_file
: yaml file of the app, defalut: ./conf/tts_online_application.yamllog_file
: log file. Default: ./log/paddlespeech.logOutput:
[2022-04-27 10:18:09,107] [ INFO] - The first response time of the 0 warm up: 1.1551103591918945 s
[2022-04-27 10:18:09,219] [ INFO] - The first response time of the 1 warm up: 0.11204338073730469 s
[2022-04-27 10:18:09,324] [ INFO] - The first response time of the 2 warm up: 0.1051797866821289 s
[2022-04-27 10:18:09,325] [ INFO] - **********************************************************************
INFO: Started server process [17600]
[2022-04-27 10:18:09] [INFO] [server.py:75] Started server process [17600]
INFO: Waiting for application startup.
[2022-04-27 10:18:09] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:18:09] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:18:09] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
Python API
from paddlespeech.server.bin.paddlespeech_server import ServerExecutor
server_executor = ServerExecutor()
server_executor(
config_file="./conf/tts_online_application.yaml",
log_file="./log/paddlespeech.log")
Output:
[2022-04-27 10:20:16,660] [ INFO] - The first response time of the 0 warm up: 1.0945196151733398 s
[2022-04-27 10:20:16,773] [ INFO] - The first response time of the 1 warm up: 0.11222052574157715 s
[2022-04-27 10:20:16,878] [ INFO] - The first response time of the 2 warm up: 0.10494542121887207 s
[2022-04-27 10:20:16,878] [ INFO] - **********************************************************************
INFO: Started server process [23466]
[2022-04-27 10:20:16] [INFO] [server.py:75] Started server process [23466]
INFO: Waiting for application startup.
[2022-04-27 10:20:16] [INFO] [on.py:45] Waiting for application startup.
INFO: Application startup complete.
[2022-04-27 10:20:16] [INFO] [on.py:59] Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
[2022-04-27 10:20:16] [INFO] [server.py:211] Uvicorn running on http://0.0.0.0:8092 (Press CTRL+C to quit)
Command Line (Recommended)
Access websocket streaming TTS service:
If 127.0.0.1
is not accessible, you need to use the actual service IP address.
paddlespeech_client tts_online --server_ip 127.0.0.1 --port 8092 --protocol websocket --input "您好,欢迎使用百度飞桨语音合成服务。" --output output.wav
Usage:
paddlespeech_client tts_online --help
Arguments:
server_ip
: erver ip. Default: 127.0.0.1port
: server port. Default: 8092protocol
: Service protocol, choices: [http, websocket], default: http.input
: (required): Input text to generate.spk_id
: Speaker id for multi-speaker text to speech. Default: 0output
: Client output wave filepath. Default: None, which means not to save the audio to the local.play
: Whether to play audio, play while synthesizing, default value: False, which means not playing. Playing audio needs to rely on the pyaudio library.spk_id
does not take effect. Streaming TTS does not support changing sample rate, variable speed and volume.Output:
[2022-04-27 10:21:04,262] [ INFO] - tts websocket client start
[2022-04-27 10:21:04,496] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-27 10:21:04,496] [ INFO] - 首包响应:0.2124948501586914 s
[2022-04-27 10:21:07,483] [ INFO] - 尾包响应:3.199106454849243 s
[2022-04-27 10:21:07,484] [ INFO] - 音频时长:3.825 s
[2022-04-27 10:21:07,484] [ INFO] - RTF: 0.8363677006141812
[2022-04-27 10:21:07,516] [ INFO] - 音频保存至:output.wav
Python API
from paddlespeech.server.bin.paddlespeech_client import TTSOnlineClientExecutor
import json
executor = TTSOnlineClientExecutor()
executor(
input="您好,欢迎使用百度飞桨语音合成服务。",
server_ip="127.0.0.1",
port=8092,
protocol="websocket",
spk_id=0,
output="./output.wav",
play=False)
Output:
[2022-04-27 10:22:48,852] [ INFO] - tts websocket client start
[2022-04-27 10:22:49,080] [ INFO] - 句子:您好,欢迎使用百度飞桨语音合成服务。
[2022-04-27 10:22:49,080] [ INFO] - 首包响应:0.21017956733703613 s
[2022-04-27 10:22:52,100] [ INFO] - 尾包响应:3.2304444313049316 s
[2022-04-27 10:22:52,101] [ INFO] - 音频时长:3.825 s
[2022-04-27 10:22:52,101] [ INFO] - RTF: 0.8445606356352762
[2022-04-27 10:22:52,134] [ INFO] - 音频保存至:./output.wav
Вы можете оставить комментарий после Вход в систему
Неприемлемый контент может быть отображен здесь и не будет показан на странице. Вы можете проверить и изменить его с помощью соответствующей функции редактирования.
Если вы подтверждаете, что содержание не содержит непристойной лексики/перенаправления на рекламу/насилия/вульгарной порнографии/нарушений/пиратства/ложного/незначительного или незаконного контента, связанного с национальными законами и предписаниями, вы можете нажать «Отправить» для подачи апелляции, и мы обработаем ее как можно скорее.
Опубликовать ( 0 )