✨feat: WebAPI & Docker by breakstring · Pull Request #40 · SparkAudio/Spark-TTS

breakstring · 2025-03-08T12:22:17Z

Add Spark-TTS Web API with FastAPI implementation
2.Add Docker support for Spark-TTS deployment

- Implement comprehensive FastAPI-based TTS API service - Add API endpoints for text-to-speech with voice cloning and creation - Create example client script for API interaction - Include environment configuration and startup script - Add README with detailed API usage and configuration instructions - Configure .env.example for flexible service setup - Implement file cleanup and output management - Support multiple audio input and output methods

- Create Dockerfile for building Spark-TTS images with flexible model inclusion - Add docker_builder.sh script for easy image building - Implement docker-compose.yml with multiple service configurations - Add .dockerignore to optimize Docker build context - Update README and run_api.sh to support Docker deployment - Configure environment variables and service types for containerized deployment

D34DC3N73R · 2025-03-13T07:36:59Z

Tested this out but I get the following error in startup logs:

ERROR:api.main:Model initialization failed: requires the protobuf library but it was not found in your environment. Checkout the instructions on the installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones that match your environment. Please note that you may need to restart your runtime after installation.

Adding protobuf==4.21.12 to requirements.txt and building again solves the issue.

breakstring · 2025-03-13T09:44:26Z

Tested this out but I get the following error in startup logs:

ERROR:api.main:Model initialization failed: requires the protobuf library but it was not found in your environment. Checkout the instructions on the installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones that match your environment. Please note that you may need to restart your runtime after installation.

Adding protobuf==4.21.12 to requirements.txt and building again solves the issue.

It's very strange, I checked in my own environment and there is no such protobuf package, and there is no such error at runtime(both in docker logs and local running logs).

(sparktts) azureuser@t4-westus2:~/Spark-TTS$ pip list Package Version ------------------------ ------------ accelerate 0.26.0 aiofiles 23.2.1 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.8.0 audioread 3.0.1 certifi 2025.1.31 cffi 1.17.1 charset-normalizer 3.4.1 click 8.1.8 decorator 5.2.1 einops 0.8.1 einx 0.3.0 fastapi 0.115.11 ffmpy 0.5.0 filelock 3.17.0 frozendict 2.4.6 fsspec 2025.2.0 gradio 5.18.0 gradio_client 1.7.2 h11 0.14.0 httpcore 1.0.7 httpx 0.28.1 huggingface-hub 0.29.2 idna 3.10 Jinja2 3.1.6 joblib 1.4.2 lazy_loader 0.4 librosa 0.10.2.post1 llvmlite 0.44.0 markdown-it-py 3.0.0 MarkupSafe 2.1.5 mdurl 0.1.2 mpmath 1.3.0 msgpack 1.1.0 networkx 3.4.2 numba 0.61.0 numpy 2.1.3 nvidia-cublas-cu12 12.4.5.8 nvidia-cuda-cupti-cu12 12.4.127 nvidia-cuda-nvrtc-cu12 12.4.127 nvidia-cuda-runtime-cu12 12.4.127 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.2.1.3 nvidia-curand-cu12 10.3.5.147 nvidia-cusolver-cu12 11.6.1.9 nvidia-cusparse-cu12 12.3.1.170 nvidia-nccl-cu12 2.21.5 nvidia-nvjitlink-cu12 12.4.127 nvidia-nvtx-cu12 12.4.127 omegaconf 2.3.0 orjson 3.10.15 packaging 24.2 pandas 2.2.3 pillow 11.1.0 pip 25.0 platformdirs 4.3.6 pooch 1.8.2 psutil 7.0.0 pycparser 2.22 pydantic 2.10.6 pydantic_core 2.27.2 pydub 0.25.1 Pygments 2.19.1 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-multipart 0.0.20 pytz 2025.1 PyYAML 6.0.2 regex 2024.11.6 requests 2.32.3 rich 13.9.4 ruff 0.9.9 safehttpx 0.1.6 safetensors 0.5.2 scikit-learn 1.6.1 scipy 1.15.2 semantic-version 2.10.0 setuptools 75.8.0 shellingham 1.5.4 six 1.17.0 sniffio 1.3.1 soundfile 0.12.1 soxr 0.5.0.post1 starlette 0.46.0 sympy 1.13.1 threadpoolctl 3.5.0 tokenizers 0.20.3 tomlkit 0.13.2 torch 2.5.1 torchaudio 2.5.1 tqdm 4.66.5 transformers 4.46.2 triton 3.1.0 typer 0.15.2 typing_extensions 4.12.2 tzdata 2025.1 urllib3 2.3.0 uvicorn 0.34.0 websockets 15.0.1 wheel 0.45.1

At the same time, I also use some other methods to check the protobuf package, which does not exist either.

D34DC3N73R · 2025-03-13T23:48:27Z

This is the full error

:~/test-sparktts$ docker run -p 7860:7860 --name test-sparktts --gpus all -e SERVICE_TYPE=webui spark-tts:latest-full Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2447, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/models/qwen2/tokenization_qwen2_fast.py", line 120, in __init__ super().__init__( File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_fast.py", line 116, in __init__ fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Exception: expected value at line 1 column 1 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/webui.py", line 260, in <module> demo = build_ui( ^^^^^^^^^ File "/app/webui.py", line 97, in build_ui model = initialize_model(model_dir, device=device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/webui.py", line 47, in initialize_model model = SparkTTS(model_dir, device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/cli/SparkTTS.py", line 44, in __init__ self._initialize_inference() File "/app/cli/SparkTTS.py", line 48, in _initialize_inference self.tokenizer = AutoTokenizer.from_pretrained(f"{self.model_dir}/LLM") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 920, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2213, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2448, in _from_pretrained except import_protobuf_decode_error(): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 87, in import_protobuf_decode_error raise ImportError(PROTOBUF_IMPORT_ERROR.format(error_message)) ImportError: requires the protobuf library but it was not found in your environment. Checkout the instructions on the installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones that match your environment. Please note that you may need to restart your runtime after installation.

Adding this allows me to run the container after a rebuild

$ cat requirements.txt einops==0.8.1 einx==0.3.0 numpy==2.2.3 omegaconf==2.3.0 packaging==24.2 safetensors==0.5.2 soundfile==0.12.1 soxr==0.5.0.post1 torch==2.5.1 torchaudio==2.5.1 tqdm==4.66.5 transformers==4.46.2 gradio==5.18.0 fastapi==0.115.11 uvicorn==0.34.0 python-dotenv==1.0.1 protobuf==4.21.12

From within the container

root@d0dad5f76940:/app# pip show protobuf Name: protobuf Version: 4.21.12 Summary: Home-page: https://developers.google.com/protocol-buffers/ Author: protobuf@googlegroups.com Author-email: protobuf@googlegroups.com License: 3-Clause BSD License Location: /usr/local/lib/python3.12/site-packages Requires: Required-by:

breakstring · 2025-03-14T02:09:05Z

This is the full error

:~/test-sparktts$ docker run -p 7860:7860 --name test-sparktts --gpus all -e SERVICE_TYPE=webui spark-tts:latest-full Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2447, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/models/qwen2/tokenization_qwen2_fast.py", line 120, in __init__ super().__init__( File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_fast.py", line 116, in __init__ fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Exception: expected value at line 1 column 1 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/webui.py", line 260, in <module> demo = build_ui( ^^^^^^^^^ File "/app/webui.py", line 97, in build_ui model = initialize_model(model_dir, device=device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/webui.py", line 47, in initialize_model model = SparkTTS(model_dir, device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/cli/SparkTTS.py", line 44, in __init__ self._initialize_inference() File "/app/cli/SparkTTS.py", line 48, in _initialize_inference self.tokenizer = AutoTokenizer.from_pretrained(f"{self.model_dir}/LLM") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 920, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2213, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2448, in _from_pretrained except import_protobuf_decode_error(): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 87, in import_protobuf_decode_error raise ImportError(PROTOBUF_IMPORT_ERROR.format(error_message)) ImportError: requires the protobuf library but it was not found in your environment. Checkout the instructions on the installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones that match your environment. Please note that you may need to restart your runtime after installation.

Adding this allows me to run the container after a rebuild

$ cat requirements.txt einops==0.8.1 einx==0.3.0 numpy==2.2.3 omegaconf==2.3.0 packaging==24.2 safetensors==0.5.2 soundfile==0.12.1 soxr==0.5.0.post1 torch==2.5.1 torchaudio==2.5.1 tqdm==4.66.5 transformers==4.46.2 gradio==5.18.0 fastapi==0.115.11 uvicorn==0.34.0 python-dotenv==1.0.1 protobuf==4.21.12

From within the container

root@d0dad5f76940:/app# pip show protobuf Name: protobuf Version: 4.21.12 Summary: Home-page: https://developers.google.com/protocol-buffers/ Author: protobuf@googlegroups.com Author-email: protobuf@googlegroups.com License: 3-Clause BSD License Location: /usr/local/lib/python3.12/site-packages Requires: Required-by:

Oops, it's webui part.

This is the full error

:~/test-sparktts$ docker run -p 7860:7860 --name test-sparktts --gpus all -e SERVICE_TYPE=webui spark-tts:latest-full Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2447, in _from_pretrained tokenizer = cls(*init_inputs, **init_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/models/qwen2/tokenization_qwen2_fast.py", line 120, in __init__ super().__init__( File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_fast.py", line 116, in __init__ fast_tokenizer = TokenizerFast.from_file(fast_tokenizer_file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Exception: expected value at line 1 column 1 During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/app/webui.py", line 260, in <module> demo = build_ui( ^^^^^^^^^ File "/app/webui.py", line 97, in build_ui model = initialize_model(model_dir, device=device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/webui.py", line 47, in initialize_model model = SparkTTS(model_dir, device) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/app/cli/SparkTTS.py", line 44, in __init__ self._initialize_inference() File "/app/cli/SparkTTS.py", line 48, in _initialize_inference self.tokenizer = AutoTokenizer.from_pretrained(f"{self.model_dir}/LLM") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/models/auto/tokenization_auto.py", line 920, in from_pretrained return tokenizer_class.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2213, in from_pretrained return cls._from_pretrained( ^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 2448, in _from_pretrained except import_protobuf_decode_error(): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/transformers/tokenization_utils_base.py", line 87, in import_protobuf_decode_error raise ImportError(PROTOBUF_IMPORT_ERROR.format(error_message)) ImportError: requires the protobuf library but it was not found in your environment. Checkout the instructions on the installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones that match your environment. Please note that you may need to restart your runtime after installation.

Adding this allows me to run the container after a rebuild

$ cat requirements.txt einops==0.8.1 einx==0.3.0 numpy==2.2.3 omegaconf==2.3.0 packaging==24.2 safetensors==0.5.2 soundfile==0.12.1 soxr==0.5.0.post1 torch==2.5.1 torchaudio==2.5.1 tqdm==4.66.5 transformers==4.46.2 gradio==5.18.0 fastapi==0.115.11 uvicorn==0.34.0 python-dotenv==1.0.1 protobuf==4.21.12

From within the container

root@d0dad5f76940:/app# pip show protobuf Name: protobuf Version: 4.21.12 Summary: Home-page: https://developers.google.com/protocol-buffers/ Author: protobuf@googlegroups.com Author-email: protobuf@googlegroups.com License: 3-Clause BSD License Location: /usr/local/lib/python3.12/site-packages Requires: Required-by:

I'm very sorry, I just packaged the webui part into Docker but didn't test this part of the code, because the webui is the existing code, and I thought it should work fine. I will take some time today to verify it.
Thank you very much for your clarification.

breakstring · 2025-03-14T03:58:22Z

I just found a clean VM to set up the environment, then completely rebuilt the image and executed your command without encountering the protobuf error you mentioned. The warning in the first line is something I had seen before.

After starting, the corresponding WebUI can also be opened. Of course, there are also some strange issues on the WebUI that cause me to sometimes be able to generate audio and most times not, which is also the reason I repackaged this FastAPI-based WebAPI interface. Gradio is too difficult to use....

D34DC3N73R · 2025-03-14T06:53:06Z

You are correct on that. I completely wiped my build cache and downloaded the model fresh from HF and did not receive the error on startup. Sorry for the false report!

phong-phuong · 2025-03-21T22:51:04Z

While your intent was to have separate images, one that includes pretrained and a lite one that doens't, the commands here are copying and deleting files in separate layers, which will only add to the filesize.

As a result, the lite image actually contains the pretrained models in the image in earlier layers, twice, one in the /tmp folder, and a second in the final destination.

For reference the pretrained images are around 3.67GB.
Personally, I would completely avoid including the models in the image and let the use mount them to avoid this complexity, and to avoid redundant models in both docker container library and on disk.

Lite image is 17 GB

Lite image should be 10 GB:

# Copy context COPY . /tmp/context/ # 1st copy (+3.67GB) # Check if model directory exists RUN if [ -d "/tmp/context/pretrained_models" ]; then \ echo "Found pretrained_models directory"; \ else \ echo "pretrained_models directory not found"; \ fi # Decide whether to copy model files based on INCLUDE_MODELS parameter RUN if [ "${INCLUDE_MODELS}" = "true" ]; then \ echo "Including models in the image"; \ if [ -d "/tmp/context/pretrained_models" ]; then \ cp -r /tmp/context/pretrained_models/* /app/pretrained_models/ || echo "No model files to copy"; \ # 2nd copy (+367GB) else \ echo "Warning: pretrained_models directory not found in build context"; \ fi; \ else \ echo "Models will need to be mounted at runtime"; \ fi # Clean up temporary directory - Comment: # This is run in a separate layer, so it doesn't reduce the image size RUN rm -rf /tmp/context

breakstring · 2025-03-22T00:34:43Z

While your intent was to have separate images, one that includes pretrained and a lite one that doens't, the commands here are copying and deleting files in separate layers, which will only add to the filesize.

As a result, the lite image actually contains the pretrained models in the image in earlier layers, twice, one in the /tmp folder, and a second in the final destination.

For reference the pretrained images are around 3.67GB. Personally, I would completely avoid including the models in the image and let the use mount them to avoid this complexity, and to avoid redundant models in both docker container library and on disk.

Lite image is 17 GB

Lite image should be 10 GB:
# Copy context COPY . /tmp/context/ # 1st copy (+3.67GB) # Check if model directory exists RUN if [ -d "/tmp/context/pretrained_models" ]; then \ echo "Found pretrained_models directory"; \ else \ echo "pretrained_models directory not found"; \ fi # Decide whether to copy model files based on INCLUDE_MODELS parameter RUN if [ "${INCLUDE_MODELS}" = "true" ]; then \ echo "Including models in the image"; \ if [ -d "/tmp/context/pretrained_models" ]; then \ cp -r /tmp/context/pretrained_models/* /app/pretrained_models/ || echo "No model files to copy"; \ # 2nd copy (+367GB) else \ echo "Warning: pretrained_models directory not found in build context"; \ fi; \ else \ echo "Models will need to be mounted at runtime"; \ fi # Clean up temporary directory - Comment: # This is run in a separate layer, so it doesn't reduce the image size RUN rm -rf /tmp/context 

Thanks for your feedback. I'm in a travel these days, and will check it next week once I have time. @phong-phuong

breakstring added 3 commits March 7, 2025 15:37

Add CLI module for Spark-TTS

29b2727

breakstring mentioned this pull request Mar 8, 2025

有docker吗 #37

Open

Merge branch 'SparkAudio:main' into main

cfc3b80

breakstring mentioned this pull request Mar 18, 2025

可以api调用吗？ #112

Open

Merge branch 'SparkAudio:main' into main

2ffdf43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨feat: WebAPI & Docker#40

✨feat: WebAPI & Docker#40
breakstring wants to merge 5 commits intoSparkAudio:mainfrom
breakstring:main

breakstring commented Mar 8, 2025

D34DC3N73R commented Mar 13, 2025

breakstring commented Mar 13, 2025

D34DC3N73R commented Mar 13, 2025

breakstring commented Mar 14, 2025

breakstring commented Mar 14, 2025

D34DC3N73R commented Mar 14, 2025

phong-phuong commented Mar 21, 2025

breakstring commented Mar 22, 2025

Labels

3 participants

Conversation

breakstring commented Mar 8, 2025

D34DC3N73R commented Mar 13, 2025

breakstring commented Mar 13, 2025

D34DC3N73R commented Mar 13, 2025

breakstring commented Mar 14, 2025

breakstring commented Mar 14, 2025

D34DC3N73R commented Mar 14, 2025

phong-phuong commented Mar 21, 2025

breakstring commented Mar 22, 2025

Labels

3 participants