Giter VIP home page Giter VIP logo

comfyui-tts's Introduction

Text to Speech (TTS) for ComfyUI

Description

What This Is

ComfyUI-TTS is a tool that allows you to convert strings within ComfyUI to audio so you can hear what's written. My objective with this one was to be able to use it with LLM AI models, but I wanted to leave the door open for way more other uses.

Where This Fits

  • TTS is "text to speech", which converts the written word to sound you can hear. It does not do the other thing, converting audio to text.
  • Piper-tts was the first TTS program I chose to implement because it's meant to be easy to do so. The feature set is less complete, but it works simple and easy.
  • ONNX models are used by Piper-tts, along with a JSON file which should be named the same as the onnx, but with a .json extension. I noticed some of the downloadables are not this way, and it's up to you to fix that (sorry)
  • ComfyUI-Manager lets us use Stable Diffusion using a flow graph layout.

Why I Made This

  • I wanted to integrate text generation and image generation AI in one interface and see what other people can come up with to use them. TTS is just one aspect of being able to use text generation.

Features:

  • Currently let's you load ONNX models in a consistent fashion with other ComfyUI models and can use them to generate audio output from text.

Upcoming Features:

  • Intend to expand the Piper-tts function options
  • Then going to start working on implementing basic XTTSv2

Installation

What you need first:

Highly Recommended

Steps if using Comfy Manager:

  1. Visit your Install Custom Nodes page, and search for ComfyUI-TTS.
  2. Hit Install and restart when prompted.
  3. Copy your ONNX and JSON files into ./ComfyUI/custom_nodes/ComfyUI-TTS/models/*
  4. Hit Ctrl+F5 to hard reload the browser window.
  5. The nodes should be in the TTS menu.

Steps if installing manually:

  1. Clone this repo into custom_nodes folder.
  2. Install piper-tts using the python methods!
  3. Copy your ONNX and JSON files into ./ComfyUI/custom_nodes/ComfyUI-TTS/models/*
  4. Hit Ctrl+F5 to hard reload the browser window.
  5. The nodes should be in the TTS menu.

If you can't install:

Either post an issue on github, or ask on Element in Comfy's channel

Usage

Instructions:

  1. Download ONNX and JSON files for the models, which can be found here. You will need at least 1. Different models produce different results.

  2. Ensure the JSON file is named identically to the ONNX, but with .json appended.

  3. Place models in ComfyUI/custom_nodes/ComfyUI-TTS/models. They can be renamed if you want.

  4. Fire up/Restart ComfyUI and allow it to finish restarting.

  5. Hit Ctrl+F5 to ensure the browser is refreshed.

  6. Check your ComfyUI available nodes and find the TTS menu.

  7. Load TTS Model

  8. Call Speak Text

If you get errors:

Either post an issue on github, or ask on Element in Comfy's channel

Examples

image

For Possible Contributors

Known Issues

  • This is a very recent release. Only basic functionality is probable.

Conclusion

We appreciate your interest in TTS for ComfyUI. Feel free to explore and provide feedback or report any issues you encounter. Your contributions and suggestions are valuable to the project.

comfyui-tts's People

Contributors

daniel-lewis-ab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

comfyui-tts's Issues

Opportunity to make it easier to use

Hi,

I was just trying to implement Piper-TTS for ComfyUI, and I noticed I can't pull the sample-rate from the PiperVoice class as a property, which would be useful to passing to the sounddevice.play()

I'm sure there are other properties that would similarly be useful to someone if they were exposed.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.