Giter VIP home page Giter VIP logo

stableaudiowebui's Introduction

๐Ÿ’€๐Ÿ”Š StableAudioWebUI ๐Ÿ’€๐Ÿ”Š

Hugging Face

A Lightweight Gradio Web interface for running Stable Audio Open 1.0

By @drbaph



image_2024-06-10_21-03-05



image_2024-06-10_21-02-10


Example

Ethereal_ambient_pads_with_pulsing_futuristic_bassline_and_hypnotic_shamanic_percussion.mp4

โš  Disclaimer โš 

I am not responsible for any content generated using this repository. By using this repository, you acknowledge that you are bound by the Stability AI license agreement and will only use this model for research or personal purposes. No commercial usage is allowed!


Recommended Settings

Prompt: Any
Sampler: dpmpp-3m-sde
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500
Duration: Max 47s
Seed: Any

> Saves Files in the following directory Output/YYYY-MM-DD/

> using the following schema 'your_prompt.mp3'

๐Ÿš€Updates (0.4)

[10/06/2024]


โœ… Added (Random_Seed) Checkbox


โœ… Implemented Enhanced Filename Handling and Security Measures

  • Filename Length Control: Truncated long prompts to a maximum of 50 characters for filenames, preventing excessively long filenames.
  • Enhanced Sanitization: Applied strict rules to replace non-alphanumeric characters with underscores (_), ensuring valid and safe filenames.
  • Unique Filename Generation: Introduced a system to append numeric suffixes to filenames to avoid overwriting existing files, ensuring each file is uniquely named.
  • Safe Directory Handling: Utilized secure methods for path creation and directory handling to avoid risks from user input influencing file paths.
Click to expand for earlier updates

[08/06/2024]


โœ… Added One-Click-Installer.bat for Windows NVIDIA / CPU Builds

โœ… Optimised Code for efficiency

โœ… Simplified UI


[06/06/2024]


โœ… Updated UI elements to include Advanced Parametres dropdown

( CFG Scale, Sigma_min, Sigma_max )

โœ… Added Use Half precision checkbox for Low VRAM inference

( Float 16 )

โœ… Added choice for all Sampler types

( dpmpp-3m-sde, dpmpp-2m-sde, k-heun, k-lms, k-dpmpp-2s-ancestral, k-dpm-2, k-dpm-fast )

โœ… Added link to the Repo


๐Ÿ“ Note: For Windows builds with Nvidia 30xx + or Float32 Capable CPU you can use the One-Click-Installer.bat to simplify the process, granted you have logged in to huggingface-cli and auth'd your token prior to running the batch script: Step 3 (the huggingface-cli is used for obtaining the model file)

Step 1: Start by cloning the repo:

git clone https://github.com/Saganaki22/StableAudioWebUI.git

Step 2: Use the below deployment (tested on 24GB Nvidia VRAM but should work with 12GB too as we have added the Load Half precision, Float16 option in the WebUI):

cd StableAudioWebUI
python -m venv myenv
myenv\Scripts\activate
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
pip install -r requirements.txt

(Note if you have an older Nvidia GPU you may need to use CUDA 11.8)

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Step 3: (Optional - read more): If you haven't got a hugging face account or have not used huggingface-cli before, create an account and then authenticate your Hugging face account with a token (create token at https://huggingface.co/settings/tokens)

huggingface-cli login

(paste your token and follow the instructions, token will not be displayed when pasted)

If you want to run it using CPU

omit 'pip install torch torchvision --index-url https://download.pytorch.org/whl/cu121' and the process after it and just run

pip install -r requirements1.txt
pip install -r requirements.txt

Step 4: Run

python gradio_app.py

โญ Bonus

If you are using Windows and followed my setup instructions you could create a batch script to activate the enviroment and run the script all in one, what you need to do is:

Create a new text file in the same folder as gradio_app.py & paste this in the text file

@echo off
title StableAudioWebUI
call myenv\Scripts\activate
python gradio_app.py
pause

then save the file as run.bat

Screenshots (older build)

(All with random seeds)

Prompt: a dog barking
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500

image

a_dog_barking.mp4


Prompt: people clapping
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500

image

people_clapping.mp4


Prompt: didgeridoo
CFG: 7
Sigma_Min: 0.3
Sigma_Max: 500

image

didgeridoo_.mp4

Model Details

  • Model type:ย Stable Audio Open 1.0ย is a latent diffusion model based on a transformer architecture.
  • Language(s): English
  • License: See theย LICENSE file.
  • Commercial License: to use this model commercially, please refer toย https://stability.ai/membership

stableaudiowebui's People

Contributors

saganaki22 avatar ameerazam08 avatar

Stargazers

Alexander Mateo avatar Scott Hedstrom avatar  avatar Sebastien M avatar  avatar DartPower avatar Ben Garrard avatar  avatar James  avatar  avatar  avatar  avatar  avatar  avatar Tan Shaohui avatar Paragoner avatar Cyborg_of_Nature avatar  avatar  avatar  avatar Jakub Strnad avatar  avatar  avatar  avatar  avatar  avatar Krtolica Vujadin avatar  avatar  avatar  avatar jacobi petrucciani avatar Isaac avatar Luciano Santa Brรญgida avatar  avatar cell avatar  avatar

Watchers

DartPower avatar  avatar  avatar  avatar

stableaudiowebui's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.