Giter VIP home page Giter VIP logo

storyteller's Introduction

StoryTeller

Code style: black License: MIT

A multimodal AI story teller, built with Stable Diffusion, GPT, and neural text-to-speech (TTS).

Given a prompt as an opening line of a story, GPT writes the rest of the plot; Stable Diffusion draws an image for each sentence; a TTS model narrates each line, resulting in a fully animated video of a short story, replete with audio and visuals.

out

Installation

PyPI

Story Teller is available on PyPI.

$ pip install storyteller-core

Source

  1. Clone the repository.
$ git clone https://github.com/jaketae/storyteller.git
$ cd storyteller
  1. Install dependencies.
$ pip install .

Note: For Apple M1/2 users, mecab-python3 is not available. You need to install mecab before running pip install. You can do this with Hombrew via brew install mecab. For more information, refer to SamuraiT/mecab-python3#84.

  1. (Optional) To develop locally, install dev dependencies and install pre-commit hooks. This will automatically trigger linting and code quality checks before each commit.
$ pip install -e .[dev]
$ pre-commit install

Quickstart

The quickest way to run a demo is through the CLI. Simply type

$ storyteller

The final video will be saved as /out/out.mp4, alongside other intermediate images, audio files, and subtitles.

To adjust the defaults with custom parametes, toggle the CLI flags as needed.

$ storyteller --help
usage: storyteller [-h] [--writer_prompt WRITER_PROMPT]
                   [--painter_prompt_prefix PAINTER_PROMPT_PREFIX] [--num_images NUM_IMAGES]
                   [--output_dir OUTPUT_DIR] [--seed SEED] [--max_new_tokens MAX_NEW_TOKENS]
                   [--writer WRITER] [--painter PAINTER] [--speaker SPEAKER]
                   [--writer_device WRITER_DEVICE] [--painter_device PAINTER_DEVICE]

optional arguments:
  -h, --help            show this help message and exit
  --writer_prompt WRITER_PROMPT
  --painter_prompt_prefix PAINTER_PROMPT_PREFIX
  --num_images NUM_IMAGES
  --output_dir OUTPUT_DIR
  --seed SEED
  --max_new_tokens MAX_NEW_TOKENS
  --writer WRITER
  --painter PAINTER
  --speaker SPEAKER
  --writer_device WRITER_DEVICE
  --painter_device PAINTER_DEVICE

Usage

For more advanced use cases, you can also directly interface with Story Teller in Python code.

  1. Load the model with defaults.
from storyteller import StoryTeller

story_teller = StoryTeller.from_default()
story_teller.generate(...)
  1. Alternatively, configure the model with custom settings.
from storyteller import StoryTeller, StoryTellerConfig

config = StoryTellerConfig(
    writer="gpt2-large",
    painter="CompVis/stable-diffusion-v1-4",
    max_new_tokens=100,
)

story_teller = StoryTeller(config)
story_teller.generate(...)

License

Released under the MIT License.

storyteller's People

Contributors

jaketae avatar christopherwoodall avatar itsyogesh avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.