Giter VIP home page Giter VIP logo

billionsongs's Introduction

This project mainly serves as a demonstration of Gradient, our TensorFlow binding for C# and other .NET languages.

Development instance can be accessed here: Billion Songs Dev.

What is it, and how does it work?

This is a deep learning-powered song lyrics generator, based on GPT-2, wrapped as a ASP.NET Core website.

It generates songs word by word (or rather token by token), using the statistical relationships learned by a deep learning model, called GPT-2. The actual generator code is in GradientTextGenerator class.

Text generation is pretty slow even with a powerful GPU, so we have a bunch of caches in /Web to provide a better user experience. There is also PregeneratedSongProvider, which continuously creates new texts in the background to ensure clicking "Make Random" button gives an instant result.

NOTE: this repository has git submodules. So clone with --recurse-submodules. Learn about them here.

Prerequisites

  1. Download and install Python and TensorFlow as described in Gradient documentation
  2. Install Python package, called regex (python -m pip install regex --user)
  3. Install the latest .NET Core SDK

Run instructions

  1. After cloning the repository, enter the Web folder and run dotnet ef database update. That should create songs.db file in the same directory.
  2. Edit appsettings.json (see appsettings.Development.json for an example):
    • add "DB": "sqlite"
    • modify DefaultConnection to "DefaultConnection": "Data Source=songs.db"
  3. Run dotnet run web. This should print some logs. Wait for Now listening on: http://, then open that URL in the browser. It will take up to 4 minutes to generate the first song.

Train instructions

NOTE: training requires a lot of RAM (>16GB), and will be slow on non-GPU

  1. Download the original 117M GPT-2 model by running one of download_model.* scripts in External/Gradient-Samples/GPT-2 from the same directory.
  2. Download and extract any lyrics dataset (I used Every song you have heard (almost)!), and unpack it if needed.
  3. From the command line in the same directory (GPT-2), run dotnet run train --include *.csv --column Lyrics path/to/lyrics/folder --run Lyrics (change the column parameter to the name of the lyrics column in you dataset)

NOTE: dev instance was trained with train -i "*.csv" --column=Lyrics Downloads\every-song-you-have-heard-almost -r Lyrics --checkpoint=fresh --save-every=100 -n 3. If training from IDE, set working directory to GPT-2 (which should contain models subfolder downloaded previously).

  1. Interrupt training process, when samples start looking good.
  2. Try the trained model by running dotnet run --run Lyrics

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.