Giter VIP home page Giter VIP logo

frodotype's Introduction

Introduction: Frodotype - a fantasy flavoured text generator

Frodotye is a webapp that generates text which grammatically correct and logical most of the time. This project aims to fine-tune the GPT2 model using the gpt-2-simple python package. OpenAI offers the use of 3 of their models, however the size and complexity of these models makes it diffcult to train on consumer hardware. Therefor, the smallest (117m) parameter model was used in conjucntion with a Google Deep Leanring VM to retrain the model.The data used to fine-tuned the model consists of 102 fantasy novels by 20 authors. This data was used for 2 reasons:

  1. Fantasy is the genre I read the most.
  2. I own the ebooks used in this project.

Frodotype was built using Tensorflow, docker, Google Cloud Platform, Javascript, Chart.js, and Bulma CSS. A bulk of the heavy lifting in python was done using Max Wolf's gpt-2-simple project.

Gathering the data

The 102 books used were converted from the Amazon Kindle format .azw to plain text files. This processes included stripping all images, formattting, and hyperlinks. The books were then manually stripped of their table of contents, appencices, and glossaries. The final text file used to retrain the model was put together using the following commands:

  1. Concatenate the files:

$ for f in *.txt do (cat "${f}"; echo) >> unprocessed.txt; done

  1. Deleting all none ASCII charcaters:

$ LC_ALL=C tr -dc '\0-\177' < unprocessed.txt > processed.txt

  1. Removing numbers and dashes from the text:

$ tr -d '[0-9-]' < processed.txt > final.txt

Additional processing is done in the text-analysis notebook.

Training the model

The model was trained on a Google Deep Learning VM using a Tesla K80 GPU, TensorFlow 1.15, and CUDA 10.0.

The model was retrained using gpt-2-simple, a python package that eases the process of tweeking hyperparameters. The model was trained for three differeing lengths. The one used in this app was trianed for 45,000 steps or approximattly 90 hours. Two additional models were trained at 25,000 steps and 80,000 steps. The smaller of the two models had a much higher loss value, while the larger model had a simillar loss that began to increase towards the end.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.