Giter VIP home page Giter VIP logo

tf-lstm-char-cnn's Introduction

tf-lstm-char-cnn

Tensorflow port of Yoon Kim's Torch7 code. See also similar project here that failed to reproduce Kim's results and was apparently abandoned by the author. Many pieces of code are borrowed from it.

Installation

You need tensorflow version 1.0 and python (obviously).

Running

python train.py -h
python evaluate.py -h

Training

python train.py

will train large model from Yoon Kim's paper.

Evaluate

python evaluate.py --load_model cv/epoch024_4.4962.model

evaluates this model on the test dataset

Generate

python generate.py --load_model cv/epoch024_4.4962.model

generates random text from the loaded model

Model

model

Model_1 graph is used for inference (computing validation loss and perplexity during training)

Model graph is used for training.

Pre-trained model

Here is a ZIP file containing model files (created with TF 1.0). This model was trained with the default parameters and acheved the accuracy of the published result.

Pre-trained model 60Mb

Results

training

Learning rate Train/Valid/Test loss Train/Valid/Test perplexity
1.0 3.815 / 4.407 / 4.369 35.40 / 82.02 / 79.03

Note that model DOES reproduce the published result.

Training times (legacy, with TF 0.12)

Timings were recorded on AWS EC2 instancies:

  1. c4.8xlarge - 32 CPUs, no GPUs
  2. g2.2xlarge - 8 CPU, 1 GPU (K520)
  3. p2.xlarge - 4 CPU, 1 GPU (K80)
Timing c4.8xlarge g2.2xlarge p2.xlarge
Secs per batch 0.98 2.85 0.32
Secs per epoch 1404 3779 428

Takes 3 hours to complete training (25 epochs) on p2.xlarge machine

Generating random text

default charges during the straight summer of the collapse of it works that it <unk> to the u.s. is very <unk> 

mr. boyd 's remarks are crucial to the champion to a close <unk> in N he says 

explains it 's only one of the many really identified 

it did n't end a deal we do n't know what 's no longer viable 

<unk> <unk> <unk> 's <unk> hearing his head 

nora more than a$ N million has had no control in N it would do that could win any new other <unk> corporate and real-estate <unk> a future number of names <unk> at telling an <unk> outside the international institution 

the extensive approach is a real form of unpublished football <unk> as the <unk> bells on negotiation refund 

in other words the others who paid $ N a figure 

what 's really the <unk> who can use a vast hundred the transaction would give it a better use of the <unk> and 

the demise of the topic is fatal than we can do anything that is like the change in reality says justice <unk> <unk> of <unk> & <unk> a direct mail marketing firm 

when when lawyer <unk> <unk> <unk> as corporate crime remains a distant younger woman says i 've seen however an injection in the new york money fund 

some hearings have detectors with <unk> have survived mr. achenbaum 's veto incorporated the <unk> delicate center which <unk> a heavy medium that lawyers call are confusing big inquiries for the forecasting two outlets to had the claims 

the afghan people have a series that 's really heard to share legislation for <unk> without all their racial cosmetic ties 

he is going to hold his ...

Contributors

Nicole

David

derluke

tf-lstm-char-cnn's People

Contributors

dchichkov avatar derluke avatar mkroutikov avatar w4nderlust avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.