Giter VIP home page Giter VIP logo

tensorflow_fizzbuzz's Introduction

Tensorflow Fizz Buzz

Fizz buzz is a very old and widely used programming interview question. It has spawned many awesome solutions and parodies, where the two that stand out to me are the enterprise Fizz Buzz repo as well as Joel Grus' tensorflow Fizz Buzz.

While reading Joel's solution I was wondering if it were possible to fully teach the neural network such that it would be able to reproduce the correct solution perfectly.

Why should or could this work?

Effectively, the function we want to approximate is based on the modulo function. This function is a very basic, stepwise continuous function that takes the shape of a seesaw function. Of course, Fizz Buzz is not just simply a modulo function, but a combination of the linear function f(x) = x and three modulo functions with some logic attached to them. Additionally, the values of the function are not any numerical values, but for all intents and purposes we could imagine the special value to just map to three different negative numbers or even some a vector in R^n that's not just the real axis in one dimensions.

My belief that this should be possible was actually based on the universal approximation theorem, which states

[T]he universal approximation theorem states that a feed-forward network with a single hidden layer containing a finite number of neurons can approximate continuous functions on compact subsets of Rn, under mild assumptions on the activation function.

It was originally shown for sigmoid activation functions, but has since been shown to be valid for a wide range of different activation functions.

Of course, the function we want to approximate does not fully fulfill all the conditions of the theorem: the function is not continuous. Hence, it is not immediately clear that we would be able to perfectly approximate the Fizz Buzz function. However, its shape is still decently simple that I have hope to be able to fully teach a neural network the Fizz Buzz rules.

Improving the network

Joel's approach was already decently successful, however his learned model did not fully succeed in reproducing the right sequence. There were a few things about the training data and the neural architecture that I changed in order to improve the model:

  • the training data set: Neural network shine in problems with a LOT of training data. Hence, I increased the training set from the numbers starting at 101 up to 1024 to the range [101,4096]. One could argue that this is still not a lot, but it turned out to be enough for our use case.
  • In order to accommodate the increased input range, I had to increase the numbers of binary digits
  • I also increased the batch size to 250 instead of 128
  • It turned out that we needed an increased number of hidden nodes in order to learn the functions pattern (from 100 to 200)

Results

foo@bar:~$ python fizzbuzz_tf.py
Epoch: 1, train accuracy: 0.5335335335335335, epoch loss: 20.365147829055786
Epoch: 2, train accuracy: 0.5335335335335335, epoch loss: 18.559097170829773
Epoch: 3, train accuracy: 0.5335335335335335, epoch loss: 18.453301548957825
Epoch: 4, train accuracy: 0.5335335335335335, epoch loss: 18.446383833885193
Epoch: 5, train accuracy: 0.5335335335335335, epoch loss: 18.443661332130432
Epoch: 6, train accuracy: 0.5335335335335335, epoch loss: 18.455499053001404
...
Epoch: 2000, train accuracy: 1.0, epoch loss: 0.07465453259646893
['1' '2' 'fizz' '4' 'buzz' 'fizz' '7' '8' 'fizz' 'buzz' '11' 'fizz' '13'
 '14' 'fizzbuzz' '16' '17' 'fizz' '19' 'buzz' 'fizz' '22' '23' 'fizz'
 'buzz' '26' 'fizz' '28' '29' 'fizzbuzz' '31' '32' 'fizz' '34' 'buzz'
 'fizz' '37' '38' 'fizz' 'buzz' '41' 'fizz' '43' '44' 'fizzbuzz' '46' '47'
 'fizz' '49' 'buzz' 'fizz' '52' '53' 'fizz' 'buzz' '56' 'fizz' '58' '59'
 'fizzbuzz' '61' '62' 'fizz' '64' 'buzz' 'fizz' '67' '68' 'fizz' 'buzz'
 '71' 'fizz' '73' '74' 'fizzbuzz' '76' '77' 'fizz' '79' 'buzz' 'fizz' '82'
 '83' 'fizz' 'buzz' '86' 'fizz' '88' '89' 'fizzbuzz' '91' '92' 'fizz' '94'
 'buzz' 'fizz' '97' '98' 'fizz' 'buzz']
Number of correct predictions: 100
Incorrect predictions:  []

This looks absolutely great! Now, did we just learn a universal fizz buzz function? Let's try some higher numbers and check how good our model is. Uncommenting the respective parts in fizzbuzz_tf.py will yield

foo@bar:~$ python fizzbuzz_tf.py
...

Number of correct predictions: 226
Incorrect predictions:  [('fizz', '4098'), ('4099', 'fizz'), ('buzz', '4100'), ('fizz', 'buzz')
...

Obviously, our model performs terribly! Only 226 correct predictions out of almost 1000. How could this be? The reason for this lies in the way we encoded our integers and the training set. Our training set only included numbers that have a binary representation with up 12 digits (and 4096 with 13). Hence, none of the other inputs used any of the neurons that would be used for the numbers with higher number of digits. So those connections were never training and hence our model performs terribly.

How could we solve this? One way to solve this would be to choose a different encoding for our integers that doesn't suffer such an issues. For example using the normal decimal encoding instead of binary would make our model range much larger depending on our training set.

tensorflow_fizzbuzz's People

Contributors

jborchma avatar

Stargazers

 avatar

Watchers

 avatar

tensorflow_fizzbuzz's Issues

add treatment of decimal "encoding"

Add a decimal "encoding" function (essentially just zero padding to some number of digits) and do a couple of runs and see how that compares.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.