Giter VIP home page Giter VIP logo

model-zoo's Introduction

Flux Model Zoo

This repository contains various demonstrations of the Flux machine learning library. Any of these may freely be used as a starting point for your own models.

The models are broadly categorised into the folders vision (e.g. large convolutional neural networks (CNNs)), text (e.g. various recurrent neural networks (RNNs) and natural language processing (NLP) models), games (Reinforcement Learning / RL). See the READMEs of respective models for more information.

Usage

Each model comes with its own Julia project. To use this, open Julia in the project folder, and enter

using Pkg; Pkg.activate("."); Pkg.instantiate()

This will install all needed packages, at the exact versions when the model was last updated. Then you can run the model code with include("<model-to-run>.jl"), or by running the model script line-by-line.

Models may also be run with NVIDIA GPU support, if you have a CUDA installed. Most models will have this capability by default, pointed at by calls to gpu in the model code.

Gitpod Online IDE

Each model can be used in Gitpod, just open the repository by gitpod

  • Based on Gitpod's policies, free access is limited.
  • All of your work will place in the Gitpod's cloud.
  • It isn't an officially maintained feature.

Contributing

We welcome contributions of new models and documentation.

Share a new model

If you want to share a new model, we suggest you follow these guidelines:

  • Models should be in a folder with a project and manifest file to pin all relevant packages.
  • Models should include a README(.md) to explain what the model is about, how to run it, and what results it achieves (if applicable).
  • Models should ideally be CPU/GPU agnostic and not depend directly on GPU functionality.
  • Please keep the code short, clean, and self-explanatory, with as little boilerplate as possible.

Create or improve documentation

You can contribute in one of the following ways

  • Add or improve documentation to existing models: Write the following information:
    • Give a brief introduction to the model’s architecture and the goal it archives.
    • Describe the Flux API that the model demonstrates (high-level API, AD, custom operations, custom layers, etc.).
    • Add literature background for the model. More specifically, add articles, blog posts, videos, and any other resource that is helpful to better understand the model.
    • Mention the technique that is being demonstrated. Briefly describe the learning technique being demonstrated (Computer vision, regression, NLP, time series, etc.).
  • Write in-depth tutorials for a model: You can further extend the documentation of a model and create a tutorial to explain in more detail the architecture, the training routine, use your own data, and so forth. After you write a tutorial, create a PR with it for the Tutorials section on the FluxML website.

Update a model

Each example lists the version of Flux for which it was most recently updated. Bringing them up to the latest is a great way to learn! Flux has a NEWS page listing important changes. (For other packages, see their releses page: MLUtils, MLDatasets, etc.)

To run the old examples, Flux v0.11 can be installed and run on Julia 1.6, the LTS version. Flux v0.12 works on Julia 1.8. Flux v0.14 is the latest right now, this and v0.13 are marked with ☀️; models upgraded to use
explicit gradients (v0.13.9+ or v0.14) have a +.

Examples in the Model Zoo

Vision

Text

Other & contributed models

Tutorials

Examples Elsewhere

MLJFlux is a bridge to MLJ.jl, a package for mostly non-neural-network machine learning. They have some examples of interest, which like the model zoo's examples, each include a local Project & Manifest file:

model-zoo's People

Contributors

adarshkumar712 avatar adinhobl avatar aditkumar72 avatar avik-pal avatar carlolucibello avatar chrisrackauckas avatar darsnack avatar dhairyalgandhi avatar fishares avatar iblislin avatar jakee417 avatar jldc avatar joostdup avatar kraftpunk97 avatar lilianabs avatar logankilpatrick avatar luboshanus avatar maetshju avatar matsueushi avatar mcabbott avatar mcognetta avatar mikeinnes avatar roboneet avatar saswatpp avatar shreyas-kowshik avatar staticfloat avatar sudhanshuagrawal27 avatar tejank10 avatar touchesir avatar willtebbutt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

model-zoo's Issues

Tutorial: throws an error when calling `m(x)` (output of Dense)

Here:

Throws:
ERROR: MethodError: no method matching make_makeargs(::Base.Broadcast.Broadcasted{Flux.Tracker.TrackedStyle,Nothing,typeof(identity),Tuple{Base.Broadcast.Broadcasted{Flux.Tracker.TrackedStyle,Nothing,typeof(+),Tuple{TrackedArray{…,Array{Float64,1}},TrackedArray{…,Array{Float64,1}}}}}})

Even when activating the project with ]activate .

Also, the Flux version in Project.toml starts to be outdated - for instance, I don't think there is a derivative function anymore. So I guess one question is, whether it's worth to fix this bug using Flux 0.6.8, or re-writing the tutorial for a more recent version of Flux.

julia 1.0.0 support?

Print errors when using Metalhead:
WARNING: eval from module Broadcast to Metalhead:
Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):388 =#, Expr(:function, Expr(:where, Expr(:call, :flatten, Expr(:::, :bc, Expr(:curly, :Broadcasted, :Style))), :Style), Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):389 =#, Expr(:&&, Expr(:call, :isflat, :bc), Expr(:return, :bc)), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):390 =#, :args = Expr(:call, :cat_nested, :bc), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):391 =#, Expr(:let, Expr(:block, :makeargs = Expr(:call, :make_makeargs, :bc), :f = Expr(:., :bc, :(:f))), Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):392 =#, :newf = Expr(:macrocall, Symbol("@inline"), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):392 =#, Expr(:function, Expr(:where, Expr(:tuple, Expr(:::, :args, Expr(:curly, :Vararg, :Any, :N))), :N), Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):393 =#, Expr(:call, :f, Expr(:..., Expr(:call, :makeargs, Expr(:..., :args))))))), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):395 =#, Expr(:return, Expr(:call, Expr(:curly, :Broadcasted, :Style), :newf, :args, Expr(:., :bc, :(:axes)))))))), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):398 =#, Expr(:macrocall, Symbol("@inline"), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):398 =#, Expr(:function, Expr(:call, :make_makeargs, :makeargs, Expr(:::, :t, Expr(:curly, :Tuple, Expr(:<:, :Broadcasted), Expr(:curly, :Vararg, :Any)))), Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):399 =#, :bc = Expr(:ref, :t, 1), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):400 =#, Expr(:let, Expr(:block, :makeargs = Expr(:call, :make_makeargs, :makeargs, Expr(:call, :tail, :t)), :f = Expr(:., :bc, :(:f))), Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):401 =#, Expr(:let, :makeargs = Expr(:call, :make_makeargs, :makeargs, Expr(:., :bc, :(:args))), Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):402 =#, Expr(:tuple, :headargs, :tailargs) = Expr(:tuple, Expr(:call, :make_headargs, Expr(:., :bc, :(:args))), Expr(:call, :make_tailargs, Expr(:., :bc, :(:args)))), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):403 =#, Expr(:return, Expr(:macrocall, Symbol("@inline"), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):403 =#, Expr(:function, Expr(:where, Expr(:tuple, Expr(:::, :args, Expr(:curly, :Vararg, :Any, :N))), :N), Expr(:block, #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):404 =#, :args1 = Expr(:call, :makeargs, Expr(:..., :args)), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):405 =#, Expr(:tuple, :a, :b) = Expr(:tuple, Expr(:call, :headargs, Expr(:..., :args1)), Expr(:call, :tailargs, Expr(:..., :args1))), #= Symbol("/home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl"):406 =#, Expr(:tuple, Expr(:call, :f, Expr(:..., :a)), Expr(:..., :b))))))))))))))
** incremental compilation may be broken for this module **

WARNING: Method definition flatten(Base.Broadcast.Broadcasted{Style, Axes, F, Args} where Args<:Tuple where F where Axes) where {Style} in module Broadcast at broadcast.jl:296 overwritten at /home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl:389.
WARNING: Method definition make_makeargs(Any, Tuple{#s12, Vararg{Any, N} where N} where #s12<:(Base.Broadcast.Broadcasted{Style, Axes, F, Args} where Args<:Tuple where F where Axes where Style<:Union{Nothing, Base.Broadcast.BroadcastStyle})) in module Broadcast at broadcast.jl:334 overwritten at /home/qingzhi/.julia/packages/Flux/kJVo6/src/tracker/array.jl:399.
ERROR: LoadError: UndefVarError: Void not defined
Stacktrace:
[1] top-level scope at none:0
[2] include at ./boot.jl:317 [inlined]
[3] include_relative(::Module, ::String) at ./loading.jl:1038
[4] include(::Module, ::String) at ./sysimg.jl:29
[5] top-level scope at none:2
[6] eval at ./boot.jl:319 [inlined]
[7] eval(::Expr) at ./client.jl:389
[8] top-level scope at ./none:3
in expression starting at /home/qingzhi/.julia/packages/BSON/VAvDJ/src/BSON.jl:7
ERROR: LoadError: Failed to precompile BSON [fbb218c0-5317-5bc6-957e-2ee96dd4b1f0] to /home/qingzhi/.julia/compiled/v1.0/BSON/3tVCZ.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] macro expansion at ./logging.jl:313 [inlined]
[3] compilecache(::Base.PkgId, ::String) at ./loading.jl:1184
[4] _require(::Base.PkgId) at ./logging.jl:311
[5] require(::Base.PkgId) at ./loading.jl:852
[6] macro expansion at ./logging.jl:311 [inlined]
[7] require(::Module, ::Symbol) at ./loading.jl:834
[8] include at ./boot.jl:317 [inlined]
[9] include_relative(::Module, ::String) at ./loading.jl:1038
[10] include(::Module, ::String) at ./sysimg.jl:29
[11] top-level scope at none:2
[12] eval at ./boot.jl:319 [inlined]
[13] eval(::Expr) at ./client.jl:389
[14] top-level scope at ./none:3
in expression starting at /home/qingzhi/.julia/packages/Metalhead/8ZFyX/src/Metalhead.jl:4
ERROR: Failed to precompile Metalhead [dbeba491-748d-5e0e-a39e-b530a07fa0cc] to /home/qingzhi/.julia/compiled/v1.0/Metalhead/OYscp.ji.
Stacktrace:
[1] error(::String) at ./error.jl:33
[2] macro expansion at ./logging.jl:313 [inlined]
[3] compilecache(::Base.PkgId, ::String) at ./loading.jl:1184
[4] macro expansion at ./logging.jl:311 [inlined]
[5] _require(::Base.PkgId) at ./loading.jl:941
[6] require(::Base.PkgId) at ./loading.jl:852
[7] macro expansion at ./logging.jl:311 [inlined]
[8] require(::Module, ::Symbol) at ./loading.jl:834

Error running the text generation example

Looks like the LSTM signature has been changed : LSTM(x, y) gives:

MethodError: no method matching Flux.LSTMCell(::TrackedArray{…,Array{Float64,2}}, ::TrackedArray{…,Array{Float64,2}}, ::Flux.Tracker.TrackedReal{Float64}, ::TrackedArray{…,Array{Float64,1}}, ::TrackedArray{…,Array{Float64,1}})
Closest candidates are:
  Flux.LSTMCell(::A, ::A, ::V, !Matched::V, !Matched::V) where {A, V} at /home/ayush99/.julia/dev/Flux/src/layers/recurrent.jl:116

Stacktrace:
 [1] #LSTMCell#79(::typeof(Flux.glorot_uniform), ::Type, ::Int64, ::Int64) at /home/ayush99/.julia/dev/Flux/src/layers/recurrent.jl:125
 [2] Flux.LSTMCell(::Int64, ::Int64) at /home/ayush99/.julia/dev/Flux/src/layers/recurrent.jl:125
 [3] #LSTM#80(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Int64, ::Vararg{Int64,N} where N) at /home/ayush99/.julia/dev/Flux/src/layers/recurrent.jl:160
 [4] LSTM(::Int64, ::Vararg{Int64,N} where N) at /home/ayush99/.julia/dev/Flux/src/layers/recurrent.jl:160
 [5] top-level scope at none:0

Switch all datasets to be using MLDatasets.jl

Following up on FluxML/Flux.jl#580
It would be good to get the model zoo off of depending on Flux for datasets,
and instead just dependinging on MLDatasets.jl
Once that is done, the datasets in Flux can be deprecated.

This is a follow up to #15 in part, now the MLDatasets has matured and been registered.
For reference re: surpressing the accept downlad prompt mentined in #15 (comment)
this is done by setting ENV["DATADEPS_ALWAYS_ACCEPT"]="true".

CNN model

I wonder if Flux is ready for CNN. If so, could you please add a CNN model to show us how it works? Thanks!

Source of housing data example unclear.

In the code it says" # This replicates the housing data example from the Knet.jl readme. Although we

could have reused more of Flux (see the mnist example), the library's

abstractions are very lightweight and don't force you into any particular

strategy."

This is not clear as not link is provided.

GsoC Proposal 2019

Hi All!! I am a Gsoc'19 aspirant and have been contributing to JuliaLang and model-zoo for a while.I want to contribute to the Flux during the summer project.I have an idea in my mind which is a kind of three stages of work.

First step would be to create good implementation of some toy RL problems just to know Flux well.Next step is to implement sparsification of neural network algorithm using sensitivity regularization using NIPS Challenge Paper
http://papers.nips.cc/paper/7644-learning-sparse-neural-networks-via-sensitivity-driven-regularization

This would be tested on some known sparsifiable networks like ImageNet, VGGNet and last step is to experiment it on RL problems.
Also, another idea I had in mind was to port fast.ai courses into Julia which would be a great fun!!

Please provide suggestions as I would like to start working on proposal right away @MikeInnes @ViralBShah @dhairyagandhi96 @staticfloat

Speech example not working

julia> include("01-speech-blstm.jl")
Loading files
ERROR: LoadError: ArgumentError: invalid value for Enum BSONType: 32
Stacktrace:
 [1] enum_argument_error(::Symbol, ::UInt8) at ./Enums.jl:29
 [2] Type at ./Enums.jl:134 [inlined]
 [3] read at ./Enums.jl:15 [inlined]
 [4] parse_pairs(::IOStream) at /home/jfsantos/.julia/packages/BSON/kxdIr/src/read.jl:41
 [5] parse_doc at /home/jfsantos/.julia/packages/BSON/kxdIr/src/read.jl:49 [inlined]
 [6] parse at /home/jfsantos/.julia/packages/BSON/kxdIr/src/read.jl:93 [inlined]
 [7] #open#310(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::typeof(BSON.parse), ::String) at ./iostream.jl:369
 [8] open at ./iostream.jl:367 [inlined]
 [9] parse at /home/jfsantos/.julia/packages/BSON/kxdIr/src/read.jl:94 [inlined]
 [10] load at /home/jfsantos/.julia/packages/BSON/kxdIr/src/read.jl:96 [inlined]
 [11] macro expansion at /home/jfsantos/.julia/packages/BSON/kxdIr/src/BSON.jl:51 [inlined]
 [12] readData(::String) at /home/jfsantos/src/model-zoo/audio/speech-blstm/01-speech-blstm.jl:95
 [13] main() at /home/jfsantos/src/model-zoo/audio/speech-blstm/01-speech-blstm.jl:134
 [14] top-level scope at none:0
 [15] include at ./boot.jl:326 [inlined]
 [16] include_relative(::Module, ::String) at ./loading.jl:1038
 [17] include(::Module, ::String) at ./sysimg.jl:29
 [18] include(::String) at ./client.jl:403
 [19] top-level scope at none:0
in expression starting at /home/jfsantos/src/model-zoo/audio/speech-blstm/01-speech-blstm.jl:180

I wasn't able to narrow down what is causing the issue as I still have little experience with BSON. The script for data generation seems to have ran fine and created the train and test folders with bson files in them.

GSOC Proposal : Computer Vision Networks

I wish to implement the following network architectures for my GSOC. They are listed in the order of implementation

  • VGG16
  • VGG19
  • ResNet50
  • Xception
  • Inception V3
  • Inception ResNet V2
  • MobileNet
  • DenseNet
  • NASNet
  • R-FCN
  • Fast R-CNN
  • Faster R-CNN
  • RetinaNet
  • Mask R-CNN (*)

@MikeInnes how does this sound? I wish to test these implementation on the tiny-imagenet-200 dataset or the CIFAR-10

Problem with broadcasting .== in MNIST Conv model from the zoo

I'm getting a GPU compilation error with the conv model from the zoo:


Argument 4 to your kernel function is of type [something long that I can't copy/paste]
That type is not isbits, and such arguments are only allowed when they are unused by the kernel```
This is at the last line, `Flux.train!(loss, params(m), train, opt, cb = evalcb)`. Everything works on cpu.

Edit:
I've narrowed it down to the line `accuracy(x, y) = mean(onecold(m(x)) .== onecold(y))`. If I remove the callback, `evalcb = throttle(() -> @show(accuracy(tX, tY)), 10)` -> `Flux.train!(loss, params(m), train, opt)` it works.
Edit:
Further narrowed it down to the `.==` in `onecold(m(tX)) .== onecold(tY)`
`mean(cpu(onecold(m(tX))) .== cpu(onecold(tY)))` works

julia mnist.jl has numerical issue (NaN params) after some training

$ julia mnist.jl                                       
WARNING: Array{T}(::Type{T}, m::Int, n::Int) is deprecated, use Array{T}(m, n) instead.                                                                
Stacktrace:                                                          
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:70               
 [2] Array(::Type{Float64}, ::Int64, ::Int64) at ./deprecated.jl:57  
 [3] traindata() at /home/phuoc/.julia/v0.6/MNIST/src/MNIST.jl:88    
 [4] include_from_node1(::String) at ./loading.jl:569                
 [5] include(::String) at ./sysimg.jl:14                             
 [6] process_options(::Base.JLOptions) at ./client.jl:305            
 [7] _start() at ./client.jl:371                                     
while loading /home/phuoc/git/model-zoo/mnist/mnist.jl, in expression starting on line 5                                                               
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.                                                                           
Stacktrace:                                                          
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:70               
 [2] Array(::Type{Float64}, ::Int64) at ./deprecated.jl:57           
 [3] traindata() at /home/phuoc/.julia/v0.6/MNIST/src/MNIST.jl:89    
 [4] include_from_node1(::String) at ./loading.jl:569                
 [5] include(::String) at ./sysimg.jl:14                             
 [6] process_options(::Base.JLOptions) at ./client.jl:305            
 [7] _start() at ./client.jl:371                                     
while loading /home/phuoc/git/model-zoo/mnist/mnist.jl, in expression starting on line 5                                                               
loss(x, y) = param(0.089922)                                         
loss(x, y) = param(0.0890893)     
loss(x, y) = param(0.0865485)                                        
loss(x, y) = param(0.0804525)     
loss(x, y) = param(0.0710139)     
loss(x, y) = param(0.0596167)     
loss(x, y) = param(0.0461451)        
loss(x, y) = param(NaN)              
ERROR: LoadError: BoundsError: attempt to access 10-element UnitRange{Int64} at index [0]                                                              
Stacktrace:                          
 [1] throw_boundserror(::UnitRange{Int64}, ::Int64) at ./abstractarray.jl:433                                                                          
 [2] getindex at ./range.jl:477 [inlined]                                  
 [3] argmax(::TrackedArray{…,Array{Float64,1}}, ::UnitRange{Int64}) at /home/phuoc/.julia/v0.6/Flux/src/onehot.jl:27                                   
 [4] include_from_node1(::String) at ./loading.jl:569                      
 [5] include(::String) at ./sysimg.jl:14                                   
 [6] process_options(::Base.JLOptions) at ./client.jl:305                  
 [7] _start() at ./client.jl:371     
while loading /home/phuoc/git/model-zoo/mnist/mnist.jl, in expression starting on line 18  

Project Proposal For Contribution to GSoC'19

Hi I am a GSoC'19 aspirant looking to contribute to the model-zoo of Flux. I have a few ideas in mind for my work and wanted to check the feasibility of it. I was planning to implement Generative models as in CycleGAN, pix2pix, neural image captioning, and an object detection pipeline as in YOLO-v3. I also see that the object detection modules as in Faster-RCNN and mobilenet-SSD have not been implemented and I can also work with them. I have prior experience with generative models and object detection using Deep Learning. @MikeInnes @ViralBShah

Switch from MNIST.jl to MLDatasets.jl?

Would it be OK to swap the data-providing package for the mnist example? MNIST.jl has deprecation warnings on 0.6 and it seems like MLDatasets is more active. This would also allow the mnist example to be run with the fashion-MNIST dataset.

If there's no objections I'll submit a PR.

How do you apply the cmu phones model?

The code in the phonemes example looks amazingly simple. However, I cannot figure out how to apply the model at test time, i.e., how would a function predict(model, word) look like?

function predict(model, word)
   Xs = tokenise(word, alphabet)
   ## apply the RNN
end

60-minute blitz out of date

I'm working through the 60-minute blitz, but it appears it is out of date:

  • derivative appears to be succeeded by gradient
  • Even so, taking the second derivative of the polynomial example doesn't work:
using Flux.Tracker: gradient
f(x) = 3x^2 + 2x + 1
df(x) = gradient(f, x)
df(5) # (32.0 (tracked),)
ddf(x) = gradient(df, x)
ddf(5) # expect 6

Last line results in

ERROR: Function output is not scalar
Stacktrace:
 [1] losscheck(::Tuple{Flux.Tracker.TrackedReal{Float64}}) at /home/janis/.julia/packages/Flux/8XpDt/src/tracker/back.jl:171
 [2] gradient_(::Function, ::Int64) at /home/janis/.julia/packages/Flux/8XpDt/src/tracker/back.jl:72
 [3] #gradient#24 at /home/janis/.julia/packages/Flux/8XpDt/src/tracker/back.jl:182 [inlined]
 [4] gradient at /home/janis/.julia/packages/Flux/8XpDt/src/tracker/back.jl:182 [inlined]
 [5] ddf(::Int64) at ./REPL[69]:1
 [6] top-level scope at none:0

Addition Of CycleGAN model

As suggested by @dhairyagandhi96 , I have chalked out an implementation for the cycleGAN model

  • Write the UNet architecture for the generator [For 256x256 and 128x128 images]

  • Write the discriminator. The paper's implementation reference would be followed.

  • Training would be done on the apples2oranges dataset first followed by the horses2zebras dataset.

  • Formulation of the identity loss for the generators and discriminators and the adversarial losses.

The code will be organised in a separate repository with the utility functions, I/O, model definitions and training files.

What more details are required?
@staticfloat

Go with differentiable programming

Since AlphaGo is already implemented here with Flux, it probably shouldn't be hard to implement a similar experiment with differentiable programming.

Would it make sense? Or is it too complex for differentiable programming?

MNIST is using MSE loss

MNIST is a classification problem,
it has a softmax output layer.
It should be using cross entropy loss.

Use CuArrays for mnist.jl example

I tried the following but it does not seem to be supported yet:

using CuArrays
...
m = Chain(
  Dense(28^2, 32, σ),
  Dense(32, 10),
  softmax)
m = cu.(m)

julia> m[1].W                
Tracked 32×784 Array{Float64,2}:   

MNIST example fails on gpu

When I uncomment the using CuArrays line in the MNIST example it fails (Ubuntu 14.04 - Google Cloud instance 104GB RAM, NVIDIA GPU) with the following error:

ERROR: LoadError: MethodError: no method matching ∇conv_data!(::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}; pad=(0, 0), stride=(1, 1), dilation=(1, 1), flipkernel=0)
Closest candidates are:
  ∇conv_data!(::CuArray{T<:Union{Float16, Float32, Float64},N} where N, ::CuArray{T<:Union{Float16, Float32, Float64},N} where N, ::CuArray{T<:Union{Float16, Float32, Float64},N} where N, ::CuArray{T<:Union{Float16, Float32, Float64},N} where N; pad, stride, mode, alpha, dilation, workspace, algo) where T<:Union{Float16, Float32, Float64} at /home/jamesnorton/.julia/packages/CuArrays/f4Eke/src/dnn/nnlib.jl:115 got unsupported keyword argument "flipkernel"
  ∇conv_data!(::AbstractArray{T,4}, ::AbstractArray{T,4}, ::AbstractArray{T,4}, ::AbstractArray{T,4}; pad, stride, dilation, flipkernel) where T at /home/jamesnorton/.julia/packages/NNlib/x0XUf/src/conv.jl:84

The strange thing is that the second candidate looks like a match to me.

Stacktrace:

 [1] kwerr(::NamedTuple{(:pad, :stride, :dilation, :flipkernel),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64},Int64}}, ::Function, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at ./error.jl:97
 [2] (::getfield(NNlib, Symbol("#kw##∇conv_data!")))(::NamedTuple{(:pad, :stride, :dilation, :flipkernel),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64},Int64}}, ::typeof(NNlib.∇conv_data!), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at ./none:0
 [3] #∇conv_data#54(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Function, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at /home/jamesnorton/.julia/packages/NNlib/x0XUf/src/conv.jl:39
 [4] (::getfield(NNlib, Symbol("#kw##∇conv_data")))(::NamedTuple{(:stride, :pad, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}, ::typeof(NNlib.∇conv_data), ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at ./none:0
 [5] (::getfield(Flux.Tracker, Symbol("##434#435")){Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:stride, :pad, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}},TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,4}}})(::CuArray{Float32,4}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/array.jl:349
 [6] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##434#435")){Base.Iterators.Pairs{Symbol,Tuple{Int64,Int64},Tuple{Symbol,Symbol,Symbol},NamedTuple{(:stride, :pad, :dilation),Tuple{Tuple{Int64,Int64},Tuple{Int64,Int64},Tuple{Int64,Int64}}}},TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,4}}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,4}},Flux.Tracker.Tracked{CuArray{Float32,4}}}}, ::CuArray{Float32,4}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:23
 [7] back(::Flux.Tracker.Tracked{CuArray{Float32,4}}, ::CuArray{Float32,4}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [8] foreach at ./abstractarray.jl:1836 [inlined]
 [9] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("#back#451")){2,getfield(Base.Broadcast, Symbol("##26#28")){getfield(Base.Broadcast, Symbol("##27#29")){typeof(+),getfield(Base.Broadcast, Symbol("##9#10")){getfield(Base.Broadcast, Symbol("##9#10")){getfield(Base.Broadcast, Symbol("##11#12"))}},getfield(Base.Broadcast, Symbol("##13#14")){getfield(Base.Broadcast, Symbol("##13#14")){getfield(Base.Broadcast, Symbol("##15#16"))}},getfield(Base.Broadcast, Symbol("##5#6")){getfield(Base.Broadcast, Symbol("##5#6")){getfield(Base.Broadcast, Symbol("##3#4"))}}},typeof(relu)},Tuple{TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,4}}}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,4}},Flux.Tracker.Tracked{CuArray{Float32,4}}}}, ::CuArray{Float32,4}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:26
 [10] back(::Flux.Tracker.Tracked{CuArray{Float32,4}}, ::CuArray{Float32,4}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [11] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##438#439")){Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}},TrackedArray{…,CuArray{Float32,4}},Tuple{Int64,Int64},CuArray{Float32,4}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,4}},Nothing}}, ::CuArray{Float32,4}) at ./abstractarray.jl:1836
 [12] back(::Flux.Tracker.Tracked{CuArray{Float32,4}}, ::CuArray{Float32,4}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [13] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##385#386")){TrackedArray{…,CuArray{Float32,4}}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,4}},Nothing}}, ::CuArray{Float32,2}) at ./abstractarray.jl:1836
 [14] back(::Flux.Tracker.Tracked{CuArray{Float32,2}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [15] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##424#425")){TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,2}}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,2}},Flux.Tracker.Tracked{CuArray{Float32,2}}}}, ::CuArray{Float32,2}) at ./abstractarray.jl:1836
 [16] back(::Flux.Tracker.Tracked{CuArray{Float32,2}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [17] foreach(::Function, ::Tuple{Flux.Tracker.Tracked{CuArray{Float32,2}},Flux.Tracker.Tracked{CuArray{Float32,1}}}, ::Tuple{CuArray{Float32,2},CuArray{Float32,1}}) at ./abstractarray.jl:1836
 [18] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("#back#451")){2,getfield(Base.Broadcast, Symbol("##26#28")){getfield(Base.Broadcast, Symbol("##27#29")){typeof(+),getfield(Base.Broadcast, Symbol("##9#10")){getfield(Base.Broadcast, Symbol("##9#10")){getfield(Base.Broadcast, Symbol("##11#12"))}},getfield(Base.Broadcast, Symbol("##13#14")){getfield(Base.Broadcast, Symbol("##13#14")){getfield(Base.Broadcast, Symbol("##15#16"))}},getfield(Base.Broadcast, Symbol("##5#6")){getfield(Base.Broadcast, Symbol("##5#6")){getfield(Base.Broadcast, Symbol("##3#4"))}}},typeof(identity)},Tuple{TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,1}}}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,2}},Flux.Tracker.Tracked{CuArray{Float32,1}}}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:26
 [19] back(::Flux.Tracker.Tracked{CuArray{Float32,2}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [20] foreach at ./abstractarray.jl:1836 [inlined]
 [21] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##426#427")){TrackedArray{…,CuArray{Float32,2}}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,2}}}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:26
 [22] back(::Flux.Tracker.Tracked{CuArray{Float32,2}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [23] foreach(::Function, ::Tuple{Nothing,Flux.Tracker.Tracked{CuArray{Float32,2}},Nothing}, ::Tuple{CuArray{Float32,2},CuArray{Float32,2},Float32}) at ./abstractarray.jl:1836
 [24] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("#back#451")){3,getfield(Base.Broadcast, Symbol("##26#28")){getfield(Base.Broadcast, Symbol("##27#29")){typeof(*),getfield(Base.Broadcast, Symbol("##9#10")){getfield(Base.Broadcast, Symbol("##9#10")){getfield(Base.Broadcast, Symbol("##11#12"))}},getfield(Base.Broadcast, Symbol("##13#14")){getfield(Base.Broadcast, Symbol("##13#14")){getfield(Base.Broadcast, Symbol("##15#16"))}},getfield(Base.Broadcast, Symbol("##5#6")){getfield(Base.Broadcast, Symbol("##27#29")){typeof(CUDAnative.log),getfield(Base.Broadcast, Symbol("##9#10")){getfield(Base.Broadcast, Symbol("##11#12"))},getfield(Base.Broadcast, Symbol("##13#14")){getfield(Base.Broadcast, Symbol("##15#16"))},getfield(Base.Broadcast, Symbol("##5#6")){getfield(Base.Broadcast, Symbol("##5#6")){getfield(Base.Broadcast, Symbol("##3#4"))}}}}},typeof(*)},Tuple{Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1}},TrackedArray{…,CuArray{Float32,2}},Int64}},Tuple{Nothing,Flux.Tracker.Tracked{CuArray{Float32,2}},Nothing}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:26
 [25] back(::Flux.Tracker.Tracked{CuArray{Float32,2}}, ::CuArray{Float32,2}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:45
 [26] foreach at ./abstractarray.jl:1836 [inlined]
 [27] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##397#398")){TrackedArray{…,CuArray{Float32,2}}},Tuple{Flux.Tracker.Tracked{CuArray{Float32,2}}}}, ::Float32) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:26
 [28] back(::Flux.Tracker.Tracked{Float32}, ::Float32) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:43
 [29] foreach at ./abstractarray.jl:1836 [inlined]
 [30] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##178#179")),Tuple{Flux.Tracker.Tracked{Float32}}}, ::Float32) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:26
 [31] back(::Flux.Tracker.Tracked{Float32}, ::Float64) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:43
 [32] foreach at ./abstractarray.jl:1836 [inlined]
 [33] back_(::Flux.Tracker.Call{getfield(Flux.Tracker, Symbol("##203#206")){Int64},Tuple{Flux.Tracker.Tracked{Float32},Nothing}}, ::Float32) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:26
 [34] back(::Flux.Tracker.Tracked{Float32}, ::Int64) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:43
 [35] back!(::Flux.Tracker.TrackedReal{Float32}) at /home/jamesnorton/.julia/packages/Flux/oN61x/src/tracker/back.jl:62
 [36] #train!#121(::getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##11#12")),Int64}}, ::Function, ::Function, ::Array{Tuple{CuArray{Float32,4},Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1}}},1}, ::getfield(Flux.Optimise, Symbol("##43#47"))) at /home/jamesnorton/.julia/packages/Juno/46C8i/src/progress.jl:111
 [37] (::getfield(Flux.Optimise, Symbol("#kw##train!")))(::NamedTuple{(:cb,),Tuple{getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##11#12")),Int64}}}}, ::typeof(Flux.Optimise.train!), ::Function, ::Array{Tuple{CuArray{Float32,4},Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1}}},1}, ::Function) at ./none:0
 [38] top-level scope at none:0
 [39] include at ./boot.jl:317 [inlined]
 [40] include_relative(::Module, ::String) at ./loading.jl:1044
 [41] include(::Module, ::String) at ./sysimg.jl:29
 [42] exec_options(::Base.JLOptions) at ./client.jl:231
 [43] _start() at ./client.jl:425
in expression starting at /home/jamesnorton/cats_n_dogs/mnist.jl:39


cifar10 fails on GPU

If I add using CuArrays to the top of the script, I get the following error:

$ julia vision/cifar10/cifar10.jl 
ERROR: LoadError: GPU compilation of #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Bool,1,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Int64,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Int64,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}.
That type is not isbits, and such arguments are only allowed when they are unused by the kernel.

Stacktrace:
 [1] check_invocation(::CUDAnative.CompilerContext, ::LLVM.Function) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/validation.jl:35
 [2] compile(::CUDAnative.CompilerContext) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:94
 [3] #compile#109(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::VersionNumber, ::Any, ::Any) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:45
 [4] compile at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:43 [inlined]
 [5] #compile#108(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::CUDAdrv.CuDevice, ::Function, ::Any) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:18
 [6] compile at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:16 [inlined]
 [7] macro expansion at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:269 [inlined]
 [8] #cufunction#123(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::getfield(GPUArrays, Symbol("##23#24")), ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Bool,1,CUDAnative.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Int64,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}}}) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:240
 [9] cufunction(::Function, ::Type) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:240
 [10] macro expansion at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:208 [inlined]
 [11] macro expansion at ./gcutils.jl:87 [inlined]
 [12] macro expansion at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:205 [inlined]
 [13] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Bool,1}, ::Tuple{CuArray{Bool,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CuArray{Int64,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /home/tyler/.julia/packages/CuArrays/qZCAt/src/gpuarray_interface.jl:59
 [14] gpu_call(::Function, ::CuArray{Bool,1}, ::Tuple{CuArray{Bool,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CuArray{Int64,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}}, ::Int64) at /home/tyler/.julia/packages/GPUArrays/t8tJB/src/abstract_gpu_interface.jl:151
 [15] gpu_call at /home/tyler/.julia/packages/GPUArrays/t8tJB/src/abstract_gpu_interface.jl:128 [inlined]
 [16] copyto! at /home/tyler/.julia/packages/GPUArrays/t8tJB/src/broadcast.jl:48 [inlined]
 [17] copyto! at ./broadcast.jl:797 [inlined]
 [18] copy(::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Tuple{Base.OneTo{Int64}},typeof(==),Tuple{CuArray{Int64,1},Array{Int64,1}}}) at ./broadcast.jl:773
 [19] materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(==),Tuple{CuArray{Int64,1},Array{Int64,1}}}) at ./broadcast.jl:753
 [20] accuracy(::CuArray{Float32,4}, ::Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1}}) at /home/tyler/code/model-zoo/vision/cifar10/cifar10.jl:114
 [21] macro expansion at ./show.jl:555 [inlined]
 [22] (::getfield(Main, Symbol("##33#34")))() at /home/tyler/code/model-zoo/vision/cifar10/cifar10.jl:118
 [23] (::getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##33#34")),Int64})(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function) at /home/tyler/.julia/packages/Flux/lz7S9/src/utils.jl:120
 [24] throttled at /home/tyler/.julia/packages/Flux/lz7S9/src/utils.jl:116 [inlined]
 [25] macro expansion at /home/tyler/.julia/packages/Flux/lz7S9/src/optimise/train.jl:75 [inlined]
 [26] macro expansion at /home/tyler/.julia/packages/Juno/TfNYn/src/progress.jl:133 [inlined]
 [27] #train!#12(::getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##33#34")),Int64}}, ::Function, ::Function, ::Tracker.Params, ::Array{Tuple{CuArray{Float32,4},Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1}}},1}, ::ADAM) at /home/tyler/.julia/packages/Flux/lz7S9/src/optimise/train.jl:69
 [28] (::getfield(Flux.Optimise, Symbol("#kw##train!")))(::NamedTuple{(:cb,),Tuple{getfield(Flux, Symbol("#throttled#18")){getfield(Flux, Symbol("##throttled#10#14")){Bool,Bool,getfield(Main, Symbol("##33#34")),Int64}}}}, ::typeof(Flux.Optimise.train!), ::Function, ::Tracker.Params, ::Array{Tuple{CuArray{Float32,4},Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1}}},1}, ::ADAM) at ./none:0
 [29] top-level scope at none:0
 [30] include at ./boot.jl:326 [inlined]
 [31] include_relative(::Module, ::String) at ./loading.jl:1038
 [32] include(::Module, ::String) at ./sysimg.jl:29
 [33] exec_options(::Base.JLOptions) at ./client.jl:267
 [34] _start() at ./client.jl:436
in expression starting at /home/tyler/code/model-zoo/vision/cifar10/cifar10.jl:124

Edit: hm, same problem on vision/mnist/conv.jl. The code does hit the GPU for a couple seconds however according to watch nvidia-smi

$ julia vision/mnist/conv.jl 
[ Info: activating new environment at ~/code/model-zoo/cuda.
  Updating registry at `~/.julia/registries/General`
  Updating git-repo `https://github.com/JuliaRegistries/General.git`
 Resolving package versions...
[ Info: Loading data set
[ Info: Constructing model...
[ Info: Beginning training loop...
ERROR: LoadError: GPU compilation of #23(CuArrays.CuKernelState, CUDAnative.CuDeviceArray{Bool,1,CUDAnative.AS.Global}, Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Int64,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}) failed
KernelError: passing and using non-bitstype argument

Argument 4 to your kernel function is of type Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Int64,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}.
That type is not isbits, and such arguments are only allowed when they are unused by the kernel.

Stacktrace:
 [1] check_invocation(::CUDAnative.CompilerContext, ::LLVM.Function) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/validation.jl:35
 [2] compile(::CUDAnative.CompilerContext) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:94
 [3] #compile#109(::Bool, ::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::VersionNumber, ::Any, ::Any) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:45
 [4] compile at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:43 [inlined]
 [5] #compile#108(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::CUDAdrv.CuDevice, ::Function, ::Any) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:18
 [6] compile at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/compiler/driver.jl:16 [inlined]
 [7] macro expansion at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:269 [inlined]
 [8] #cufunction#123(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::typeof(CUDAnative.cufunction), ::getfield(GPUArrays, Symbol("##23#24")), ::Type{Tuple{CuArrays.CuKernelState,CUDAnative.CuDeviceArray{Bool,1,CUDAnative.AS.Global},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CUDAnative.CuDeviceArray{Int64,1,CUDAnative.AS.Global},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}}}) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:240
 [9] cufunction(::Function, ::Type) at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:240
 [10] macro expansion at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:208 [inlined]
 [11] macro expansion at ./gcutils.jl:87 [inlined]
 [12] macro expansion at /home/tyler/.julia/packages/CUDAnative/PFgO3/src/execution.jl:205 [inlined]
 [13] _gpu_call(::CuArrays.CuArrayBackend, ::Function, ::CuArray{Bool,1}, ::Tuple{CuArray{Bool,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CuArray{Int64,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}}, ::Tuple{Tuple{Int64},Tuple{Int64}}) at /home/tyler/.julia/packages/CuArrays/qZCAt/src/gpuarray_interface.jl:59
 [14] gpu_call(::Function, ::CuArray{Bool,1}, ::Tuple{CuArray{Bool,1},Base.Broadcast.Broadcasted{Nothing,Tuple{Base.OneTo{Int64}},typeof(==),Tuple{Base.Broadcast.Extruded{CuArray{Int64,1},Tuple{Bool},Tuple{Int64}},Base.Broadcast.Extruded{Array{Int64,1},Tuple{Bool},Tuple{Int64}}}}}, ::Int64) at /home/tyler/.julia/packages/GPUArrays/t8tJB/src/abstract_gpu_interface.jl:151
 [15] gpu_call at /home/tyler/.julia/packages/GPUArrays/t8tJB/src/abstract_gpu_interface.jl:128 [inlined]
 [16] copyto! at /home/tyler/.julia/packages/GPUArrays/t8tJB/src/broadcast.jl:48 [inlined]
 [17] copyto! at ./broadcast.jl:797 [inlined]
 [18] copy(::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Tuple{Base.OneTo{Int64}},typeof(==),Tuple{CuArray{Int64,1},Array{Int64,1}}}) at ./broadcast.jl:773
 [19] materialize(::Base.Broadcast.Broadcasted{Base.Broadcast.ArrayStyle{CuArray},Nothing,typeof(==),Tuple{CuArray{Int64,1},Array{Int64,1}}}) at ./broadcast.jl:753
 [20] accuracy(::CuArray{Float32,4}, ::Flux.OneHotMatrix{CuArray{Flux.OneHotVector,1}}) at /home/tyler/code/model-zoo/vision/mnist/conv.jl:85
 [21] top-level scope at /home/tyler/code/model-zoo/vision/mnist/conv.jl:100 [inlined]
 [22] top-level scope at ./none:0
 [23] include at ./boot.jl:326 [inlined]
 [24] include_relative(::Module, ::String) at ./loading.jl:1038
 [25] include(::Module, ::String) at ./sysimg.jl:29
 [26] exec_options(::Base.JLOptions) at ./client.jl:267
 [27] _start() at ./client.jl:436
in expression starting at /home/tyler/code/model-zoo/vision/mnist/conv.jl:94

I'm not sure if this is a problem on ModelZoo or elsewhere. I'm on CUDA 10.0 and CUDNN 7.3.1. Here's the tests from CuArrays:

Test Summary:                          | Pass  Error  Total
CuArrays                               | 4011      4   4015
  GPUArrays test suite                 | 1023          1023
  Memory                               |    5             5
  Array                                |   19            19
  Adapt                                |    2             2
  Broadcast                            |   10            10
  Cufunc                               |    6             6
  Ref Broadcast                        |    1             1
  Broadcast Fix                        |    4             4
  Reduce                               |    6             6
  0D                                   |    2             2
  Slices                               |   17            17
  Reshape                              |    1             1
  LinearAlgebra.triu! with diagonal -2 |    1             1
  LinearAlgebra.triu! with diagonal -1 |    1             1
  LinearAlgebra.triu! with diagonal 0  |    1             1
  LinearAlgebra.triu! with diagonal 1  |    1             1
  LinearAlgebra.triu! with diagonal 2  |    1             1
  LinearAlgebra.tril! with diagonal -2 |    1             1
  LinearAlgebra.tril! with diagonal -1 |    1             1
  LinearAlgebra.tril! with diagonal 0  |    1             1
  LinearAlgebra.tril! with diagonal 1  |    1             1
  LinearAlgebra.tril! with diagonal 2  |    1             1
  Utilities                            |    2             2
  accumulate                           |    8             8
  logical indexing                     |   15            15
  CUDNN                                |   41            41
  CUBLAS                               | 1164          1164
  CUSPARSE                             | 1140          1140
  CUSOLVER                             |  269      4    273
    elty = Float32                     |   67      1     68
      Cholesky (po)                    |    8             8
      getrf!                           |           1      1
      getrs!                           |    3             3
      geqrf!                           |    1             1
      ormqr!                           |    4             4
      orgqr!                           |    2             2
      sytrf!                           |    4             4
      gebrd!                           |    4             4
      syevd!                           |    3             3
      sygvd!                           |    5             5
      syevj!                           |    4             4
      svd with QRAlgorithm method      |    7             7
      svd with QRAlgorithm method      |    1             1
      svd with JacobiAlgorithm method  |    7             7
      svd with JacobiAlgorithm method  |    7             7
      qr                               |    7             7
    elty = Float64                     |   67      1     68
      Cholesky (po)                    |    8             8
      getrf!                           |           1      1
      getrs!                           |    3             3
      geqrf!                           |    1             1
      ormqr!                           |    4             4
      orgqr!                           |    2             2
      sytrf!                           |    4             4
      gebrd!                           |    4             4
      syevd!                           |    3             3
      sygvd!                           |    5             5
      syevj!                           |    4             4
      svd with QRAlgorithm method      |    7             7
      svd with QRAlgorithm method      |    1             1
      svd with JacobiAlgorithm method  |    7             7
      svd with JacobiAlgorithm method  |    7             7
      qr                               |    7             7
    elty = Complex{Float32}            |   67      1     68
      Cholesky (po)                    |    8             8
      getrf!                           |           1      1
      getrs!                           |    3             3
      geqrf!                           |    1             1
      ormqr!                           |    4             4
      orgqr!                           |    2             2
      sytrf!                           |    4             4
      gebrd!                           |    4             4
      syevd!                           |    3             3
      sygvd!                           |    5             5
      syevj!                           |    4             4
      svd with QRAlgorithm method      |    7             7
      svd with QRAlgorithm method      |    1             1
      svd with JacobiAlgorithm method  |    7             7
      svd with JacobiAlgorithm method  |    7             7
      qr                               |    7             7
    elty = Complex{Float64}            |   67      1     68
      Cholesky (po)                    |    8             8
      getrf!                           |           1      1
      getrs!                           |    3             3
      geqrf!                           |    1             1
      ormqr!                           |    4             4
      orgqr!                           |    2             2
      sytrf!                           |    4             4
      gebrd!                           |    4             4
      syevd!                           |    3             3
      sygvd!                           |    5             5
      syevj!                           |    4             4
      svd with QRAlgorithm method      |    7             7
      svd with QRAlgorithm method      |    1             1
      svd with JacobiAlgorithm method  |    7             7
      svd with JacobiAlgorithm method  |    7             7
      qr                               |    7             7
  CUFFT                                |  150           150
  CURAND                               |   32            32
  CUSPARSE + CUSOLVER                  |   84            84
ERROR: LoadError: Some tests did not pass: 4011 passed, 0 failed, 4 errored, 0 broken.
in expression starting at /home/tyler/.julia/packages/CuArrays/qZCAt/test/runtests.jl:25
ERROR: Package CuArrays errored during testing

And here's the test results for Flux:

[ Info: Testing Flux/CUDNN
batch_size = 1: Error During Test at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:7
  Got exception outside of a @test
  CUDNNError(code 3, CUDNN_STATUS_BAD_PARAM)
  Stacktrace:
   [1] macro expansion at /home/tyler/.julia/packages/CuArrays/qZCAt/src/dnn/error.jl:19 [inlined]
   [2] cudnnRNNBackwardData(::Flux.CUDA.RNNDesc{Float32}, ::Int64, ::Array{CuArrays.CUDNN.TensorDesc,1}, ::CuArray{Float32,1}, ::Array{CuArrays.CUDNN.TensorDesc,1}, ::CuArray{Float32,1}, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,1}, ::Ptr{Nothing}, ::CUDAdrv.CuPtr{Nothing}, ::CuArrays.CUDNN.FilterDesc, ::CuArray{Float32,1}, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,1}, ::Ptr{Nothing}, ::CUDAdrv.CuPtr{Nothing}, ::Array{CuArrays.CUDNN.TensorDesc,1}, ::CuArray{Float32,1}, ::CuArrays.CUDNN.TensorDesc, ::CuArray{Float32,1}, ::Ptr{Nothing}, ::CUDAdrv.CuPtr{Nothing}, ::CuArray{UInt8,1}, ::CuArray{UInt8,1}) at /home/tyler/.julia/packages/Flux/lz7S9/src/cuda/curnn.jl:170
   [3] backwardData(::Flux.CUDA.RNNDesc{Float32}, ::CuArray{Float32,1}, ::CuArray{Float32,1}, ::CuArray{Float32,1}, ::Nothing, ::CuArray{Float32,1}, ::Nothing, ::CuArray{UInt8,1}) at /home/tyler/.julia/packages/Flux/lz7S9/src/cuda/curnn.jl:187
   [4] backwardData(::Flux.CUDA.RNNDesc{Float32}, ::CuArray{Float32,1}, ::CuArray{Float32,1}, ::CuArray{Float32,1}, ::CuArray{Float32,1}, ::CuArray{UInt8,1}) at /home/tyler/.julia/packages/Flux/lz7S9/src/cuda/curnn.jl:195
   [5] (::getfield(Flux.CUDA, Symbol("##8#9")){Flux.GRUCell{TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,1}}},TrackedArray{…,CuArray{Float32,1}},TrackedArray{…,CuArray{Float32,1}},CuArray{UInt8,1},Tuple{CuArray{Float32,1},CuArray{Float32,1}}})(::Tuple{CuArray{Float32,1},CuArray{Float32,1}}) at /home/tyler/.julia/packages/Flux/lz7S9/src/cuda/curnn.jl:306
   [6] back_(::Tracker.Call{getfield(Flux.CUDA, Symbol("##8#9")){Flux.GRUCell{TrackedArray{…,CuArray{Float32,2}},TrackedArray{…,CuArray{Float32,1}}},TrackedArray{…,CuArray{Float32,1}},TrackedArray{…,CuArray{Float32,1}},CuArray{UInt8,1},Tuple{CuArray{Float32,1},CuArray{Float32,1}}},Tuple{Tracker.Tracked{CuArray{Float32,1}},Tracker.Tracked{CuArray{Float32,1}},Tracker.Tracked{CuArray{Float32,2}},Tracker.Tracked{CuArray{Float32,2}},Tracker.Tracked{CuArray{Float32,1}}}}, ::Tuple{CuArray{Float32,1},CuArray{Float32,1}}, ::Bool) at /home/tyler/.julia/packages/Tracker/6wcYJ/src/back.jl:35
   [7] back(::Tracker.Tracked{Tuple{CuArray{Float32,1},CuArray{Float32,1}}}, ::Tuple{CuArray{Float32,1},Int64}, ::Bool) at /home/tyler/.julia/packages/Tracker/6wcYJ/src/back.jl:58
   [8] (::getfield(Tracker, Symbol("##13#14")){Bool})(::Tracker.Tracked{Tuple{CuArray{Float32,1},CuArray{Float32,1}}}, ::Tuple{CuArray{Float32,1},Int64}) at /home/tyler/.julia/packages/Tracker/6wcYJ/src/back.jl:38
   [9] foreach(::Function, ::Tuple{Tracker.Tracked{Tuple{CuArray{Float32,1},CuArray{Float32,1}}},Nothing}, ::Tuple{Tuple{CuArray{Float32,1},Int64},Nothing}) at ./abstractarray.jl:1867
   [10] back_(::Tracker.Call{getfield(Tracker, Symbol("##356#358")){Tracker.TrackedTuple{Tuple{CuArray{Float32,1},CuArray{Float32,1}}},Int64},Tuple{Tracker.Tracked{Tuple{CuArray{Float32,1},CuArray{Float32,1}}},Nothing}}, ::CuArray{Float32,1}, ::Bool) at /home/tyler/.julia/packages/Tracker/6wcYJ/src/back.jl:38
   [11] back(::Tracker.Tracked{CuArray{Float32,1}}, ::CuArray{Float32,1}, ::Bool) at /home/tyler/.julia/packages/Tracker/6wcYJ/src/back.jl:58
   [12] back!(::TrackedArray{…,CuArray{Float32,1}}, ::CuArray{Float32,1}) at /home/tyler/.julia/packages/Tracker/6wcYJ/src/back.jl:77
   [13] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:23
   [14] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
   [15] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:7
   [16] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
   [17] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
   [18] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
   [19] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
   [20] include at ./boot.jl:326 [inlined]
   [21] include_relative(::Module, ::String) at ./loading.jl:1038
   [22] include(::Module, ::String) at ./sysimg.jl:29
   [23] include(::String) at ./client.jl:403
   [24] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/cuda.jl:45
   [25] include at ./boot.jl:326 [inlined]
   [26] include_relative(::Module, ::String) at ./loading.jl:1038
   [27] include(::Module, ::String) at ./sysimg.jl:29
   [28] include(::String) at ./client.jl:403
   [29] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/runtests.jl:30
   [30] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
   [31] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/runtests.jl:11
   [32] include at ./boot.jl:326 [inlined]
   [33] include_relative(::Module, ::String) at ./loading.jl:1038
   [34] include(::Module, ::String) at ./sysimg.jl:29
   [35] include(::String) at ./client.jl:403
   [36] top-level scope at none:0
   [37] eval(::Module, ::Any) at ./boot.jl:328
   [38] exec_options(::Base.JLOptions) at ./client.jl:243
   [39] _start() at ./client.jl:436
batch_size = 5: Test Failed at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:26
  Expression: ((rnn.cell).Wi).grad ≈ collect(((curnn.cell).Wi).grad)
   Evaluated: Float32[0.0264221 0.0311623 … 0.0401529 0.0481648; -0.00580488 -0.00684245 … -0.000633113 -0.00512767; … ; -0.488995 -0.397307 … -0.543058 -0.593239; -1.59764 -1.72794 … -1.24722 -2.0625] ≈ Float32[0.0163969 0.0200817 … 0.0357488 0.036319; -0.00383122 -0.00466102 … 0.000233916 -0.00279561; … ; -0.655613 -0.581466 … -0.616253 -0.790114; -1.32217 -1.42347 … -1.1262 -1.737]
Stacktrace:
 [1] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:26
 [2] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [3] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:7
 [4] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [5] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
 [6] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
 [7] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
batch_size = 5: Test Failed at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:27
  Expression: ((rnn.cell).Wh).grad ≈ collect(((curnn.cell).Wh).grad)
   Evaluated: Float32[-0.0155654 0.00326925 … -0.0217709 -0.00794538; 0.00331486 -0.000961465 … 0.00208863 -0.000682507; … ; 0.0600955 -0.0379895 … 0.0877803 0.056216; 0.13519 -0.049922 … 0.1269 0.0464798] ≈ Float32[-0.0102084 0.00063507 … -0.0177858 -0.00776742; 0.00226024 -0.000442876 … 0.00130408 -0.000717541; … ; 0.0800566 -0.047805 … 0.10263 0.0568791; 0.120198 -0.04255 … 0.115747 0.0459818]
Stacktrace:
 [1] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:27
 [2] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [3] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:7
 [4] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [5] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
 [6] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
 [7] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
batch_size = 5: Test Failed at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:28
  Expression: ((rnn.cell).b).grad ≈ collect(((curnn.cell).b).grad)
   Evaluated: Float32[0.0431294, -0.00524064, 0.00328069, 0.0383973, -0.0277038, -0.190451, 0.378964, 0.113537, -1.10465, -0.109885, -0.380812, -0.806893, -0.591657, -0.840546, -2.16732] ≈ Float32[0.0309576, -0.0028444, 0.00784941, 0.0508012, -0.0251556, -0.116478, 0.196908, 0.156512, -1.30342, -0.109, -0.0338723, -0.537076, -0.840815, -1.04284, -1.83286]
Stacktrace:
 [1] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:28
 [2] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [3] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:7
 [4] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [5] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
 [6] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
 [7] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
batch_size = 5: Test Failed at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:29
  Expression: ((rnn.cell).h).grad ≈ collect(((curnn.cell).h).grad)
   Evaluated: Float32[-0.0128217, -0.706767, -0.358611, -1.41596, -0.547122] ≈ Float32[0.0667142, -0.127551, -0.454483, -1.79769, -0.414044]
Stacktrace:
 [1] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:29
 [2] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [3] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:7
 [4] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1156
 [5] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
 [6] top-level scope at /build/source/usr/share/julia/stdlib/v1.1/Test/src/Test.jl:1083
 [7] top-level scope at /home/tyler/.julia/packages/Flux/lz7S9/test/cuda/curnn.jl:4
Test Summary:        | Pass  Fail  Error  Total
Flux                 |  246     4      1    251
  Throttle           |   11                  11
  Jacobian           |    1                   1
  Initialization     |   12                  12
  Params             |    2                   2
  Basic Stacking     |    1                   1
  Precision          |    6                   6
  Stacking           |    3                   3
  onecold            |    4                   4
  Optimise           |   11                  11
  Optimiser          |    3                   3
  Training Loop      |    2                   2
  basic              |   25                  25
  Dropout            |    8                   8
  BatchNorm          |   14                  14
  InstanceNorm       |   16                  16
  GroupNorm          |   16                  16
  losses             |   30                  30
  Pooling            |    2                   2
  CNN                |    1                   1
  Depthwise Conv     |    4                   4
  Tracker            |    4                   4
  CuArrays           |    8                   8
  CUDNN BatchNorm    |   10                  10
  RNN                |   40     4      1     45
    R = Flux.RNN     |   16                  16
    R = Flux.GRU     |    6     4      1     11
      batch_size = 1 |    2            1      3
      batch_size = 5 |    4     4             8
    R = Flux.LSTM    |   18                  18
ERROR: LoadError: Some tests did not pass: 246 passed, 4 failed, 1 errored, 0 broken.
in expression starting at /home/tyler/.julia/packages/Flux/lz7S9/test/runtests.jl:9
ERROR: Package Flux errored during testing

This appears to be a known issue FluxML/Flux.jl#267

conversion to pointer not defined for CuArray{Float32,2}

With the line using CuArrays uncommented I get the error

# julia conv.jl 
ERROR: LoadError: conversion to pointer not defined for CuArray{Float32,2}
Stacktrace:
 [1] #conv2d!#43(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Int64, ::Float32, ::Function, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at /root/.julia/v0.6/NNlib/src/impl/conv.jl:174
 [2] (::NNlib.#kw##conv2d!)(::Array{Any,1}, ::NNlib.#conv2d!, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at ./<missing>:0
 [3] (::NNlib.#kw##conv!)(::Array{Any,1}, ::NNlib.#conv!, ::CuArray{Float32,4}, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at ./<missing>:0
 [4] #conv#53(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Function, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at /root/.julia/v0.6/NNlib/src/conv.jl:29
 [5] (::NNlib.#kw##conv)(::Array{Any,1}, ::NNlib.#conv, ::CuArray{Float32,4}, ::CuArray{Float32,4}) at ./<missing>:0
 [6] track(::Flux.Tracker.Call{Flux.Tracker.#_conv,Tuple{CuArray{Float32,4},TrackedArray{…,CuArray{Float32,4}},Tuple{Int64,Int64},Tuple{Int64,Int64}}}) at /root/.julia/v0.6/Flux/src/tracker/Tracker.jl:41
 [7] #conv#23(::Tuple{Int64,Int64}, ::Tuple{Int64,Int64}, ::Function, ::CuArray{Float32,4}, ::TrackedArray{…,CuArray{Float32,4}}) at /root/.julia/v0.6/Flux/src/tracker/array.jl:252
 [8] (::NNlib.#kw##conv)(::Array{Any,1}, ::NNlib.#conv, ::CuArray{Float32,4}, ::TrackedArray{…,CuArray{Float32,4}}) at ./<missing>:0
 [9] (::Flux.Conv{2,NNlib.#relu,TrackedArray{…,CuArray{Float32,4}},TrackedArray{…,CuArray{Float32,1}}})(::CuArray{Float32,4}) at /root/.julia/v0.6/Flux/src/layers/conv.jl:39
 [10] mapfoldl_impl(::Base.#identity, ::Flux.##81#82, ::CuArray{Float32,4}, ::Array{Any,1}, ::Int64) at ./reduce.jl:43
 [11] (::Flux.Chain)(::CuArray{Float32,4}) at /root/.julia/v0.6/Flux/src/layers/basic.jl:31
 [12] include_from_node1(::String) at ./loading.jl:576
 [13] include(::String) at ./sysimg.jl:14
 [14] process_options(::Base.JLOptions) at ./client.jl:305
 [15] _start() at ./client.jl:371
while loading model-zoo/mnist/conv.jl, in expression starting on line 30

Better output of test prediction in "Housing" model

When the housing,jl file is run,the last line is not printed to the console (outputs correctly when run line by line).
Also the test example is currently the first record in the dataset, I think it would be better if we use a random record from the dataset to predict.
All of this can be implemented in 2-3 lines of code.
I can work on this if no one else is already working on this.

kl-divergence formula in vae seems wrong

The kl-divergence for normal distribution should be

but the formula in vae is

kl_q_p(μ, logσ) = 0.5 * sum(exp.(2 .* logσ) + μ.^2 .- 1 .+ logσ.^2)

it should be

kl_q_p(μ, logσ) = 0.5 * sum(exp.(2 .* logσ) + μ.^2 .- 1 .- 2 .* logσ) 

for the last term change from (logσ)^2 to 2logσ

GSoC Proposal

I want to participate in GSoC and make a contribution in Flux.jl model zoo and implement the following algorithms

  • RBMs
  • Siamese nets
  • SARSA(λ)
  • TD(λ)
  • An implementation of a sequence to sequence network
  • Pixel rnn
  • Attention network for generating captions
  • GRUs
  • An easy to call in implementation of all the learning updates
  • VAE
  • DCGANs
  • DenseNet

And apart from model implementations, I would like to propose a tutorial repo under Flux.jl in which each implementation will be explained in a Jupyter notebook.

@MikeInnes is it a good enough project for GSoC? I would be glad to hear any reviews from you!

word based language model with word embeddings

Hello,

I want to build a word language model. Due to that I was reading the char-rnn code in the zoo. And in that code, one-hot encodings are used in order to represent the char encodings. I was wondering how do we create word embeddings and feed them to the same network. And how the softmax handles word embedding vectors?

Here is the code that I "wrote" :

@time sentences = [map(word -> String(word), split(line, " ")) for line in lines]
@time vocabulary = sort(unique(vcat(sentences...)))
# adding 'eos' as end of sentence tag
!in("eos", vocabulary) ? push!(vocabulary, "eos") : nothing

# if we create one-hot vectors than we will have 65k x 252k binary vector! Too Much!
# @time hots = map(word -> Flux.onehot(word, vocabulary), vcat(sentences...));
# alternative is to create word embeddings !

N = length(vocabulary)
embedSize = 128

dictionary = Dict{String, Int32}();
@time [dictionary[k] = v for (k,v) in zip(vocabulary, collect(1:N))]
wordEmbeddings = rand(Float32, embedSize, N); # each column refers to one word in the vocabulary
# wordEmbeddings[:,dictionary[""]] will give the embedding of that specific word

seqlen = 50
nbatch = 50

eos = wordEmbeddings[:, dictionary["eos"]]

@time text = map(sentence -> map(word -> wordEmbeddings[:, dictionary[word]], sentence), sentences)
text = vcat(text...)

Xs = collect(partition(batchseq(chunk(text, nbatch), eos), seqlen))
Ys = collect(partition(batchseq(chunk(text[2:end], nbatch), eos), seqlen))
# this where we create the model a one layer GRU
m = Chain(
  GRU(N, 128),
  Dense(128, N),
  softmax)

Run everything on GPUs

There should be an easy way to run the whole model-zoo on GPUs. Some kind of a flag on top of each file or an environment variable perhaps.

Bad results from char-rnn after CPU training

I might be doing things in the wrong way but after playing with the char-rnn example for some time I wonder if other people can get good performance from it? It is based on a blog entry which shows impressive text generation capabilities after training and char-rnn.jl seems to be trying to replicate them but my results are nowhere near the results of the blog entry. Are people tweaking the example code to get better results or can anyone get good results from the "vanilla" settings?

GSoC Proposal 2018

I am interested in contributing to the model zoo as part of GSoC . Below are the models which I want to implement over the course of summer:

1.KNN
2.Apriori
3.Naive Bayes Classification
4.Association Rules
5.Q-Learning
6.K-means
7.Deep Q Network
8.Genetic algorithm

@MikeInnes Is this project good enough for GSoC Sir?

The MNIST model needs nursing

When running the mnist.jl example (by simply copying everything to REPL), I get the following

julia> using Flux, MNIST

julia> using Flux: onehotbatch, argmax, mse, throttle

julia> using Base.Iterators: repeated

julia> x, y = traindata()
WARNING: Array{T}(::Type{T}, m::Int, n::Int) is deprecated, use Array{T}(m, n) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:70
 [2] Array(::Type{Float64}, ::Int64, ::Int64) at ./deprecated.jl:57
 [3] traindata() at /home/troels/.julia/v0.6/MNIST/src/MNIST.jl:88
 [4] eval(::Module, ::Any) at ./boot.jl:235
 [5] eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:66
 [6] macro expansion at ./REPL.jl:97 [inlined]
 [7] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73
while loading no file, in expression starting on line 0
WARNING: Array{T}(::Type{T}, m::Int) is deprecated, use Array{T}(m) instead.
Stacktrace:
 [1] depwarn(::String, ::Symbol) at ./deprecated.jl:70
 [2] Array(::Type{Float64}, ::Int64) at ./deprecated.jl:57
 [3] traindata() at /home/troels/.julia/v0.6/MNIST/src/MNIST.jl:89
 [4] eval(::Module, ::Any) at ./boot.jl:235
 [5] eval_user_input(::Any, ::Base.REPL.REPLBackend) at ./REPL.jl:66
 [6] macro expansion at ./REPL.jl:97 [inlined]
 [7] (::Base.REPL.##1#2{Base.REPL.REPLBackend})() at ./event.jl:73
while loading no file, in expression starting on line 0
([0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0; … ; 0.0 0.0 … 0.0 0.0; 0.0 0.0 … 0.0 0.0], [5.0, 0.0, 4.0, 1.0, 9.0, 2.0, 1.0, 3.0, 1.0, 4.0  …  9.0, 2.0, 9.0, 5.0, 1.0, 8.0, 3.0, 5.0, 6.0, 8.0])

julia> y = onehotbatch(y, 0:9)
10×60000 Flux.OneHotMatrix:
 false   true  false  false  false  false  false  …  false  false  false  false  false  false  false
 false  false  false   true  false  false   true     false   true  false  false  false  false  false
 false  false  false  false  false   true  false     false  false  false  false  false  false  false
 false  false  false  false  false  false  false     false  false  false   true  false  false  false
 false  false   true  false  false  false  false     false  false  false  false  false  false  false
  true  false  false  false  false  false  false  …   true  false  false  false   true  false  false
 false  false  false  false  false  false  false     false  false  false  false  false   true  false
 false  false  false  false  false  false  false     false  false  false  false  false  false  false
 false  false  false  false  false  false  false     false  false   true  false  false  false   true
 false  false  false  false   true  false  false     false  false  false  false  false  false  false

julia> m = Chain(
                Dense(28^2, 32, σ),
                Dense(32, 10),
                softmax)
Chain(Dense(784, 32, NNlib.σ), Dense(32, 10), NNlib.softmax)

julia> loss(x, y) = mse(m(x), y)
loss (generic function with 1 method)

julia> Flux.train!(loss, repeated((x,y), 1000), SGD(params(m), 0.1),
                          cb = throttle(() -> @show(loss(x, y)), 5))
loss(x, y) = param(0.0899937)
loss(x, y) = param(0.0896123)
loss(x, y) = param(0.0887763)
loss(x, y) = param(0.0872013)
loss(x, y) = param(0.084333)
loss(x, y) = param(0.0793369)
loss(x, y) = param(0.0737908)
loss(x, y) = param(0.0664382)
loss(x, y) = param(0.0584718)
loss(x, y) = param(0.0506686)
loss(x, y) = param(0.0428273)
loss(x, y) = param(NaN)
loss(x, y) = param(NaN)
loss(x, y) = param(NaN)

... after which I terminated the run.

As you can see (in addition to the warnings), the training somehow fails. (This happens after some time every time I've repeated the process.)

spell out acronyms on REAMDE.md

A friendly suggestion: Could you spell out the acronyms in the README.md? Like "RNN (Recurrent Neural Net)" or vice versa?

This would make the page much more comprehensible to people who are not already steeped in machine learning.

GSoC Proposal

I am interested in contributing to the model zoo of FluxML as part of GSoC'18. Below are the models which I want to implement over the course of summer:

@MikeInnes Does it sound like a good project to work on?

Next steps for notebooks and tutorials

With #59 we have good infrastructure for generating notebooks. There are a couple more steps we need:

  • Content. This could be taken from existing open-source tutorials (#68) and we should also port existing notebooks like these and these.
  • Auto build notebooks – we can do this on our CI, like documenter, and push them to a branch of this repo.
  • JuliaBox – really just the same as the above, but push them to the JuliaBoxTutorials repo, which gets synced to JuliaBox users' drives.
  • Flux Website – we should have a tutorials section on the website, built from these notebooks. We'll need to build HTML from the notebooks we have (ideally with niceties like ToC, but that's secondary) and push them to the Flux website's github pages.

Useful resources:

Dataset, DataLoaders, transforms for julia

Hi,
I am a GSoC-19 aspirant, While going through some of the codes in model-zoo I realize that currently data is being loaded in an ad-hoc manner, I see that MLDatasets tries to improve upon this, but I was wondering if we could have something like PyTorch Datasets, DataLoaders and Transforms. I can work on the same, and I can also contribute to zoo by adding examples from zero/few shot learning domain and by creating a wrapper around some common generative models such mixture models, VAEs and a general wrapper for GAN/WGAN/WGAN-GP where one could call it with their custom Generators and Discriminators.

I see this coming off as a single tutorial which will demonstrate the use of loaders and the train/tests APIs (with some interesting few short learning examples).

Thoughts?

@ViralBShah @staticfloat @avik-pal @dhairyagandhi96

No decrease in loss when using model without chain

I am using the same code as in mlp.jl, I kept the same model without using chain, looks like this:

l1 = Dense(28^2,32, relu)
l2 = Dense(32, 10)
m(x) = softmax(l2(l1(x)))

now whenever I run the model the loss doesn't decrease on training.

In MNIST MLP, `argmax` and `mean` problems

In model-zoo/vision/mnist/mlp.jl

accuracy(x, y) = mean(argmax(m(x)) .== argmax(y))

Throws a warning:
Warning: `argmax(...) is deprecated, use `onecold(...)` instead.
Which is missing.

And calling it throws an error:
ERROR: UndefVarError: mean not defined

using Statistics fixes this issue, perhaps is lacking somewhere?

ForwardDiff 0.5.0 required by Flux.jl, but version 0.10.1 used in Manifest.toml

in cifar10/Manifest.toml, ForwardDiff version is 0.10.1, but in the Flux package, they require version 0.5.0 for it to work. It gives an error saying Float32() cannot run on ForwardDiff object, when running the project code (after instantiating the project with command instantiate, which uses Manifest.toml, of course).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.