Giter VIP home page Giter VIP logo

Comments (10)

denizyuret avatar denizyuret commented on June 2, 2024

from autograd.jl.

ekinakyurek avatar ekinakyurek commented on June 2, 2024

from autograd.jl.

ekinakyurek avatar ekinakyurek commented on June 2, 2024

yeah, full works for me! Though, the problematic thing about this interface is that you don't know what will your gradient type be in advance.

from autograd.jl.

denizyuret avatar denizyuret commented on June 2, 2024

from autograd.jl.

ekinakyurek avatar ekinakyurek commented on June 2, 2024

I realize that it has broken the Knet. When you have Adam optimizer with gclip and you get a Sparse gradient, the gclip fails in this case.

from autograd.jl.

denizyuret avatar denizyuret commented on June 2, 2024

I can't replicate, the following works fine. Please provide a minimal example.

using Knet

# Load data (mnistdata basically replicates mnist.ipynb)                                                                      
include(Knet.dir("data","mnist.jl"))
dtrn,dtst = mnistdata(xsize=(784,:),xtype=Array)

struct Foo; w; end

model = Foo(param(10,784))

# We turn Linear instances into callable objects for prediction:                                                              
(m::Foo)(x) = (I = (a->a[1]).(vec(argmax(x,dims=1))); m.w[:,I])

# model(x) gives predictions, let model(x,y) give the loss                                                                    
(m::Foo)(x, y) = nll(m(x), y)

@info "training..."
@time Knet.minimize!(model, dtst, Adam(lr=0.0001,gclip=0.1))

from autograd.jl.

denizyuret avatar denizyuret commented on June 2, 2024

dy/sparsebugs branch has implemented + for two Sparse values, please test.

from autograd.jl.

ekinakyurek avatar ekinakyurek commented on June 2, 2024

Although, I didn't run your example, I believe you didn't get the error because your gradients doesn't exceed the gclip value. Here is a simpler example you can replicate without downloading anything.

julia> using Knet

julia> function foo(w)
           s = 0.0
           for i=1:length(w); s+=w[i]; end
           return s
       end

foo (generic function with 1 method)

julia> w = Param(randn(2,2))
2×2 Param{Array{Float64,2}}:
  0.427868   0.657678
 -0.332868  -1.50003

julia> J = @diff foo(w)
T(-0.7473544438700652)

julia> update!(value(w), grad(J,w), Adam(gclip=0.1))
ERROR: MethodError: lmul!(::Float64, ::AutoGrad.Sparse{Float64,2}) is ambiguous. Candidates:
  lmul!(a, x::AutoGrad.Sparse{T,N}) where {T, N} in AutoGrad at /kuacc/users/eakyurek13/.julia/packages/AutoGrad/9MrCC/src/sparse.jl:51
  lmul!(s::Number, X::AbstractArray) in LinearAlgebra at /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.2/LinearAlgebra/src/generic.jl:100
Possible fix, define
  lmul!(::Number, ::AutoGrad.Sparse{T,N})
Stacktrace:
 [1] gclip!(::AutoGrad.Sparse{Float64,2}, ::Float64) at /kuacc/users/eakyurek13/.julia/packages/Knet/IIjk8/src/update.jl:613
 [2] update!(::Array{Float64,2}, ::AutoGrad.Sparse{Float64,2}, ::Adam) at /kuacc/users/eakyurek13/.julia/packages/Knet/IIjk8/src/update.jl:537
 [3] top-level scope at REPL[6]:1

from autograd.jl.

denizyuret avatar denizyuret commented on June 2, 2024

You are right, it was an ambiguity issue. I will create a PR now.

from autograd.jl.

denizyuret avatar denizyuret commented on June 2, 2024

Fixed in current master.

from autograd.jl.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.