Giter VIP home page Giter VIP logo

Comments (5)

lucidrains avatar lucidrains commented on May 20, 2024 1

@ken012git 🙏 do you want to try 0.2.4? i think i found the issue 🤦‍♂️

from imagen-pytorch.

lucidrains avatar lucidrains commented on May 20, 2024 1

@ken012git forgot the residual 🤦 and also needed a feedforward after it anyways

from imagen-pytorch.

lucidrains avatar lucidrains commented on May 20, 2024 1

@ken012git thank you for the experiments! basically, in a lot of papers, researchers remove attention past a certain token length (1024 or 2048) since it is prohibitively expensive due to the quadratic compute. but i like to substitute them with linear attention, even if it is a bit weaker. my favorite linear attention remains https://arxiv.org/abs/1812.01243 , and here i am also giving it a depthwise conv recommended by the primer paper

from imagen-pytorch.

ken012git avatar ken012git commented on May 20, 2024

Sure! Thanks for your immediate response!

I would also like to know what causes the issue. =)

from imagen-pytorch.

ken012git avatar ken012git commented on May 20, 2024

Hi @lucidrains ,

I have tested v0.2.4 and the issue seems gone. Thanks!

# test model, resolution 64
unet1 = Unet(
        dim = 32,
        cond_dim = 512,
        dim_mults = (1, 2, 4, 8),
        num_resnet_blocks = (2, 2, 2, 2),    # small
        layer_attns = (False, False, False, True),
        layer_cross_attns = (False, False, False, True),
       # use_linear_attn = False,        
        use_linear_attn = True,
    )

Loss curve, blue: use_linear_attn =False,red: use_linear_attn =True
Screen Shot 2022-06-13 at 4 08 59 PM

early stage results, left: use_linear_attn =False,right: use_linear_attn =True
Screen Shot 2022-06-13 at 4 10 25 PM

I am wondering we should use transformers or linear attention layers at this line that configured by use_linear_attn.

Would you point me relevant papers? Thanks

from imagen-pytorch.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.