Giter VIP home page Giter VIP logo

Comments (3)

jaindhairyahere avatar jaindhairyahere commented on August 25, 2024 1

@krishnadubba Have you successfully implement the strided version btw? Could you share the code change?

I was able to reproduce the patterns using this function.
image
image

`

def sparse_attention_mask(n_tokens, stride_length=3, c=2):

  x = tf.reshape(tf.range(n_tokens), [n_tokens, 1])

  y = tf.transpose(x)

  z = tf.zeros((n_tokens,n_tokens))

  Q = z + x

  K = z + y

  causal_attention_mask = (Q>=K)

  fixed_mask_1 = tf.equal(Q//stride_length, K//stride_length)
  fixed_mask_2 = tf.logical_and(tf.math.floormod(K, stride_length) >= stride_length-c, tf.math.floormod(K, stride_length)<=stride_length)
  combined_mask_fixed = tf.logical_and(causal_attention_mask, tf.logical_or(fixed_mask_1, fixed_mask_2))

  stride_mask_1 = tf.less_equal(Q-K, stride_length)
  stride_mask_2 = tf.equal(tf.math.floormod(Q-K, stride_length), 0)
  combined_mask_stride = tf.logical_and(causal_attention_mask, tf.logical_or(stride_mask_1, stride_mask_2))

  return tf.reshape(combined_mask_fixed, [1, 1, n_tokens, n_tokens]), tf.reshape(combined_mask_stride, [1, 1, n_tokens, n_tokens])`

from sparse_attention.

pengfeiZhao1993 avatar pengfeiZhao1993 commented on August 25, 2024

After reading this code -- "attention.py", I find this base code only contains separate implementations of strided attention, called "first / second step of strided attention" within it. Therefore, you perhaps need to implement a integral version of strided attention by yourself with each head corresponding to one of aforementioned two steps for a two head sparse self-attention.

from sparse_attention.

benathi avatar benathi commented on August 25, 2024

@krishnadubba Have you successfully implement the strided version btw? Could you share the code change?

from sparse_attention.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.