Giter VIP home page Giter VIP logo

Comments (10)

athas avatar athas commented on June 2, 2024

Iotas are simplified away entirely after the tiling pass, so they will not show up in the generated code.

from futhark.

sortraev avatar sortraev commented on June 2, 2024

Yes, sorry, what's important in the code is not the use of iota but the use of explicit indexing. For all the example tensor contraction expressions I have tested thus far, the problem (of extra dimensions, and hence extra variance added onto the one redomap array) does not seem to occur when I rewrite the expression to not use explicit indexing, as in the example given in my EDIT2.

Or perhaps I misunderstood your reply? Can you please elaborate?

from futhark.

athas avatar athas commented on June 2, 2024

If you explicitly index an iota, the iota goes away. Pre-tiling, we still have redomaps, so the iota is not explicitly indexed (at least not fully). Post-tiling, those redomaps get turned into loop with explicit indexing, and then the indexing of iota is simplified away. I'm not quite sure what the question is now.

from futhark.

sortraev avatar sortraev commented on June 2, 2024

I think my original question remains, but allow me to simplify:

The two input arrays to the contraction have dimensions xsss : [Q][A][I] and ysss : [B][Q][J], and the result has dimensions [I][B][J][A]. Hence the segspace has dimensions [I][B][J][A], and the reduced dimension is [Q].

Given the map nest in example_tc5 (second snippet in my OP), the first input slice to the redomap should be variant only to the two maps labeled "dim 0" and "dim 3" (the I and A of the segspace), while the second input slice should be variant only to the other two maps labeled "dim 1" and "dim 2" (the B and J of the segspace).

Inspecting the IR right before entering loop tiling, I see that the first redomap slice, index_8309, comes from xsss as expected. However the second redomap slice, map2_arg2_8307, comes from some 4D array map2_arg2_r_r_r_8279 -- but ysss is supposed to be 3D, and map2_arg2_r_r_r_8279 seems to be indexed using gtid_8301, which corresponds to the I dimension of the space, i.e. one of the dimensions on which ysss is supposed to be invariant (see below snippet).

In other words, the size [Q] slice of ysss going into the redomap is now variant to one more outer map dimension than it should be.

let {map2_arg2_8307 : [Q_6857]f32} =
  map2_arg2_r_r_r_8279[gtid_8301, gtid_8302, gtid_8303, 0i64 :+ Q_6857 * 1i64]
let {index_8309 : [Q_6857]f32} =
  xsss_6860[0i64 :+ Q_6857 * 1i64, gtid_8304, gtid_8301]

from futhark.

athas avatar athas commented on June 2, 2024

It ultimately occurs because the program prior to flattening (--extract-kernels) is not a perfect nest of maps. The relevant part looks like this:

  let {defunc_0_map_res_7279 : [I_6458][B_6460][J_6461][A_6457]f32} =
    map(I_6458,
        {iota_res_7223},
        \ {eta_p_7225 : i64}
          : {[B_6460][J_6461][A_6457]f32} ->
          let {defunc_0_map_res_7278 : [B_6460][J_6461][A_6457]f32} =
            map(B_6460,
                {ysss_6463},
                \ {ysss_elem_7254 : [Q_6459][J_6461]f32}
                  : {[J_6461][A_6457]f32} ->
                  let {defunc_0_map_res_7277 : [J_6461][A_6457]f32} =
                    map(J_6461,
                        {iota_res_7227},
                        \ {eta_p_7232 : i64}
                          : {[A_6457]f32} ->
                          let {map2_arg2_7233 : [Q_6459]f32} =
                            ysss_elem_7254[0i64 :+ Q_6459 * 1i64, eta_p_7232]
                          let {defunc_0_map_res_7276 : [A_6457]f32} =
                            map(A_6457,
                                {iota_res_7228},
                                \ {eta_p_7235 : i64}
                                  : {f32} ->
                                  let {index_7275 : [Q_6459]f32} =
                                    xsss_6462[0i64 :+ Q_6459 * 1i64, eta_p_7235, eta_p_7225]
                                  let {defunc_res_7274 : f32} =
                                    redomap(Q_6459,
                                            {index_7275, map2_arg2_7233},
                                            {\ {eta_p_7242 : f32,
                                                eta_p_7243 : f32}
                                              : {f32} ->
                                              let {+_res_7244 : f32} =
                                                fadd32(eta_p_7242, eta_p_7243)
                                              in {+_res_7244},
                                            {0.0f32}},
                                            \ {eta_p_7256 : f32,
                                               eta_p_7257 : f32}
                                              : {f32} ->
                                              let {defunc_0_f_res_7258 : f32} =
                                                fmul32(eta_p_7256, eta_p_7257)
                                              in {defunc_0_f_res_7258})
                                  in {defunc_res_7274})
                          in {defunc_0_map_res_7276})
                  in {defunc_0_map_res_7277})
          in {defunc_0_map_res_7278})

During flattening, that slice (map2_arg2_7233) is distributed. A distributed slice is implemented as a SegMap containing a slice.

from futhark.

athas avatar athas commented on June 2, 2024

I removed the "somewhat inefficiently" part; that is in isolation the right way to flatten a slice.

from futhark.

athas avatar athas commented on June 2, 2024

If this is a problem, it would certainly be possible to get rid of the segmap, by extending our index-simplification rules a bit.

from futhark.

sortraev avatar sortraev commented on June 2, 2024

Thanks! I think I got the gist of it.

For now, one condition I require is that each redomap array must be variant to at least one and at most (k-1) of k inner dimensions, and conversely, for each inner dimension there is exactly one redomap array variant to it. The latter condition here is a simplifying assumption which should eventually be elided, perhaps replaced with something along the lines of "for each inner dimension there is at least one and at most (n-1) redomap arrays variant to it, for n redomap arrays", but I'm not quite there yet.

I can't say for certain whether this condition cannot be expressed differently, or that it wouldn't be made redundant under other conditions which I have yet to formulate, but for now I'd say it is a bit in the way of determining variance, yes.

Regardless, I have plenty of example source programs that do not provoke the thing that I should be able to reach a working prototype, so there's no rush to do anything about it. On the other hand, if you have suggestions on a different strategy to implement in my module that would not require changes elsewhere, I'm all ears!

from futhark.

athas avatar athas commented on June 2, 2024

Programming with explicit indexes is wrong. It should be done with replicates. But I think I will implement the simplifications anyway - it is not so difficult.

from futhark.

sortraev avatar sortraev commented on June 2, 2024

Programming with explicit indexes is wrong.

I agree! I only tested that version of the program because the method of using combinations of flatten/transpose/unflatten to obtain the correct permutation of indices can be a little tedious (until you get used to it and start working them out step by step), and so I imagined some users (especially newbs) might opt to doing it that way, even if I do agree that bad coding style warrants punishment in most cases.

It should be done with replicates.

Oo, this sounds interesting. Can you demonstrate?

But I think I will implement the simplifications anyway - it is not so difficult.

Awesome! :)

from futhark.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.