Comments (10)
Iotas are simplified away entirely after the tiling pass, so they will not show up in the generated code.
from futhark.
Yes, sorry, what's important in the code is not the use of iota
but the use of explicit indexing. For all the example tensor contraction expressions I have tested thus far, the problem (of extra dimensions, and hence extra variance added onto the one redomap array) does not seem to occur when I rewrite the expression to not use explicit indexing, as in the example given in my EDIT2.
Or perhaps I misunderstood your reply? Can you please elaborate?
from futhark.
If you explicitly index an iota
, the iota
goes away. Pre-tiling, we still have redomap
s, so the iota
is not explicitly indexed (at least not fully). Post-tiling, those redomap
s get turned into loop
with explicit indexing, and then the indexing of iota
is simplified away. I'm not quite sure what the question is now.
from futhark.
I think my original question remains, but allow me to simplify:
The two input arrays to the contraction have dimensions xsss : [Q][A][I]
and ysss : [B][Q][J]
, and the result has dimensions [I][B][J][A]
. Hence the segspace has dimensions [I][B][J][A]
, and the reduced dimension is [Q]
.
Given the map nest in example_tc5
(second snippet in my OP), the first input slice to the redomap should be variant only to the two map
s labeled "dim 0" and "dim 3" (the I
and A
of the segspace), while the second input slice should be variant only to the other two map
s labeled "dim 1" and "dim 2" (the B
and J
of the segspace).
Inspecting the IR right before entering loop tiling, I see that the first redomap slice, index_8309
, comes from xsss
as expected. However the second redomap slice, map2_arg2_8307
, comes from some 4D array map2_arg2_r_r_r_8279
-- but ysss
is supposed to be 3D, and map2_arg2_r_r_r_8279
seems to be indexed using gtid_8301
, which corresponds to the I
dimension of the space, i.e. one of the dimensions on which ysss
is supposed to be invariant (see below snippet).
In other words, the size [Q]
slice of ysss
going into the redomap is now variant to one more outer map
dimension than it should be.
let {map2_arg2_8307 : [Q₄_6857]f32} =
map2_arg2_r_r_r_8279[gtid_8301, gtid_8302, gtid_8303, 0i64 :+ Q₄_6857 * 1i64]
let {index_8309 : [Q₄_6857]f32} =
xsss_6860[0i64 :+ Q₄_6857 * 1i64, gtid_8304, gtid_8301]
from futhark.
It ultimately occurs because the program prior to flattening (--extract-kernels
) is not a perfect nest of map
s. The relevant part looks like this:
let {defunc_0_map_res_7279 : [I_6458][B_6460][J_6461][A_6457]f32} =
map(I_6458,
{iota_res_7223},
\ {eta_p_7225 : i64}
: {[B_6460][J_6461][A_6457]f32} ->
let {defunc_0_map_res_7278 : [B_6460][J_6461][A_6457]f32} =
map(B_6460,
{ysss_6463},
\ {ysss_elem_7254 : [Q_6459][J_6461]f32}
: {[J_6461][A_6457]f32} ->
let {defunc_0_map_res_7277 : [J_6461][A_6457]f32} =
map(J_6461,
{iota_res_7227},
\ {eta_p_7232 : i64}
: {[A_6457]f32} ->
let {map2_arg2_7233 : [Q_6459]f32} =
ysss_elem_7254[0i64 :+ Q_6459 * 1i64, eta_p_7232]
let {defunc_0_map_res_7276 : [A_6457]f32} =
map(A_6457,
{iota_res_7228},
\ {eta_p_7235 : i64}
: {f32} ->
let {index_7275 : [Q_6459]f32} =
xsss_6462[0i64 :+ Q_6459 * 1i64, eta_p_7235, eta_p_7225]
let {defunc_res_7274 : f32} =
redomap(Q_6459,
{index_7275, map2_arg2_7233},
{\ {eta_p_7242 : f32,
eta_p_7243 : f32}
: {f32} ->
let {+_res_7244 : f32} =
fadd32(eta_p_7242, eta_p_7243)
in {+_res_7244},
{0.0f32}},
\ {eta_p_7256 : f32,
eta_p_7257 : f32}
: {f32} ->
let {defunc_0_f_res_7258 : f32} =
fmul32(eta_p_7256, eta_p_7257)
in {defunc_0_f_res_7258})
in {defunc_res_7274})
in {defunc_0_map_res_7276})
in {defunc_0_map_res_7277})
in {defunc_0_map_res_7278})
During flattening, that slice (map2_arg2_7233
) is distributed. A distributed slice is implemented as a SegMap
containing a slice.
from futhark.
I removed the "somewhat inefficiently" part; that is in isolation the right way to flatten a slice.
from futhark.
If this is a problem, it would certainly be possible to get rid of the segmap, by extending our index-simplification rules a bit.
from futhark.
Thanks! I think I got the gist of it.
For now, one condition I require is that each redomap array must be variant to at least one and at most (k-1)
of k
inner dimensions, and conversely, for each inner dimension there is exactly one redomap array variant to it. The latter condition here is a simplifying assumption which should eventually be elided, perhaps replaced with something along the lines of "for each inner dimension there is at least one and at most (n-1)
redomap arrays variant to it, for n
redomap arrays", but I'm not quite there yet.
I can't say for certain whether this condition cannot be expressed differently, or that it wouldn't be made redundant under other conditions which I have yet to formulate, but for now I'd say it is a bit in the way of determining variance, yes.
Regardless, I have plenty of example source programs that do not provoke the thing that I should be able to reach a working prototype, so there's no rush to do anything about it. On the other hand, if you have suggestions on a different strategy to implement in my module that would not require changes elsewhere, I'm all ears!
from futhark.
Programming with explicit indexes is wrong. It should be done with replicates. But I think I will implement the simplifications anyway - it is not so difficult.
from futhark.
Programming with explicit indexes is wrong.
I agree! I only tested that version of the program because the method of using combinations of flatten/transpose/unflatten to obtain the correct permutation of indices can be a little tedious (until you get used to it and start working them out step by step), and so I imagined some users (especially newbs) might opt to doing it that way, even if I do agree that bad coding style warrants punishment in most cases.
It should be done with replicates.
Oo, this sounds interesting. Can you demonstrate?
But I think I will implement the simplifications anyway - it is not so difficult.
Awesome! :)
from futhark.
Related Issues (20)
- Internal compiler error: Use of unknown variable HOT 1
- Internal compiler error: lmadcopyCPU: 0 HOT 1
- Platform inconsistency in power calculation HOT 6
- This should not type check
- Size-type error after pass `simplify` HOT 1
- Futhark does not run with GHC 9.8 HOT 2
- Internal compiler error: unknown variable HOT 1
- ispc backend failing on large ranges HOT 8
- Document `FUTHARK_COMPILER_DEBUGGING=2`.
- Limit context memory use HOT 4
- Internal compiler error
- Internal compiler error (unhandled IO exception).
- Dead repl link on front page HOT 1
- Justification for Alias Tracking HOT 1
- futhark c and multicore are giving different results HOT 3
- Module ascription does not type check
- Is it possible to add a compiled version of the windows version available for download? HOT 4
- Documentation improvement for Prelude math functions HOT 3
- Type suffixes should be ignored when unifying expressions
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from futhark.