Comments (3)
Hmm if I run the script above it still gives wrong answer. Test passed tho.
from xla.
I think the issue is with shape reporting. Data looks right AFAICT:
tensor([[[0.2436, 0.8010],
[0.2436, 0.8010],
[0.5483, 0.8010],
[0.7193, 0.8010]],
[[0.3733, 0.9327],
[0.3733, 0.9327],
[0.5483, 0.9327],
[0.7193, 0.9327]],
[[0.9229, 0.3699],
[0.9229, 0.7067],
[0.9229, 0.7648],
[0.9229, 0.3699]]])
torch.Size([4, 2])
from xla.
Shape inference is broken
IR {
%0 = f32[3,1,2]{2,1,0} xla::device_data(), device=CPU:0
%1 = f32[4,2]{1,0} xla::device_data(), device=CPU:0
%2 = f32[4,2]{1,0} aten::max(%1, %0)
}
HloModule DumpHloComputation.9
ENTRY %DumpHloComputation.9 (param_0.1: f32[3,1,2], param_1.2: f32[4,2]) -> (f32[3,4,2]) {
%param_1.2 = f32[4,2]{1,0} parameter(1)
%broadcast.3 = f32[3,4,2]{2,1,0} broadcast(f32[4,2]{1,0} %param_1.2), dimensions={1,2}
%param_0.1 = f32[3,1,2]{2,1,0} parameter(0)
%broadcast.4 = f32[3,1,2]{2,1,0} broadcast(f32[3,1,2]{2,1,0} %param_0.1), dimensions={0,1,2}
%reshape.5 = f32[3,2]{1,0} reshape(f32[3,1,2]{2,1,0} %broadcast.4)
%broadcast.6 = f32[3,4,2]{2,1,0} broadcast(f32[3,2]{1,0} %reshape.5), dimensions={0,2}
%maximum.7 = f32[3,4,2]{2,1,0} maximum(f32[3,4,2]{2,1,0} %broadcast.3, f32[3,4,2]{2,1,0} %broadcast.6)
ROOT %tuple.8 = (f32[3,4,2]{2,1,0}) tuple(f32[3,4,2]{2,1,0} %maximum.7)
}
from xla.
Related Issues (20)
- Copy data from TPU to CPU increases CompileTime HOT 4
- [hard] Op info test for `logdet` HOT 1
- Ran out of memory in memory space vmem / register allocator spill slots call depth 2 HOT 1
- Report real device for XLATensor2
- Core dump when calling `jax.device_put` on array mapped from torch.tensor to jax with xla2
- Registering CUDA custom calls with the C++ FFI HOT 1
- Processes hang when running multiprocess training with xmp.spawn, and one rank waits at rendezvous while another hits assertion HOT 2
- [benchmarks] `dlrm` training running twice on dynamo and non-dynamo configurations.
- 2.5 backport PR request list HOT 3
- XLA tensor metadata differs from eager.
- While operator test generates condition input as a parameter instead of a constant HOT 6
- Speeding up computation while using SPMD on large TPU pod HOT 3
- [hard] Op info test for `nanmean`
- [hard] Op info test for `repeat_interleave` HOT 1
- [RFC] `torch_xla` Backward Compatibility Proposal HOT 2
- [fp8] Support `fp8e4m3` in torch_xla
- xm.send_cpu_data_to_device cannot support 2d data and 4d mesh HOT 1
- [hard] Op info test for `masked.median` HOT 1
- Distributed training on multi-host v4/v5 TPU Pods is too slow HOT 6
- Add documentation for `pip install[pallas]`
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xla.