Comments (15)
-
original PR: #7157
-
Reason to backport: The PR is related to a planned release feature urgent fix: urgent fix for while_loop in 2.4 release
-
2.4 backport PR link: #7306
from xla.
- Original PR: #7155
- Reason to backport: Fix a bug in distributed checkpointing with padded tensors
- 2.4 backport PR link: #7243
from xla.
- original PR: #7236
- Reason to backport: Include build dependencies for CUDA 12.4
- 2.4 backport PR link: #7244
from xla.
- original PR: #7219
- Reason to backport: Minor addition to Triton functionality to support CUDA plugins
- 2.4 backport PR link: #7303
from xla.
- original PR: #7254 #7258
- Reason to backport: OpenXLA pin update requested because new telemetry in PyTorch/xla needs to be enabled, requested by Aman
- 2.4 backport PR link: #7261 #7262
from xla.
- Original PRs: #7249 and #7268
- Reason to backport: Enables new libtpu telemetry and detection of CUDA plugin package (
torch_xla_cuda_plugin
) by default. - Backport link: #7270
from xla.
- Original PR: Enable bucketized all-reduce for gradients #7216
- Reason to backport: Parity with Neuron branch r2.1_aws_neuron
- Backport link: WIP
from xla.
Reason:
This is a regression. Without this fix, PyTorch/XLA will always output a log spam about PJRT version upon loading.
Risk:
Low.
from xla.
Reason:
This is a regression fix. Without this fix a bunch of xlml test will fail
Risk:
Low since it is just a revert.
from xla.
Reason:
It's a cherry-pick hot fix for AttributeError: module 'numpy' has no attribute 'product'
in r2.4 CI
Risk:
Low since master issue has beem fixed by the original pr
from xla.
original prs
backport prs
Reason:
I want to enable the eager mode for 2.4 release since I want to make it default for the 2.6
Risk
low, all of the features are guarded by torch_xla.experimental.eager_mdoe
, these prs should be no op is eager mode is disabled.
from xla.
Original PR
Backport PR
Reason:
To fix CI
Risk:
Low, since this doesn't change anything in torch_xla library.
from xla.
Original PR:
#7640
Backport PR:
#7684
Reason:
To fix upstream pytorch build
Risk:
Low, this doesn't change anything in torch_xla library
from xla.
Original: #7231
Backport: #7708
Support MoE.
from xla.
Original PR
Backport PR:
Reason:
Back port docs for eager mode
Risk:
low, just a doc
from xla.
Related Issues (20)
- [API Usability] Deprecate xla_real_devices
- [API Usability] Deprecate `xla_device_hw`
- [API Usability] Delete `unlazy`
- [API Usability] Internalize `RateTracker`
- [API Usability] Internalize `ToXlaTensorArena`
- [API Usability] Delete `check_view_sharing`
- [API Usability] Internalize `reduce_gradients`
- [Fori Loop] Inconsistent Shape Behavior HOT 2
- Equivalent of get_worker_info to split an IterableDataset HOT 18
- Is there any way to directly execute the cached computational graph HOT 5
- Op info test for `T .. arange` HOT 1
- CUDA and GPU-Flavoured Docker/Container Image Missing CUDA Support HOT 1
- Graph dump to optimize HOT 9
- Invalid version identifier in filenames of nightly builds HOT 6
- How to test on a subset of TPUs in a TPU Pod HOT 7
- Failed to import torch_xla by following the GPU instructions on an H100 node (A3-High) HOT 1
- Iteration of MpDeviceLoader doesn't work HOT 1
- Improve device auto-detection HOT 2
- libtpu not installed with nightly build HOT 4
- PyTorch/XLA usability progress tracking
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xla.