Comments (10)
@sitamgithub-MSIT we haven't heard an update in a bit and just wondering if you're still working on the issue?
Yes I am working to it. I am checking this example in the hugging face for Gemma. I am thinking about reproducing the same for CodeGemma, though.
from xla.
/assigntome
from xla.
@duncantech I am thinking of training the latest Gemma model, indeed, with Pytorch XLA. Is it okay then?
from xla.
I think the geema model should work out of box. Take a look at https://github.com/google/gemma_pytorch#try-it-out-with-pytorchxla. Feel free to give it a try and see if we can improve anything.
from xla.
I think the geema model should work out of box. Take a look at https://github.com/google/gemma_pytorch#try-it-out-with-pytorchxla. Feel free to give it a try and see if we can improve anything.
Ok. I will look into the gemma part.
For a different model I am trying with, a few things I need to know: do I need to use any free cloud tpu provider, for example, Kaggle or Colab tpu, or is it necessary to do it with the v5 in Google Cloud?
from xla.
That part I think @duncantech can answer.
from xla.
You can work with a free TPU provider if you'd like to get things started.
We should also be able to give a small amount of v5es to try with too.
from xla.
@sitamgithub-MSIT we haven't heard an update in a bit and just wondering if you're still working on the issue?
from xla.
@duncantech I am preparing a script to run in tpus. So as I am using Codegema, it comes with 7b parameters, so it will not fit in Colab unless we use a 4-bit version of that. So should I use the bits and bytes configuration for that? Or should I just train it in the cloud and see if everything works?
from xla.
You can try with the 4-but version and see what the performance is like since that would be easier for others to run in the future!
from xla.
Related Issues (20)
- Equivalent of get_worker_info to split an IterableDataset HOT 18
- Is there any way to directly execute the cached computational graph HOT 5
- Op info test for `T .. arange` HOT 1
- CUDA and GPU-Flavoured Docker/Container Image Missing CUDA Support HOT 1
- Graph dump to optimize HOT 9
- Invalid version identifier in filenames of nightly builds HOT 6
- How to test on a subset of TPUs in a TPU Pod HOT 7
- Failed to import torch_xla by following the GPU instructions on an H100 node (A3-High) HOT 1
- Iteration of MpDeviceLoader doesn't work HOT 1
- Improve device auto-detection HOT 2
- libtpu not installed with nightly build HOT 4
- PyTorch/XLA usability progress tracking
- inconsistency in calling `get_ordinal` and `world_size` calls HOT 2
- Effectively manage API usability changes
- Make `torch_xla.launch` work transparently in notebooks
- Support portable executables in `torch_xla.launch`
- `xmp.spawn(_mp_fn, nprocs=1)` failure HOT 4
- Device init before `xmp.spawn()` HOT 3
- Does PyTorch/XLA nightly provide GPU support? HOT 3
- introduce torch.tpu.is_available() HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xla.