Comments (9)
Hi @ShyFoo,
Thank you for reaching out. While we haven't tried this directly yet DALI provides compatibility with PyTorch so you can create a DALI iterator that would behave as PyTorch one.
Have you tried this yet? Have you encountered any issues?
from dali.
I just checked and unfortunately, HuggingFace trainer (I checked only one example so far) expects PyTorch DataIterator (checks for the type of passes object). I still believe it is possible to make DALI work there but I may need a bit of adjustments to the internal methods. If you have some spare time we would be more than happy to review a PR with such an example.
from dali.
I just checked and unfortunately, HuggingFace trainer (I checked only one example so far) expects PyTorch DataIterator (checks for the type of passes object). I still believe it is possible to make DALI work there but I may need a bit of adjustments to the internal methods. If you have some spare time we would be more than happy to review a PR with such an example.
Thanks for your reply. Yeah, I have tried integrating DALI into the HuggingFace trainer. As you mentioned, it seems to expect either torch.utils.data.Dataset or torch.utils.data.IterableDataset as input. It might be possible to customize a data collator, which can preprocess data in a DALI pipeline, for the HuggingFace trainer, but I'm not sure if it will work. If you have any idea for this, please let me know. I'm sure it would be a huge step for the whole community!
from dali.
I did some basic investigation:
- for CV tasks DALI needs to have such wrapper:
class DALIwrapper(DALIGenericIterator): def __init__(self, *args, **kargs): super().__init__(*args, **kargs) def __next__(self): batch = super().__next__() return batch[0] pipe_train = GetDaliPipeline(...) data_train = DALIwrapper([pipe_train], ["pixel_values", "labels"], reader_name="Reader")
- trainer needs to have these methods replaced:
so DALI dataloader is used as is without wrapping into PyTorch DataIterator
trainer.get_train_dataloader = lambda : data_train trainer.get_eval_dataloader = lambda x: x
- there is probably another layer of complexity using DALI with a distributed strategy which is handled by accelerate
from dali.
For the third point, get_train_dataloader returns self.accelerator.prepare(DataLoader(train_dataset, **dataloader_params)) for a distributed training or evaluation. So what should I do in this setting? Hope for your help.
from dali.
@ShyFoo - I'm afraid I don't know the answer. My idea was to avoid wrapping DALI into DataLoader as it serves as one already. If you have time I would be more than happy to learn your findings.
from dali.
e basic investigation:
Got it. I have tried using DALI + DeepSpeed, where I have to define a triaing loop and other practical functions myself, instead of using the Huggingface Trainer for convenience, so I prefer to use DALI, DeepSpeed, and Huggingface Trainer simultaneously if possible.
from dali.
If you have any examples of DALI + DeepSpeed feel free to post them as PR to DALI - we would be more than happy to enrich our documentation and the base of samples.
from dali.
Well, I'd be happy to.
from dali.
Related Issues (20)
- Extracting properties from a list of DataNodes HOT 5
- A100 hardware decoder HOT 1
- Extract motion vectors HOT 7
- Segmentation fault when using 'mixed' HOT 5
- Bbox Pruning Too Aggressive? HOT 5
- Indexing video with binary mask HOT 1
- source_info tensor not guaranteed to contain correct data HOT 1
- 16 bit gray scale Image read error HOT 1
- COCO Reader pixelwise_masks Emtpy Output HOT 7
- Dali on Jetson: nvidia.dali.fn.readers.video_resize is missing HOT 4
- Numpy reader test (GDS) HOT 4
- How to add a scalar value to the loader? HOT 1
- Bug in creating `TensorGPU` when `stream` key is `None` in CUDA array interface HOT 2
- Configure max image size HOT 3
- Webdataset reader behavior with many sources HOT 1
- ModuleNotFoundError: No module named 'nvidia.dali.python_function_plugin' HOT 3
- Speed up Dino with DALI HOT 3
- error using webdataset
- webdataset cannot stop cycling at end of epoch HOT 11
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from dali.