Giter VIP home page Giter VIP logo

luqqiu / dali Goto Github PK

View Code? Open in Web Editor NEW

This project forked from nvidia/dali

0.0 0.0 0.0 206.48 MB

A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.

Home Page: https://docs.nvidia.com/deeplearning/sdk/dali-developer-guide/index.html

License: Apache License 2.0

CMake 2.36% C++ 64.58% Python 18.88% C 0.98% Cuda 11.84% Shell 1.31% Dockerfile 0.04%

dali's People

Contributors

a-sansanwal avatar alexbula avatar awolant avatar azrael417 avatar banasraf avatar cclauss avatar cliffwoolley avatar drivanov avatar epaminondas avatar hartb avatar jantonguirao avatar januszl avatar joehandzik avatar kh4l avatar klecki avatar kychennv avatar madebyollin avatar mwbyeon avatar mzient avatar pawela avatar pribalta avatar ptrendx avatar sfc-gh-rmaj avatar slayton58 avatar stiepan avatar szalpal avatar thetimemaster avatar willthefrog avatar winggan avatar wojciechpr avatar

Watchers

 avatar

dali's Issues

Dali CPU data loader still requires libcuda.so

Following the DALI tutorial i am able to launch the run the training in docker image with the original training script

$ nvidia-docker run --ipc=host -it -v /home/ec2-user/data:/data --network=host -v /home/ec2-user/DALI:/DALI nvcr.io/nvidia/pytorch:21.05-py3
$ cd /DALI/docs/examples/use_cases/pytorch/resnet50/
$ python -m torch.distributed.launch --nproc_per_node=1 \
  --nnodes=2 --node_rank=1 \
  --master_addr="ip-172-31-44-53.ec2.internal" --master_port=443 \
  main.py --dali_cpu --arch resnet50 --workers 1 --batch-size 16 --epochs 1 --lr 4.096 /data

I can remove the GPU training logics from the script and modify it to a version that merely reading imagenet data using dali cpu. The updated script locates here.
The updated script works with the previous nvidia-docker command.

However, when I try to run the dali data loader without GPU involved

$ docker run --ipc=host -it -v /home/ec2-user/data:/data -v /home/ec2-user/DALI:/DALI   nvcr.io/nvidia/pytorch:21.05-py3 bash
$ cd /DALI/docs/examples/use_cases/pytorch/resnet50/
$ python main.py --dali_cpu --arch resnet50 --workers 1 --batch-size 16 --epochs 1 --lr 4.096 /data

The script error out with

root@4fd5caa961fc:/DALI/docs/examples/use_cases/pytorch/resnet50# python main.py --dali_cpu --arch resnet50 --workers 1 --batch-size 16 --epochs 1 --lr 4.096 /data
dali device is cpu, decoder device is cpu
dlopen "libcuda.so" failed!
Traceback (most recent call last):
  File "main.py", line 291, in <module>
    main()
  File "main.py", line 186, in main
    pipe.build()
  File "/opt/conda/lib/python3.8/site-packages/nvidia/dali/pipeline.py", line 657, in build
    self._init_pipeline_backend()
  File "/opt/conda/lib/python3.8/site-packages/nvidia/dali/pipeline.py", line 562, in _init_pipeline_backend
    self._pipe = b.Pipeline(self._max_batch_size,
RuntimeError: [/opt/dali/dali/core/device_guard.cc:33] Assert on "cuInitChecked()" failed: Failed to load libcuda.so. Check your library paths and if the driver is installed correctly.

Basically my target is to run the DALI data loader in an instance without GPU and in a docker image without GPU.
My questions are:

  • is GPU instance needed to run the DALI data loader even using dali-cpu
  • Can the DALI data loader run in docker images without nvidia

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.