netx-repo / pipeswitch Goto Github PK

View Code? Open in Web Editor NEW

122.0 122.0 32.0 1.18 MB

PipeSwitch: Fast Pipelined Context Switching for Deep Learning Applications

License: Apache License 2.0

Python 58.43% C++ 27.11% Shell 14.46%

pipeswitch's People

Contributors

Stargazers

Watchers

pipeswitch's Issues

how to timing 'model.to(device)' correctly?

I am using pytorch's api in my python code to measure time for different layers of resnet152 to device(GPU, V-100).However, I cannot get a stable result.
Here is my code:

import torch.nn as nn
device = torch.device('cuda:3' if torch.cuda.is_available() else 'cpu')
model = torchvision.models.resnet152(pretrained=True)

def todevice(_model_, _device_=device):
    T0 = time.perf_counter()
    _model_.to(_device_)
    torch.cuda.synchronize()
    T1 = time.perf_counter()
    print("model to device %s cost:%s ms" % (_device_, ((T1 - T0) * 1000)))

model1 = nn.Sequential(*list(resnet152.children())[:6])
todevice(model1)

When I use the code to test at different time, I can always get different answers, some of them are ridiculous, even to 200ms.
Also, there are 4 GPU in my lab, I don't know whether other extra GPUs will affect my result.
Could you tell me how to timing model.to(device) correctly?

Which pytorch is using for this repo?

I try run the pipeswitch section, but meet below error:
python main.py model_list.txt
TIMESTAMP, frontend, start, 1608555799.825023
Traceback (most recent call last):
File "main.py", line 60, in
main()
File "main.py", line 24, in main
torch.cuda.allocate_shared_cache()
AttributeError: module 'torch.cuda' has no attribute 'allocate_shared_cache'

So which torch version shall I use to fix it?

Thx,
Lei

../c10/cuda/impl/CUDAGuardImpl.h:176:40: error: cannot convert ‘const c10::DataPtr’ to ‘void*’ CUDACachingAllocator::recordStream(data_ptr, cuda_stream);

Hi, guys：
When I use the overwritten CUDACachingAllocator.cpp to compile PyTorch, I encounter the following error. Can you give me some suggestions? Thank you very much!

[3308/6112] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o
FAILED: c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dc10_cuda_EXPORTS -Iaten/src -I../aten/src -I. -I../ -I../cmake/../third_party/benchmark/include -I../cmake/../third_party/cudnn_frontend/include -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda-10.2/include -I../c10/cuda/../.. -I../c10/.. -isystem third_party/gloo -isystem ../cmake/../third_party/gloo -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem /home/baofu/anaconda3/include -isystem ../third_party/gemmlowp -isystem ../third_party/neon2sse -isystem ../third_party/XNNPACK/include -isystem ../third_party -isystem ../cmake/../third_party/eigen -isystem /home/baofu/anaconda3/include/python3.7m -isystem /home/baofu/anaconda3/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include -isystem /usr/lib/openmpi/include -isystem /usr/lib/openmpi/include/openmpi -isystem ../cmake/../third_party/cub -isystem ../third_party/ideep/mkl-dnn/include -isystem ../third_party/ideep/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -DC10_CUDA_BUILD_MAIN_LIB -fvisibility=hidden -std=gnu++14 -MD -MT c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o -MF c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o.d -o c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o -c ../c10/cuda/CUDAStream.cpp
In file included from ../c10/cuda/CUDAGuard.h:7,
                 from ../c10/cuda/CUDAStream.cpp:2:
../c10/cuda/impl/CUDAGuardImpl.h: In member function ‘virtual void c10::cuda::impl::CUDAGuardImpl::recordDataPtrOnStream(const c10::DataPtr&, const c10::Stream&) const’:
../c10/cuda/impl/CUDAGuardImpl.h:176:40: **error:** cannot convert ‘const c10::DataPtr’ to ‘void*’
     CUDACachingAllocator::recordStream(data_ptr, cuda_stream);
                                        ^~~~~~~~
In file included from ../c10/cuda/impl/CUDAGuardImpl.h:7,
                 from ../c10/cuda/CUDAGuard.h:7,
                 from ../c10/cuda/CUDAStream.cpp:2:
../c10/cuda/CUDACachingAllocator.h:56:38: note:   initializing argument 1 of ‘void c10::cuda::CUDACachingAllocator::recordStream(void*, c10::cuda::CUDAStream)’
 C10_CUDA_API void recordStream(void *ptr, CUDAStream stream);
                                ~~~~~~^~~
[3323/6112] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o
FAILED: c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dc10_cuda_EXPORTS -Iaten/src -I../aten/src -I. -I../ -I../cmake/../third_party/benchmark/include -I../cmake/../third_party/cudnn_frontend/include -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda-10.2/include -I../c10/cuda/../.. -I../c10/.. -isystem third_party/gloo -isystem ../cmake/../third_party/gloo -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem /home/baofu/anaconda3/include -isystem ../third_party/gemmlowp -isystem ../third_party/neon2sse -isystem ../third_party/XNNPACK/include -isystem ../third_party -isystem ../cmake/../third_party/eigen -isystem /home/baofu/anaconda3/include/python3.7m -isystem /home/baofu/anaconda3/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include -isystem /usr/lib/openmpi/include -isystem /usr/lib/openmpi/include/openmpi -isystem ../cmake/../third_party/cub -isystem ../third_party/ideep/mkl-dnn/include -isystem ../third_party/ideep/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -DC10_CUDA_BUILD_MAIN_LIB -fvisibility=hidden -std=gnu++14 -MD -MT c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o -MF c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o.d -o c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o -c ../c10/cuda/impl/CUDAGuardImpl.cpp
In file included from ../c10/cuda/impl/CUDAGuardImpl.cpp:1:
../c10/cuda/impl/CUDAGuardImpl.h: In member function ‘virtual void c10::cuda::impl::CUDAGuardImpl::recordDataPtrOnStream(const c10::DataPtr&, const c10::Stream&) const’:
../c10/cuda/impl/CUDAGuardImpl.h:176:40: **error:** cannot convert ‘const c10::DataPtr’ to ‘void*’
     CUDACachingAllocator::recordStream(data_ptr, cuda_stream);
                                        ^~~~~~~~
In file included from ../c10/cuda/impl/CUDAGuardImpl.h:7,
                 from ../c10/cuda/impl/CUDAGuardImpl.cpp:1:
../c10/cuda/CUDACachingAllocator.h:56:38: note:   initializing argument 1 of ‘void c10::cuda::CUDACachingAllocator::recordStream(void*, c10::cuda::CUDAStream)’
 C10_CUDA_API void recordStream(void *ptr, CUDAStream stream);
                                ~~~~~~^~~
[3329/6112] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o
FAILED: c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dc10_cuda_EXPORTS -Iaten/src -I../aten/src -I. -I../ -I../cmake/../third_party/benchmark/include -I../cmake/../third_party/cudnn_frontend/include -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda-10.2/include -I../c10/cuda/../.. -I../c10/.. -isystem third_party/gloo -isystem ../cmake/../third_party/gloo -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem /home/baofu/anaconda3/include -isystem ../third_party/gemmlowp -isystem ../third_party/neon2sse -isystem ../third_party/XNNPACK/include -isystem ../third_party -isystem ../cmake/../third_party/eigen -isystem /home/baofu/anaconda3/include/python3.7m -isystem /home/baofu/anaconda3/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include -isystem /usr/lib/openmpi/include -isystem /usr/lib/openmpi/include/openmpi -isystem ../cmake/../third_party/cub -isystem ../third_party/ideep/mkl-dnn/include -isystem ../third_party/ideep/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -DC10_CUDA_BUILD_MAIN_LIB -fvisibility=hidden -std=gnu++14 -MD -MT c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o -MF c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o.d -o c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o -c ../c10/cuda/CUDACachingAllocator.cpp
In file included from ../c10/cuda/CUDAGuard.h:7,
                 from ../c10/cuda/CUDACachingAllocator.cpp:5:
../c10/cuda/impl/CUDAGuardImpl.h: In member function ‘virtual void c10::cuda::impl::CUDAGuardImpl::recordDataPtrOnStream(const c10::DataPtr&, const c10::Stream&) const’:
../c10/cuda/impl/CUDAGuardImpl.h:176:40: **error:** cannot convert ‘const c10::DataPtr’ to ‘void*’
     CUDACachingAllocator::recordStream(data_ptr, cuda_stream);
                                        ^~~~~~~~
In file included from ../c10/cuda/CUDACachingAllocator.cpp:3:
../c10/cuda/CUDACachingAllocator.h:56:38: note:   initializing argument 1 of ‘void c10::cuda::CUDACachingAllocator::recordStream(void*, c10::cuda::CUDAStream)’
 C10_CUDA_API void recordStream(void *ptr, CUDAStream stream);
                                ~~~~~~^~~
[3335/6112] Building CXX object third_party/fbgemm/CMakeFiles/fbgemm_avx2.dir/src/FbgemmI8Depthwise3DAvx2.cc.o
ninja: build stopped: subcommand failed.

[Plugin for pytorch] Port to Windows?

I failed to compile pytorch after I overwrote some pytorch source files with PipeSwitch's file.
Some Linux-specific headers are included (e.g: the unistd.h header file included in CUDACachingAllocator.cpp)

Is there docker image ?

Hi,

I am trying to reproduce your result with the README guidance.

While trying the ready_model section, I meet below error. Which python package shall I install to fix it?

python ready_model.py resnet152
Traceback (most recent call last):
File "ready_model.py", line 10, in
from experiments.helper import get_model
ModuleNotFoundError: No module named 'experiments'

Thx,
Lei

Model-aware grouping code

Hi,

May I know if there is python code available for the model-aware grouping algorithm (Algorithm 1) mentioned in your paper?

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.