Giter VIP home page Giter VIP logo

pipeswitch's People

Contributors

baizh1994 avatar chuheng001 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pipeswitch's Issues

how to timing 'model.to(device)' correctly?

I am using pytorch's api in my python code to measure time for different layers of resnet152 to device(GPU, V-100).However, I cannot get a stable result.
Here is my code:

import torch.nn as nn
device = torch.device('cuda:3' if torch.cuda.is_available() else 'cpu')
model = torchvision.models.resnet152(pretrained=True)

def todevice(_model_, _device_=device):
    T0 = time.perf_counter()
    _model_.to(_device_)
    torch.cuda.synchronize()
    T1 = time.perf_counter()
    print("model to device %s cost:%s ms" % (_device_, ((T1 - T0) * 1000)))

model1 = nn.Sequential(*list(resnet152.children())[:6])
todevice(model1)

When I use the code to test at different time, I can always get different answers, some of them are ridiculous, even to 200ms.
Also, there are 4 GPU in my lab, I don't know whether other extra GPUs will affect my result.
Could you tell me how to timing model.to(device) correctly?

Which pytorch is using for this repo?

I try run the pipeswitch section, but meet below error:
python main.py model_list.txt
TIMESTAMP, frontend, start, 1608555799.825023
Traceback (most recent call last):
File "main.py", line 60, in
main()
File "main.py", line 24, in main
torch.cuda.allocate_shared_cache()
AttributeError: module 'torch.cuda' has no attribute 'allocate_shared_cache'

So which torch version shall I use to fix it?

Thx,
Lei

../c10/cuda/impl/CUDAGuardImpl.h:176:40: error: cannot convert ‘const c10::DataPtr’ to ‘void*’ CUDACachingAllocator::recordStream(data_ptr, cuda_stream);

Hi, guys:
When I use the overwritten CUDACachingAllocator.cpp to compile PyTorch, I encounter the following error. Can you give me some suggestions? Thank you very much!

[3308/6112] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o
FAILED: c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dc10_cuda_EXPORTS -Iaten/src -I../aten/src -I. -I../ -I../cmake/../third_party/benchmark/include -I../cmake/../third_party/cudnn_frontend/include -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda-10.2/include -I../c10/cuda/../.. -I../c10/.. -isystem third_party/gloo -isystem ../cmake/../third_party/gloo -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem /home/baofu/anaconda3/include -isystem ../third_party/gemmlowp -isystem ../third_party/neon2sse -isystem ../third_party/XNNPACK/include -isystem ../third_party -isystem ../cmake/../third_party/eigen -isystem /home/baofu/anaconda3/include/python3.7m -isystem /home/baofu/anaconda3/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include -isystem /usr/lib/openmpi/include -isystem /usr/lib/openmpi/include/openmpi -isystem ../cmake/../third_party/cub -isystem ../third_party/ideep/mkl-dnn/include -isystem ../third_party/ideep/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -DC10_CUDA_BUILD_MAIN_LIB -fvisibility=hidden -std=gnu++14 -MD -MT c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o -MF c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o.d -o c10/cuda/CMakeFiles/c10_cuda.dir/CUDAStream.cpp.o -c ../c10/cuda/CUDAStream.cpp
In file included from ../c10/cuda/CUDAGuard.h:7,
                 from ../c10/cuda/CUDAStream.cpp:2:
../c10/cuda/impl/CUDAGuardImpl.h: In member function ‘virtual void c10::cuda::impl::CUDAGuardImpl::recordDataPtrOnStream(const c10::DataPtr&, const c10::Stream&) const’:
../c10/cuda/impl/CUDAGuardImpl.h:176:40: **error:** cannot convert ‘const c10::DataPtr’ to ‘void*’
     CUDACachingAllocator::recordStream(data_ptr, cuda_stream);
                                        ^~~~~~~~
In file included from ../c10/cuda/impl/CUDAGuardImpl.h:7,
                 from ../c10/cuda/CUDAGuard.h:7,
                 from ../c10/cuda/CUDAStream.cpp:2:
../c10/cuda/CUDACachingAllocator.h:56:38: note:   initializing argument 1 of ‘void c10::cuda::CUDACachingAllocator::recordStream(void*, c10::cuda::CUDAStream)’
 C10_CUDA_API void recordStream(void *ptr, CUDAStream stream);
                                ~~~~~~^~~
[3323/6112] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o
FAILED: c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dc10_cuda_EXPORTS -Iaten/src -I../aten/src -I. -I../ -I../cmake/../third_party/benchmark/include -I../cmake/../third_party/cudnn_frontend/include -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda-10.2/include -I../c10/cuda/../.. -I../c10/.. -isystem third_party/gloo -isystem ../cmake/../third_party/gloo -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem /home/baofu/anaconda3/include -isystem ../third_party/gemmlowp -isystem ../third_party/neon2sse -isystem ../third_party/XNNPACK/include -isystem ../third_party -isystem ../cmake/../third_party/eigen -isystem /home/baofu/anaconda3/include/python3.7m -isystem /home/baofu/anaconda3/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include -isystem /usr/lib/openmpi/include -isystem /usr/lib/openmpi/include/openmpi -isystem ../cmake/../third_party/cub -isystem ../third_party/ideep/mkl-dnn/include -isystem ../third_party/ideep/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -DC10_CUDA_BUILD_MAIN_LIB -fvisibility=hidden -std=gnu++14 -MD -MT c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o -MF c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o.d -o c10/cuda/CMakeFiles/c10_cuda.dir/impl/CUDAGuardImpl.cpp.o -c ../c10/cuda/impl/CUDAGuardImpl.cpp
In file included from ../c10/cuda/impl/CUDAGuardImpl.cpp:1:
../c10/cuda/impl/CUDAGuardImpl.h: In member function ‘virtual void c10::cuda::impl::CUDAGuardImpl::recordDataPtrOnStream(const c10::DataPtr&, const c10::Stream&) const’:
../c10/cuda/impl/CUDAGuardImpl.h:176:40: **error:** cannot convert ‘const c10::DataPtr’ to ‘void*’
     CUDACachingAllocator::recordStream(data_ptr, cuda_stream);
                                        ^~~~~~~~
In file included from ../c10/cuda/impl/CUDAGuardImpl.h:7,
                 from ../c10/cuda/impl/CUDAGuardImpl.cpp:1:
../c10/cuda/CUDACachingAllocator.h:56:38: note:   initializing argument 1 of ‘void c10::cuda::CUDACachingAllocator::recordStream(void*, c10::cuda::CUDAStream)’
 C10_CUDA_API void recordStream(void *ptr, CUDAStream stream);
                                ~~~~~~^~~
[3329/6112] Building CXX object c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o
FAILED: c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o 
/usr/bin/c++ -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DIDEEP_USE_MKL -DMAGMA_V2 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DTH_BLAS_MKL -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dc10_cuda_EXPORTS -Iaten/src -I../aten/src -I. -I../ -I../cmake/../third_party/benchmark/include -I../cmake/../third_party/cudnn_frontend/include -Icaffe2/contrib/aten -I../third_party/onnx -Ithird_party/onnx -I../third_party/foxi -Ithird_party/foxi -I/usr/local/cuda-10.2/include -I../c10/cuda/../.. -I../c10/.. -isystem third_party/gloo -isystem ../cmake/../third_party/gloo -isystem ../cmake/../third_party/googletest/googlemock/include -isystem ../cmake/../third_party/googletest/googletest/include -isystem ../third_party/protobuf/src -isystem /home/baofu/anaconda3/include -isystem ../third_party/gemmlowp -isystem ../third_party/neon2sse -isystem ../third_party/XNNPACK/include -isystem ../third_party -isystem ../cmake/../third_party/eigen -isystem /home/baofu/anaconda3/include/python3.7m -isystem /home/baofu/anaconda3/lib/python3.7/site-packages/numpy/core/include -isystem ../cmake/../third_party/pybind11/include -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent -isystem /usr/lib/openmpi/include/openmpi/opal/mca/event/libevent2021/libevent/include -isystem /usr/lib/openmpi/include -isystem /usr/lib/openmpi/include/openmpi -isystem ../cmake/../third_party/cub -isystem ../third_party/ideep/mkl-dnn/include -isystem ../third_party/ideep/include -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Werror=cast-function-type -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC -DCAFFE2_USE_GLOO -DCUDA_HAS_FP16=1 -DHAVE_GCC_GET_CPUID -DUSE_AVX -DUSE_AVX2 -DTH_HAVE_THREAD -DC10_CUDA_BUILD_MAIN_LIB -fvisibility=hidden -std=gnu++14 -MD -MT c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o -MF c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o.d -o c10/cuda/CMakeFiles/c10_cuda.dir/CUDACachingAllocator.cpp.o -c ../c10/cuda/CUDACachingAllocator.cpp
In file included from ../c10/cuda/CUDAGuard.h:7,
                 from ../c10/cuda/CUDACachingAllocator.cpp:5:
../c10/cuda/impl/CUDAGuardImpl.h: In member function ‘virtual void c10::cuda::impl::CUDAGuardImpl::recordDataPtrOnStream(const c10::DataPtr&, const c10::Stream&) const’:
../c10/cuda/impl/CUDAGuardImpl.h:176:40: **error:** cannot convert ‘const c10::DataPtr’ to ‘void*’
     CUDACachingAllocator::recordStream(data_ptr, cuda_stream);
                                        ^~~~~~~~
In file included from ../c10/cuda/CUDACachingAllocator.cpp:3:
../c10/cuda/CUDACachingAllocator.h:56:38: note:   initializing argument 1 of ‘void c10::cuda::CUDACachingAllocator::recordStream(void*, c10::cuda::CUDAStream)’
 C10_CUDA_API void recordStream(void *ptr, CUDAStream stream);
                                ~~~~~~^~~
[3335/6112] Building CXX object third_party/fbgemm/CMakeFiles/fbgemm_avx2.dir/src/FbgemmI8Depthwise3DAvx2.cc.o
ninja: build stopped: subcommand failed.

Is there docker image ?

Hi,

I am trying to reproduce your result with the README guidance.

While trying the ready_model section, I meet below error. Which python package shall I install to fix it?

python ready_model.py resnet152
Traceback (most recent call last):
File "ready_model.py", line 10, in
from experiments.helper import get_model
ModuleNotFoundError: No module named 'experiments'

Thx,
Lei

Model-aware grouping code

Hi,

May I know if there is python code available for the model-aware grouping algorithm (Algorithm 1) mentioned in your paper?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.