Giter VIP home page Giter VIP logo

keyu-tian / spark Goto Github PK

View Code? Open in Web Editor NEW
1.4K 26.0 79.0 716 KB

[ICLR'23 Spotlight🔥] The first successful BERT/MAE-style pretraining on any convolutional network; Pytorch impl. of "Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling"

Home Page: https://arxiv.org/abs/2301.03580

License: MIT License

Python 55.50% Jupyter Notebook 44.50%
bert convnet convolutional-neural-networks masked-image-modeling pre-trained-model self-supervised-learning sparse-convolution ssl cnn iclr

spark's Introduction

SparK: the first successful BERT/MAE-style pretraining on any convolutional networks  Reddit Twitter

This is the official implementation of ICLR paper Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling, which can pretrain any CNN (e.g., ResNet) in a BERT-style self-supervised manner. We've tried our best to make the codebase clean, short, easy to read, state-of-the-art, and only rely on minimal dependencies.

SparK_demo_22s_4k_wo_bages.1.mp4

SOTA  OpenReview  arXiv

🔥 News

🕹️ Colab Visualization Demo

Check pretrain/viz_reconstruction.ipynb for visualizing the reconstruction of SparK pretrained models, like:

We also provide pretrain/viz_spconv.ipynb that shows the "mask pattern vanishing" issue of dense conv layers.

What's new here?

🔥 Pretrained CNN beats pretrained Swin-Transformer:

🔥 After SparK pretraining, smaller models can beat un-pretrained larger models:

🔥 All models can benefit, showing a scaling behavior:

🔥 Generative self-supervised pretraining surpasses contrastive learning:

See our paper for more analysis, discussions, and evaluations.

Todo list

catalog

Pretrained weights (self-supervised; w/o decoder; can be directly finetuned)

Note: for network definitions, we directly use timm.models.ResNet and official ConvNeXt.

reso.: the image resolution; acc@1: ImageNet-1K finetuned acc (top-1)

arch. reso. acc@1 #params flops weights (self-supervised, without SparK's decoder)
ResNet50 224 80.6 26M 4.1G resnet50_1kpretrained_timm_style.pth
ResNet101 224 82.2 45M 7.9G resnet101_1kpretrained_timm_style.pth
ResNet152 224 82.7 60M 11.6G resnet152_1kpretrained_timm_style.pth
ResNet200 224 83.1 65M 15.1G resnet200_1kpretrained_timm_style.pth
ConvNeXt-S 224 84.1 50M 8.7G convnextS_1kpretrained_official_style.pth
ConvNeXt-B 224 84.8 89M 15.4G convnextB_1kpretrained_official_style.pth
ConvNeXt-L 224 85.4 198M 34.4G convnextL_1kpretrained_official_style.pth
ConvNeXt-L 384 86.0 198M 101.0G convnextL_384_1kpretrained_official_style.pth
Pretrained weights (with SparK's UNet-style decoder; can be used to reconstruct images)
arch. reso. acc@1 #params flops weights (self-supervised, with SparK's decoder)
ResNet50 224 80.6 26M 4.1G res50_withdecoder_1kpretrained_spark_style.pth
ResNet101 224 82.2 45M 7.9G res101_withdecoder_1kpretrained_spark_style.pth
ResNet152 224 82.7 60M 11.6G res152_withdecoder_1kpretrained_spark_style.pth
ResNet200 224 83.1 65M 15.1G res200_withdecoder_1kpretrained_spark_style.pth
ConvNeXt-S 224 84.1 50M 8.7G cnxS224_withdecoder_1kpretrained_spark_style.pth
ConvNeXt-L 384 86.0 198M 101.0G cnxL384_withdecoder_1kpretrained_spark_style.pth

Installation & Running

We highly recommended you to use torch==1.10.0, torchvision==0.11.1, and timm==0.5.4 for reproduction. Check INSTALL.md to install all pip dependencies.

  • Loading pretrained model weights in 3 lines
# download our weights `resnet50_1kpretrained_timm_style.pth` first
import torch, timm
res50, state = timm.create_model('resnet50'), torch.load('resnet50_1kpretrained_timm_style.pth', 'cpu')
res50.load_state_dict(state.get('module', state), strict=False)     # just in case the model weights are actually saved in state['module']

Acknowledgement

We referred to these useful codebases:

License

This project is under the MIT license. See LICENSE for more details.

Citation

If you found this project useful, you can kindly give us a star ⭐, or cite us in your work 📖:

@Article{tian2023designing,
  author  = {Keyu Tian and Yi Jiang and Qishuai Diao and Chen Lin and Liwei Wang and Zehuan Yuan},
  title   = {Designing BERT for Convolutional Networks: Sparse and Hierarchical Masked Modeling},
  journal = {arXiv:2301.03580},
  year    = {2023},
}

spark's People

Contributors

ifighting avatar keyu-tian avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spark's Issues

About reproducing MAE with ViT-S

I notice that you reproduce the result of MAE with ViT-S by using official codes, but there is no decoder design in the official codes.
I wonder that how many layers you use in decoder of ViT-S, also decoder_dim and num_head?

想问一下这个loss训练到什么数量级才可以拿去做预训练权重

我这边把2d操作修改成高度为1的,为了适配1d时序信号,之前试过了vit1dmae,可以预训练完成,且能够微调,效果也不错,目前的疑问是:这边的loss情况要降低到什么程度才可以呢,我修改成mse,loss发现是0,0002的时候,拿权重去做可视化预测,效果很糟糕,我观察到你代码中设置min_loss设置为1e-9,可以分享下最终loss收敛到什么一个地步可以停止训练吗,谢谢

error in downstream_imagenet

Thanks for your great work! however, in downstream_imagenet, I found an error.
I used the code '~/SparK/downstream_imagenet$ bash ./main.sh exp1 --data_path=/home/users/data --model=resnet50 --resume_from=/home/users/SparK/pretrain/output/resnet50_1kpretrained.pth --bs=32'

and the following error happened:
[05-17 12:28:48] (nstream_imagenet/main.py, line 49)=> [FT start] from ep0
[05-17 12:28:48] (nstream_imagenet/main.py, line 58)=> [loader_train.sampler.set_epoch(0)]
Traceback (most recent call last):
File "/home/users/SparK/downstream_imagenet/main.py", line 189, in
main_ft()
File "/home/users/SparK/downstream_imagenet/main.py", line 60, in main_ft
train_loss, train_acc = fine_tune_one_epoch(ep, args, tb_lg, loader_train, iters_train, criterion, mixup_fn, model, model_ema, optimizer, params_req_grad)
File "/home/users/SparK/downstream_imagenet/main.py", line 129, in fine_tune_one_epoch
inp, tar = mixup_fn(inp, tar)
ValueError: too many values to unpack (expected 2)
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name=3 encoding='UTF-8'>
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 3006233) of binary: /usr/bin/python3

Modified to 3D network

Hi, I am very interested in your research and want to use it for 3D dataset. I want to know where this code and file needs to be modified to adapt to the 3D network. I would thanks for your reply.

The finetuning weights on imagenet

Thanks for your wonderful work! The tutorial is fairly friendly for reproduction, but I also wonder if it is possible for us to get the fine-tuned weights on imagenet?

Contrastive learning methods performance in paper

Thanks for your great work! Does these contrastive learning methods' performance on ImageNet refer to finetuning results? If so, since those papers only report linear evaluation results, where did you get the finetuning score?

How to install Tap?

I installed Tap via pip3 install tap, but some trouble happen.

from tap import Tap
ImportError: cannot import name 'Tap' from 'tap' 

Finetuning epochs

Dear authors,

with regards to U2 from the reviewer 71p6: "The sparse convolution seems cannot largely speed up the training process like MAE for ViT and the model requires 300 epoch fine-tuning that is the same as the configuration of training from scratch."

I think that the reviewer had in mind that for MAE, the authors only fine-tuned for 100 (ViT-B) and 50 (ViT-L/H) epochs (MAE paper, A.1, Table 9). The supervised models that were trained from scratch in the MAE paper were trained for 300 epochs (MAE paper, A.2, Table 11).

SparK was fine-tuned for 300 epochs, which is the same configuration as training supervised ViTs from scratch. MAE thus achieved comparable performance with 3-6x less fine-tuning epochs.

Can you report your results also with the same fine-tuning configuration (50/100 epochs) as used in MAE? Only then the comparison can be fair.

Comment from: https://openreview.net/forum?id=NRxydtWup1S

How do I know it is pretraining works instead of longer finetuning epochs?

I notice that you set the finetuning epochs as 200 or 400.

'convnext_base': (4096, 400, 20, 'adam', 0.0001, 0.7, 0.01, 0.8, 3, 0.4, 0.9999),

However,

  • the standard supervised training only runs for 300 epochs (w/o 1600 or 800 pretraining epochs).
  • in MIM works (e.g. MAE and ConvNextV2), they typically finetune 100 epochs.

Did you try 100 epoch schedule? Can you also kindly share the result under such a setting?

Pretraining with my own dataset

Hi, thanks for your work. Whenever I try to pretrain with my own dataset, following error is happening:

torchrun --data_path=/home/user/augdata --exp_name=ptaugdata --exp_dir=/home/user/models
usage: torchrun [-h] [--nnodes NNODES] [--nproc-per-node NPROC_PER_NODE] [--rdzv-backend RDZV_BACKEND] [--rdzv-endpoint RDZV_ENDPOINT] [--rdzv-id RDZV_ID] [--rdzv-conf RDZV_CONF] [--standalone]
[--max-restarts MAX_RESTARTS] [--monitor-interval MONITOR_INTERVAL] [--start-method {spawn,fork,forkserver}] [--role ROLE] [-m] [--no-python] [--run-path] [--log-dir LOG_DIR]
[-r REDIRECTS] [-t TEE] [--node-rank NODE_RANK] [--master-addr MASTER_ADDR] [--master-port MASTER_PORT] [--local-addr LOCAL_ADDR]
training_script ...
torchrun: error: the following arguments are required: training_script, training_script_args

do I need to specify training_script and training_script_args?

Lr layer decay

Thanks for your excellent work! I noticed that LR layer decay is not used in ImageNet fine-tuning, but it is used in detection. Why not use layer decay like in transformer fine-tuning? What influence will it have if layer decay is adopted in ImageNet fine-tuning?

A question about data leakage.

Hi @keyu-tian! Thank you for proposing the amazing work. I have a question about data leakage. Here is my situation.
I have a unlabled dataset about 200k images. Can I use the whole dataset to train a backbone, then use the backbone to do detection job on the same dataset (but small part of the dataset with label)? Will it casue a data leakage issue?since during SparK training may contain the image in valid set.
And my final goal is to count objects in the whole dataset(200k) by object detection.

How do you adjust lr and wd?

Hi,

Thanks so much for your great work!
I am confused about how you adjust the lr and weight_decay. In pretrain/utils/lr_control.py/lr_wd_annealing, I think lr and wd are only acquired and not changed.

Thank you!

how to convert sparse model to dense model?

After finishing pretrain resnet50, we can get a resnet50 weight file in sparse type. I'd like to use resnet50 (dense type) as my backbone in my other projects. But how to convert sparse model to dense model? Is there any convenient function like SparseEncoder.dense_model_to_sparse in encoder.py ?

Pretrained weights on ResNet18?

Hi,

Thanks so much for your great work!
I wonder if you can publish the pretrained weights on ResNet18 for restricted memory?

Thnak you!

downstream_imagenet fine-tuning question

I was running downstream_imagenet code like this:
bash ./main.sh exp1 --data_path=/home/users/datacopy --model=resnet50 --resume_from=/home/users/SparK/pretrain/output_pretraining/resnet50_1kpretrained.pth --bs=16

and the error happened like this:[05-17 02:23:25] (nstream_imagenet/main.py, line 48)=> [FT start] ep_eval=[0, 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 298, 299]
[05-17 02:23:25] (nstream_imagenet/main.py, line 49)=> [FT start] from ep0
[05-17 02:23:25] (nstream_imagenet/main.py, line 58)=> [loader_train.sampler.set_epoch(0)]
[05-17 02:23:44] (nstream_imagenet/main.py, line 165)=> [ep0 it 3/375] L: 0.7013 Acc: 0.00 lr: 6.8e-088.2e-07 Remain: 0:28:38
[05-17 02:24:14] (nstream_imagenet/main.py, line 165)=> [ep0 it187/375] L: 0.6930 Acc: 0.00 lr: 1.1e-06
1.3e-05 Remain: 0:00:48
Traceback (most recent call last):
File "/home//downstream_imagenet/main.py", line 189, in
main_ft()
File "/home/downstream_imagenet/main.py", line 60, in main_ft
train_loss, train_acc = fine_tune_one_epoch(ep, args, tb_lg, loader_train, iters_train, criterion, mixup_fn, model, model_ema, optimizer, params_req_grad)
File "/home/users/SparK/downstream_imagenet/main.py", line 129, in fine_tune_one_epoch
inp, tar = mixup_fn(inp, tar)
File "/home/users/.local/lib/python3.10/site-packages/timm/data/mixup.py", line 210, in call
assert len(x) % 2 == 0, 'Batch size should be even when using this'
AssertionError: Batch size should be even when using this
sys:1: ResourceWarning: unclosed file <_io.TextIOWrapper name=3 encoding='UTF-8'>
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2986007) of binary: /usr/bin/python3
Traceback (most recent call last):
File "/home/users/.local/bin/torchrun", line 8, in
sys.exit(main())
File "/home/users/.local/lib/python3.10/site-packages/torch/distributed/elastic/multiprocessing/errors/init.py", line 346, in wrapper
return f(*args, **kwargs)
File "/home/users/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 794, in main
run(args)
File "/home/users/.local/lib/python3.10/site-packages/torch/distributed/run.py", line 785, in run
elastic_launch(
File "/home/users/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 134, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/home/users/.local/lib/python3.10/site-packages/torch/distributed/launcher/api.py", line 250, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:

============================================================
main.py FAILED

What should i do when input shape is a rectangle ?

Thank you for your excellent work. I want to do some work on audio classification task, and I found it is impossible to keep input shape as a square. What should i do when input tensor is a rectangle,like (128, 600).

The time of training resnet50

Thanks for your wonderful job! I find the time of training resnet50 on V100 is very log. Can you offer the log of training resnet50, and offer the time of training resnet50. Thanks!

a small model such as Mobilenet v2 for pre-training

Thank you for your excellent work. Replacing the transformer with CNN does make deployment more friendly. Furthermore, I'm wondering if using a smaller model such as Mobilenet v2 for pre-training and then fine-tuning downstream would be effective?

torch.distributed.elastic.multiprocessing.api.SignalException

Hi,

Thanks for your great work!

I trained with your code but always got the above exception after training for 1 or 2 hours.
I searched and found that was because the terminal window was closed even the model was training in the background with nohup.

Have you also met the same problem? and I just wonder how did you train the models in the background?

Thank you!

Self-supervised training time is too long compared to MAE

Thanks for your work, we compared SparK with MAE (convnext-base v.s. swin-base), the training time of SparK is about 6.5 times that of MAE. Is there any way to improve training efficiency? Is the lengthy training time caused by insufficient hardware optimization of sparse convolution?

q

Hello, I don't know if you has any research involving NLP, I want to ask if there is any better pre-training model now, or how to improve Bert

The finetuning hyperparameters of resnet50

The hyperparameter settings (batch size and learning rate) in the paper seem inconsistent with the code. Which could reproduce the performance (80.6 acc) reported in the paper?

稀疏卷积问题

您好,看代码后发现sp_conv_forward的 returning_active_ex=True,也就是说在卷积部分没有进行稀疏计算,而是在batchnorm和layernorm上进行稀疏计算吗?
感谢您的工作!

训练问题

请问有在训练过程中出现这个问题吗?
image

how to finetune resnet in my own dataset?

hi~I find I can pre-train resnet in my own dataset by replacing the function [build_dataset_to_pretrain] and running /pretrain/main.py ; also I can finetune resnet in Imagenet by running /downstream-imagenet/main.py; but how can I finetune resnet in my own dataset? when I try to run /pretrain/main.py with args --resume_from=/mypath-to-res50_withdecoder_1kpretrained_spark_style.pth, I will get a error like
File "/workspace/user_code/SparK/pretrain/utils/misc.py", line 180, in load_checkpoint
missing, unexpected = model_without_ddp.load_state_dict(checkpoint['module'], strict=False)
KeyError: 'module'

vis_reconstruction.py question

File "E:/code/SparK-main/SparK-main-latest/pretrain/vis_reconstruction.py", line 55, in build_spark
assert len(missing) == 0, f'load_state_dict missing keys: {missing}'
AssertionError: load_state_dict missing keys: ['imn_m', 'imn_s', 'norm_black', 'sparse_encoder.sp_cnn.conv1.weight', 'sparse_encoder.sp_cnn.bn1.weight', 'sparse_encoder.sp_cnn.bn1.bias', 'sparse_encoder.sp_cnn.bn1.running_mean', 'sparse_encoder.sp_cnn.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.0.conv1.weight', 'sparse_encoder.sp_cnn.layer1.0.bn1.weight', 'sparse_encoder.sp_cnn.layer1.0.bn1.bias', 'sparse_encoder.sp_cnn.layer1.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer1.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.0.conv2.weight', 'sparse_encoder.sp_cnn.layer1.0.bn2.weight', 'sparse_encoder.sp_cnn.layer1.0.bn2.bias', 'sparse_encoder.sp_cnn.layer1.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer1.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer1.0.conv3.weight', 'sparse_encoder.sp_cnn.layer1.0.bn3.weight', 'sparse_encoder.sp_cnn.layer1.0.bn3.bias', 'sparse_encoder.sp_cnn.layer1.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer1.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer1.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer1.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer1.1.conv1.weight', 'sparse_encoder.sp_cnn.layer1.1.bn1.weight', 'sparse_encoder.sp_cnn.layer1.1.bn1.bias', 'sparse_encoder.sp_cnn.layer1.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer1.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.1.conv2.weight', 'sparse_encoder.sp_cnn.layer1.1.bn2.weight', 'sparse_encoder.sp_cnn.layer1.1.bn2.bias', 'sparse_encoder.sp_cnn.layer1.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer1.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer1.1.conv3.weight', 'sparse_encoder.sp_cnn.layer1.1.bn3.weight', 'sparse_encoder.sp_cnn.layer1.1.bn3.bias', 'sparse_encoder.sp_cnn.layer1.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer1.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer1.2.conv1.weight', 'sparse_encoder.sp_cnn.layer1.2.bn1.weight', 'sparse_encoder.sp_cnn.layer1.2.bn1.bias', 'sparse_encoder.sp_cnn.layer1.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer1.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer1.2.conv2.weight', 'sparse_encoder.sp_cnn.layer1.2.bn2.weight', 'sparse_encoder.sp_cnn.layer1.2.bn2.bias', 'sparse_encoder.sp_cnn.layer1.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer1.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer1.2.conv3.weight', 'sparse_encoder.sp_cnn.layer1.2.bn3.weight', 'sparse_encoder.sp_cnn.layer1.2.bn3.bias', 'sparse_encoder.sp_cnn.layer1.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer1.2.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.0.conv1.weight', 'sparse_encoder.sp_cnn.layer2.0.bn1.weight', 'sparse_encoder.sp_cnn.layer2.0.bn1.bias', 'sparse_encoder.sp_cnn.layer2.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.0.conv2.weight', 'sparse_encoder.sp_cnn.layer2.0.bn2.weight', 'sparse_encoder.sp_cnn.layer2.0.bn2.bias', 'sparse_encoder.sp_cnn.layer2.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.0.conv3.weight', 'sparse_encoder.sp_cnn.layer2.0.bn3.weight', 'sparse_encoder.sp_cnn.layer2.0.bn3.bias', 'sparse_encoder.sp_cnn.layer2.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer2.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer2.1.conv1.weight', 'sparse_encoder.sp_cnn.layer2.1.bn1.weight', 'sparse_encoder.sp_cnn.layer2.1.bn1.bias', 'sparse_encoder.sp_cnn.layer2.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.1.conv2.weight', 'sparse_encoder.sp_cnn.layer2.1.bn2.weight', 'sparse_encoder.sp_cnn.layer2.1.bn2.bias', 'sparse_encoder.sp_cnn.layer2.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.1.conv3.weight', 'sparse_encoder.sp_cnn.layer2.1.bn3.weight', 'sparse_encoder.sp_cnn.layer2.1.bn3.bias', 'sparse_encoder.sp_cnn.layer2.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.2.conv1.weight', 'sparse_encoder.sp_cnn.layer2.2.bn1.weight', 'sparse_encoder.sp_cnn.layer2.2.bn1.bias', 'sparse_encoder.sp_cnn.layer2.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.2.conv2.weight', 'sparse_encoder.sp_cnn.layer2.2.bn2.weight', 'sparse_encoder.sp_cnn.layer2.2.bn2.bias', 'sparse_encoder.sp_cnn.layer2.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.2.conv3.weight', 'sparse_encoder.sp_cnn.layer2.2.bn3.weight', 'sparse_encoder.sp_cnn.layer2.2.bn3.bias', 'sparse_encoder.sp_cnn.layer2.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.2.bn3.running_var', 'sparse_encoder.sp_cnn.layer2.3.conv1.weight', 'sparse_encoder.sp_cnn.layer2.3.bn1.weight', 'sparse_encoder.sp_cnn.layer2.3.bn1.bias', 'sparse_encoder.sp_cnn.layer2.3.bn1.running_mean', 'sparse_encoder.sp_cnn.layer2.3.bn1.running_var', 'sparse_encoder.sp_cnn.layer2.3.conv2.weight', 'sparse_encoder.sp_cnn.layer2.3.bn2.weight', 'sparse_encoder.sp_cnn.layer2.3.bn2.bias', 'sparse_encoder.sp_cnn.layer2.3.bn2.running_mean', 'sparse_encoder.sp_cnn.layer2.3.bn2.running_var', 'sparse_encoder.sp_cnn.layer2.3.conv3.weight', 'sparse_encoder.sp_cnn.layer2.3.bn3.weight', 'sparse_encoder.sp_cnn.layer2.3.bn3.bias', 'sparse_encoder.sp_cnn.layer2.3.bn3.running_mean', 'sparse_encoder.sp_cnn.layer2.3.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.0.conv1.weight', 'sparse_encoder.sp_cnn.layer3.0.bn1.weight', 'sparse_encoder.sp_cnn.layer3.0.bn1.bias', 'sparse_encoder.sp_cnn.layer3.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.0.conv2.weight', 'sparse_encoder.sp_cnn.layer3.0.bn2.weight', 'sparse_encoder.sp_cnn.layer3.0.bn2.bias', 'sparse_encoder.sp_cnn.layer3.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.0.conv3.weight', 'sparse_encoder.sp_cnn.layer3.0.bn3.weight', 'sparse_encoder.sp_cnn.layer3.0.bn3.bias', 'sparse_encoder.sp_cnn.layer3.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer3.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer3.1.conv1.weight', 'sparse_encoder.sp_cnn.layer3.1.bn1.weight', 'sparse_encoder.sp_cnn.layer3.1.bn1.bias', 'sparse_encoder.sp_cnn.layer3.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.1.conv2.weight', 'sparse_encoder.sp_cnn.layer3.1.bn2.weight', 'sparse_encoder.sp_cnn.layer3.1.bn2.bias', 'sparse_encoder.sp_cnn.layer3.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.1.conv3.weight', 'sparse_encoder.sp_cnn.layer3.1.bn3.weight', 'sparse_encoder.sp_cnn.layer3.1.bn3.bias', 'sparse_encoder.sp_cnn.layer3.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.2.conv1.weight', 'sparse_encoder.sp_cnn.layer3.2.bn1.weight', 'sparse_encoder.sp_cnn.layer3.2.bn1.bias', 'sparse_encoder.sp_cnn.layer3.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.2.conv2.weight', 'sparse_encoder.sp_cnn.layer3.2.bn2.weight', 'sparse_encoder.sp_cnn.layer3.2.bn2.bias', 'sparse_encoder.sp_cnn.layer3.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.2.conv3.weight', 'sparse_encoder.sp_cnn.layer3.2.bn3.weight', 'sparse_encoder.sp_cnn.layer3.2.bn3.bias', 'sparse_encoder.sp_cnn.layer3.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.2.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.3.conv1.weight', 'sparse_encoder.sp_cnn.layer3.3.bn1.weight', 'sparse_encoder.sp_cnn.layer3.3.bn1.bias', 'sparse_encoder.sp_cnn.layer3.3.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.3.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.3.conv2.weight', 'sparse_encoder.sp_cnn.layer3.3.bn2.weight', 'sparse_encoder.sp_cnn.layer3.3.bn2.bias', 'sparse_encoder.sp_cnn.layer3.3.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.3.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.3.conv3.weight', 'sparse_encoder.sp_cnn.layer3.3.bn3.weight', 'sparse_encoder.sp_cnn.layer3.3.bn3.bias', 'sparse_encoder.sp_cnn.layer3.3.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.3.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.4.conv1.weight', 'sparse_encoder.sp_cnn.layer3.4.bn1.weight', 'sparse_encoder.sp_cnn.layer3.4.bn1.bias', 'sparse_encoder.sp_cnn.layer3.4.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.4.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.4.conv2.weight', 'sparse_encoder.sp_cnn.layer3.4.bn2.weight', 'sparse_encoder.sp_cnn.layer3.4.bn2.bias', 'sparse_encoder.sp_cnn.layer3.4.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.4.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.4.conv3.weight', 'sparse_encoder.sp_cnn.layer3.4.bn3.weight', 'sparse_encoder.sp_cnn.layer3.4.bn3.bias', 'sparse_encoder.sp_cnn.layer3.4.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.4.bn3.running_var', 'sparse_encoder.sp_cnn.layer3.5.conv1.weight', 'sparse_encoder.sp_cnn.layer3.5.bn1.weight', 'sparse_encoder.sp_cnn.layer3.5.bn1.bias', 'sparse_encoder.sp_cnn.layer3.5.bn1.running_mean', 'sparse_encoder.sp_cnn.layer3.5.bn1.running_var', 'sparse_encoder.sp_cnn.layer3.5.conv2.weight', 'sparse_encoder.sp_cnn.layer3.5.bn2.weight', 'sparse_encoder.sp_cnn.layer3.5.bn2.bias', 'sparse_encoder.sp_cnn.layer3.5.bn2.running_mean', 'sparse_encoder.sp_cnn.layer3.5.bn2.running_var', 'sparse_encoder.sp_cnn.layer3.5.conv3.weight', 'sparse_encoder.sp_cnn.layer3.5.bn3.weight', 'sparse_encoder.sp_cnn.layer3.5.bn3.bias', 'sparse_encoder.sp_cnn.layer3.5.bn3.running_mean', 'sparse_encoder.sp_cnn.layer3.5.bn3.running_var', 'sparse_encoder.sp_cnn.layer4.0.conv1.weight', 'sparse_encoder.sp_cnn.layer4.0.bn1.weight', 'sparse_encoder.sp_cnn.layer4.0.bn1.bias', 'sparse_encoder.sp_cnn.layer4.0.bn1.running_mean', 'sparse_encoder.sp_cnn.layer4.0.bn1.running_var', 'sparse_encoder.sp_cnn.layer4.0.conv2.weight', 'sparse_encoder.sp_cnn.layer4.0.bn2.weight', 'sparse_encoder.sp_cnn.layer4.0.bn2.bias', 'sparse_encoder.sp_cnn.layer4.0.bn2.running_mean', 'sparse_encoder.sp_cnn.layer4.0.bn2.running_var', 'sparse_encoder.sp_cnn.layer4.0.conv3.weight', 'sparse_encoder.sp_cnn.layer4.0.bn3.weight', 'sparse_encoder.sp_cnn.layer4.0.bn3.bias', 'sparse_encoder.sp_cnn.layer4.0.bn3.running_mean', 'sparse_encoder.sp_cnn.layer4.0.bn3.running_var', 'sparse_encoder.sp_cnn.layer4.0.downsample.0.weight', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.weight', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.bias', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.running_mean', 'sparse_encoder.sp_cnn.layer4.0.downsample.1.running_var', 'sparse_encoder.sp_cnn.layer4.1.conv1.weight', 'sparse_encoder.sp_cnn.layer4.1.bn1.weight', 'sparse_encoder.sp_cnn.layer4.1.bn1.bias', 'sparse_encoder.sp_cnn.layer4.1.bn1.running_mean', 'sparse_encoder.sp_cnn.layer4.1.bn1.running_var', 'sparse_encoder.sp_cnn.layer4.1.conv2.weight', 'sparse_encoder.sp_cnn.layer4.1.bn2.weight', 'sparse_encoder.sp_cnn.layer4.1.bn2.bias', 'sparse_encoder.sp_cnn.layer4.1.bn2.running_mean', 'sparse_encoder.sp_cnn.layer4.1.bn2.running_var', 'sparse_encoder.sp_cnn.layer4.1.conv3.weight', 'sparse_encoder.sp_cnn.layer4.1.bn3.weight', 'sparse_encoder.sp_cnn.layer4.1.bn3.bias', 'sparse_encoder.sp_cnn.layer4.1.bn3.running_mean', 'sparse_encoder.sp_cnn.layer4.1.bn3.running_var', 'sparse_encoder.sp_cnn.layer4.2.conv1.weight', 'sparse_encoder.sp_cnn.layer4.2.bn1.weight', 'sparse_encoder.sp_cnn.layer4.2.bn1.bias', 'sparse_encoder.sp_cnn.layer4.2.bn1.running_mean', 'sparse_encoder.sp_cnn.layer4.2.bn1.running_var', 'sparse_encoder.sp_cnn.layer4.2.conv2.weight', 'sparse_encoder.sp_cnn.layer4.2.bn2.weight', 'sparse_encoder.sp_cnn.layer4.2.bn2.bias', 'sparse_encoder.sp_cnn.layer4.2.bn2.running_mean', 'sparse_encoder.sp_cnn.layer4.2.bn2.running_var', 'sparse_encoder.sp_cnn.layer4.2.conv3.weight', 'sparse_encoder.sp_cnn.layer4.2.bn3.weight', 'sparse_encoder.sp_cnn.layer4.2.bn3.bias', 'sparse_encoder.sp_cnn.layer4.2.bn3.running_mean', 'sparse_encoder.sp_cnn.layer4.2.bn3.running_var', 'dense_decoder.dec.0.up_sample.weight', 'dense_decoder.dec.0.up_sample.bias', 'dense_decoder.dec.0.conv.0.weight', 'dense_decoder.dec.0.conv.1.weight', 'dense_decoder.dec.0.conv.1.bias', 'dense_decoder.dec.0.conv.1.running_mean', 'dense_decoder.dec.0.conv.1.running_var', 'dense_decoder.dec.0.conv.3.weight', 'dense_decoder.dec.0.conv.4.weight', 'dense_decoder.dec.0.conv.4.bias', 'dense_decoder.dec.0.conv.4.running_mean', 'dense_decoder.dec.0.conv.4.running_var', 'dense_decoder.dec.1.up_sample.weight', 'dense_decoder.dec.1.up_sample.bias', 'dense_decoder.dec.1.conv.0.weight', 'dense_decoder.dec.1.conv.1.weight', 'dense_decoder.dec.1.conv.1.bias', 'dense_decoder.dec.1.conv.1.running_mean', 'dense_decoder.dec.1.conv.1.running_var', 'dense_decoder.dec.1.conv.3.weight', 'dense_decoder.dec.1.conv.4.weight', 'dense_decoder.dec.1.conv.4.bias', 'dense_decoder.dec.1.conv.4.running_mean', 'dense_decoder.dec.1.conv.4.running_var', 'dense_decoder.dec.2.up_sample.weight', 'dense_decoder.dec.2.up_sample.bias', 'dense_decoder.dec.2.conv.0.weight', 'dense_decoder.dec.2.conv.1.weight', 'dense_decoder.dec.2.conv.1.bias', 'dense_decoder.dec.2.conv.1.running_mean', 'dense_decoder.dec.2.conv.1.running_var', 'dense_decoder.dec.2.conv.3.weight', 'dense_decoder.dec.2.conv.4.weight', 'dense_decoder.dec.2.conv.4.bias', 'dense_decoder.dec.2.conv.4.running_mean', 'dense_decoder.dec.2.conv.4.running_var', 'dense_decoder.dec.3.up_sample.weight', 'dense_decoder.dec.3.up_sample.bias', 'dense_decoder.dec.3.conv.0.weight', 'dense_decoder.dec.3.conv.1.weight', 'dense_decoder.dec.3.conv.1.bias', 'dense_decoder.dec.3.conv.1.running_mean', 'dense_decoder.dec.3.conv.1.running_var', 'dense_decoder.dec.3.conv.3.weight', 'dense_decoder.dec.3.conv.4.weight', 'dense_decoder.dec.3.conv.4.bias', 'dense_decoder.dec.3.conv.4.running_mean', 'dense_decoder.dec.3.conv.4.running_var', 'dense_decoder.dec.4.up_sample.weight', 'dense_decoder.dec.4.up_sample.bias', 'dense_decoder.dec.4.conv.0.weight', 'dense_decoder.dec.4.conv.1.weight', 'dense_decoder.dec.4.conv.1.bias', 'dense_decoder.dec.4.conv.1.running_mean', 'dense_decoder.dec.4.conv.1.running_var', 'dense_decoder.dec.4.conv.3.weight', 'dense_decoder.dec.4.conv.4.weight', 'dense_decoder.dec.4.conv.4.bias', 'dense_decoder.dec.4.conv.4.running_mean', 'dense_decoder.dec.4.conv.4.running_var', 'dense_decoder.proj.weight', 'dense_decoder.proj.bias', 'densify_norms.0.weight', 'densify_norms.0.bias', 'densify_norms.1.weight', 'densify_norms.1.bias', 'densify_norms.2.weight', 'densify_norms.2.bias', 'densify_norms.3.weight', 'densify_norms.3.bias', 'densify_projs.0.weight', 'densify_projs.0.bias', 'densify_projs.1.weight', 'densify_projs.1.bias', 'densify_projs.2.weight', 'densify_projs.2.bias', 'densify_projs.3.weight', 'densify_projs.3.bias', 'mask_tokens.0', 'mask_tokens.1', 'mask_tokens.2', 'mask_tokens.3']

fatal: not a git repository (or any parent up to mount point /)

when running the command:
torchrun --nproc_per_node=4 --nnodes=1 --node_rank=0 --master_addr=localhost --master_port=0 ./pretrain/main.py --data_path=/data/ --exp_name=loftr_backbone1 --exp_dir=/Logs --model=loftr_backbone --bs=4
I come across the following error


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

Im not using gitclone to download this repo but using download zip, I guess the error has something to do with it?
Do you konw how to fix this? Thanks!

Optimized sparse conv from open3d

This repo is using masked dense convolutions because it is optimized in torch. However, would this implementation speed things up or too complicated getting this work here?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.