Giter VIP home page Giter VIP logo

models's Issues

Run-Time and Memory Measurement

(Regarding eppmvsnet)
Hi,
I am trying to measure the runtime and memory usage of a set of methods as your table 3 in the paper shows, but didn't get the same numbers. Could you provide more details regarding how you measure them? Thanks!

How to load the model's parameters during prediction after the data and optimizator parallelism at the training time? 【pangu-alpha采用数据并行+优化器并行方式训练,predict时候如何加载参数。】

Task Description

How to load pangu-alpha model's parameters during prediction after the data and optimizator parallelism at the training time?
【pangu-alpha采用数据并行+优化器并行方式训练,predict时候如何加载参数。】

Task Goal

In the MindSpore tutorial and also the course, there are several instructions about how to use the distributed model do to the training and prediction(model loading). but those instructions only include the data parallelism and automatic parallelism. Following those instructions, there is only one generated checkpoint file, and that is straightforward how to load the model during prediction. However, I cannot find out any instruction to introduce how to load the mode if I trained my model with data parallelism and optimizator parallelism. In this case, each card will generate a checkpoint file, and I am not sure which one should be loaded during prediction. For example, I use 64 cards to train my model, and wanna use 1 card or 8 cards to predict. In this case, there are multiple checkpoint files, which one should I select to use?

【在MindSpore的教程中,关于分布式并行模型的训练和加载,只介绍了数据并行和自动并行两种情况,这两种情况保存的参数只有一个checkpoint文件,加载方法比较简单。然而,在其他的一些情况,MindSpore的教程及Readme中,没有说明如何处理。比如,在使用“数据并行”+“优化器并行”,每张卡的checkpoint是不一样的,不知道具体加载那个checkpoint。比如使用64卡训练,想单卡推理或者8卡推理加载,该如何操作?】

any update for training code?

Thanks for your great work.

I have been trying to reproduce your work [Semi-Supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation (DDM)] , but it seems that I'm missing a few important parts.
Is there any plan for providing training code and procedure?

cv/FDA-BNN missing files

Thanks for the awesome work, but there are some missing files in cv/FDA-BNN, such as trainer, config files.

Have any plans to upload these files?

论文中3.3部分公式11的代码实现疑问

作者你好,看了原始论文中3.3部分的公式11,包含两个全连接层,其中第二个全连接层还有skip-connection。但在代码实现中跟论文不一样。autodis.py这个文件AutoDisModel类的construct成员函数的252-261行是实现autodis embedding的,对于论文中的公式11,代码实现的时候只用了一个全连接层,请教下是什么原因?

RuntimeError: For 'Reshape', the size of 'input_x': {3456} is not equal to the size of the first output: {5760}

I use the dataset you provided,but I can't train.How to solve this problem?

root@0563a279aa9b:/data# DEVICE_ID=0 python train.py
Start time : 2022-09-22 08:07:09

infos : {'dataset_path': './dataset/', 'backbone_pretrained': './src/model/res2net_pretrained.ckpt', 'dataset_train': 'PASCAL_SBD', 'datasets_val': ['GrabCut', 'Berkeley'], 'epochs': 33, 'train_only_epochs': 32, 'val_robot_interval': 1, 'lr': 0.007, 'batch_size': 8, 'max_num': 0, 'size': (384, 384), 'device': 'CPU', 'num_workers': 4, 'itis_pro': 0.7, 'max_point_num': 20, 'record_point_num': 5, 'pred_tsh': 0.5, 'miou_target': [0.9, 0.9], 'resume': None, 'snapshot_path': './snapshot'}

Traceback (most recent call last):
File "train.py", line 35, in
mine = Trainer(p)
File "/data/src/trainer.py", line 111, in init
size=p["size"][0], backbone_pretrained=p["backbone_pretrained"]
File "/data/src/model/fcanet.py", line 295, in init
resnet.load_pretrained_model(backbone_pretrained)
File "/data/src/model/res2net.py", line 267, in load_pretrained_model
tmp[:, :3, :, :] = parameter_dict["conv1_0.weight"]
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/common/tensor.py", line 344, in setitem
out = tensor_operator_registry.get('setitem')(self, index, value)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/ops/composite/multitype_ops/_compile_utils.py", line 67, in _tensor_setitem
return tensor_setitem_by_tuple(self, index, value)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/ops/composite/multitype_ops/_compile_utils.py", line 803, in tensor_setitem_by_tuple
return tensor_setitem_by_tuple_with_tensor(self, index, value)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/ops/composite/multitype_ops/_compile_utils.py", line 956, in tensor_setitem_by_tuple_with_tensor
tuple_index, value, idx_advanced = remove_expanded_dims(tuple_index, F.shape(data), value)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/ops/composite/multitype_ops/compile_utils.py", line 1156, in remove_expanded_dims
value = F.reshape(value, value_shape)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/ops/function/array_func.py", line 857, in reshape
return reshape
(input_x, input_shape)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/ops/primitive.py", line 294, in call
return _run_op(self, self.name, args)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/common/api.py", line 98, in wrapper
results = fn(*arg, **kwargs)
File "/usr/local/python-3.7.5/lib/python3.7/site-packages/mindspore/ops/primitive.py", line 748, in _run_op
output = real_run_op(obj, op_name, args)
RuntimeError: For 'Reshape', the size of 'input_x': {3456} is not equal to the size of the first output: {5760}


  • C++ Call Stack: (For framework developers)

mindspore/ccsrc/plugin/device/cpu/kernel/memcpy_cpu_kernel.cc:37 Launch

Figure 4 疑问

作者,你好,请问可以提高一下关于Figure 4的代码吗?非常感谢

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.