Giter VIP home page Giter VIP logo

magnet's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

magnet's Issues

Some problems about Gleason dataset

Hi, thanks a lot for sharing your interesting and great work. I downloaded the Gleason dataset, but I have some problems.
1.Could you tell me how to vote for the final labels for your work? Majority vote?
2.The labels seem to contain more than fours classes(013456). Could you tell me how to map it to four classes(benign, Grade 3, Grade 4, and Grade 5)? Only calculate metrics of class 1,3,4 and 5?
3.Could you provide its training and testing filenames?

Questions about Binary Semantic Segmentation

Hi, thank you for sharing your code, it works well on the public datasets.
Now I want to train and test this network with my own dataset, which is a binary semantic segmentation task.
So I changed the Class_Num as 2, but there are some problems I cannot solve:
While I'm training backbone, there comes a RuntimeError as following:

File "train.py", line 331, in
main()
File "train.py", line 297, in main
writer_dict,
File "/home/yuming/Documents/MagNet-main/backbone/lib/core/function.py", line 49, in train
losses, _ = model(images, labels)
File "/opt/conda_envs/yuming/envs/hu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda_envs/yuming/envs/hu/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
return self.module(*inputs[0], **kwargs[0])
File "/opt/conda_envs/yuming/envs/hu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuming/Documents/MagNet-main/backbone/lib/utils/utils.py", line 34, in forward
loss = self.loss(outputs, labels)
File "/opt/conda_envs/yuming/envs/hu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/home/yuming/Documents/MagNet-main/backbone/lib/core/criterion.py", line 37, in forward
return sum([w * self._forward(x, target) for (w, x) in zip(weights, score)])
File "/home/yuming/Documents/MagNet-main/backbone/lib/core/criterion.py", line 37, in
return sum([w * self._forward(x, target) for (w, x) in zip(weights, score)])
File "/home/yuming/Documents/MagNet-main/backbone/lib/core/criterion.py", line 25, in _forward
loss = self.criterion(score.contiguous(), target)
File "/opt/conda_envs/yuming/envs/hu/lib/python3.7/site-packages/torch/nn/modules/module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda_envs/yuming/envs/hu/lib/python3.7/site-packages/torch/nn/modules/loss.py", line 1121, in forward
ignore_index=self.ignore_index, reduction=self.reduction)
File "/opt/conda_envs/yuming/envs/hu/lib/python3.7/site-packages/torch/nn/functional.py", line 2824, in cross_entropy
return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index)
RuntimeError: weight tensor should be defined either for all or no classes at /pytorch/aten/src/THCUNN/generic/SpatialClassNLLCriterion.cu:27

While I'm training MagNet, there comes a IndexError as following:
File "train.py", line 193, in
main()
File "train.py", line 38, in main
dataset = get_dataset_with_name(opt.dataset)(opt)
File "/home/yuming/Documents/MagNet-main/magnet/dataset/cityscapes.py", line 11, in init
super().init(opt)
File "/home/yuming/Documents/MagNet-main/magnet/dataset/base.py", line 44, in init
self.data += [self.parse_info(line)]
File "/home/yuming/Documents/MagNet-main/magnet/dataset/base.py", line 84, in parse_info
info["label"] = os.path.join(self.root, tokens[1])
IndexError: list index out of range

I've tried a lot, but the situation has not improved.
I would appreciate it if you could help me!

Some questions about the training process

How to train segmentation model and refinement model ?
I try to retrain MagNet with Deepglobe dataset. But I noticed that no example is provided in the readme.md to train MagNet without pretrained parameters of backbone. In train.py, the segmentation model is set to eval mode, and the parameters of segmentation model are not updated during training.
For this reason, I changed model.eval() to model.train() on the line 46 of train.py. But the IOU fluctuates up and down during training, with only tiny increase after 100 epochs of training.
Therefore, I would like to know how to train segmentation model and refinement model. Are the two models trained respectively?

the epoch_IoU of retrained refinement network can only up to 0.35 on deepglobe dataset

I tried to retrain the segmentation backbone and refinement network following the guideline in readme https://github.com/VinAIResearch/MagNet#training-backbone-networks.
The best_mIoU of retrained backbone fpn is 0.6363 , this result is close to the baseline IoU 0.6722 shown in readme.
image
In this sense, the performance of retrained refinement network with retrained backbone should be close to the performance with pretrained backbone.
In the retraining of refinement network, the change of epoch_IoU with pretrained backbone was like following image,
image1
the change of epoch_IoU with retrained backbone was like following image.
image2
With the retrained backbone, the epoch_IoU can only up to 0.35.
I tried to find the difference between pretrained backbone and retrained backbone.
I separated the validate part from backbone/train.py to evaluate the performance of pretrained backbone. https://github.com/DwRolin/temp_code/blob/main/eval_pretrain.py
What's strange is that the MeanIU of pretrained backbone is only 0.07.
I would like to know what causes this contradiction and how to make the retrained refinement network work well.

RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)

i run the scripts offered in the part "To test with a Deepglobe image", using the python demo.py ......, then i got the followling error:

"RuntimeError: indices should be either on cpu or on the same device as the indexed tensor (cpu)"
could you please help me with that please?
Here's my running scripts, i add the --sub_batch_size part:
python demo.py
--dataset deepglobe
--image data/639004_sat.jpg
--scales 612-612,1224-1224,2448-2448
--crop_size 612 612
--input_size 508 508
--model fpn
--pretrained checkpoints/deepglobe_fpn.pth
--pretrained_refinement checkpoints/deepglobe_refinement.pth
--num_classes 7
--n_points 0.75
--n_patches -1
--smooth_kernel 11
--save_pred
--save_dir test_results/demo
--sub_batch_size 1

Why the input_size of backbone is set to the number of 508×508 on the DeepGlobe dataset experiment?

On the Cityscapes dataset experiment, the input size of the backbone network is set to 256×128, and the original image size is 2048×1024, and the input samples are obtained by integer downsampling. However, on the deepglobe dataset experiment, the input size was set to 508×508 and the original image size was 2448×2448, so the input samples were not obtained by integer downsampling.

In the refinement stage, the coarse segmentation map will be interpolated to the original size. Will the non-integer interpolation result in loss of object edge details and is the 508×508 input size necessary?

a small question about Deepglobe dataset

Hi,
First of all, thank you for your excellent work. However, I have a small question about Deepglobe dataset.

  1. As far as I know, the original validation and test sets do not provide labels. Are all the val and test sets in the current data split from the original training set?
  2. According to your and GLNet's split file (.txt), the total number of train samples is 454 rather than 455. Why do you and your previous papers on related work explain that 455?

Thank you for your attention!

the inputs of the "refinement model" are different between train.py and test.py

Hello @hmchuong,
Thank you for your source code. I read your code and recognize that the inputs of the "refinement model" between train.py and test.py are different. To more detail, the 2 inputs of the "refinement model" in train.py (crop_preds, fine_pred) are derived from the backbone model while the 2 inputs of the "refinement model" in test.py (scale_early_preds, coarse_preds) are derived from the backbone model and the refinement model of the previous stage, respectively.

Can you explain why the inputs of the "refinement model" are different between train.py and test.py? Thank you very much!

In train.py

coarse_pred = model(coarse_image).softmax(1) # "model" is backbone model
fine_pred = model(fine_image).softmax(1)
crop_preds = roi_align(coarse_pred, coords, output_size=(opt.input_size[1], opt.input_size[0]))
logits = refinement_model(crop_preds, fine_pred) # --> "crop_preds" and "fine_pred" derived from the backbone model

#--------------------

In test.py

scale_early_preds = get_batch_predictions(model, sub_batch_size, scale_image_patches.to(device))
coarse_preds = roi_align(final_output, [coords[selected_patch_ids]], output_size=(opt.input_size[1], opt.input_size[0])) # "final_output" derived from the refinement model --> "coarse_preds" derived from the refinement model
.
.
fine_pred = get_batch_predictions(
refinement_models[min(len(refinement_models), idx) - 1],
sub_batch_size,
scale_early_preds, # --> "scale_early_preds" from backbone model
coarse_preds, #---> "coarse_preds" from the refinement model of the previous stage
)
.
.
final_output = (
final_output.reshape(1, opt.num_classes, scale[0] * scale[1])
.scatter_(2, error_point_indices, fine_pred) # "fine_pred" derived from the refinement model
.view(1, opt.num_classes, scale[1], scale[0])
)

error about get_gaussian_kernel2d

When I run the demo bash script, I run the script by you, but have an error as follows:

aceback (most recent call last):
  File "demo.py", line 12, in <module>
    from magnet.utils.blur import MedianBlur
  File "/MagNet/magnet/utils/blur.py", line 125
    kernel: torch.Tensor = get_gaussian_kernel2d(kernel_size, sigma).repeat(chan
nel, 1, 1, 1)
          ^
SyntaxError: invalid syntax```

some details about the results of experiment

Thank you for sharing your work.
I'm confused that the result of FPN reported in your experiment part(table8, the results on the DeepGlobe dataset). I used your test collection and model parameters, but only got 62.86, less than 67.86.
Did you use any data augmentation for testing other networks ?
Could you pls give some more details about that? thanks.

About the Gleason dataset

My problem is same as the link: ##11

Could you provide the filenames of train and test, and share me the label you used?

Thank you so much!

How to apply train.py trained parameters to test.py?

Hello! How to apply the parameters of the refinement module obtained after running the train.py function to test.py, I see that you have given three scales of refinement module parameters in the test of the citydataset.

Patches and refined locations

Hi!
If we are using 256x128 patches and we refine 32768 locations in them. Doesn't this mean that we are using only the output from the refinement network by overwriting all the pixels predicted by the backbone. Am I missing something? Doesn't locations mean pixels? Thank you in advance.

How to set parameter 'sub_batch_size'?

Hi, thanks for your contribution with code. I found that the parameter 'sub_batch_size' is added in 'magnet/options/test.py', but in README.md, the instruction of demo do not given that. How could I set it correctly in this model?

# of required GPUs to reproduce Best outputs

Hello !
Thanks for your great contribution in this field.

I'm setting up to follow your work (MagNet) and wonder how many GPUs are required to implement your codes?
In details, I want to work on DeepGlobe Dataset first with the following running codes.
Please tell me the number and the memory size of GPUs you used in this experiments!

Best regards,
Yooseung

========================================================================
python train.py --dataset deepglobe
--root data/deepglobe
--datalist data/list/deepglobe/train.txt
--scales 612-612,1224-1224,2448-2448
--crop_size 612 612
--input_size 508 508
--num_workers 8
--model fpn
--pretrained checkpoints/deepglobe_fpn.pth
--num_classes 7
--batch_size 8
--task_name deepglobe_refinement
--lr 0.001

or in short, run the script below
sh scripts/deepglobe/train_magnet.sh

About the result of deepglobe dataset

hi!
For the deepglobe dataset, I have some questions about the results of the replication.
This is the result of my own training backbone network and refined modules:
82384479e176945834d19dfba907c9e
I have done many experiments and still can not reproduce the effect of the original paper
bb81de9bfe215fef8524c7d39121d40
I want to know what's wrong and there are two groups of Coarse iou and Refinement iou. What do they represent respectively?
I hope to get your answer.Thank you very much!!!

missing file "hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml"

Hi @hmchuong ,
I try to train backbone networks with mentioned instruction:

In ./backbone

python train.py --cfg experiments/cityscapes/hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml

Can you upload file "hrnet_ocr_w18_train_256x128_sgd_lr1e-2_wd5e-4_bs_12_epoch484.yaml" ? or any way as long as I can run above command line (I don't see arg: --cfg in code)
Thank you.

demo.py: error: --sub_batch_size

usage: demo.py [-h] --dataset DATASET [--root ROOT] [--datalist DATALIST] --scales SCALES --crop_size N
[N ...] --input_size N [N ...] [--num_workers NUM_WORKERS] --model MODEL --num_classes
NUM_CLASSES --pretrained PRETRAINED
[--pretrained_refinement PRETRAINED_REFINEMENT [PRETRAINED_REFINEMENT ...]] [--image IMAGE]
--sub_batch_size SUB_BATCH_SIZE [--n_patches N_PATCHES] --n_points N_POINTS
[--smooth_kernel SMOOTH_KERNEL] [--save_pred] [--save_dir SAVE_DIR]
demo.py: error: --sub_batch_size

AttributeError in test_magnet.sh Script

I encountered an AttributeError while running the test_magnet.sh script in the MagNet project. The issues seem to come from deprecated use of PyTorch and NumPy functions.

Steps to Reproduce:

  1. Run the test_magnet.sh script.
  2. Observe output:
    /MagNet/magnet/utils/metrics.py:22: FutureWarning: In the future np.bool will be defined as the corresponding NumPy scalar.
    AttributeError: module 'numpy' has no attribute 'bool'.

Proposed Solution
Replace np.bool with bool in metrics.py. Suggested change in line 22:
k = (x >= 0) & (y < n) & (x != ignore_label) & (mask.astype(bool))

NumPy version: 1.26.4, PyTorch version: 1.12.1

RuntimeError: CUDA error: out of memory

Traceback (most recent call last):
File "train.py", line 331, in
main()
File "train.py", line 120, in main
scale_factor=config.TRAIN.SCALE_FACTOR,
File "/home/cv428/Students/LH/MagNet-main/backbone/lib/datasets/cityscapes.py", line 118, in init
1.0507,
RuntimeError: CUDA error: out of memory

i used 3080 12GB.
i set : BASE_SIZE: 8
BATCH_SIZE_PER_GPU: 1
SCALE_FACTOR: 1
but still out of memory
Help me!!!!!!!

Training details on methods in Table 4

Hi,

Thanks for your interesting work.

I'm curious about some details regarding comparison methods in table 4 from your paper. Are all methods compared in the table trained and tested on 256x128 images as you mentioned in sec. 4.2? Could you provide more details on how you trained your model compared to the baseline "downsample" and "patching" methods?

Minor side note, from your backbone training config it seems that you are using HRNetv2-W18s rather than HRNetv2-W18. Am I missing something?

Thanks again.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.