Giter VIP home page Giter VIP logo

Comments (39)

sacmehta avatar sacmehta commented on August 28, 2024 1

You didn’t turn on the decoder flag. you end up training ESPNet-C and not ESPNet.

Please use your current model as pretrained encoder (instead of ours) and train it with decoding flag.

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024 1

Try batch size of 6 or 8.

from espnet.

hsyi avatar hsyi commented on August 28, 2024 1

yeah I found it
I resized the feature map
I should upsample the model out
it's work now
thank you !

from espnet.

hsyi avatar hsyi commented on August 28, 2024 1

image
yes

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Something is wrong with your evaluation.

We achieve mIOU of 53.3 and 61.4 for ESPNet-C and ESPNet on the Cityscapes validation set. Are you using the evaluation scripts provided by the Cityscapes dataset?

from espnet.

wldeephi avatar wldeephi commented on August 28, 2024

I just using the evaluation scripts provided by the Cityscapes dataset, and the mIOU is 43.5% with your released ESP_C model(p=2, q=8), can you give me some advice? Thansk very much

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

I am assuming that you are using VisualizeResults.py file in the test folder. This file generates the label images at a resolution of 1024x512.

To use Cityscapes scripts, you need to upsample these generated images by a factor of 2 so that your label image size is the same as the input image i.e. 2048x1024. Could you please tell me how are you up sampling your label images to get to this size?

from espnet.

wldeephi avatar wldeephi commented on August 28, 2024

I just using VisualizeResults.py file with input_size=2048x1024, I wonder is it same with the 1024x512 and then upsampling with factor of 2?

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

We never trained/tested our models at this high resolution because it will demand enormous resources which are not available on embedded devices. Could you try generating the results at 1024x512 resolution and see if you are able to generate the reported numbers?

Note 1: if you are upsampling segmentation masks, then please use nearest neighbor interpolation.

Note 2: if you are upsampling the feature maps of last layer, then use Bilinear interpolation and then apply softmax to get the final feature maps.

P.S. you can finetune ESPNet models at high resolution. I believe fine tuning at high resolutions will further improve its accuracy.

from espnet.

wldeephi avatar wldeephi commented on August 28, 2024

OK, thanks very much for you real time answers and very useful advices. I will do experiments following your suggestions and then share results later。

from espnet.

hsyi avatar hsyi commented on August 28, 2024

May I ask about your mIOU during training? I can get only 0.47 mIOU in splited validation dataset and 0.50mIOU in train dataset during training .

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Which scripts are you using to evaluate mIOU?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

I have not evaluate the model on test set of cityscapes dataset, the data of miou is from trainlog.
image
image

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

This looks good. Train it for 300 epochs and you should be Okay.

Note: official mIOU metric used for evaluation on the Cityscapes is different than the one which we have in our code. Please evaluate your best model (with min validation loss) on the Cityscapes validation set using their scripts to compare the number reported in paper.

from espnet.

hsyi avatar hsyi commented on August 28, 2024

thank you for your quickly answering, I'm waiting for the 300 epochs.

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

For our numbers on the Cityscapes validation set, please see Table 2(f) in the paper which reports the data for both ESPNet-C as well as ESPNet.

Good luck and thanks for showing interest in our work!

from espnet.

hsyi avatar hsyi commented on August 28, 2024

hi, this is my full train log, I use the pretraind encoder in your github to train ESPNet, but It seems that I can't get the good performance as you. Do u have any advices for training? how to tune the hyperparameters?
trainValLog.txt

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Hi,

This looks good to me. Please evaluate it using the Cityscapes dataset scripts because they are different as they use weighted mIOU; different from the one we provide. Once you evaluate on that, you will see similar performance.

Note that please use your best model for generating results. That is, the model with least validation loss.

from espnet.

hsyi avatar hsyi commented on August 28, 2024

Thank you @sacmehta , I'll try tomorrow.It's really nice of you ^ _ ^

from espnet.

hsyi avatar hsyi commented on August 28, 2024

hello, this is my result with bilinear interpolation on val set , And I think it's worse than you
image

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Could you please tell me if you use ESpNet-C or ESPNet?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

ESPNet,I use the pretrained ESPnet-c which you provided in the code to train the model

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Which configuration?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

all hyperparameters are default value in your code. Do you have any advice?

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Can you share the command you used to train the model?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

python main.py
look at this
image

from espnet.

hsyi avatar hsyi commented on August 28, 2024

sorry I didn't remember whether did I changed the setting during training with the command -decoder=True, but I'll figure it out.

from espnet.

hsyi avatar hsyi commented on August 28, 2024

yeah,in the model dict ,there are two up layer. and the folder to contain the result is "results_enc__dec_2_8"
So I think I have turn it on during training with the command line.
Do you have any suggestions?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

but the size of parameters is different. I must have done something wrong.
Thank you for your help. I'm going to retrain one.

from espnet.

hsyi avatar hsyi commented on August 28, 2024

I believe this is my final result for ESPNet, I have checked the number of "parameters" in train log, and it's right for ESPNet.
May I ask u for any advice for hyperparameters? are you using the default setting in your code to train your model?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

image
hi,this is the evaluated result in validation set on cityscapes val dataset.(with bilinear resize to 2048*1024)
I use your espnet_p2_q8 with decoder provided in the code .
Did you released your best model? or whether I've done something wrong?

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Provided model is our best model on the validation set.

Did you resize feature maps or segmentation masks using Bilinear interpolation?

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Also, check unique values in the generated segmentation masks. They should be between 0 and number of classes. You can check this using

numpy.unique

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

Are you able to attain the reported accuracy?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

hi, @sacmehta ,have you done something to augment the original dataset, I can only get 0.59 below your 0.61 on cityscapes validation set

from espnet.

sacmehta avatar sacmehta commented on August 28, 2024

No, we didn’t use any additional augmentation.

+/- 2 points deviation is kind of expected. We used a batch size of 12 for ESPNet-c and 6 for ESPNet. What batch size did you use?

from espnet.

hsyi avatar hsyi commented on August 28, 2024

same as you, 6 for ESPNet .
thank you very much!

from espnet.

acgtyrant avatar acgtyrant commented on August 28, 2024

@sacmehta Is it necessary to train 300 epochs? The val loss and mIoU are sluggish after the early epoch.

from espnet.

acgtyrant avatar acgtyrant commented on August 28, 2024

My result is val mIoU 0.601.

from espnet.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.