Comments (14)
Hi there, I updated the script to call compute_validation_loss at the validation epoch check after compute_validation_map, but I keep getting an attribute error (see below) because pytorch is getting a list instead of a tensor.
According to Issue 243, "prepare_data" should help with this list to tensor conversion, but it is being called in the line right before "out=net(images)" Do you have any suggestions for what to try next? Thanks for your help!
In train.py:
# This is done per epoch
if args.validation_epoch > 0:
if epoch % args.validation_epoch == 0 and epoch > 0:
compute_validation_map(epoch, iteration, yolact_net, val_dataset, log if args.log else None)
compute_validation_loss(yolact_net,val_data_loader,MultiBoxLoss, log if args.log else None)
The only other code i've changed is adding logging to compute_validation_loss (the script hasnt gotten that far yet):
def compute_validation_loss(net, data_loader, criterion,log:Log=None):
global loss_types
net = CustomDataParallel(NetLoss(net, criterion))
with torch.no_grad():
losses = {}
# Don't switch to eval mode because we want to get losses
iterations = 0
for datum in data_loader:
images, targets, masks, num_crowds = prepare_data(datum)
out = net(images)
wrapper = ScatterWrapper(targets, masks, num_crowds)
_losses = criterion(out, wrapper, wrapper.make_mask())
for k, v in _losses.items():
v = v.mean().item()
if k in losses:
losses[k] += v
else:
losses[k] = v
if log is not None:
precision = 5
loss_info = {k: round(losses[k].item(), precision) for k in losses}
loss_info['T'] = round(loss.item(), precision)
log.log('val', loss=loss_info, epoch=epoch, iter=iteration,
lr=round(cur_lr, 10), elapsed=elapsed)
iterations += 1
if args.validation_size <= iterations * args.batch_size:
break
for k in losses:
losses[k] /= iterations
loss_labels = sum([[k, losses[k]] for k in loss_types if k in losses], [])
print(('Validation ||' + (' %s: %.3f |' * len(losses)) + ')') % tuple(loss_labels), flush=True)
Error:
<class 'list'>
Traceback (most recent call last):
File "train.py", line 523, in
train()
File "train.py", line 377, in train
compute_validation_loss(yolact_net,val_data_loader,MultiBoxLoss, log if args.log else None)
File "train.py", line 480, in compute_validation_loss
out = net(images)
File "C:\ProgramData\Anaconda3\envs\yolact-env-py37\lib\site-packages\torch\nn\modules\module.py", line 532, in call
result = self.forward(*input, **kwargs)
File "C:\Users\dany\ml\yolact\yolact.py", line 568, in forward
_, _, img_h, img_w = x.size()
AttributeError: 'list' object has no attribute 'size'
from yolact.
I removed that function because I hadn't updated it in a long while. I'll add it back and update it though (check my latest commit).
To use, create a data_loader
for val_dataset
in the same way I create one for dataset
and then call the function with the proper arguments.
Fair warning though, the evaluation will take a while (you might want to reduce the number of validation examples being evaluated, check the arguments to train.py
).
from yolact.
I don't know if it is still needed, but I have found that this works for me:
Creating a data loader for the validation set
val_data_loader = data.DataLoader(val_dataset, args.batch_size,
num_workers=args.num_workers,
shuffle=True, collate_fn=detection_collate,
pin_memory=True)
Creating a dictionary for the validation loss averages
val_loss_avgs = { k: MovingAverage(100) for k in loss_types }
Adding this to calculate loss on validation the same way that it does for the train loss. Adding this after the for loop that calculates the train and before the calculation of validation mAP.
# This is done per epoch
if epoch>0:
#Calculates the loss on the validation dataset.
print('Calculating validaton losses, this may take a while...')
val_iteration = 0
for dat in val_data_loader:
#if len(dat[0]) < args.batch_size:
# Zero the grad to get ready to compute gradients
optimizer.zero_grad()
# Forward Pass + Compute loss at the same time (see CustomDataParallel and NetLoss)
losses = net(dat)
losses = { k: (v).mean() for k,v in losses.items() } # Mean here because Dataparallel
loss = sum([losses[k] for k in losses])
# no_inf_mean removes some components from the loss, so make sure to backward through all of it
# all_loss = sum([v.mean() for v in losses.values()])
# Backprop
loss.backward() # Do this to free up vram even if loss is not finite
if torch.isfinite(loss).item():
optimizer.step()
# Add the loss to the moving average for bookkeeping
for k in losses:
val_loss_avgs[k].add(losses[k].item())
val_iteration += 1
if args.validation_size <= val_iteration * args.batch_size:
break
total = sum([val_loss_avgs[k].get_avg() for k in losses])
loss_labels = sum([[k, val_loss_avgs[k].get_avg()] for k in loss_types if k in losses], [])
print(('Validation Loss ||' + (' %s: %.3f |' * len(losses)) + ' T: %.3f '+')') % tuple(loss_labels+[total]), flush=True)
if args.log:
precision = 5
loss_info = {k: round(losses[k].item(), precision) for k in losses}
loss_info['T'] = round(loss.item(), precision)
log.log('val', loss=loss_info, epoch=epoch, iter=iteration)
Also faced an index out of range issue that was fixed with this change on the loop of the function "prepare_data". This fixes it without breaking anything.
for device, alloc in zip(devices, allocation):
for _ in range(min(alloc,len(datum[0]))):
images[cur_idx] = gradinator(images[cur_idx].to(device))
targets[cur_idx] = gradinator(targets[cur_idx].to(device))
masks[cur_idx] = gradinator(masks[cur_idx].to(device))
cur_idx += 1
Using this I managed to get validation loss calculated and added to the log ๐
from yolact.
Thanks!
from yolact.
Doesn't seem to be fixed for me. For those interested, here is what I think the code needs to be:
First create a new instance of the dataloader for the validation dataset:
data_loader_val = data.DataLoader(val_dataset, len(val_dataset), # WARNING: using the full length of val_dataset might cause a memory overflow...
num_workers=args.num_workers,
shuffle=True, collate_fn=detection_collate,
generator=torch.Generator(device='cuda'),
pin_memory=True)
And then change your compute_validation_loss to this:
def compute_validation_loss(net, data_loader):
#Calculates the loss on the validation dataset.
print('Calculating validaton losses, this may take a while...')
global loss_types
with torch.no_grad():
losses = {}
# Don't switch to eval mode here. Warning: this is viable but changes the interpretation of the validation loss.
for datum in data_loader:
losses = net.forward(datum)
losses = { k: (v).mean() for k,v in losses.items() }
loss = sum([losses[k] for k in losses])
loss_labels = sum([[k, losses[k]] for k in loss_types if k in losses], [])
print(('Validation Loss||' + (' %s: %.3f |' * len(losses)) + ')') % tuple(loss_labels), flush=True)```
from yolact.
@LukasMahieuArinti
hi, I tried the solution you suggested, but still got the same type of error. Is there any other solution please?
Error:
Traceback (most recent call last):
File "train.py", line 530, in
train()
File "train.py", line 378, in train
compute_validation_loss(yolact_net, data_loader_val)
File "train.py", line 503, in compute_validation_loss
losses = net.forward(datum)
File "/home/graduate/shancheng/yolact/yolact.py", line 566, in forward
_, _, img_h, img_w = x.size()
AttributeError: 'list' object has no attribute 'size'
from yolact.
Weird, works fine for me. Are you sure you are passing the correct arguments to the validation loss function?
The 'net' argument should be the Yolact model as defined in this piece of code:
yolact_net = Yolact()
net = yolact_net
Also, makes sure you only call this function once per epoch and you already went through one epoch at least. It doesn't make a lot of sense to call it more/earlier.
If you want to look at some code: I recently noticed that the yolactedge repository has a very similar implementation to compute the validation loss.
from yolact.
@ChangShanCheng did you get a solution of 'AttributeError: 'list' object has no attribute 'size'' i am getting the same wrror when trying to calculate validation loss
from yolact.
@bhuvanofc Not yet, and I also confirm that the "net" is the Yolact model, so I'm still trying to figure out a way
from yolact.
@ChangShanCheng thank you. please let me know if you find any solution. Also, do you happen to know how I could calculate training accuracy for yolact++ during the training process?
from yolact.
@dbolya any solution for this error. I need validation loss to write in my thesis
from yolact.
I don't know if it is still needed, but I have found that this works for me:
Creating a data loader for the validation set
val_data_loader = data.DataLoader(val_dataset, args.batch_size, num_workers=args.num_workers, shuffle=True, collate_fn=detection_collate, pin_memory=True)
Creating a dictionary for the validation loss averages
val_loss_avgs = { k: MovingAverage(100) for k in loss_types }
Adding this to calculate loss on validation the same way that it does for the train loss. Adding this after the for loop that calculates the train and before the calculation of validation mAP.
# This is done per epoch if epoch>0: #Calculates the loss on the validation dataset. print('Calculating validaton losses, this may take a while...') val_iteration = 0 for dat in val_data_loader: #if len(dat[0]) < args.batch_size: # Zero the grad to get ready to compute gradients optimizer.zero_grad() # Forward Pass + Compute loss at the same time (see CustomDataParallel and NetLoss) losses = net(dat) losses = { k: (v).mean() for k,v in losses.items() } # Mean here because Dataparallel loss = sum([losses[k] for k in losses]) # no_inf_mean removes some components from the loss, so make sure to backward through all of it # all_loss = sum([v.mean() for v in losses.values()]) # Backprop loss.backward() # Do this to free up vram even if loss is not finite if torch.isfinite(loss).item(): optimizer.step() # Add the loss to the moving average for bookkeeping for k in losses: val_loss_avgs[k].add(losses[k].item()) val_iteration += 1 if args.validation_size <= val_iteration * args.batch_size: break total = sum([val_loss_avgs[k].get_avg() for k in losses]) loss_labels = sum([[k, val_loss_avgs[k].get_avg()] for k in loss_types if k in losses], []) print(('Validation Loss ||' + (' %s: %.3f |' * len(losses)) + ' T: %.3f '+')') % tuple(loss_labels+[total]), flush=True) if args.log: precision = 5 loss_info = {k: round(losses[k].item(), precision) for k in losses} loss_info['T'] = round(loss.item(), precision) log.log('val', loss=loss_info, epoch=epoch, iter=iteration)
Also faced an index out of range issue that was fixed with this change on the loop of the function "prepare_data". This fixes it without breaking anything.
for device, alloc in zip(devices, allocation): for _ in range(min(alloc,len(datum[0]))): images[cur_idx] = gradinator(images[cur_idx].to(device)) targets[cur_idx] = gradinator(targets[cur_idx].to(device)) masks[cur_idx] = gradinator(masks[cur_idx].to(device)) cur_idx += 1
Using this I managed to get validation loss calculated and added to the log ๐
I have the last piece of code, but there is still an index out of range error, do you know why? Looking forward to your reply
from yolact.
Hi there, I updated the script to call compute_validation_loss at the validation epoch check after compute_validation_map, but I keep getting an attribute error (see below) because pytorch is getting a list instead of a tensor.
According to Issue 243, "prepare_data" should help with this list to tensor conversion, but it is being called in the line right before "out=net(images)" Do you have any suggestions for what to try next? Thanks for your help!
In train.py:
# This is done per epoch if args.validation_epoch > 0: if epoch % args.validation_epoch == 0 and epoch > 0: compute_validation_map(epoch, iteration, yolact_net, val_dataset, log if args.log else None) compute_validation_loss(yolact_net,val_data_loader,MultiBoxLoss, log if args.log else None)
The only other code i've changed is adding logging to compute_validation_loss (the script hasnt gotten that far yet):
def compute_validation_loss(net, data_loader, criterion,log:Log=None): global loss_types net = CustomDataParallel(NetLoss(net, criterion)) with torch.no_grad(): losses = {}
# Don't switch to eval mode because we want to get losses iterations = 0 for datum in data_loader: images, targets, masks, num_crowds = prepare_data(datum) out = net(images) wrapper = ScatterWrapper(targets, masks, num_crowds) _losses = criterion(out, wrapper, wrapper.make_mask()) for k, v in _losses.items(): v = v.mean().item() if k in losses: losses[k] += v else: losses[k] = v if log is not None: precision = 5 loss_info = {k: round(losses[k].item(), precision) for k in losses} loss_info['T'] = round(loss.item(), precision) log.log('val', loss=loss_info, epoch=epoch, iter=iteration, lr=round(cur_lr, 10), elapsed=elapsed) iterations += 1 if args.validation_size <= iterations * args.batch_size: break for k in losses: losses[k] /= iterations loss_labels = sum([[k, losses[k]] for k in loss_types if k in losses], []) print(('Validation ||' + (' %s: %.3f |' * len(losses)) + ')') % tuple(loss_labels), flush=True)
Error:
<class 'list'> Traceback (most recent call last): File "train.py", line 523, in train() File "train.py", line 377, in train compute_validation_loss(yolact_net,val_data_loader,MultiBoxLoss, log if args.log else None) File "train.py", line 480, in compute_validation_loss out = net(images) File "C:\ProgramData\Anaconda3\envs\yolact-env-py37\lib\site-packages\torch\nn\modules\module.py", line 532, in call result = self.forward(*input, **kwargs) File "C:\Users\dany\ml\yolact\yolact.py", line 568, in forward _, _, img_h, img_w = x.size() AttributeError: 'list' object has no attribute 'size'
Have you solved the problem๏ผ
from yolact.
@peter-zhang-1020 did u solve it yet?
any help would be appreciated :)
Edit: I made it work by using @konrad-ivelic s solution, but setting Shuffle=False
in the val_data_loader
.
Furthermore i didnt use this part:
for device, alloc in zip(devices, allocation):
for _ in range(min(alloc,len(datum[0]))):
images[cur_idx] = gradinator(images[cur_idx].to(device))
targets[cur_idx] = gradinator(targets[cur_idx].to(device))
masks[cur_idx] = gradinator(masks[cur_idx].to(device))
cur_idx += 1
and it only works using ResNet101 and not ResNet50 as backbone.
from yolact.
Related Issues (20)
- Hyperparameter tuning
- AP่พๅบ
- Loss curve
- Just wonder how this project make the results sync while using thread pool
- How to change the network
- How to draw a specific shape on the screen while recognizing a specific object?
- epoch
- How to pick the best epoch? HOT 1
- KeyError 2156740059
- small-, medium-, large-mAP definition?
- Question on DCN implementation
- mAP and AR: definition of small medium large ? HOT 2
- Handling negative images or images with no object of interest
- After entering the verification command, do not display the verified image
- After entering the verification command, do not display the verified image
- I get Illegal instruction (core dumped)
- "mtrand.pyx", line 936, in numpy.random.mtrand.RandomState.choice HOT 2
- How to get the coordinates and polygon information of evaluation results
- Post processing process issues
- There is a bug when I want to do test with yolact. HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ๐๐๐
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google โค๏ธ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from yolact.