packtpublishing / pytorch-computer-vision-cookbook Goto Github PK
View Code? Open in Web Editor NEWPyTorch Computer Vision Cookbook, Published by Packt
License: MIT License
PyTorch Computer Vision Cookbook, Published by Packt
License: MIT License
In the implementation :
def xyxyh2xywh(xyxy, image_size=416):
xywh = torch.zeros(xyxy.shape[0],6)
xywh[:,2] = (xyxy[:, 0] + xyxy[:, 2]) / 2./img_size
xywh[:,3] = (xyxy[:, 1] + xyxy[:, 3]) / 2./img_size
xywh[:,5] = (xyxy[:, 2] - xyxy[:, 0])/img_size
xywh[:,4] = (xyxy[:, 3] - xyxy[:, 1])/img_size
xywh[:,1]= xyxy[:,6]
return xywh
i think that it should be
def xyxyh2xywh(xyxy, image_size=416):
xywh = torch.zeros(xyxy.shape[0],6)
xywh[:,2] = (xyxy[:, 0] + xyxy[:, 2]) / 2./img_size
xywh[:,3] = (xyxy[:, 1] + xyxy[:, 3]) / 2./img_size
xywh[:,4] = (xyxy[:, 2] - xyxy[:, 0])/img_size
xywh[:,5] = (xyxy[:, 3] - xyxy[:, 1])/img_size
xywh[:,1]= xyxy[:,6]
return xywh
To make Chapter1 run out of the box I added
import os
if not os.path.exists("models"):
os.makedirs("models")
to section
Store and Load Models
Hello,
I don't understand how the code snippet below works. Can anyone explain it to me?
In 1), the shape of "obj_mask" is [8, 3, 13, 13] because I'm using 8 as batch size.
In 2), the shape of "batch_inds" and "best_anchor_ind" is 90 because the program found 90 bounding boxes in the batch.
Now, the original shape is [8, 3, 13, 13] and the shape to be calculate is [90, 90, 13, 13].
Is this right? It seems that the program works without finding errors, but I'm wondering if it is right code.
sizeT=batch_size, num_anchors, grid_size, grid_size
obj_mask = torch.zeros(sizeT,device=device,dtype=torch.uint8) # 1)
noobj_mask = torch.ones(sizeT,device=device,dtype=torch.uint8)
tx = torch.zeros(sizeT, device=device, dtype=torch.float32)
ty= torch.zeros(sizeT, device=device, dtype=torch.float32)
tw= torch.zeros(sizeT, device=device, dtype=torch.float32)
th= torch.zeros(sizeT, device=device, dtype=torch.float32)
sizeT=batch_size, num_anchors, grid_size, grid_size, num_cls
tcls= torch.zeros(sizeT, device=device, dtype=torch.float32)
target_bboxes = target[:, 2:] * grid_size
t_xy = target_bboxes[:, :2]
t_wh = target_bboxes[:, 2:]
t_x, t_y = t_xy.t()
t_w, t_h = t_wh.t()
grid_i, grid_j = t_xy.long().t()
iou_with_anchors=[get_iou_WH(anchor, t_wh) for anchor in anchors]
iou_with_anchors = torch.stack(iou_with_anchors)
best_iou_wa, best_anchor_ind = iou_with_anchors.max(0)
batch_inds, target_labels = target[:, :2].long().t()
obj_mask[batch_inds, best_anchor_ind, grid_j, grid_i] = 1 # 2)
noobj_mask[batch_inds, best_anchor_ind, grid_j, grid_i] = 0
I am not sure why but even with batch size 4 CUDA goes out of memory, i think many unwanted things(like gradients)are kept in memory so memory is insufficient. I have 6 GB memory, can you alter the code to clear unwanted things out of memory and make the code even more efficient.
Even i tried in kaggle kernel and again CUDA goes out of memory.
Hi,
I'm using the code from Chapter 5 as a guide to train a yolo model, but I'm struggling with trying to use an image of different dimensions than the 416 that is used in the book. I've edited the create_layers
function so that it takes image size as an input and passes it to YOLOLayer
, but when I try to train a model, I get the error:
File "", line 97, in forward
x = torch.cat([layer_outputs[int(l_i)]
RuntimeError: Sizes of tensors must match except in dimension 1. Got 34 and 33 in dimension 2
Do you have any suggestions for using a different size image? I could not find any details about this in the text book either. Thank you
I have trained your model on the hmdb51 dataset using a lesser number of classes but the accuracy always remains contant even for different learning rates or a different lr_scheduler.
Please guide me further.
Traceback (most recent call last):
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '_labels'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:/Users/jewoo/PycharmProjects/vision/chapter/customDataset.py", line 37, in
histo_dataset = histoCancerDataset(data_dir, data_transformer, "train")
File "C:/Users/jewoo/PycharmProjects/vision/chapter/customDataset.py", line 23, in init
self.labels = [labels_df.loc[filename[:-4]].values[0] for filename in filenames]
File "C:/Users/jewoo/PycharmProjects/vision/chapter/customDataset.py", line 23, in
self.labels = [labels_df.loc[filename[:-4]].values[0] for filename in filenames]
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexing.py", line 1768, in getitem
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexing.py", line 1965, in _getitem_axis
return self._get_label(key, axis=axis)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexing.py", line 625, in _get_label
return self.obj._xs(label, axis=axis)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\generic.py", line 3537, in xs
loc = self.index.get_loc(key)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexes\base.py", line 2648, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '_labels'
I downloaded the HMDB dataset and extracted the main folder, but i was getting an error (NotaDirectoryError) in chapter10.py when i was trying to establish path2ajpgs and im sure the path is correct
In The Code, "path2train" has been used Instead of using "path2test".
So by changing these names, The Error will be fixed.
This Error was actually in the book itself and since I am studying with this book and code every single project and although make these codes look more advanced.
I saw your book here: https://circuitnoob.com/pytorch-computer-vision-cookbook-over-70-recipes-to-solve-computer-vision-and-image-processing-problems-using-pytorch-1-x/ .
I wanted to make sure that was ok. Is it allowed to be there?
It seems there is a mistake in the code for chapter 10. Using the original dataset and code as in the book does not work. The model seems not to learn and is stuck at 1.98% accuracy.
Even after signing into a free account the Download link https://amd.grand-challenge.org/download/ says "Forbidden"
Hello,
get_coco_dataset.sh: 29: get_coco_dataset.sh: Syntax error: "(" unexpected
Can you tell me how to solve this problem?
Thank you.
I don't see any datafiles. please, can you upload those too? thank you
Amazing working thank you for sharing the code.
the accuracy of the model is 0 or 0.05 on validation set.
what's the issues. i just copy your code.
Thank you for your time, looking forward for reply.
The Nonmax suppression code did not run, i had to change the line
detections[0, :4] = (ww * detections[supp_inds, :4]).sum(0) / ww.sum()
into
detections[0, :4] = (ww.view(-1, 1) * detections[supp_inds, :4]).sum(0) / ww.sum()
from sklearn.model_selection import StratifiedShuffleSplit
sss = StratifiedShuffleSplit(n_splits=2, test_size=0.5, random_state=42)
train_indx, test_indx = next(sss.split(unique_ids, unique_labels))
train_ids = [unique_ids[ind] for ind in train_indx]
train_labels = [unique_labels[ind] for ind in train_indx]
print(len(train_ids), len(train_labels))
test_ids = [unique_ids[ind] for ind in test_indx]
test_labels = [unique_labels[ind] for ind in test_indx]
print(len(test_ids), len(test_labels))
ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.
The output of the Yolo layers use transform_outputs method to convert the box predictions into pixels.
When training we need to revert this back to the original network ouput of the yolo layer. To do this the method transform_bbox is used. It is indeed reverting the operations of transform_outputs, except the last thing done in transform_outputs where the boxes are scaled by the stride factor (32, 16 or 8 depending of the layer). When calculating the loss , we compare directly the calculated targets (via get_yolo_targets) and the 'reverted' yolo network outputs. To me this looks wrong as the reverted yolo outputs still have the stride factor in.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.