Giter VIP home page Giter VIP logo

pytorch-computer-vision-cookbook's Introduction

PyTorch-Computer-Vision-Cookbook

PyTorch Computer Vision Cookbook

This is the code repository for PyTorch Computer Vision Cookbook, published by Packt.

Over 70 recipes to master the art of computer vision with deep learning and PyTorch 1.x

What is this book about?

This book enables you to solve the trickiest of problems in computer vision using deep learning algorithms and techniques. You will learn to use several different algorithms for different CV problems such as classification, detection, segmentation, and more using Pytorch. Packed with best practices in training and deployment of CV applications.

This book covers the following exciting features:

  • Develop, train and deploy deep learning algorithms using PyTorch 1.x
  • Understand how to fine-tune and change hyperparameters to train deep learning algorithms
  • Perform various CV tasks such as classification, detection, and segmentation
  • Implement a neural style transfer network based on CNNs and pre-trained models
  • Generate new images and implement adversarial attacks using GANs
  • Implement video classification models based on RNN, LSTM, and 3D-CNN
  • Discover best practices for training and deploying deep learning algorithms for CV applications

If you feel this book is for you, get your copy today!

https://www.packtpub.com/

Instructions and Navigations

All of the code is organized into folders.

The code will look like the following:

# define a tensor with specific data type
x = torch.ones(2, 2, dtype=torch.int8)
print(x)
print(x.dtype)
tensor([[1, 1],
 [1, 1]], dtype=torch.int8)
torch.int8

Following is what you need for this book: Computer vision professionals, data scientists, deep learning engineers, and AI developers looking for quick solutions for various computer vision problems will find this book useful. Intermediate-level knowledge of computer vision concepts, along with Python programming experience is required.

With the following software and hardware list you can run all code files present in the book (Chapter 1-10).

Software and Hardware List

Chapter Software required OS required
1 - 10 Python 3.5+, PyTorch 1.x, GPU (preferred) Windows, Mac OS X, and Linux (Any)

We also provide a PDF file that has color images of the screenshots/diagrams used in this book. Click here to download it.

Related products

Get to Know the Author

Michael Avendi is a principal data scientist with vast experience in deep learning, computer vision, and medical imaging analysis. He works on the research and development of data-driven algorithms for various imaging problems, including medical imaging applications. His research papers have been published in major medical journals, including the Medical Imaging Analysis journal. Michael Avendi is an active Kaggle participant and was awarded a top prize in a Kaggle competition in 2017.

Suggestions and Feedback

Click here if you have any feedback or suggestions.

pytorch-computer-vision-cookbook's People

Contributors

casijoe5231 avatar manikandankurup-packt avatar mnauf avatar packt-itservice avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

pytorch-computer-vision-cookbook's Issues

lacking data

I don't see any datafiles. please, can you upload those too? thank you

Chapter 6 - CUDA goes out of memory even with batch size 4.

I am not sure why but even with batch size 4 CUDA goes out of memory, i think many unwanted things(like gradients)are kept in memory so memory is insufficient. I have 6 GB memory, can you alter the code to clear unwanted things out of memory and make the code even more efficient.

Even i tried in kaggle kernel and again CUDA goes out of memory.

Chapter 5 training code question/error ?

The output of the Yolo layers use transform_outputs method to convert the box predictions into pixels.
When training we need to revert this back to the original network ouput of the yolo layer. To do this the method transform_bbox is used. It is indeed reverting the operations of transform_outputs, except the last thing done in transform_outputs where the boxes are scaled by the stride factor (32, 16 or 8 depending of the layer). When calculating the loss , we compare directly the calculated targets (via get_yolo_targets) and the 'reverted' yolo network outputs. To me this looks wrong as the reverted yolo outputs still have the stride factor in.

Chapter6-deployment Error

In The Code, "path2train" has been used Instead of using "path2test".
So by changing these names, The Error will be fixed.
This Error was actually in the book itself and since I am studying with this book and code every single project and although make these codes look more advanced.

chapter2 I have a KeyError: '_labels'

Traceback (most recent call last):
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
return self._engine.get_loc(key)
File "pandas_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '_labels'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:/Users/jewoo/PycharmProjects/vision/chapter/customDataset.py", line 37, in
histo_dataset = histoCancerDataset(data_dir, data_transformer, "train")
File "C:/Users/jewoo/PycharmProjects/vision/chapter/customDataset.py", line 23, in init
self.labels = [labels_df.loc[filename[:-4]].values[0] for filename in filenames]
File "C:/Users/jewoo/PycharmProjects/vision/chapter/customDataset.py", line 23, in
self.labels = [labels_df.loc[filename[:-4]].values[0] for filename in filenames]
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexing.py", line 1768, in getitem
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexing.py", line 1965, in _getitem_axis
return self._get_label(key, axis=axis)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexing.py", line 625, in _get_label
return self.obj._xs(label, axis=axis)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\generic.py", line 3537, in xs
loc = self.index.get_loc(key)
File "C:\Users\jewoo\anaconda3\envs\totti\lib\site-packages\pandas\core\indexes\base.py", line 2648, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc
File "pandas_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: '_labels'

Chapter 10 Resnt18RNN model accuracy is 0.

Amazing working thank you for sharing the code.
the accuracy of the model is 0 or 0.05 on validation set.
what's the issues. i just copy your code.

Thank you for your time, looking forward for reply.

Chapter 10

It seems there is a mistake in the code for chapter 10. Using the original dataset and code as in the book does not work. The model seems not to learn and is stuck at 1.98% accuracy.

Chapter 5 get_yolo_targets

Hello,

I don't understand how the code snippet below works. Can anyone explain it to me?
In 1), the shape of "obj_mask" is [8, 3, 13, 13] because I'm using 8 as batch size.
In 2), the shape of "batch_inds" and "best_anchor_ind" is 90 because the program found 90 bounding boxes in the batch.
Now, the original shape is [8, 3, 13, 13] and the shape to be calculate is [90, 90, 13, 13].
Is this right? It seems that the program works without finding errors, but I'm wondering if it is right code.

sizeT=batch_size, num_anchors, grid_size, grid_size
obj_mask = torch.zeros(sizeT,device=device,dtype=torch.uint8)             # 1)
noobj_mask = torch.ones(sizeT,device=device,dtype=torch.uint8)
tx = torch.zeros(sizeT, device=device, dtype=torch.float32)
ty= torch.zeros(sizeT, device=device, dtype=torch.float32)
tw= torch.zeros(sizeT, device=device, dtype=torch.float32)
th= torch.zeros(sizeT, device=device, dtype=torch.float32)

sizeT=batch_size, num_anchors, grid_size, grid_size, num_cls
tcls= torch.zeros(sizeT, device=device, dtype=torch.float32)

target_bboxes = target[:, 2:] * grid_size
t_xy = target_bboxes[:, :2]
t_wh = target_bboxes[:, 2:]
t_x, t_y = t_xy.t()
t_w, t_h = t_wh.t()

grid_i, grid_j = t_xy.long().t()

iou_with_anchors=[get_iou_WH(anchor, t_wh) for anchor in anchors]
iou_with_anchors = torch.stack(iou_with_anchors)
best_iou_wa, best_anchor_ind = iou_with_anchors.max(0)

batch_inds, target_labels = target[:, :2].long().t()
obj_mask[batch_inds, best_anchor_ind, grid_j, grid_i] = 1                 # 2)
noobj_mask[batch_inds, best_anchor_ind, grid_j, grid_i] = 0

Chapter 5 method xyxyh2xywh

In the implementation :

  def xyxyh2xywh(xyxy, image_size=416):
      xywh = torch.zeros(xyxy.shape[0],6)
      xywh[:,2] = (xyxy[:, 0] + xyxy[:, 2]) / 2./img_size
      xywh[:,3] = (xyxy[:, 1] + xyxy[:, 3]) / 2./img_size
      xywh[:,5] = (xyxy[:, 2] - xyxy[:, 0])/img_size 
      xywh[:,4] = (xyxy[:, 3] - xyxy[:, 1])/img_size
      xywh[:,1]= xyxy[:,6]    
      return xywh

i think that it should be

  def xyxyh2xywh(xyxy, image_size=416):
      xywh = torch.zeros(xyxy.shape[0],6)
      xywh[:,2] = (xyxy[:, 0] + xyxy[:, 2]) / 2./img_size
      xywh[:,3] = (xyxy[:, 1] + xyxy[:, 3]) / 2./img_size
      xywh[:,4] = (xyxy[:, 2] - xyxy[:, 0])/img_size 
      xywh[:,5] = (xyxy[:, 3] - xyxy[:, 1])/img_size
      xywh[:,1]= xyxy[:,6]    
      return xywh

Chapter5 error

Hello,

I got some errors at Chapter5
image

Can you tell me how to solve this issue?

Thank you.

NotaDirectoryError

I downloaded the HMDB dataset and extracted the main folder, but i was getting an error (NotaDirectoryError) in chapter10.py when i was trying to establish path2ajpgs and im sure the path is correct

Non max suppression run time issue (Chapter 05)

The Nonmax suppression code did not run, i had to change the line
detections[0, :4] = (ww * detections[supp_inds, :4]).sum(0) / ww.sum()
into
detections[0, :4] = (ww.view(-1, 1) * detections[supp_inds, :4]).sum(0) / ww.sum()

Error when running the module below

from sklearn.model_selection import StratifiedShuffleSplit

sss = StratifiedShuffleSplit(n_splits=2, test_size=0.5, random_state=42)
train_indx, test_indx = next(sss.split(unique_ids, unique_labels))

train_ids = [unique_ids[ind] for ind in train_indx]
train_labels = [unique_labels[ind] for ind in train_indx]
print(len(train_ids), len(train_labels))

test_ids = [unique_ids[ind] for ind in test_indx]
test_labels = [unique_labels[ind] for ind in test_indx]
print(len(test_ids), len(test_labels))

ValueError: The least populated class in y has only 1 member, which is too few. The minimum number of groups for any class cannot be less than 2.

Training yolo model with different image size

Hi,
I'm using the code from Chapter 5 as a guide to train a yolo model, but I'm struggling with trying to use an image of different dimensions than the 416 that is used in the book. I've edited the create_layers function so that it takes image size as an input and passes it to YOLOLayer, but when I try to train a model, I get the error:

File "", line 97, in forward
x = torch.cat([layer_outputs[int(l_i)]
RuntimeError: Sizes of tensors must match except in dimension 1. Got 34 and 33 in dimension 2

Do you have any suggestions for using a different size image? I could not find any details about this in the text book either. Thank you

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.