Giter VIP home page Giter VIP logo

willbrennan / semanticsegmentation Goto Github PK

View Code? Open in Web Editor NEW
172.0 5.0 30.0 2.76 MB

A framework for training segmentation models in pytorch on labelme annotations with pretrained examples of skin, cat, and pizza topping segmentation

License: MIT License

Python 100.00%
pytorch torchvision computer-vision semantic-segmentation segmentation skin-segmentation skin-detection pizza-toppings pizza cats

semanticsegmentation's Introduction

Semantic Segmentation

Overview

This project started as a replacement to the Skin Detection project that used traditional computer vision techniques. This project implements two models,

  • FCNResNet101 from torchvision for accurate segmentation
  • BiSeNetV2 for real-time segmentation

These models are trained with masks from labelme annotations. As labelme annotations allow for multiple categories per a pixel we use multi-label semantic segmentation. Both the accurate and real-time models are in the pretrained directory.

Getting Started

The pretrained models are stored in the repo with git-lfs, when you clone make sure you've pulled the files by calling,

git lfs pull

or by downloading them from github directly. This project uses conda to manage its enviroment; once conda is installed we create the enviroment and activate it,

conda env create -f enviroment.yml
conda activate semantic_segmentation

. On windows; powershell needs to be initialised and the execution policy needs to be modified.

conda init powershell
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser

Pre-Trained Segmentation Projects

This project comes bundled with several pretrained models, which can be found in the pretrained directory. To infer segmentation masks on your images run evaluate_images.

# to display the output
python evaluate_images.py --images ~/Pictures/ --model pretrained/model_segmentation_skin_30.pth --model-type FCNResNet101 --display
# to save the output
python evaluate_images.py --images ~/Pictures/ --model pretrained/model_segmentation_skin_30.pth --model-type FCNResNet101 --save

To run the real-time models change the --model-type,

# to display the output
python evaluate_images.py --images ~/Pictures/ --model pretrained/model_segmentation_realtime_skin_30.pth --model-type BiSeNetV2 --display
# to save the output
python evaluate_images.py --images ~/Pictures/ --model pretrained/model_segmentation_realtime_skin_30.pth --model-type BiSeNetV2 --save

Skin Segmentation

This model was trained with a custom dataset of 150 images taken from COCO where skin segmentation annotations were added. This includes a wide variety of skin colours and lighting conditions making it more robust than the Skin Detection project. This model detects,

  • skin Skin Segmentation

Pizza Topping Segmentation

This was trained with a custom dataset of 89 images taken from COCO where pizza topping annotations were added. There's very few images for each type of topping so this model performs very badly and needs quite a few more images to behave well!

  • 'chilli', 'ham', 'jalapenos', 'mozzarella', 'mushrooms', 'olive', 'pepperoni', 'pineapple', 'salad', 'tomato'

Pizza Toppings

Cat and Bird Segmentation

Annotated images of birds and cats were taken from COCO using the extract_from_coco script and then trained on.

  • cat, birds

Demo on Cat & Birds

Training New Projects

To train a new project you can either create new labelme annotations on your images, to launch labelme run,

labelme

and start annotating your images! You'll need a couple of hundred. Alternatively if your category is already in COCO you can run the conversion tool to create labelme annotations from them.

python extract_from_coco.py --images ~/datasets/coco/val2017 --annotations ~/datasets/coco/annotations/instances_val2017.json --output ~/datasets/my_cat_images_val --categories cat

Once you've got a directory of labelme annotations you can check how the images will be shown to the model during training by running,

python check_dataset.py --dataset ~/datasets/my_cat_images_val
# to show our dataset with training augmentation
python check_dataset.py --dataset ~/datasets/my_cat_images_val --use-augmentation

. If your happy with the images and how they'll appear in training then train the model using,

python train.py --train ~/datasets/my_cat_images_train --val ~/datasets/my_cat_images_val --model-tag segmentation_cat --model-type FCNResNet101

. This may take some time depending on how many images you have. Tensorboard logs are available in the logs directory. To run your trained model on a directory of images run

# to display the output
python evaluate_images.py --images ~/Pictures/my_cat_imgs --model models/model_segmentation_cat_30.pth --model-type FCNResNet101 --display 
# to save the output
python evaluate_images.py --images ~/Pictures/my_cat_imgs --model models/model_segmentation_cat_30.pth --model-type FCNResNet101 --save

semanticsegmentation's People

Contributors

saskra avatar willbrennan avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

semanticsegmentation's Issues

Tranfer learning

Unfortunately, skin segmentation works very poorly on my own smartphone photos. You wouldn't happen to have a short tutorial on how I could use my own ground truth masks on these images to improve your model via transfer learning/fine tuning? Do you have examples of the masks used so I can generate mine in the same format (I use a different program than labelme)? Did you create these masks yourself, or are they part of the COCO dataset?

Or is there just some pre-processing of the images that I should have done?

Assertion Error when replicating this code with Google Colab

I get the following error code as I'm attempting to replicate this code on Google Colab, because I don't have a GPU on my local machine. I also get an assertion error when running on my local machine after altering the cuda code to tolerate a CPU.

I've never worked with git lfs before, so I'm assuming the problem might be there, but I'm wondering if others may be having the same issue.

Screenshot of my colab code attached

INFO:root:loading FCNResNet101 from pretrained/model_segmentation_skin_30.pth
INFO:root:creating model with categories: ['skin']
/usr/local/lib/python3.6/dist-packages/torch/nn/modules/container.py:550: UserWarning: Setting attributes on ParameterDict is not supported.
warnings.warn("Setting attributes on ParameterDict is not supported.")
Downloading: "https://download.pytorch.org/models/resnet101-5d3b4d8f.pth" to /root/.cache/torch/hub/checkpoints/resnet101-5d3b4d8f.pth
100% 170M/170M [00:03<00:00, 44.7MB/s]
Downloading: "https://download.pytorch.org/models/fcn_resnet101_coco-7ecb50ca.pth" to /root/.cache/torch/hub/checkpoints/fcn_resnet101_coco-7ecb50ca.pth
100% 208M/208M [00:06<00:00, 33.3MB/s]
INFO:root:evaluating images from /root/Pictures/
Traceback (most recent call last):
File "evaluate_images.py", line 72, in <module> for image_file in find_files(image_dir, ['.png', '.jpg', '.jpeg']):
File "evaluate_images.py", line 31, in find_files assert dir_path.exists()
AssertionError

image

Working on Colab - import an image and test skin model

Hi Will,

thank you so much for sharing your work, i would like to adopt for a pilot project using Colab environment.

First thing I would like to replicate demo you shown, could you please help in addressing the issues I found? I'm using Colab.

(I also issues and solutions not relevant to your library for other people who may experience same troubles)

Could not activate working env in Colab
See:

I solved it by making explicit the dependencies on yml:

!conda install -y -c pytorch -c conda-forge  pytorch==1.5 python==3.7.5 pip pytorch==1.5 torchvision==0.6.0 cudatoolkit=10.1 numpy==1.16.3
!pip install pytorch-ignite opencv-python pycocotools tensorboard albumentations yapf pytest labelme ignite 

However cannot run the library :

from semantic_segmentation import models
from semantic_segmentation import load_model
from semantic_segmentation import draw_results

because ignite module cannot be found.

I tried any of the following:

!conda install ignite -c pytorch -y
!pip install pytorch-ignite --upgrade
!pip install git+https://github.com/pytorch/ignite

But all failed to even find the module ignite (nomodule found).

I tried to install it from repo:

!git clone https://github.com/pytorch/ignite
python ./ignite/setup.py install

In this case ignite is installed but cannot load modules as expected:

from ignite import engine
ImportError: cannot import name 'engine'

Could you help to get your project set properly up ?

Locating the dataset

You mentioned you used a db of about 150 labelled images.
Where can I find it ?

coreml conversion error

Hi! when i convert pytorch model to onnx ( for coreml onversion), I get unpickling error while loading ".pth" weights file. Can you help converting pytorch model to onnx form.

Skin segmentation dataset

Could you please share the training data (images and labels) for the model training in skin segmentation?
(150 images from coco)

Semantic Segmentation Tool

I want to train the net on my own datasets but I do not know which tool I need to use in order to tag the datasets.

Can you give me the name of the tool?

Thanks

real time

Hello and thank you for sharing your work!
I do not understand entirely how does the real time model for skin segmentation work.
Do you have some instructions regarding, the input, the output, and the purpose?
Thank you, Lucia

Working with high resolution images

Hi Will,

Thanks for sharing this awesome project.
So, I am trying to run this with high-resolution images and the results were not good.

Here are some examples
Input images:
14
15

These are the results:
mask_skin_14
mask_skin_15

Then, I’ve reduce the obama image to 1024x1024 (previous was 3000x2284) and the white guy reduce from 1300x1241 to 512x512.
New results:

mask_skin_15_1024
mask_skin_14-512x

I was wondering if this is happen beacuse the model was trained with lower resolution images, is that right?

CUDA error: no kernel image

I am encountering "RuntimeError: CUDA error: no kernel image is available for execution on the device" error when trying to run model_segmentation_skin.

I tried downgrading my CUDA 11.2 installation to 10.1, but am still encountering the error. Is the current environment not compatible with my system? Any recommendations to resolve issue would be greatly appreciated.

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Sun_Jul_28_19:07:16_PDT_2019
Cuda compilation tools, release 10.1, V10.1.243
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.56       Driver Version: 460.56       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  GeForce RTX 307...  Off  | 00000000:01:00.0  On |                  N/A |
| N/A   42C    P8    16W /  N/A |    494MiB /  7982MiB |      2%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A      1652      G   /usr/lib/xorg/Xorg                304MiB |
|    0   N/A  N/A      1833      G   /usr/bin/gnome-shell              119MiB |
|    0   N/A  N/A      3073      G   ...AAAAAAAAA= --shared-files       68MiB |
+-----------------------------------------------------------------------------+

Multiple Classes

'precision': metrics.Precision(thresholded_transform(threshold=0.5)),

If I see it correctly, you have also solved semantic segmentations for multiple classes and not only binary decisions like skin vs. background with this source code, for example, in the pizza topping task. Could it be that for this, however, something other than this threshold transformation must be used for the metrics as in the line above? And that e.g. also the sigmoid output transformation in this evaluator would have to be adjusted? Do you happen to have a solution for this?

Real-Time segmentation

Hi!

First of all thank you for your work!

I would be interested in real-time skin segmentation!

  • Daniel

coremltools conversion from pytorch to coreml

@WillBrennan Do you have experience using coremltools to convert pytorch models to coreml?

I created a script to convert the "model_segmentation_skin_30.pth" model to coreml format, but have encountered issues. The coreml model is generated successfully, but the segmentation outputs are not working. I believe there may be an issue with the conversion script.

import urllib
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

import torch
import torch.nn as nn
import torchvision
import json

from torchvision import transforms
from PIL import Image

from semantic_segmentation.models.fcn import FCNResNet101

from coremltools.converters.mil import register_torch_op
from coremltools.converters.mil.frontend.torch.ops import _get_inputs
from coremltools.converters.mil.mil import Builder as mb
import coremltools as ct

@register_torch_op
def type_as(context, node):
    inputs = _get_inputs(context, node)
    context.add(mb.cast(x=inputs[0], dtype='int32'), node.name)


labels = ['skin']

device = torch.device('cpu')

# Load the model 

model = FCNResNet101(categories=labels)
model.load_state_dict(torch.load("./pretrained/model_segmentation_skin_30.pth", map_location=device))
model.eval()

# Load a sample image (cat_dog.jpg)
input_image = Image.open("./test/test-img.jpg")
input_image.show()

preprocess = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(
        mean=[0.485, 0.456, 0.406],
        std=[0.229, 0.224, 0.225],
    ),
])

input_tensor = preprocess(input_image)
input_batch = input_tensor.unsqueeze(0)

with torch.no_grad():
    output = model(input_batch)['out'][0]
torch_predictions = output.argmax(0)



def display_segmentation(input_image, output_predictions):
    # Create a color pallette, selecting a color for each class
    palette = torch.tensor([2 ** 25 - 1, 2 ** 15 - 1, 2 ** 21 - 1])
    colors = torch.as_tensor([i for i in range(21)])[:, None] * palette
    colors = (colors % 255).numpy().astype("uint8")

    # Plot the semantic segmentation predictions of 21 classes in each color
    r = Image.fromarray(
        output_predictions.byte().cpu().numpy()
    ).resize(input_image.size)
    r.putpalette(colors)

    # Overlay the segmentation mask on the original image
    alpha_image = input_image.copy()
    alpha_image.putalpha(255)
    r = r.convert("RGBA")
    r.putalpha(128)
    seg_image = Image.alpha_composite(alpha_image, r)
    # display(seg_image) -- doesn't work
    seg_image.show()

display_segmentation(input_image, torch_predictions)

# Wrap the Model to Allow Tracing*
class WrappedFCNResNet(nn.Module):
    
    def __init__(self):
        super(WrappedFCNResNet, self).__init__()
        self.model = FCNResNet101(categories=labels)
        self.model.load_state_dict(torch.load("./pretrained/model_segmentation_skin_30.pth", map_location=device))
        self.model.eval()
    def forward(self, x):
        res = self.model(x)
        x = res["out"]
        return x
        
# Trace the Wrapped Model
traceable_model = WrappedFCNResNet().eval()
trace = torch.jit.trace(traceable_model, input_batch)

# Convert the model
mlmodel = ct.convert(
    trace,
    inputs=[ct.TensorType(name="input", shape=input_batch.shape)],
)

# Save the model without new metadata
mlmodel.save("SkinSegmentation_no_metadata.mlmodel")

# Load the saved model
mlmodel = ct.models.MLModel("SkinSegmentation_no_metadata.mlmodel")

# Add new metadata for preview in Xcode
labels_json = {"labels": ["skin"]}

mlmodel.user_defined_metadata["com.apple.coreml.model.preview.type"] = "SkinSegmentation"
mlmodel.user_defined_metadata['com.apple.coreml.model.preview.params'] = json.dumps(labels_json)

mlmodel.save("SkinSegmentation_plus_metadata.mlmodel")

Does anything look off to you in above script?

Real-time evaluation error

Hi!

I couldn't run the evaluation, there is something wrong with loading the model:

python evaluate_images.py --images /media/sztaki/pi/video5 --model pretrained/model_segmentation_realtime_skin_30.pth --model-type BiSeNetV2 --save

INFO:root:loading BiSeNetV2 from pretrained/model_segmentation_realtime_skin_30.pth
Traceback (most recent call last):
File "evaluate_images.py", line 58, in
model = load_model(models[args.model_type], torch.load(args.model))
File "/home/terbe/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/serialization.py", line 426, in load
return _load(f, map_location, pickle_module, **pickle_load_args)
File "/home/terbe/anaconda3/envs/pytorch/lib/python3.6/site-packages/torch/serialization.py", line 603, in _load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

Is there any idea what could be the problem?

Cannot launch "labelme"

I would like to add training data to the model, but am unable to launch 'labelme' from environment:

qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
This application failed to start because no Qt platform plugin could be initialized. Reinstalling the application may fix this problem.

Available platform plugins are: eglfs, linuxfb, minimal, minimalegl, offscreen, vnc, wayland-egl, wayland, wayland-xcomposite-egl, wayland-xcomposite-glx, webgl, xcb.

Aborted (core dumped)

Am I missing a dependency? FYI - I am running Ubuntu 20.04.

Training error

Hello @WillBrennan ,

I tried to train using my own data / annotation files, but received the following error:

python train.py --train /home/yoda/Development/sportModel/Training --val /home/yoda/Development/sportModel/Validation --model-tag segmentation_sport --model-type FCNResNet101
INFO:root:running training on cuda
INFO:root:creating dataset and data loaders
INFO:root:loaded 240 annotations from /home/yoda/Development/sportModel/Training
INFO:root:use augmentation: True
INFO:root:categories: ['baseball', 'golf_ball', 'basketball']
INFO:root:loaded 40 annotations from /home/yoda/Development/sportModel/Validation
INFO:root:use augmentation: False
INFO:root:categories: ['baseball', 'golf_ball', 'basketball']
INFO:root:creating dataloaders with 16 workers and a batch-size of 2
/home/yoda/anaconda3/envs/semantic_segmentation/lib/python3.9/site-packages/torch/utils/data/dataloader.py:478: UserWarning: This DataLoader will create 16 worker processes in total. Our suggested max number of worker in current system is 12, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
  warnings.warn(_create_warning_msg(
INFO:root:creating FCNResNet101 and optimizer with initial lr of 0.0001
INFO:root:creating model with categories: ['baseball', 'golf_ball', 'basketball']
INFO:root:creating trainer and evaluator engines
Traceback (most recent call last):
  File "/home/yoda/Desktop/sport-segmentation/SemanticSegmentation/train.py", line 87, in <module>
    '[email protected]': IoUMetric(thresholded_transform(threshold=0.3)),
  File "/home/yoda/Desktop/sport-segmentation/SemanticSegmentation/semantic_segmentation/metrics.py", line 18, in __init__
    super().__init__(output_transform=output_transform, device=device)
  File "/home/yoda/anaconda3/envs/semantic_segmentation/lib/python3.9/site-packages/ignite/metrics/metric.py", line 224, in __init__
    if torch.device(device).type == "xla":
TypeError: Device() received an invalid combination of arguments - got (NoneType), but expected one of:
 * (torch.device device)
      didn't match because some of the arguments have invalid types: (NoneType)
 * (str type, int index)

Any idea on why this might be occurring and how to resolve? Thanks!

segmented images

Hi, thank you for sharing this work!
Could you please provide the 150 images used for skin detector and relative annotated masks? It would be really useful to understand how annotations were made. Thanks

Understanding the terminal output during training

Can anyone help me understand the terminal output during training? I followed these steps: https://github.com/WillBrennan/SemanticSegmentation#training-new-projects I only adjusted minor things.

My output looks like this (with some truncation, line breaks and indentation on my part for clarity):

python train.py --train datasets/my_cat_images_train --val datasets/my_cat_images_val --model-tag segmentation_cat --model-type FCNResNet101
INFO:root:running training on cuda
INFO:root:creating dataset and data loaders
INFO:root:loaded 4114 annotations from datasets/my_cat_images_train
INFO:root:use augmentation: True
INFO:root:categories: ['cat']
INFO:root:loaded 184 annotations from datasets/my_cat_images_val
INFO:root:use augmentation: False
INFO:root:categories: ['cat']
INFO:root:creating dataloaders with 19 workers and a batch-size of 6
INFO:root:creating FCNResNet101 and optimizer with initial lr of 0.0001
INFO:root:creating model with categories: ['cat']
INFO:root:creating trainer and evaluator engines
INFO:root:creating summary writer with tag segmentation_cat
INFO:root:attaching lr scheduler
INFO:root:attaching event driven calls
INFO:root:training...
	INFO:ignite.engine.engine.Engine:Engine run starting with max_epochs=2.
	
INFO:root:epoch[1] - iteration[500/685] loss: 0.145
INFO:root:epoch: 1 - current lr: 0.0001
	INFO:ignite.engine.engine.Engine:Engine run starting with max_epochs=1.
	INFO:ignite.engine.engine.Engine:Epoch[1] Complete. Time taken: 00:00:09
	INFO:ignite.engine.engine.Engine:Engine run complete. Time taken: 00:00:09
INFO:root:loss: 0.147 precision: 0.833 recall: 0.880 [email protected]: 0.724 [email protected]: 0.748 Epoch: 1 
	INFO:ignite.engine.engine.Engine:Engine run starting with max_epochs=1.
	INFO:ignite.engine.engine.Engine:Epoch[1] Complete. Time taken: 00:00:09
	INFO:ignite.engine.engine.Engine:Engine run complete. Time taken: 00:00:09
INFO:root:loss: 0.123 precision: 0.873 recall: 0.911 [email protected]: 0.781 [email protected]: 0.804 Epoch: 1 
	INFO:ignite.engine.engine.Engine:Epoch[1] Complete. Time taken: 00:10:52
	
INFO:root:epoch[2] - iteration[500/685] loss: 0.416
INFO:root:epoch: 2 - current lr: 9e-05
	INFO:ignite.engine.engine.Engine:Engine run starting with max_epochs=1.
	INFO:ignite.engine.engine.Engine:Epoch[1] Complete. Time taken: 00:00:09
	INFO:ignite.engine.engine.Engine:Engine run complete. Time taken: 00:00:09
INFO:root:loss: 0.161 precision: 0.854 recall: 0.880 [email protected]: 0.745 [email protected]: 0.765 Epoch: 2 
	INFO:ignite.engine.engine.Engine:Engine run starting with max_epochs=1.
	INFO:ignite.engine.engine.Engine:Epoch[1] Complete. Time taken: 00:00:09
	INFO:ignite.engine.engine.Engine:Engine run complete. Time taken: 00:00:09
INFO:root:loss: 0.116 precision: 0.872 recall: 0.922 [email protected]: 0.787 [email protected]: 0.812 Epoch: 2 
	INFO:ignite.engine.engine.Engine:Epoch[2] Complete. Time taken: 00:10:52
	INFO:ignite.engine.engine.Engine:Engine run complete. Time taken: 00:21:45

Here's what I'd like to know about it:

  • The outputs in the training starting with "INFO:root:" are apparently defined in semantic_segmentation/engines.py, but where do the others ("INFO:ignite.engine.engine:") come from?
  • Why are two epochs run through as desired, but after the initially correct specification (namely "max_epochs=2"), "max_epochs=1" is then output several times?
  • Why is "Epoch[1]" output several times during the second epoch?
  • Why is it stated several times in the middle of an epoch that the epoch or the run is "complete" and a new run would be started?
  • What does the time information "Time taken: 00:00:09" refer to, because the real runtime is significantly longer?
  • Why are there intermediate results for the metrics during the epochs, although the function should only be executed when the epoch is finished? And why are there only the results of the second output per epoch in the tensorboard?
    @trainer.on(engine.Events.EPOCH_COMPLETED)
Wall time,Step,Value
1667398091.2036688,1,0.8042078614234924
1667398743.5209541,2,0.8115814328193665

Possibly part of my questions come from the fact that I had only used Tensorflow and not Torch before. I thank you in advance if someone can make me understand at least part of these oddities.

CPU

Hello, I was trying to use your model on skin segmentation in a CPU, but I am not being able to predict over an image. Is that possible? Would you help me with the code modifications, please?
Thank you, Lucia

Input Shape

@WillBrennan What is the input shape of the model tensor for model_segmentation_skin_30.pth? [1,3,480,480] ?

Error in training

I tried to train BisenetV2 on my own data according to your instructions but I got the next error:

INFO:root:creating dataset and data loaders
INFO:root:loaded 6800 annotations from /home/nvidia/Documents/Projects/SemanticSegmentation/RailSem19_LabelMe/train
INFO:root:use augmentation: True
INFO:root:categories: ['rail']
INFO:root:loaded 1700 annotations from /home/nvidia/Documents/Projects/SemanticSegmentation/RailSem19_LabelMe/val
INFO:root:use augmentation: False
INFO:root:categories: ['rail']
INFO:root:creating dataloaders with 16 workers and a batch-size of 2
INFO:root:creating BiSeNetV2 and optimizer with initial lr of 0.0001
INFO:root:creating model with categories: ['rail']
INFO:root:creating trainer and evaluator engines
INFO:root:creating summary writer with tag seg_train
INFO:root:attaching lr scheduler
INFO:root:attaching event driven calls
INFO:root:training...
INFO:ignite.engine.engine.Engine:Engine run starting with max_epochs=30.
/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py:552: UserWarning: Setting attributes on ParameterDict is not supported.
warnings.warn("Setting attributes on ParameterDict is not supported.")
/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py:645: UserWarning: nn.ParameterDict is being used with DataParallel but this is not supported. This dict will appear empty for the models replicated on each GPU except the original one.
warnings.warn("nn.ParameterDict is being used with DataParallel but this is not "
ERROR:ignite.engine.engine.Engine:Current run is terminating due to exception: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 304, in forward
x_semantic = self.semantic(x)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 193, in forward
x = self.stage5(x)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 146, in forward
x_gap = self.conv_project(x_gap)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 23, in forward
return F.leaky_relu(self.bn(self.conv(x)))
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward
self.weight, self.bias, bn_training, exponential_average_factor, self.eps)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2054, in batch_norm
_verify_batch_size(input.size())
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2037, in _verify_batch_size
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 128, 1, 1])

ERROR:ignite.engine.engine.Engine:Engine run is terminating due to exception: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 304, in forward
x_semantic = self.semantic(x)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 193, in forward
x = self.stage5(x)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 146, in forward
x_gap = self.conv_project(x_gap)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 23, in forward
return F.leaky_relu(self.bn(self.conv(x)))
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward
self.weight, self.bias, bn_training, exponential_average_factor, self.eps)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2054, in batch_norm
_verify_batch_size(input.size())
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2037, in _verify_batch_size
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 128, 1, 1])

Traceback (most recent call last):
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 775, in _internal_run
self._handle_exception(e)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 745, in _internal_run
time_taken = self._run_once_on_dataset()
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 850, in _run_once_on_dataset
self._handle_exception(e)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 469, in _handle_exception
raise e
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/ignite/engine/engine.py", line 833, in _run_once_on_dataset
self.state.output = self._process_function(self, self.state.batch)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/ignite/engine/init.py", line 103, in _update
y_pred = model(x)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 161, in forward
outputs = self.parallel_apply(replicas, inputs, kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/data_parallel.py", line 171, in parallel_apply
return parallel_apply(replicas, inputs, kwargs, self.device_ids[:len(replicas)])
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 86, in parallel_apply
output.reraise()
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
raise self.exc_type(msg)
ValueError: Caught ValueError in replica 0 on device 0.
Original Traceback (most recent call last):
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/parallel/parallel_apply.py", line 61, in _worker
output = module(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 304, in forward
x_semantic = self.semantic(x)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 193, in forward
x = self.stage5(x)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/container.py", line 117, in forward
input = module(input)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 146, in forward
x_gap = self.conv_project(x_gap)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/Documents/Projects/SemanticSegmentation/semantic_segmentation/models/bisenetv2.py", line 23, in forward
return F.leaky_relu(self.bn(self.conv(x)))
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/modules/batchnorm.py", line 136, in forward
self.weight, self.bias, bn_training, exponential_average_factor, self.eps)
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2054, in batch_norm
_verify_batch_size(input.size())
File "/home/nvidia/anaconda3/lib/python3.7/site-packages/torch/nn/functional.py", line 2037, in _verify_batch_size
raise ValueError('Expected more than 1 value per channel when training, got input size {}'.format(size))
ValueError: Expected more than 1 value per channel when training, got input size torch.Size([1, 128, 1, 1])

How can I solve it?

Running out of memory

Sorry to interrupt your peace, I've been trying to run your code to train my own model on skin detection, but it always says it runs out of memory, I've checked out on the internet and all answers I get is that I should clean the data in each iteration or things like that.
When I try to use the pretrained models it also runs out of memory if the picture is like more than 1Mb.

The thing is that I don't think your code is not working, so there must be something I'm doing wrong. Do you have any idea on what could it be?

I have 4Gb of GPU and 12 of ram, so actually memory shouldn't be a problem I think. I'm running it in Windows10 with a GPU nvidia gtx 1050ti, that I bought just to run your code :'(

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.