Giter VIP home page Giter VIP logo

yolop's Introduction

You Only 👀 Once for Panoptic ​ 🚗 Perception

You Only Look at Once for Panoptic driving Perception

by Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wang 📧, Xiang Bai, Wenqing Cheng, Wenyu Liu School of EIC, HUST

(📧) corresponding author.

arXiv technical report (Machine Intelligence Research2022)


中文文档

The Illustration of YOLOP

yolop

Contributions

  • We put forward an efficient multi-task network that can jointly handle three crucial tasks in autonomous driving: object detection, drivable area segmentation and lane detection to save computational costs, reduce inference time as well as improve the performance of each task. Our work is the first to reach real-time on embedded devices while maintaining state-of-the-art level performance on the BDD100K dataset.

  • We design the ablative experiments to verify the effectiveness of our multi-tasking scheme. It is proved that the three tasks can be learned jointly without tedious alternating optimization.

  • We design the ablative experiments to prove that the grid-based prediction mechanism of detection task is more related to that of semantic segmentation task, which is believed to provide reference for other relevant multi-task learning research works.

Results

PWC

Traffic Object Detection Result

Model Recall(%) mAP50(%) Speed(fps)
Multinet 81.3 60.2 8.6
DLT-Net 89.4 68.4 9.3
Faster R-CNN 81.2 64.9 8.8
YOLOv5s 86.8 77.2 82
YOLOP(ours) 89.2 76.5 41

Drivable Area Segmentation Result

Model mIOU(%) Speed(fps)
Multinet 71.6 8.6
DLT-Net 71.3 9.3
PSPNet 89.6 11.1
YOLOP(ours) 91.5 41

Lane Detection Result:

Model mIOU(%) IOU(%)
ENet 34.12 14.64
SCNN 35.79 15.84
ENet-SAD 36.56 16.02
YOLOP(ours) 70.50 26.20

Ablation Studies 1: End-to-end v.s. Step-by-step:

Training_method Recall(%) AP(%) mIoU(%) Accuracy(%) IoU(%)
ES-W 87.0 75.3 90.4 66.8 26.2
ED-W 87.3 76.0 91.6 71.2 26.1
ES-D-W 87.0 75.1 91.7 68.6 27.0
ED-S-W 87.5 76.1 91.6 68.0 26.8
End-to-end 89.2 76.5 91.5 70.5 26.2

Ablation Studies 2: Multi-task v.s. Single task:

Training_method Recall(%) AP(%) mIoU(%) Accuracy(%) IoU(%) Speed(ms/frame)
Det(only) 88.2 76.9 - - - 15.7
Da-Seg(only) - - 92.0 - - 14.8
Ll-Seg(only) - - - 79.6 27.9 14.8
Multitask 89.2 76.5 91.5 70.5 26.2 24.4

Ablation Studies 3: Grid-based v.s. Region-based:

Training_method Recall(%) AP(%) mIoU(%) Accuracy(%) IoU(%) Speed(ms/frame)
R-CNNP Det(only) 79.0 67.3 - - - -
R-CNNP Seg(only) - - 90.2 59.5 24.0 -
R-CNNP Multitask 77.2(-1.8) 62.6(-4.7) 86.8(-3.4) 49.8(-9.7) 21.5(-2.5) 103.3
YOLOP Det(only) 88.2 76.9 - - - -
YOLOP Seg(only) - - 91.6 69.9 26.5 -
YOLOP Multitask 89.2(+1.0) 76.5(-0.4) 91.5(-0.1) 70.5(+0.6) 26.2(-0.3) 24.4

Notes:

  • The works we has use for reference including Multinet (paper,code),DLT-Net (paper),Faster R-CNN (paper,code),YOLOv5scode) ,PSPNet(paper,code) ,ENet(paper,code) SCNN(paper,code) SAD-ENet(paper,code). Thanks for their wonderful works.
  • In table 4, E, D, S and W refer to Encoder, Detect head, two Segment heads and whole network. So the Algorithm (First, we only train Encoder and Detect head. Then we freeze the Encoder and Detect head as well as train two Segmentation heads. Finally, the entire network is trained jointly for all three tasks.) can be marked as ED-S-W, and the same for others.

Visualization

Traffic Object Detection Result

detect result

Drivable Area Segmentation Result

Lane Detection Result

Notes:

  • The visualization of lane detection result has been post processed by quadratic fitting.

Project Structure

├─inference
│ ├─images   # inference images
│ ├─output   # inference result
├─lib
│ ├─config/default   # configuration of training and validation
│ ├─core    
│ │ ├─activations.py   # activation function
│ │ ├─evaluate.py   # calculation of metric
│ │ ├─function.py   # training and validation of model
│ │ ├─general.py   #calculation of metric、nms、conversion of data-format、visualization
│ │ ├─loss.py   # loss function
│ │ ├─postprocess.py   # postprocess(refine da-seg and ll-seg, unrelated to paper)
│ ├─dataset
│ │ ├─AutoDriveDataset.py   # Superclass dataset,general function
│ │ ├─bdd.py   # Subclass dataset,specific function
│ │ ├─hust.py   # Subclass dataset(Campus scene, unrelated to paper)
│ │ ├─convect.py 
│ │ ├─DemoDataset.py   # demo dataset(image, video and stream)
│ ├─models
│ │ ├─YOLOP.py    # Setup and Configuration of model
│ │ ├─light.py    # Model lightweight(unrelated to paper, zwt)
│ │ ├─commom.py   # calculation module
│ ├─utils
│ │ ├─augmentations.py    # data augumentation
│ │ ├─autoanchor.py   # auto anchor(k-means)
│ │ ├─split_dataset.py  # (Campus scene, unrelated to paper)
│ │ ├─utils.py  # logging、device_select、time_measure、optimizer_select、model_save&initialize 、Distributed training
│ ├─run
│ │ ├─dataset/training time  # Visualization, logging and model_save
├─tools
│ │ ├─demo.py    # demo(folder、camera)
│ │ ├─test.py    
│ │ ├─train.py    
├─toolkits
│ │ ├─deploy    # Deployment of model
│ │ ├─datapre    # Generation of gt(mask) for drivable area segmentation task
├─weights    # Pretraining model

Requirement

This codebase has been developed with python version 3.7, PyTorch 1.7+ and torchvision 0.8+:

conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.2 -c pytorch

See requirements.txt for additional dependencies and version requirements.

pip install -r requirements.txt

Data preparation

Download

We recommend the dataset directory structure to be the following:

# The id represent the correspondence relation
├─dataset root
│ ├─images
│ │ ├─train
│ │ ├─val
│ ├─det_annotations
│ │ ├─train
│ │ ├─val
│ ├─da_seg_annotations
│ │ ├─train
│ │ ├─val
│ ├─ll_seg_annotations
│ │ ├─train
│ │ ├─val

Update the your dataset path in the ./lib/config/default.py.

Training

You can set the training configuration in the ./lib/config/default.py. (Including: the loading of preliminary model, loss, data augmentation, optimizer, warm-up and cosine annealing, auto-anchor, training epochs, batch_size).

If you want try alternating optimization or train model for single task, please modify the corresponding configuration in ./lib/config/default.py to True. (As following, all configurations is False, which means training multiple tasks end to end).

# Alternating optimization
_C.TRAIN.SEG_ONLY = False           # Only train two segmentation branchs
_C.TRAIN.DET_ONLY = False           # Only train detection branch
_C.TRAIN.ENC_SEG_ONLY = False       # Only train encoder and two segmentation branchs
_C.TRAIN.ENC_DET_ONLY = False       # Only train encoder and detection branch

# Single task 
_C.TRAIN.DRIVABLE_ONLY = False      # Only train da_segmentation task
_C.TRAIN.LANE_ONLY = False          # Only train ll_segmentation task
_C.TRAIN.DET_ONLY = False          # Only train detection task

Start training:

python tools/train.py

Multi GPU mode:

python -m torch.distributed.launch --nproc_per_node=N tools/train.py  # N: the number of GPUs

Evaluation

You can set the evaluation configuration in the ./lib/config/default.py. (Including: batch_size and threshold value for nms).

Start evaluating:

python tools/test.py --weights weights/End-to-end.pth

Demo Test

We provide two testing method.

Folder

You can store the image or video in --source, and then save the reasoning result to --save-dir

python tools/demo.py --source inference/images

Camera

If there are any camera connected to your computer, you can set the source as the camera number(The default is 0).

python tools/demo.py --source 0

Demonstration

input output

Deployment

Our model can reason in real-time on Jetson Tx2, with Zed Camera to capture image. We use TensorRT tool for speeding up. We provide code for deployment and reasoning of model in ./toolkits/deploy.

Segmentation Label(Mask) Generation

You can generate the label for drivable area segmentation task by running

python toolkits/datasetpre/gen_bdd_seglabel.py

Model Transfer

Before reasoning with TensorRT C++ API, you need to transfer the .pth file into binary file which can be read by C++.

python toolkits/deploy/gen_wts.py

After running the above command, you obtain a binary file named yolop.wts.

Running Inference

TensorRT needs an engine file for inference. Building an engine is time-consuming. It is convenient to save an engine file so that you can reuse it every time you run the inference. The process is integrated in main.cpp. It can determine whether to build an engine according to the existence of your engine file.

Third Parties Resource

Citation

If you find our paper and code useful for your research, please consider giving a star ⭐ and citation 📝 :

@article{wu2022yolop,
  title={Yolop: You only look once for panoptic driving perception},
  author={Wu, Dong and Liao, Man-Wen and Zhang, Wei-Tian and Wang, Xing-Gang and Bai, Xiang and Cheng, Wen-Qing and Liu, Wen-Yu},
  journal={Machine Intelligence Research},
  pages={1--13},
  year={2022},
  publisher={Springer}
}

yolop's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

yolop's Issues

Anchor setup

Dear Authors,

thanks for your great work. I have a basic question:

[1, [[3, 9, 5, 11, 4, 20], [7, 18, 6, 39, 12, 31], [19, 50, 38, 81, 68, 157]], [128, 256, 512]]],

I believe this is your default anchors,
[3, 9, 5, 11, 4, 20], [7, 18, 6, 39, 12, 31], [19, 50, 38, 81, 68, 157]
My question is why the anchors are vertical instead of horizontal (for horizontal vehicles) ?
stride 8: [3, 9, 5, 11, 4, 20]:
stride 16: [7, 18, 6, 39, 12, 31],
stride 32: [19, 50, 38, 81, 68, 157]
I think the order is [anchor1_w, anchor1_h, anchor2_w, anchor2_h, anchor3_w, anchor3_h] for each layer
wh = (y[..., 2:4] * 2) ** 2 * self.anchor_grid[i].view(1, self.na, 1, 1, 2) # wh (bs,na,ny,nx,2)

One reason I could think of is maybe the autoanchor makes 1280x720 -> 640x640 (not keeping aspect ratio), and thus make the anchors more vertical-like (but I have not yet look into the code)
But the inference in YOLOP seem to be keeping aspect ratio.

Above is my question. Thanks again for the great work.

Issues while running train.py

Just checking if there is any suggestions/advise to go around this error:

RuntimeError: CUDA out of memory. Tried to allocate 46.00 MiB (GPU 0; 7.79 GiB total capacity; 6.47 GiB already allocated; 37.38 MiB free; 6.53 GiB reserved in total by PyTorch)

Traceback (most recent call last):
File "tools/train.py", line 395, in
main()
File "tools/train.py", line 323, in main
epoch, num_batch, num_warmup, writer_dict, logger, device, rank)
File "/home/jvilela/YOLOP/lib/core/function.py", line 76, in train
outputs = model(input)
File "/home/jvilela/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jvilela/YOLOP/lib/models/YOLOP.py", line 555, in forward
x = block(x)
File "/home/jvilela/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jvilela/YOLOP/lib/models/common.py", line 132, in forward
return self.cv4(self.act(self.bn(torch.cat((y1, y2), dim=1))))
File "/home/jvilela/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jvilela/YOLOP/lib/models/common.py", line 97, in forward
return self.act(self.bn(self.conv(x)))
File "/home/jvilela/.local/lib/python3.6/site-packages/torch/nn/modules/module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/jvilela/YOLOP/lib/models/common.py", line 82, in forward
return x * F.hardtanh(x + 3, 0., 6.) / 6. # for torchscript, CoreML and ONNX
File "/home/jvilela/.local/lib/python3.6/site-packages/torch/nn/functional.py", line 1188, in hardtanh
result = torch._C._nn.hardtanh(input, min_val, max_val)
RuntimeError: CUDA out of memory. Tried to allocate 46.00 MiB (GPU 0; 7.79 GiB total capacity; 6.47 GiB already allocated; 37.38 MiB free; 6.53 GiB reserved in total by PyTorch)

jvilela@soads-gpu-4:~/YOLOP$ nvidia-smi
Wed Oct 13 19:41:45 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.57.02 Driver Version: 470.57.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce ... On | 00000000:65:00.0 On | N/A |
| 18% 38C P8 13W / 250W | 280MiB / 7973MiB | 4% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 1955 G /usr/lib/xorg/Xorg 18MiB |
| 0 N/A N/A 2076 G /usr/bin/gnome-shell 49MiB |
| 0 N/A N/A 4417 G /usr/lib/xorg/Xorg 105MiB |
| 0 N/A N/A 4578 G /usr/bin/gnome-shell 75MiB |
| 0 N/A N/A 5283 G ...AAAAAAAAA= --shared-files 27MiB |
+-----------------------------------------------------------------------------+

demo.py

image
image
why can come this trouble

'list' object has no attribute 'seek'

my python version is 3.8, and when run python tools/test.py the question arised, can you help me? thank you!

File "tools/test.py", line 153, in
main()
File "tools/test.py", line 86, in main
checkpoint = torch.load(checkpoint_file)
File "/home/kasm-user/anaconda3/envs/torch19/lib/python3.8/site-packages/torch/serialization.py", line 594, in load
with _open_file_like(f, 'rb') as opened_file:
File "/home/kasm-user/anaconda3/envs/torch19/lib/python3.8/site-packages/torch/serialization.py", line 235, in _open_file_like
return _open_buffer_reader(name_or_buffer)
File "/home/kasm-user/anaconda3/envs/torch19/lib/python3.8/site-packages/torch/serialization.py", line 220, in init
_check_seekable(buffer)
File "/home/kasm-user/anaconda3/envs/torch19/lib/python3.8/site-packages/torch/serialization.py", line 311, in _check_seekable
raise_err_msg(["seek", "tell"], e)
File "/home/kasm-user/anaconda3/envs/torch19/lib/python3.8/site-packages/torch/serialization.py", line 304, in raise_err_msg
raise type(e)(msg)
AttributeError: 'list' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

dataset not accessible to opencv methods while training

After downloading the images and annotations/lables, I made the corresponding changes in the default.py config and ran python tools/train.py. The data seemed to load successfully but then some of the thread workers spit out some errors.

$ python3 tools/train.py
=> creating runs/BddDataset/_2021-08-30-23-52
Namespace(conf_thres=0.001, dataDir='', iou_thres=0.6, local_rank=-1, logDir='runs/', modelDir='', prevModelDir='', sync_bn=False)
AUTO_RESUME: False
CUDNN:
  BENCHMARK: True
  DETERMINISTIC: False
...
...

load model to device
begin to load data
building database...
100%|████████████████████████████████████████████████████████| 70000/70000 [00:13<00:00, 5352.88it/s]
database build finish
building database...
100%|████████████████████████████████████████████████████████| 10000/10000 [00:01<00:00, 5334.48it/s]
database build finish
load data finished
anchors loaded successfully
tensor([[[0.3750, 1.1250],
         [0.6250, 1.3750],
         [0.5000, 2.5000]],

        [[0.4375, 1.1250],
         [0.3750, 2.4375],
         [0.7500, 1.9375]],

        [[0.5938, 1.5625],
         [1.1875, 2.5312],
         [2.1250, 4.9062]]])
=> start training...
Exception in thread Thread-3:
Traceback (most recent call last):
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/threading.py", line 926, in _bootstrap_inner
    self.run()
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/prefetch_generator/__init__.py", line 80, in run
    for item in self.generator:
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 435, in __next__
    data = self._next_data()
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1085, in _next_data
    return self._process_data(data)
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 1111, in _process_data
    data.reraise()
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/torch/_utils.py", line 428, in reraise
    raise self.exc_type(msg)
cv2.error: Caught error in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/torch/utils/data/_utils/worker.py", line 198, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/torch/utils/data/_utils/fetch.py", line 44, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/hotify/YOLOP/lib/dataset/AutoDriveDataset.py", line 100, in __getitem__
    img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
cv2.error: OpenCV(4.5.3) /tmp/pip-req-build-l1r0y34w/opencv/modules/imgproc/src/color.cpp:182: error: (-215:Assertion failed) !_src.empty() in function 'cvtColor'



^CTraceback (most recent call last):
  File "tools/train.py", line 395, in <module>
    main()
  File "tools/train.py", line 323, in main
    epoch, num_batch, num_warmup, writer_dict, logger, device, rank)
  File "/home/hotify/YOLOP/lib/core/function.py", line 51, in train
    for i, (input, target, paths, shapes) in enumerate(train_loader):
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/prefetch_generator/__init__.py", line 92, in __next__
    return self.next()
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/site-packages/prefetch_generator/__init__.py", line 85, in next
    next_item = self.queue.get()
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/queue.py", line 170, in get
    self.not_empty.wait()
  File "/home/hotify/anaconda3/envs/yolop/lib/python3.7/threading.py", line 296, in wait
    waiter.acquire()
KeyboardInterrupt
^C

This resulted from OpenCV trying to read an empty file. To further confirm this, I edited AutoDriveDataset.py and tried to print the image shape just before cvtColor()

    def __getitem__(self, idx):
        """
        Get input and groud-truth from database & add data augmentation on input

        Inputs:
        -idx: the index of image in self.db(database)(list)
        self.db(list) [a,b,c,...]
        a: (dictionary){'image':, 'information':}

        Returns:
        -image: transformed image, first passed the data augmentation in __getitem__ function(type:numpy), then apply self.transform
        -target: ground truth(det_gt,seg_gt)

        function maybe useful
        cv2.imread
        cv2.cvtColor(data, cv2.COLOR_BGR2RGB)
        cv2.warpAffine
        """
        data = self.db[idx]
        img = cv2.imread(data["image"], cv2.IMREAD_COLOR | cv2.IMREAD_IGNORE_ORIENTATION)
        print(f'log: {img.shape}') # My edit
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
...

This verifies the claim. Can this be originating from pytorch ?

Kindly help me in this. Thanks

Integrating models to Hugging Face Hub

Hi there!

YOLOP is very interesting! I see you currently save your model checkpoint in this repo. Would you be interested in sharing the model in the Hugging Face Hub? (we can even set up a Hust Visual Learning Team organization for your team to have all the models in a single place)

The Hub offers free hosting of over 20K models, and it would make your work more accessible and visible to the rest of the ML community. Some of the benefits of sharing your models would be:

  • versioning
  • commit history and diffs
  • repos provide useful metadata about their tasks, languages, metrics, etc
  • we could add a widget for users to try the model directly in the browser

Creating the repos and adding new models should be a relatively straightforward process if you've used Git before. This is a step-by-step guide explaining the process in case you're interested. Please let us know if you would be interested and if you have any questions.

Happy to hear your thoughts,
Omar and the Hugging Face team

Additional Multitask head

hi thanks for open-sourcing this wonderful , i have the following queries

  1. can we additional head called object_features which classifies the features of the detected objects eg car: type of car, brand of car, color of car-like tat as an additional branch to obtain fine-grained details ?
  2. In bdd100k dataset there are other labels of scene classification like day-night weather, can we have another brach that gives this following results

for the above task how feasible can the current work be extended? thanks in advance

--conf-thres and --iou-thres

Hello, why when I detect images, no matter what I change the values of --conf-thres and --iou-thres to, the detection result remains the same. What's the matter?
The command is as follows:
python tools/demo.py --source inference/videos/1.mp4 --iou-thres 0.99 --conf-thres 0.999 --device 0

Detection category

Hello, if I want to output the target category to the detection image, where should I modify?

AssertionError: Invalid type

When I run python tools/demo.py --source ./inference/images/street.jpg
it appeared:
Traceback (most recent call last):
File "tools/demo.py", line 21, in
from lib.config import cfg
File "/home/zzj/Projects/YOLOP/lib/config/init.py", line 1, in
from .default import _C as cfg
File "/home/zzj/Projects/YOLOP/lib/config/default.py", line 38, in
_C.LOSS.MULTI_HEAD_LAMBDA = None
File "/home/zzj/anaconda3/envs/yolov5/lib/python3.8/site-packages/yacs/config.py", line 155, in setattr
_assert_with_logging(
File "/home/zzj/anaconda3/envs/yolov5/lib/python3.8/site-packages/yacs/config.py", line 521, in _assert_with_logging
assert cond, msg
AssertionError: Invalid type <class 'NoneType'> for key MULTI_HEAD_LAMBDA; valid types = {<class 'list'>, <class 'tuple'>, <class 'int'>, <class 'bool'>, <class 'str'>, <class 'float'>}

I changed the image name .The new name is 'street'.
我实在没能自己解决,麻烦作者看下是哪里出现了问题,应该如何解决呢?

Fineturning

Hello, if I want to detect people, can I make fineturning on the basis of your weight? For example, after we labeled people's images, we use end-to-end.pth to continue training for models.

detect my video

detect my video
python tools/demo.py --source inference/videos2
IndexError: boolean index did not match indexed array along dimension 0; dimension is 1280 but corresponding boolean dimension is 640

python tools/demo.py --source inference/videos2 --img-size 1280
RuntimeError: Input and output sizes should be greater than 0, but got input (H: 1280, W: 720) output (H: 0, W: 0)
why? and how to fix it

References Error

Your paper published in arxiv https://arxiv.org/abs/2108.11250 v4,[7] may MultiNet: Real-time Joint Semantic Reasoning for Autonomous Driving rather than R. Chandra and P. Bahl, “Multinet: Connecting to multiple ieee 802.11networks using a single wireless card,” inieeeinfocom2004, vol. 2.IEEE, 2004, pp. 882–893.
Thank you!

About use GPU

I have set the device to 0 instead of cpu, but the speed is only 5it/s(5fps),Is this situation normal?
video information:
1920*1080
1855kbps
25fps
device information:
i5 9400
NVIDIA Geforce RTX2060
Memory Dual channel 16GB

Lane detection Method

Hi, thanks for your great work.
It seems that the lane detection part for your work is quite lightweight.
Was it based on any existence method?
Thanks for your sharing :)

Camera demo has some issue with C920 nor ZED

Setup on Jetson Xavier AGX; Jetpack4.5.1, Pytorch==1.8.0, Torchvision==0.9.0, etc.
YOLOP works on demo.py by images and mp4 video.
"python tools/demo.py --source 0" has stop working and following all output with C920.
jetson@xavier-agx:~/YOLOP$ python3 tools/demo.py --source 0
['/home/jetson/YOLOP/tools', '/usr/lib/python36.zip', '/usr/lib/python3.6', '/usr/lib/python3.6/lib-dynload', '/home/jetson/.local/lib/python3.6/site-packages', '/home/jetson/.local/lib/python3.6/site-packages/torchvision-0.9.0-py3.6-linux-aarch64.egg', '/home/jetson/.local/lib/python3.6/site-packages/Pillow-8.3.1-py3.6-linux-aarch64.egg', '/home/jetson/.local/lib/python3.6/site-packages/scipy-1.4.1-py3.6-linux-aarch64.egg', '/usr/local/lib/python3.6/dist-packages', '/usr/lib/python3/dist-packages', '/usr/lib/python3.6/dist-packages', '/home/jetson/YOLOP']
=> creating runs/BddDataset/_2021-08-31-15-38
Using torch 1.8.0 CPU

[ WARN:0] global /home/nvidia/host/build_opencv/nv_opencv/modules/videoio/src/cap_gstreamer.cpp (933) open OpenCV | GStreamer warning: Cannot query video position: status=0, value=-1, duration=-1
1/1: 0... success (inf frames 2304x1536 at 2.00 FPS)

0%| | 0/1 [00:00<?, ?it/s]/home/jetson/.local/lib/python3.6/site-packages/torch/nn/functional.py:3455: UserWarning: Default upsampling behavior when mode=bilinear is changed to align_corners=False since 0.4.0. Please specify align_corners=True if the old behavior is desired. See the documentation of nn.Upsample for details.
"See the documentation of nn.Upsample for details.".format(mode)
0%| | 0/1 [00:05<?, ?it/s]
Traceback (most recent call last):
File "tools/demo.py", line 174, in
detect(cfg,opt)
File "tools/demo.py", line 127, in detect
img_det = show_seg_result(img_det, (da_seg_mask, ll_seg_mask), _, _, is_demo=True)
File "/home/jetson/YOLOP/lib/utils/plot.py", line 57, in show_seg_result
img[color_mask != 0] = img[color_mask != 0] * 0.5 + color_seg[color_mask != 0] * 0.5
IndexError: boolean index did not match indexed array along dimension 0; dimension is 1536 but corresponding boolean dimension is 1284

Extremely low P value while training on BDD100K

Hello,
Thank you for the great work on this project. While training on BDD100K, the Precision value for object detection I get is extremely low.

Driving area Segment: Acc(0.967)    IOU (0.832)    mIOU(0.897)
Lane line Segment: Acc(0.626)    IOU (0.250)  mIOU(0.617)
Detect: P(0.049)  R(0.885)  [email protected](0.728)  [email protected]:0.95(0.391)

Am I doing something wrong? The other values match the values given in the readme, but Precision is not given so I don't have a reference.

onnx export problem

Hello,

Thank you for your great work. I was trying to export your trained model to onnx using torch.onnx.export function. Yet I received the following error.
RuntimeError: Only tuples, lists and Variables supported as JIT inputs/outputs. Dictionaries and strings are also accepted but their usage is not recommended. But got unsupported type builtin_function_or_method

Is your model compatible to convert to onnx?

And, in your activations.py script I observed that you replaced hardsigmoid with hardtanh.

class Hardswish(nn.Module):  # export-friendly version of nn.Hardswish()
    @staticmethod
    def forward(x):
        # return x * F.hardsigmoid(x)  # for torchscript and CoreML
        return x * F.hardtanh(x + 3, 0., 6.) / 6.  # for torchscript, CoreML and ONNX

Yet when I export the model to torch.jit using trace, there seems to be aten::hardswish functions in the model nonetheless. What am I missing?

Thanks in Advance, Kind Regards

End-End Training

The default.py training configuration says (Line 102 - 112):

# if training 3 tasks end-to-end, set all parameters as True
# Alternating optimization
_C.TRAIN.SEG_ONLY = False           # Only train two segmentation branchs
_C.TRAIN.DET_ONLY = False           # Only train detection branch
_C.TRAIN.ENC_SEG_ONLY = False       # Only train encoder and two segmentation branchs
_C.TRAIN.ENC_DET_ONLY = False       # Only train encoder and detection branch

# Single task 
_C.TRAIN.DRIVABLE_ONLY = False      # Only train da_segmentation task
_C.TRAIN.LANE_ONLY = False          # Only train ll_segmentation task
_C.TRAIN.DET_ONLY = False          # Only train detection task

But the ReadMe says:

If you want try alternating optimization or train model for single task, please modify the corresponding configuration in ./lib/config/default.py to True. (As following, all configurations is False, which means training multiple tasks end to end).

# Alternating optimization
_C.TRAIN.SEG_ONLY = False           # Only train two segmentation branchs
_C.TRAIN.DET_ONLY = False           # Only train detection branch
_C.TRAIN.ENC_SEG_ONLY = False       # Only train encoder and two segmentation branchs
_C.TRAIN.ENC_DET_ONLY = False       # Only train encoder and detection branch

# Single task 
_C.TRAIN.DRIVABLE_ONLY = False      # Only train da_segmentation task
_C.TRAIN.LANE_ONLY = False          # Only train ll_segmentation task
_C.TRAIN.DET_ONLY = False          # Only train detection task

Therefore, the comments in the Python file say to set these parameters as True for End-End training, but the ReadMe instructs setting them as False for End-End training.

I am trying to reproduce your results using an RTX 2080Ti. I read your paper many times but did not see a mention of the training configuration you used for your experiment. Since the batch size was set to 24 and I seem to only be able to fit only 12 batch size at maximum with my 11GB of video memory.

Currently, I've been training for several days and the lane segmentation and driveable surface already look pretty good but the object detection head is still way too confident and overproducing predictions.

I ran out of disk space at several points over the course of this experiment and had to resume training from the latest checkpoint a couple times which is why there is gaps in the chart.

No Smoothing
train_loss

Smoothing
train_loss (1)

My parameters and dataset were identical to those in the default.py training configuration. I am wondering whether this slow convergence and oscillation in the loss is normal given the diversity and difficulty of the dataset or if it is a result of poorly selected hyperparameters such as LR, or is caused by my smaller batch size.

What was the value of your loss at the end of the 240 epochs?

I am sorry to ask many questions but it's frustrating to lose days of training time on my GPU by restarting training so if you could please guide a fellow student in the right direction I will be sure to help improve on your repository.

Thank you,
Alex

quadratic fitting

"The visualization of lane detection result has been post processed by quadratic fitting",
Can you share the code of quadratic fitting?

How to training to custom dataset?

I don't understood how to generate the anotation format to training on my dataset, do you have some documentation for help me? Thanks!

Generate .wts file for TensorRT

Hi,

Thanks for your great work.
Looks like gen_wts.py is not working with End-to-end.pth, and would you please provide more details on how to generate .wts file?
Appreciate very much if you can show the steps to generate the .engine file.

Regards

cannot build engine

When I run the project on Jetson Xavier, it shows the problem below. Could you please tell me how to solve this?

(py36) nvidia@nvidia:~/Project/YOLOP-new/toolkits/deploy/build$ ./yolop
Building engine...
Loading weights: yolop.wts
[10/08/2021-10:47:47] [E] [TRT] Parameter check failed at: ../builder/Network.cpp::addScale::482, condition: shift.count > 0 ? (shift.values != nullptr) : (shift.values == nullptr)
Segmentation fault (core dumped)

How to detect multiple objects using YOLOP?

First of all, thx for your great jobs! The YOLOP model right now seems to have the ability to detect only the car, if I would like to detect more classes of object, what parameters should I modified? I had already tried to modify model.nc in train.py to be 41, change the single_cls to be False in bdd.py, uncomment the bdd_labels dict in convert.py, but still I got the error said:

Traceback (most recent call last):
File "tools/train.py", line 406, in
main()
File "tools/train.py", line 333, in main
train(cfg, train_loader, model, criterion, optimizer, scaler,
File "/home/roy/Github/YOLOP/lib/core/function.py", line 77, in train
total_loss, head_losses = criterion(outputs, target, shapes,model)
File "/home/roy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/home/roy/Github/YOLOP/lib/core/loss.py", line 50, in forward
total_loss, head_losses = self._forward_impl(head_fields, head_targets, shapes, model)
File "/home/roy/Github/YOLOP/lib/core/loss.py", line 96, in _forward_impl
iou = bbox_iou(pbox.T, tbox[i], x1y1x2y2=False, CIoU=True) # iou(prediction, target)
File "/home/roy/Github/YOLOP/lib/core/general.py", line 38, in bbox_iou
print(box1[0] - box1[2] / 2)
File "/home/roy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_tensor.py", line 249, in repr
return torch._tensor_str._str(self)
File "/home/roy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_tensor_str.py", line 415, in _str
return _str_intern(self)
File "/home/roy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_tensor_str.py", line 390, in _str_intern
tensor_str = _tensor_str(self, indent)
File "/home/roy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_tensor_str.py", line 251, in _tensor_str
formatter = _Formatter(get_summarized_data(self) if summarize else self)
File "/home/roy/anaconda3/envs/pytorch/lib/python3.8/site-packages/torch/_tensor_str.py", line 90, in init
nonzero_finite_vals = torch.masked_select(tensor_view, torch.isfinite(tensor_view) & tensor_view.ne(0))
RuntimeError: CUDA error: device-side assert triggered
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.

Please help me out!

Error about deployment on Jetson agx xavier

Hi,
I want to employ the project on Jetson agx xavier. When I cmake the toolkits/deply, it appears " Unknown Cmake command 'coda_add_library'". Can you tell me how to solve it or give me some guide for compile? Thanks a lot!

BDD 100K 车道线 分割 mask 如何生成?

例如json信息中的“L”,“C”,其中“C”该如何利用
{
"category": "lane/single white",
"id": 8,
"attributes": {
"direction": "parallel",
"style": "dashed"
},
"poly2d": [
[
641.873374,
463.734289,
"C"
],
[
571.238349,
494.44517,
"C"
],
[
585.058244,
506.729521,
"C"
],
[
766.252437,
692.530346,
"L"
]
]
},

训练数据集中的文件全是png图片吗?

我下载了READ_ME中给的给出的可行驶区域分割任务的标签,里面是png图片,我就按照READ_ME中推荐的数据集文件结构创建了数据集,但在运行train.py时报错
image
应该是bdd.py 39行中的json.load(f)出了问题
这里为什么要加载json文件呢,数据集不全是图片吗,还请指点。

Clarity regarding training data

I find much clarity lacking in the training process. Additional info on training dataset format is missing in toolkits/label_conversion/README.md. I understand that it will be update sometime soon.

The docs specify the training data to be formatted as:

# The id represent the correspondence relation
├─dataset root
│ ├─images/ id.jpg
│ ├─det_annotations/ id.json
│ ├─da_seg_annotations/ id.png
│ ├─ll_seg_annotations/ id.png

But the dataset downloaded from the bdd100k site has the following structure.

.
└── segmentation
    ├── __MACOSX
    │   └── test
    ├── test
    │   ├── __MACOSX
    │   │   └── test
    │   └── test
    │       └── raw_images
    ├── train
    │   ├── __MACOSX
    │   │   └── train
    │   └── train
    │       ├── class_color
    │       ├── class_id
    │       ├── instance_color
    │       ├── instance_id
    │       └── raw_images
    └── val
        ├── __MACOSX
        │   └── val
        └── val
            ├── class_color
            ├── class_id
            ├── instance_color
            ├── instance_id
            └── raw_images
  1. Its unclear which among instance_color, class_id and instance_id denote det_annotations,da_seg_annotations,
    ll_seg_annotations. All of them are masks. I dont' intend to use the object detection part, so the json conversion shouldn't be very necessary for now.

  2. The lib/config/default.py contains params such as

_C.DATASET.DATAROOT = '/home/zwt/bdd/bdd100k/images/100k'       # the path of images folder
_C.DATASET.LABELROOT = '/home/zwt/bdd/bdd100k/labels/100k'      # the path of det_annotations folder
_C.DATASET.MASKROOT = '/home/zwt/bdd/bdd_seg_gt'                # the path of da_seg_annotations folder
_C.DATASET.LANEROOT = '/home/zwt/bdd/bdd_lane_gt'

It would be better if more info can be provided for the paths such that it can be generalised.

Can't pickle generator objects

Hello, I'm trying to learn YOLOP model.
python tools/train.py
When the code was executed, a can't pick generator objects error and a Ran out of input error occurred.
Please give me some advice.

Understanding Resolution / Network Input

Following [6], we resize images in BDD100k dataset from
1280×720×3 to 640×384×3.

But in default.py training configuration:

_C.MODEL.IMAGE_SIZE = [640, 640] # width * height, ex: 192 * 256

Is this because YOLO network usually resizes image to the longer side by using padding?

cudawarping.hpp cannot be found

When I compile the project on Jetson Xavier, it shows "No such file : cudawarping.hpp". Is it that opencv_contrib should be installed?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.