syscv / pcan Goto Github PK

View Code? Open in Web Editor NEW

364.0 364.0 50.0 57.24 MB

Prototypical Cross-Attention Networks for Multiple Object Tracking and Segmentation, NeurIPS 2021 Spotlight

Home Page: https://www.vis.xyz/pub/pcan/

License: Apache License 2.0

Shell 0.51% Python 99.49%

computer-vision mots multi-object-tracking-segmentation neurips-2021 segmentation tracking video-instance-segmentation

pcan's People

Contributors

Stargazers

Watchers

pcan's Issues

Did not see the attention code

No such file or directory

I train the model after I refer to GET_STARTED.md for preparation. However, the following problems have arisen：
FileNotFoundError: [Errno 2] No such file or directory: 'data/bdd/images/10k/train/fee92217-63b3f87f.jpg'.

I checked the json file and the dataset. I found that certain image names appear in the json file, but could not find the jpg file in the corresponding folder. Please help me!

Questions about calculation of mAP

It seems mAP is not on the benchmark. How do you calcualate it. Thank you!

Why does this command cause the computer to crash

python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/ --nproc 2

python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/ --nproc 1

python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_pcan_results_val.pkl --task seg_track --bdd-dir converted_results/

Then,mouse and keyboard cannot be used

AssertionError

I encountered an issue while training your PCAN network. May I ask for your advice on a solution.When I used my own dataset and tags for training, there was a judgment error when the number of training rounds reached two rounds of comparison results. I couldn't find the reason. Can you take the time to help me solve it.Here is the complete error message.
Python: 3.7.12 | packaged by conda-forge | (default, Oct 26 2021, 05:35:01) [MSC v.1916 64 bit (AMD64)]
CUDA available: True
CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7
NVCC: Not Available
GPU 0: NVIDIA GeForce RTX 4070 Ti
PyTorch: 1.13.1+cu117
PyTorch compiling details: PyTorch built with:

C++ Version: 199711
MSVC 192829337
Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
OpenMP 2019
LAPACK is enabled (usually provided by MKL)
CPU capability usage: AVX2
CUDA Runtime 11.7
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
CuDNN 8.5
Magma 2.5.4
Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.7, CUDNN_VERSION=8.5.0, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.13.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,

TorchVision: 0.14.1+cu117
OpenCV: 4.8.0
MMCV: 1.7.1
pcan: 0.1.0+0650969

2023-09-21 15:59:57,977 - pcan - INFO - Distributed training: False
2023-09-21 15:59:58,199 - pcan - INFO - Config:
model = dict(
type='QuasiDenseFasterRCNN',
pretrained='torchvision://resnet50',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='QuasiDenseRoIHead',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=8,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
track_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
track_head=dict(
type='QuasiDenseEmbedHead',
num_convs=4,
num_fcs=1,
embed_channels=256,
norm_cfg=dict(type='GN', num_groups=32),
loss_track=dict(type='MultiPosCrossEntropyLoss', loss_weight=0.25),
loss_track_aux=dict(
type='L2Loss',
neg_pos_ub=3,
pos_margin=0,
neg_margin=0.3,
hard_mining=True,
loss_weight=1.0))),
tracker=dict(
type='QuasiDenseEmbedTracker',
init_score_thr=0.7,
obj_score_thr=0.3,
match_score_thr=0.5,
memo_tracklet_frames=10,
memo_backdrop_frames=1,
memo_momentum=0.8,
nms_conf_thr=0.5,
nms_backdrop_iou_thr=0.3,
nms_class_iou_thr=0.7,
with_cats=True,
match_metric='bisoftmax'),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False),
embed=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.5,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='CombinedSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=3,
add_gt_as_proposals=True,
pos_sampler=dict(type='InstanceBalancedPosSampler'),
neg_sampler=dict(
type='IoUBalancedNegSampler',
floor_thr=-1,
floor_fraction=0,
num_bins=3)))),
test_cfg=dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.5,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100)))
dataset_type = 'BDDVideoDataset'
data_root = '../data/guandao/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadMultiImagesFromFile'),
dict(type='SeqLoadAnnotations', with_bbox=True, with_ins_id=True),
dict(type='SeqResize', img_scale=(480, 640), keep_ratio=True),
dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5),
dict(
type='SeqNormalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='SeqPad', size_divisor=32),
dict(type='SeqDefaultFormatBundle'),
dict(
type='SeqCollect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_match_indices'],
ref_prefix='ref')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(480, 640),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='VideoCollect', keys=['img'])
])
]
data = dict(
samples_per_gpu=2,
workers_per_gpu=2,
train=[
dict(
type='BDDVideoDataset',
ann_file='../data/guandao/labels/labels_j/train_food.json',
img_prefix='../data/guandao/images/JPEGImages_j/imagetrainj',
key_img_sampler=dict(interval=1),
ref_img_sampler=dict(num_ref_imgs=1, scope=3, method='uniform'),
pipeline=[
dict(type='LoadMultiImagesFromFile'),
dict(
type='SeqLoadAnnotations',
with_bbox=True,
with_ins_id=True),
dict(type='SeqResize', img_scale=(480, 640), keep_ratio=True),
dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5),
dict(
type='SeqNormalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='SeqPad', size_divisor=32),
dict(type='SeqDefaultFormatBundle'),
dict(
type='SeqCollect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_match_indices'],
ref_prefix='ref')
]),
dict(
type='BDDVideoDataset',
load_as_video=False,
ann_file='../data/guandao/labels/labels_j/train_food.json',
img_prefix='../data/guandao/images/JPEGImages_j/imagetrainj',
pipeline=[
dict(type='LoadMultiImagesFromFile'),
dict(
type='SeqLoadAnnotations',
with_bbox=True,
with_ins_id=True),
dict(type='SeqResize', img_scale=(480, 640), keep_ratio=True),
dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5),
dict(
type='SeqNormalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='SeqPad', size_divisor=32),
dict(type='SeqDefaultFormatBundle'),
dict(
type='SeqCollect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_match_indices'],
ref_prefix='ref')
])
],
val=dict(
type='BDDVideoDataset',
ann_file='../data/guandao/labels/labels_j/val_food.json',
img_prefix='../data/guandao/images/JPEGImages_j/imagevalj',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(480, 640),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='VideoCollect', keys=['img'])
])
]),
test=dict(
type='BDDVideoDataset',
ann_file='../data/guandao/labels/labels_j/test_food.json',
img_prefix='../data/guandao/images/JPEGImages_j/imagetestj',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(480, 640),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='VideoCollect', keys=['img'])
])
]))
optimizer = dict(type='SGD', lr=0.04, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=1000,
warmup_ratio=0.001,
step=[8, 11])
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
total_epochs = 12
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
evaluation = dict(metric=['bbox', 'track'], interval=2)
work_dir = './work_dirs\qdtrack-frcnn_r50_fpn_12e_bdd100k_evalseg'
gpu_ids = range(0, 1)

2023-09-21 15:59:58,427 - mmdet - INFO - load model from: torchvision://resnet50
2023-09-21 15:59:58,427 - mmdet - INFO - load checkpoint from torchvision path: torchvision://resnet50
2023-09-21 15:59:58,532 - mmdet - WARNING - The model and loaded state dict do not match exactly

unexpected key in source state_dict: fc.weight, fc.bias

loading annotations into memory...
Done (t=0.01s)
creating index...
C:\Users\Admin\Desktop\pcan-main\pcan\datasets\parsers\coco_api.py:21: UserWarning: mmpycocotools is deprecated. Please install official pycocotools by "pip install pycocotools"
UserWarning)
index created!
loading annotations into memory...
Done (t=0.01s)
creating index...
index created!
2023-09-21 15:59:59,230 - pcan - INFO - Start running, host: Admin@PS2023QYZLHGNB, work_dir: C:\Users\Admin\Desktop\pcan-main\tools\work_dirs\qdtrack-frcnn_r50_fpn_12e_bdd100k_evalseg
2023-09-21 15:59:59,230 - pcan - INFO - Hooks will be executed in the following order:
before_run:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(VERY_LOW ) TextLoggerHook

before_train_epoch:
(VERY_HIGH ) StepLrUpdaterHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_train_iter:
(VERY_HIGH ) StepLrUpdaterHook
(LOW ) IterTimerHook

after_train_iter:
(ABOVE_NORMAL) OptimizerHook
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

after_train_epoch:
(NORMAL ) CheckpointHook
(NORMAL ) EvalHook
(VERY_LOW ) TextLoggerHook

before_val_epoch:
(LOW ) IterTimerHook
(VERY_LOW ) TextLoggerHook

before_val_iter:
(LOW ) IterTimerHook

after_val_iter:
(LOW ) IterTimerHook

after_val_epoch:
(VERY_LOW ) TextLoggerHook

after_run:
(VERY_LOW ) TextLoggerHook

2023-09-21 15:59:59,230 - pcan - INFO - workflow: [('train', 1)], max: 12 epochs
2023-09-21 15:59:59,230 - pcan - INFO - Checkpoints will be saved to C:\Users\Admin\Desktop\pcan-main\tools\work_dirs\qdtrack-frcnn_r50_fpn_12e_bdd100k_evalseg by HardDiskBackend.
loading annotations into memory...
Done (t=0.00s)
creating index...
index created!
C:\Anaconda\envs\pcan\lib\site-packages\mmdet\models\dense_heads\rpn_head.py:180: UserWarning: In rpn_proposal or test_cfg, nms_thr has been moved to a dict named nms as iou_threshold, max_num has been renamed as max_per_img, name of original arguments and the way to specify iou_threshold of NMS will be deprecated.
'In rpn_proposal or test_cfg, '
2023-09-21 16:00:16,268 - pcan - INFO - Epoch [1][50/617] lr: 1.998e-03, eta: 0:41:44, time: 0.341, data_time: 0.158, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 90.5784, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:00:23,313 - pcan - INFO - Epoch [1][100/617] lr: 3.996e-03, eta: 0:29:17, time: 0.141, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 88.5912, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:00:30,334 - pcan - INFO - Epoch [1][150/617] lr: 5.994e-03, eta: 0:25:03, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 89.7586, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:00:37,322 - pcan - INFO - Epoch [1][200/617] lr: 7.992e-03, eta: 0:22:51, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 92.1037, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:00:44,367 - pcan - INFO - Epoch [1][250/617] lr: 9.990e-03, eta: 0:21:31, time: 0.141, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 86.1733, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:00:51,404 - pcan - INFO - Epoch [1][300/617] lr: 1.199e-02, eta: 0:20:35, time: 0.141, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 89.6032, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:00:58,340 - pcan - INFO - Epoch [1][350/617] lr: 1.399e-02, eta: 0:19:51, time: 0.139, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 90.3079, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:01:05,295 - pcan - INFO - Epoch [1][400/617] lr: 1.598e-02, eta: 0:19:16, time: 0.139, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 88.3841, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:01:12,252 - pcan - INFO - Epoch [1][450/617] lr: 1.798e-02, eta: 0:18:48, time: 0.139, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 91.9174, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:01:19,277 - pcan - INFO - Epoch [1][500/617] lr: 1.998e-02, eta: 0:18:25, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 89.9358, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:01:26,273 - pcan - INFO - Epoch [1][550/617] lr: 2.198e-02, eta: 0:18:04, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 87.2205, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:01:33,275 - pcan - INFO - Epoch [1][600/617] lr: 2.398e-02, eta: 0:17:46, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 90.9714, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:01:35,962 - pcan - INFO - Saving checkpoint at 1 epochs
2023-09-21 16:01:52,579 - pcan - INFO - Epoch [2][50/617] lr: 2.665e-02, eta: 0:18:20, time: 0.298, data_time: 0.156, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 90.8333, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:01:59,578 - pcan - INFO - Epoch [2][100/617] lr: 2.865e-02, eta: 0:18:01, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 88.6841, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:06,571 - pcan - INFO - Epoch [2][150/617] lr: 3.065e-02, eta: 0:17:43, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 93.1286, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:13,572 - pcan - INFO - Epoch [2][200/617] lr: 3.265e-02, eta: 0:17:27, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 86.5786, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:20,596 - pcan - INFO - Epoch [2][250/617] lr: 3.465e-02, eta: 0:17:12, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 88.9562, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:27,592 - pcan - INFO - Epoch [2][300/617] lr: 3.664e-02, eta: 0:16:58, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 89.3254, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:34,558 - pcan - INFO - Epoch [2][350/617] lr: 3.864e-02, eta: 0:16:44, time: 0.139, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 91.2230, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:41,542 - pcan - INFO - Epoch [2][400/617] lr: 4.000e-02, eta: 0:16:31, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 91.4397, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:48,506 - pcan - INFO - Epoch [2][450/617] lr: 4.000e-02, eta: 0:16:19, time: 0.139, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 89.0595, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:02:55,531 - pcan - INFO - Epoch [2][500/617] lr: 4.000e-02, eta: 0:16:07, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 86.6459, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:03:02,480 - pcan - INFO - Epoch [2][550/617] lr: 4.000e-02, eta: 0:15:55, time: 0.139, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 87.5785, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:03:09,487 - pcan - INFO - Epoch [2][600/617] lr: 4.000e-02, eta: 0:15:44, time: 0.140, data_time: 0.003, memory: 2301, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 91.5429, loss_bbox: nan, loss_track: nan, loss_track_aux: nan, loss: nan
2023-09-21 16:03:12,165 - pcan - INFO - Saving checkpoint at 2 epochs
completed: 0, elapsed: 0sEvaluating BDD Results...
Traceback (most recent call last):
File "C:/Users/Admin/Desktop/pcan-main/tools/train.py", line 170, in
main()
File "C:/Users/Admin/Desktop/pcan-main/tools/train.py", line 166, in main
meta=meta)
File "C:\Users\Admin\Desktop\pcan-main\pcan\apis\train.py", line 123, in train_model
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "C:\Anaconda\envs\pcan\lib\site-packages\mmcv\runner\epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "C:\Anaconda\envs\pcan\lib\site-packages\mmcv\runner\epoch_based_runner.py", line 58, in train
self.call_hook('after_train_epoch')
File "C:\Anaconda\envs\pcan\lib\site-packages\mmcv\runner\base_runner.py", line 317, in call_hook
getattr(hook, fn_name)(self)
File "C:\Users\Admin\Desktop\pcan-main\pcan\core\evaluation\eval_hooks.py", line 14, in after_train_epoch
self.evaluate(runner, results)
File "C:\Anaconda\envs\pcan\lib\site-packages\mmdet\core\evaluation\eval_hooks.py", line 177, in evaluate
results, logger=runner.logger, **self.eval_kwargs)
File "C:\Users\Admin\Desktop\pcan-main\pcan\datasets\coco_video_dataset.py", line 319, in evaluate
class_average=mot_class_average)
File "C:\Users\Admin\Desktop\pcan-main\pcan\core\evaluation\mot.py", line 190, in eval_mot
assert len(all_results) == len(anns['images'])
AssertionError

What is difference of COCO RLE and Scalabel Rle ?

Hi, i using mmtracking generate seg_tracking RLE coco format result.

     How to convert to Scalabel  RLE format ??

OSError: ./ckpts/segtrack-fixed-new.pth is not a checkpoint file

I followed your instructions to download the initial model weights and put it under ckpts folder. And I only set the configuration file : configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py. But when I run train.py, I got the following error:

Traceback (most recent call last):
File "/home/dlz/PCAN/tools/train.py", line 175, in
main()
File "/home/dlz/PCAN/tools/train.py", line 171, in main
meta=meta)
File "/home/dlz/PCAN/pcan/apis/train.py", line 122, in train_model
runner.load_checkpoint(cfg.load_from)
File "/home/dlz/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/runner/base_runner.py", line 313, in load_checkpoint
self.logger)
File "/home/dlz/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 522, in load_checkpoint
checkpoint = _load_checkpoint(filename, map_location, logger)
File "/home/dlz/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 466, in _load_checkpoint
return CheckpointLoader.load_checkpoint(filename, map_location, logger)
File "/home/dlz/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 243, in load_checkpoint
return checkpoint_loader(filename, map_location)
File "/home/dlz/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/runner/checkpoint.py", line 259, in load_from_local
raise IOError(f'{filename} is not a checkpoint file')
OSError: ./ckpts/segtrack-fixed-new.pth is not a checkpoint file

Is there a problem with the given file? Do you have a suggested solution please? Thanks a lot for your answer!

Only using detector

Hi,
Firstly thanks a lot for great work.

I have performed steps mentioned in GET_STARTED.md and everything is working fine.

I actually want to show that tracking also improves detection results. For this is it possible if I only use the detector first, calculate evaluation metrics and then apply detector along with tracker and see whether the detection results have been improved or not?

Any help would be appreciated. I really need to show this experiment as it is basis for my research internship, so I am looking for any kind of suggestions.

Best Regards,
Rehman

cannot import name 'NPROC' from 'scalabel.common.parallel'

Hey there!
While running the script to convert the annotation from bdd100k format to COCO the following exception is being raised. cannot import name 'NPROC' from 'scalabel.common.parallel'. What is the possible solution for the problem?

The time it takes to train the model?

In your paper, I find "Our model is trained with initial learning rate 0.0025 on 4 GPUs using SGD, and executes with a speed of 15.0 FPS on ResNet-50.". and could you tell me the type of GPU and the time it takes to train the PCAN model on the BDD100K segmentation tracking dataset? (Using the the initial model weights from BDD100k MOT tracking set and not using the the initial model weights) thanks

Model weights link not working

Hello,
I want to download the weight file
Thanks

An error encountered while running test code

1、When I run the following test code, I get a graphical error

python tools/test.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan/latest.pth
--out work_dirs/resnest/result.pkl --format-only --show

error:
Traceback (most recent call last):
File "tools/test.py", line 164, in
main()
File "tools/test.py", line 153, in main
dataset.format_results(outputs, **kwargs)
File "/home/lin/anaconda3/envs/pcan/lib/python3.6/site-packages/mmdet/datasets/coco.py", line 350, in format_results
assert isinstance(results, list), 'results must be a list'
AssertionError: results must be a list

My configuration file is as follows
model = dict(
type='EMQuasiDenseMaskRCNNRefine',
pretrained=None,
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch'),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
num_outs=5),
rpn_head=dict(
type='RPNHead',
in_channels=256,
feat_channels=256,
anchor_generator=dict(
type='AnchorGenerator',
scales=[8],
ratios=[0.5, 1.0, 2.0],
strides=[4, 8, 16, 32, 64]),
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[1.0, 1.0, 1.0, 1.0]),
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0),
loss_bbox=dict(
type='SmoothL1Loss', beta=0.1111111111111111, loss_weight=1.0)),
roi_head=dict(
type='QuasiDenseSegRoIHeadRefine',
bbox_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
bbox_head=dict(
type='Shared2FCBBoxHead',
in_channels=256,
fc_out_channels=1024,
roi_feat_size=7,
num_classes=8,
bbox_coder=dict(
type='DeltaXYWHBBoxCoder',
target_means=[0.0, 0.0, 0.0, 0.0],
target_stds=[0.1, 0.1, 0.2, 0.2]),
reg_class_agnostic=False,
loss_cls=dict(
type='CrossEntropyLoss', use_sigmoid=False, loss_weight=1.0),
loss_bbox=dict(type='SmoothL1Loss', beta=1.0, loss_weight=1.0)),
track_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=7, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
track_head=dict(
type='QuasiDenseEmbedHead',
num_convs=4,
num_fcs=1,
embed_channels=256,
norm_cfg=dict(type='GN', num_groups=32),
loss_track=dict(type='MultiPosCrossEntropyLoss', loss_weight=0.25),
loss_track_aux=dict(
type='L2Loss',
neg_pos_ub=3,
pos_margin=0,
neg_margin=0.3,
hard_mining=True,
loss_weight=1.0)),
mask_roi_extractor=dict(
type='SingleRoIExtractor',
roi_layer=dict(type='RoIAlign', output_size=14, sampling_ratio=0),
out_channels=256,
featmap_strides=[4, 8, 16, 32]),
mask_head=dict(
type='FCNMaskHeadPlus',
num_convs=4,
in_channels=256,
conv_out_channels=256,
num_classes=8,
loss_mask=dict(
type='CrossEntropyLoss', use_mask=True, loss_weight=1.0)),
double_train=False,
refine_head=dict(
type='EMMatchHeadPlus',
num_convs=4,
in_channels=256,
conv_kernel_size=3,
conv_out_channels=256,
upsample_method='deconv',
upsample_ratio=2,
num_classes=8,
pos_proto_num=10,
neg_proto_num=10,
stage_num=6,
conv_cfg=None,
norm_cfg=None,
mask_thr_binary=0.5,
match_score_thr=0.5,
with_mask_ref=False,
with_mask_key=True,
with_dilation=False,
loss_mask=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0))),
tracker=dict(
type='QuasiDenseSegFeatEmbedTracker',
init_score_thr=0.7,
obj_score_thr=0.3,
match_score_thr=0.5,
memo_tracklet_frames=10,
memo_backdrop_frames=1,
memo_momentum=0.8,
nms_conf_thr=0.5,
nms_backdrop_iou_thr=0.3,
nms_class_iou_thr=0.7,
with_cats=True,
match_metric='bisoftmax'),
train_cfg=dict(
rpn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.3,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=-1,
add_gt_as_proposals=False),
allowed_border=-1,
pos_weight=-1,
debug=False),
rpn_proposal=dict(
nms_across_levels=False,
nms_pre=2000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.5,
neg_iou_thr=0.5,
min_pos_iou=0.5,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='RandomSampler',
num=512,
pos_fraction=0.25,
neg_pos_ub=-1,
add_gt_as_proposals=True),
pos_weight=-1,
debug=False,
mask_size=28),
embed=dict(
assigner=dict(
type='MaxIoUAssigner',
pos_iou_thr=0.7,
neg_iou_thr=0.3,
min_pos_iou=0.5,
match_low_quality=False,
ignore_iof_thr=-1),
sampler=dict(
type='CombinedSampler',
num=256,
pos_fraction=0.5,
neg_pos_ub=3,
add_gt_as_proposals=True,
pos_sampler=dict(type='InstanceBalancedPosSampler'),
neg_sampler=dict(
type='IoUBalancedNegSampler',
floor_thr=-1,
floor_fraction=0,
num_bins=3)))),
test_cfg=dict(
rpn=dict(
nms_across_levels=False,
nms_pre=1000,
nms_post=1000,
max_num=1000,
nms_thr=0.7,
min_bbox_size=0),
rcnn=dict(
score_thr=0.5,
nms=dict(type='nms', iou_threshold=0.5),
max_per_img=100,
mask_thr_binary=0.5)),
fixed=True)
dataset_type = 'BDDVideoDataset'
data_root = ''
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadMultiImagesFromFile'),
dict(
type='SeqLoadAnnotations',
with_bbox=True,
with_ins_id=True,
with_mask=True),
dict(type='SeqResize', img_scale=(1296, 720), keep_ratio=True),
dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5),
dict(
type='SeqNormalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='SeqPad', size_divisor=32),
dict(type='SeqDefaultFormatBundle'),
dict(
type='SeqCollect',
keys=['img', 'gt_bboxes', 'gt_labels', 'gt_match_indices', 'gt_masks'],
ref_prefix='ref')
]
test_pipeline = [
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1296, 720),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='VideoCollect', keys=['img'])
])
]
data = dict(
samples_per_gpu=8,
workers_per_gpu=2,
train=[
dict(
type='BDDVideoDataset',
ann_file='/media/lin/文件/bdd/labels/seg_track_train_cocoformat.json',
img_prefix='/media/lin/文件/bdd/images/seg_track_20/train',
key_img_sampler=dict(interval=1),
ref_img_sampler=dict(num_ref_imgs=1, scope=3, method='uniform'),
pipeline=[
dict(type='LoadMultiImagesFromFile'),
dict(
type='SeqLoadAnnotations',
with_bbox=True,
with_ins_id=True,
with_mask=True),
dict(type='SeqResize', img_scale=(1296, 720), keep_ratio=True),
dict(type='SeqRandomFlip', share_params=True, flip_ratio=0.5),
dict(
type='SeqNormalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='SeqPad', size_divisor=32),
dict(type='SeqDefaultFormatBundle'),
dict(
type='SeqCollect',
keys=[
'img', 'gt_bboxes', 'gt_labels', 'gt_match_indices',
'gt_masks'
],
ref_prefix='ref')
])
],
val=dict(
type='BDDVideoDataset',
ann_file='/media/lin/文件/bdd/labels/seg_track_val_cocoformat.json',
img_prefix='/media/lin/文件/bdd/images/seg_track_20/val',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1296, 720),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='VideoCollect', keys=['img'])
])
]),
test=dict(
type='BDDVideoDataset',
ann_file='/media/lin/文件/bdd/labels/seg_track_test_cocoformat.json',
img_prefix='/media/lin/文件/bdd/images/seg_track_20/test',
pipeline=[
dict(type='LoadImageFromFile'),
dict(
type='MultiScaleFlipAug',
img_scale=(1296, 720),
flip=False,
transforms=[
dict(type='Resize', keep_ratio=True),
dict(type='RandomFlip'),
dict(
type='Normalize',
mean=[123.675, 116.28, 103.53],
std=[58.395, 57.12, 57.375],
to_rgb=True),
dict(type='Pad', size_divisor=32),
dict(type='ImageToTensor', keys=['img']),
dict(type='VideoCollect', keys=['img'])
])
]))
optimizer = dict(type='SGD', lr=0.0025, momentum=0.9, weight_decay=0.0001)
optimizer_config = dict(grad_clip=None)
lr_config = dict(
policy='step',
warmup='linear',
warmup_iters=1000,
warmup_ratio=0.001,
step=[8, 11])
checkpoint_config = dict(interval=1)
log_config = dict(interval=50, hooks=[dict(type='TextLoggerHook')])
total_epochs = 12
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = './ckpts/segtrack-fixed-new.pth'
resume_from = None
workflow = [('train', 1)]
evaluation = dict(metric=['bbox', 'segm', 'segtrack'], interval=12)
work_dir = './work_dirs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan'
gpu_ids = range(0, 1)

2、When I run the following test code, I get a graphical error

Error
Traceback (most recent call last):
File "tools/test.py", line 164, in
main()
File "tools/test.py", line 160, in main
print(dataset.evaluate(outputs, **eval_kwargs))
File "/home/lin/Desktop/pcan/pcan/datasets/coco_video_dataset.py", line 317, in evaluate
class_average=mot_class_average)
File "/home/lin/Desktop/pcan/pcan/core/evaluation/mots.py", line 30, in eval_mots
preprocessResult(all_results, anns, cats_mapping)
File "/home/lin/Desktop/pcan/pcan/core/evaluation/mot.py", line 48, in preprocessResult
for i, bbox in enumerate(anns['annotations']): # 枚举，i is index ,line is content
KeyError: 'annotations'

3、When I changed ResNet to ResNest during the training, I did not modify the other parts, and the verification accuracy after the training was all 0

After changing the backbone network, what other locations need to be changed....

Visualization Script

Thank you for the great work!

I actually have a custom dataset on which I would like to create MOTS visualizations. Additionally, I would like to perform performance evaluations (for the MOTS task) on my dataset.

Can you let me know if a script is available to directly create visualizations on a set of images? Also, can you provide a brief description of how custom datasets need to be prepared, to run the MOTS scripts? Thanks in advance!

Welcome update to OpenMMLab 2.0

I am Vansin, the technical operator of OpenMMLab. In September of last year, we announced the release of OpenMMLab 2.0 at the World Artificial Intelligence Conference in Shanghai. We invite you to upgrade your algorithm library to OpenMMLab 2.0 using MMEngine, which can be used for both research and commercial purposes. If you have any questions, please feel free to join us on the OpenMMLab Discord at https://discord.gg/amFNsyUBvm or add me on WeChat (van-sin) and I will invite you to the OpenMMLab WeChat group.

Here are the OpenMMLab 2.0 repos branches:

	OpenMMLab 1.0 branch	OpenMMLab 2.0 branch
MMEngine		0.x
MMCV	1.x	2.x
MMDetection	0.x 、1.x、2.x	3.x
MMAction2	0.x	1.x
MMClassification	0.x	1.x
MMSegmentation	0.x	1.x
MMDetection3D	0.x	1.x
MMEditing	0.x	1.x
MMPose	0.x	1.x
MMDeploy	0.x	1.x
MMTracking	0.x	1.x
MMOCR	0.x	1.x
MMRazor	0.x	1.x
MMSelfSup	0.x	1.x
MMRotate	1.x	1.x
MMYOLO		0.x

Attention: please create a new virtual environment for OpenMMLab 2.0.

Traing on Custom dataset

What do I need to do if I want to train on custom dataset? I use CVAT to annotate in MOTS format, and I don't know what format the annotation file should be exported.

KeyError: 'frames'

My purpose is to calculate the evaluation metrics in the paper，and I found instructions and codes for calculating Multi Object Tracking and Segmentation metrics on BDD100K's website(https://doc.bdd100k.com/evaluate.html#multi-object-tracking-and-segmentation-segmentation-tracking), when I run the following code to evaluate the algorithms with public annotations,

python -m bdd100k.bdd100k.eval.run -t seg_track -g /home/dlz/pcan/data/bdd/labels/seg_track_20/seg_track_val_cocoformat.json -r /home/dlz/pcan/paintedimg.zip

I got an error:
Traceback (most recent call last):
File "/home/dlz/anaconda3/envs/pcan/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/dlz/anaconda3/envs/pcan/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/dlz/PCAN/bdd100k/bdd100k/eval/run.py", line 319, in
run()
File "/home/dlz/PCAN/bdd100k/bdd100k/eval/run.py", line 292, in run
args.gt, args.result, bdd100k_config, args.nproc
File "/home/dlz/PCAN/bdd100k/bdd100k/eval/run.py", line 226, in _load_frames
gt_frames = bdd100k_to_scalabel(load(gt_base, nproc).frames, config)
File "/home/dlz/PCAN/scalabel/scalabel/label/io.py", line 90, in load
ret_cfg = process_file(inputs)
File "/home/dlz/PCAN/scalabel/scalabel/label/io.py", line 77, in process_file
raw_frames.extend(content["frames"])
KeyError: 'frames'

The real data I passed in is the seg_track_val_cocoformat.json file, and I found that the raw_frames and raw_cfg(config) parameters required by the code are not given in the file. Does the ground-truth JSON file need further processing?How to solve this problem? Thanks for your answer!

Confusion on the ablation study in the paper

Hi, I really appreciate your great job on PCAN. Now I'm reading the ablation study part of the paper, and fell really confused on the Table 3 and Table 4:

My confusions are 1. does this "varying temporal memory length" in Table 3 mean the "memo_tracklet_frames" in the tracker or something else?

which part of the code does this "multi-layer prototypical feature fusion" in Table 4 mean? Does it represent the "memo_banks" as shown or something else?

I'll definitely appreciate it, if somebody can give me a help !!

How should this error be resolved？

creating index...
index created!
2022-04-17 21:40:15,020 - pcan - INFO - Start running, host: lin@lin, work_dir: /home/lin/Desktop/pcan/work_dirs/4.17
2022-04-17 21:40:15,020 - pcan - INFO - workflow: [('train', 1)], max: 12 epochs
/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmdet/models/dense_heads/rpn_head.py:180: UserWarning: In rpn_proposal or test_cfg, nms_thr has been moved to a dict named nms as iou_threshold, max_num has been renamed as max_per_img, name of original arguments and the way to specify iou_threshold of NMS will be deprecated.
'In rpn_proposal or test_cfg, '
Traceback (most recent call last):
File "tools/train.py", line 168, in
main()
File "tools/train.py", line 164, in main
meta=meta)
File "/home/lin/Desktop/pcan/pcan/apis/train.py", line 123, in train_model
runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True)
File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/mmdet/models/detectors/base.py", line 247, in train_step
losses = self(**data)
File "/home/lin/.conda/envs/pcan/lib/python3.6/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/lin/Desktop/pcan/pcan/models/mot/quasi_dense_pcan.py", line 45, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/home/lin/Desktop/pcan/pcan/models/mot/quasi_dense_pcan.py", line 86, in forward_train
ref_gt_bboxes_ignore, ref_gt_masks, **kwargs)
File "/home/lin/Desktop/pcan/pcan/models/roi_heads/quasi_dense_roi_head.py", line 173, in forward_train
if mask_results['loss_dice'] is not None:
KeyError: 'loss_dice'

May I ask whether your team has successfully delivered the article about VIS task in ECCV2022

用于测试的demo后续会更新吗？

我试着将mmdetection的demo中的video_demo.py拼接到您的网络中，显示如下问题：
Traceback (most recent call last):
File "demo/video_demo.py", line 60, in
main()
File "demo/video_demo.py", line 35, in main
model = init_detector(args.config, args.checkpoint, device=args.device)
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/apis/inference.py", line 39, in init_detector
model = build_detector(config.model, test_cfg=config.get('test_cfg'))
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 77, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 34, in build
return build_from_cfg(cfg, registry, default_args)
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/utils/registry.py", line 172, in build_from_cfg
f'{obj_type} is not in the {registry.name} registry')
KeyError: 'QuasiDenseMaskRCNN is not in the detector registry'
(pcan) zhy@king:~/pcan$ python test_video.py demo/demo.mp4 configs/segtrack-frcnn_r50_fpn_12e_bdd10k.py checkpoints/pcan_pretrained_model.pth --show
configs/segtrack-frcnn_r50_fpn_12e_bdd10k.py
Traceback (most recent call last):
File "test_video.py", line 61, in
main()
File "test_video.py", line 36, in main
model = init_detector(args.config, args.checkpoint, device=args.device)
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/apis/inference.py", line 39, in init_detector
model = build_detector(config.model, test_cfg=config.get('test_cfg'))
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 77, in build_detector
return build(cfg, DETECTORS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmdet/models/builder.py", line 34, in build
return build_from_cfg(cfg, registry, default_args)
File "/home/zhy/anaconda3/envs/pcan/lib/python3.7/site-packages/mmcv/utils/registry.py", line 172, in build_from_cfg
f'{obj_type} is not in the {registry.name} registry')
KeyError: 'QuasiDenseMaskRCNN is not in the detector registry'
查找相关问题应该是处在setup.py那个文件。
希望您对我的疑惑进行解答谢谢

Where are the locations of prototype cross-attention in training?

Hi, Thank you for your great jobs, I'm now training the model, but I just found that there is no prototype updated in training. They only work in the test. Is that means the cross-attention only appear in training?

KeyError: "BDDVideoDataset: 'video_id'"

hello!How should I solve this problem when I encounter it

No BDD100K format jsons generated after running convert_to_bdd.sh

No BDD100K format jsons (for MOTS challenge submit) generated after running convert_to_bdd.sh.
After running convert_to_bdd.sh, In pcan-main/converted_results/seg_track dir, png masks generated but I can not find json files.

my convert_to_bdd.sh:
python ./tools/to_bdd100k.py configs/segtrack-frcnn_r50_fpn_12e_bdd10k_fixed_pcan.py --res eval_result_pcan_test.pkl
--task 'seg_track' --bdd-dir converted_results

Installation error

The requirements do not specify the exact package version required to install, I'm facing lots of errors while installing the requirements, would be great if those can be provided as well

unexpected result on our own street survelliance data

Hi, thanks for your great work!
I tried the pretrained models on BDD100K seg_track valset, the reuslts are the same with the demos.
But on our own videos (static cameras, 2560x1440), the detected objects are not enough. And some cars only got part bounded.

Is it just because the trained bdd dataset is different from our own data?
Do you have other preptrained models which is suitable for this kind of city street survelliance videos?
By the way, I don't know what is the difference between pcan_pretrained_model.pth and 'segtrack-fixed-new.pth' but they seem perform similar results. Thanks!

cannot reproduce the results in README

like what is showed in this image, after running the tools/test.py in your code, with these in README given config and pretrained weights, I cannot reproduce the results of the "Scores-val" in README. Did you also get this result on seg_track_val_cocoformat.json (which can be downloaded in the bdd100k website)? I am really confused about that.

Inference on custom dataset

Hey there, thank you for the great work!
Is there any inference script using which we can run the pre-trained model on custom video (without ground truth) and get the output for the MOTS task (preferably instance_id masks for each frame)? I tried using the demo script from MMTracking, but there were dependency issues for the versions of MMCV and MMDet while using PCAN.

problems encoutered while trying to train the model

When i use the bash script, the following error was encoutered:

ModuleNotFoundError: No module named 'mmcv._ext'

I follow the offical instruction ---- install the mmcv-full==1.2.7. However there seems to be some problem with this version
when i update the mmcv to higher version. I was told that this higher version is incompatible.
Or there are compatibility issues between MMCV and my torch or cuda versions

Here is my pip list:
addict 2.4.0
aliyun-python-sdk-core 2.14.0
aliyun-python-sdk-kms 2.16.2
annotated-types 0.6.0
bdd100k 1.0.1
boto3 1.34.44
botocore 1.34.44
certifi 2024.2.2
cffi 1.16.0
charset-normalizer 3.3.2
click 8.1.7
colorama 0.4.6
contourpy 1.1.1
crcmod 1.7
cryptography 42.0.3
cycler 0.12.1
Cython 0.29.33
filelock 3.13.1
fonttools 4.49.0
fsspec 2024.2.0
gmplot 1.4.1
idna 3.6
imageio 2.34.0
importlib-metadata 7.0.1
importlib-resources 6.1.1
Jinja2 3.1.3
jmespath 0.10.0
joblib 1.3.2
kiwisolver 1.4.5
lazy_loader 0.3
Markdown 3.5.2
markdown-it-py 3.0.0
MarkupSafe 2.1.5
matplotlib 3.7.5
mdurl 0.1.2
mmcv-full 1.2.7
mmdet 2.10.0
mmengine 0.10.3
mmpycocotools 12.0.3
model-index 0.1.11
motmetrics 1.4.0
mpmath 1.3.0
nanoid 2.0.0
networkx 3.1
numpy 1.24.4
opencv-python 4.9.0.80
opendatalab 0.0.10
openmim 0.3.9
openxlab 0.0.34
ordered-set 4.1.0
oss2 2.17.0
packaging 23.2
pandas 2.0.3
pcan 0.1.0+650969 /home/seulab/seulab_ssd/dcc/pcan-main
pillow 10.2.0
pip 23.3.1
platformdirs 4.2.0
plyfile 1.0.3
psutil 5.9.8
pycocotools 2.0.7
pycparser 2.21
pycryptodome 3.20.0
pydantic 2.6.1
pydantic_core 2.16.2
Pygments 2.17.2
pyparsing 3.1.1
python-dateutil 2.8.2
pytz 2023.4
PyWavelets 1.4.1
PyYAML 6.0.1
requests 2.28.2
rich 13.4.2
s3transfer 0.10.0
scalabel 0.3.1
scikit-image 0.21.0
scipy 1.10.1
setuptools 60.2.0
six 1.16.0
sympy 1.12
tabulate 0.9.0
termcolor 2.4.0
terminaltables 3.1.10
tifffile 2023.7.10
toml 0.10.2
tomli 2.0.1
torch 2.1.0+cu121
torchvision 0.16.0+cu121
tqdm 4.65.2
triton 2.1.0
typing_extensions 4.9.0
tzdata 2024.1
urllib3 1.26.18
wheel 0.41.2
xmltodict 0.13.0
yapf 0.40.2
zipp 3.17.0

pre-trained model for YouTube-VIS?

Was the pre-trained model provided trained on BDD100k? Could you also provide the model trained for YouTube-VIS? Thank you.

TypeError: EMQuasiDenseFasterRCNN: init() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

Thanks for your excellent work. While I was running the train.py , I ran into an error

Traceback (most recent call last):
  File "D:\ProgramData\Anaconda3\envs\PCAN\lib\site-packages\mmcv\utils\registry.py", line 179, in build_from_cfg
    return obj_cls(**args)
TypeError: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

During handling of the above exception, another exception occurred:
    raise type(e)(f'{obj_cls.__name__}: {e}')
TypeError: EMQuasiDenseFasterRCNN: __init__() missing 3 required positional arguments: 'channels', 'proto_num', and 'stage_num'

What is the possible solution for the problem?

KeyError: "BDDVideoDataset: 'video_id'"

Hello, I have encountered this issue. The data is using the pre converted dataset provided, but this error is reported. What should I do.

Pretrained model

dear author , this error occurred when I used the model you pprovided .

_pickle.UnpicklingError: A load persistent id instruction was encountered,
but no persistent_load function was specified.

How should I solve it . What other models can I use.

inference/demo api for pcan

Hello, I appreciate your great jobs! I'm a freshman on MOTS tasks. I'm wondering is there any inference.py (or demo.py), using which I can apply the pretrained weights on camera and get the output for the MOTS task (similar to the issue by kartikgupta2607 several days ago)?

TensorRT export

Looks like a good project to explore. You should export it to TensorRT for maximum performance.

syscv / pcan Goto Github PK

pcan's People

Contributors

Stargazers

Watchers

Forkers

pcan's Issues

TorchVision: 0.14.1+cu117 OpenCV: 4.8.0 MMCV: 1.7.1 pcan: 0.1.0+0650969

before_train_epoch: (VERY_HIGH ) StepLrUpdaterHook (NORMAL ) EvalHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook

before_train_iter: (VERY_HIGH ) StepLrUpdaterHook (LOW ) IterTimerHook

after_train_iter: (ABOVE_NORMAL) OptimizerHook (NORMAL ) CheckpointHook (NORMAL ) EvalHook (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook

after_train_epoch: (NORMAL ) CheckpointHook (NORMAL ) EvalHook (VERY_LOW ) TextLoggerHook

before_val_epoch: (LOW ) IterTimerHook (VERY_LOW ) TextLoggerHook

before_val_iter: (LOW ) IterTimerHook

after_val_iter: (LOW ) IterTimerHook

after_val_epoch: (VERY_LOW ) TextLoggerHook

after_run: (VERY_LOW ) TextLoggerHook