Comments (2)
The config file I used is:
1. data
dataset_type = 'WIDERFaceDataset'
data_root = 'data/WIDERFace/'
img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[1, 1, 1], to_rgb=True)
size_divisor = 32
data = dict(
samples_per_gpu=4,
workers_per_gpu=2,
train=dict(
typename=dataset_type,
ann_file=data_root + 'WIDER_train/train.txt',
img_prefix=data_root + 'WIDER_train/',
min_size=1,
offset=0,
pipeline=[
dict(typename='LoadImageFromFile', to_float32=True),
dict(typename='LoadAnnotations', with_bbox=True),
dict(typename='RandomSquareCrop',
crop_choice=[0.3, 0.45, 0.6, 0.8, 1.0]),
dict(
typename='PhotoMetricDistortion',
brightness_delta=32,
contrast_range=(0.5, 1.5),
saturation_range=(0.5, 1.5),
hue_delta=18),
dict(typename='RandomFlip', flip_ratio=0.5),
dict(typename='Resize', img_scale=(640, 640), keep_ratio=False),
dict(typename='Normalize', **img_norm_cfg),
dict(typename='DefaultFormatBundle'),
dict(typename='Collect', keys=['img', 'gt_bboxes',
'gt_labels', 'gt_bboxes_ignore']),
]),
val=dict(
typename=dataset_type,
ann_file=data_root + 'WIDER_val/val.txt',
img_prefix=data_root + 'WIDER_val/',
min_size=1,
offset=0,
pipeline=[
dict(typename='LoadImageFromFile'),
dict(
typename='MultiScaleFlipAug',
img_scale=(1100, 1650),
flip=False,
transforms=[
dict(typename='Resize', keep_ratio=True),
dict(typename='RandomFlip', flip_ratio=0.0),
dict(typename='Normalize', **img_norm_cfg),
dict(typename='Pad', size_divisor=32, pad_val=0),
dict(typename='ImageToTensor', keys=['img']),
dict(typename='Collect', keys=['img'])
])
]),
)
2. model
num_classes = 1
strides = [4, 8, 16, 32, 64, 128]
use_sigmoid = True
scales_per_octave = 3
ratios = [1.3]
num_anchors = scales_per_octave * len(ratios)
model = dict(
typename='SingleStageDetector',
backbone=dict(
typename='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
norm_cfg=dict(typename='BN'),
norm_eval=False,
dcn=None,
style='pytorch'),
neck=[
dict(
typename='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=0,
add_extra_convs='on_input',
num_outs=6,
norm_cfg=dict(typename='BN'),
upsample_cfg=dict(mode='bilinear')),
dict(
typename='Inception',
in_channel=256,
num_levels=6,
norm_cfg=dict(typename='BN'),
share=True)
],
head=dict(
typename='IoUAwareRetinaHead',
num_classes=num_classes,
num_anchors=num_anchors,
in_channels=256,
stacked_convs=4,
feat_channels=256,
norm_cfg=dict(typename='BN'),
use_sigmoid=use_sigmoid))
3. engines
meshgrid = dict(
typename='BBoxAnchorMeshGrid',
strides=strides,
base_anchor=dict(
typename='BBoxBaseAnchor',
octave_base_scale=2**(4 / 3),
scales_per_octave=scales_per_octave,
ratios=ratios,
base_sizes=strides))
bbox_coder = dict(
typename='DeltaXYWHBBoxCoder',
target_means=[.0, .0, .0, .0],
target_stds=[0.1, 0.1, 0.2, 0.2])
train_engine = dict(
typename='TrainEngine',
model=model,
criterion=dict(
typename='IoUBBoxAnchorCriterion',
num_classes=num_classes,
meshgrid=meshgrid,
bbox_coder=bbox_coder,
loss_cls=dict(
typename='FocalLoss',
use_sigmoid=use_sigmoid,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
reg_decoded_bbox=True,
loss_bbox=dict(typename='DIoULoss', loss_weight=2.0),
loss_iou=dict(
typename='CrossEntropyLoss',
use_sigmoid=True,
loss_weight=1.0),
train_cfg=dict(
assigner=dict(
typename='MaxIoUAssigner',
pos_iou_thr=0.35,
neg_iou_thr=0.35,
min_pos_iou=0.35,
ignore_iof_thr=-1,
gpu_assign_thr=100),
allowed_border=-1,
pos_weight=-1,
debug=False)),
optimizer=dict(typename='SGD', lr=3.75e-3, momentum=0.9, weight_decay=5e-4)) # 3 GPUS
3.2 val engine
val_engine = dict(
typename='ValEngine',
model=model,
meshgrid=meshgrid,
converter=dict(
typename='IoUBBoxAnchorConverter',
num_classes=num_classes,
bbox_coder=bbox_coder,
nms_pre=-1,
use_sigmoid=use_sigmoid),
num_classes=num_classes,
test_cfg=dict(
min_bbox_size=0,
score_thr=0.01,
nms=dict(typename='lb_nms', iou_thr=0.45),
max_per_img=-1),
use_sigmoid=use_sigmoid,
eval_metric=None)
hooks = [
dict(typename='OptimizerHook'),
dict(
typename='CosineRestartLrSchedulerHook',
periods=[30] * 21,
restart_weights=[1] * 21,
warmup='linear',
warmup_iters=500,
warmup_ratio=1e-1,
min_lr_ratio=1e-2),
dict(typename='EvalHook'),
dict(typename='SnapshotHook', interval=1, save_optim_flag=False, max_keep_ckpts=3),
dict(typename='LoggerHook', interval=100)
]
5. work modes
modes = ['train']#, 'val']
max_epochs = 630
6. checkpoint
weights = dict(
filepath='torchvision://resnet50',
prefix='backbone')
7. misc
seed = 1234
dist_params = dict(backend='nccl')
log_level = 'INFO'
from vedadet.
请问最后解决了吗
from vedadet.
Related Issues (20)
- How to get config of tinaface?
- Why not code directly based on mmdet?
- for the people error occured about "pip install -v -e ." HOT 4
- pip install error on windows
- how to change batch_size in vedadet? HOT 1
- AttributeError: 'ConfigDict' object has no attribute 'val_engine'
- what is the format of input data HOT 1
- Raspberry Pi
- Missing THC/THC.h HOT 3
- TinaFace for CocoDataset
- TTA
- There is no val.xml in WiderFace Anno.
- Cannot set parameter in SnapshotHook HOT 1
- Where to config save place for the finetuned model ?
- How to train with custom dataset by using the pretrained model? HOT 1
- ONNX Export HOT 3
- Running inference: ModuleNotFoundError: No module named 'vedacore' HOT 1
- Inference error using pretrained TinaFace model on my png file
- can't install mmpycocotools in docker HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vedadet.