ricosuaveguapo / sequence_digit_recognization Goto Github PK

View Code? Open in Web Editor NEW

0.0 1.0 1.0 11.57 MB

This repo presents a workflow of: auto-rotate, detect and recognize non-constant length English-number sequence.

Python 100.00%

cnn object-detection ctc-loss yolov4 lprnet

sequence_digit_recognization's Introduction

Sequence_Digit_Recognization

Status

Overall Process

- Auto-rotate
- Detection cropping
- Recognition
- Connect all the section
- Auxiliary methods (see Recognition Part in TODO)

2021/02/25 Results:
- Rotation Head : 98~100 %
- YOLO map: 96.8 -> 100
- Upper part: 92.77 -> 96.39 %
- Lower part: 92.77 -> 98.80 %
- Combination: 84.34 -> 90.36~92 %

Information

Note

data.csv: contains overall annotations

col: |ID|file_name|GT_1|GT_2|xmin|ymin|xmax|ymax|ymax_2|mode|
GT_1: ground truth of upper region
GT_2: ground truth of lower region
mode: 0/1/2: training/validation/testing

(xmin,ymin)  --------------------
              |   upper region   |
              -------------------- (xmax,ymax_2)
              |   lower region   |
              --------------------
                                  (xmax,ymax)

MultiplicativeNoise, RandomBrightness, GaussNoise data augmentation hurt LPRnet accuracy
Rewrite the decoding of LPRnet helps a lotttttttt!

How to Train

rotation model: train_rotation.py
- data in 'data/20201229/EXT/resize/new_image'
YOLO : yolov4/train.py
- data in 'yolov4/VOCdevkit/VOC2007'
lprnet : train_lprnet.py
- data in 'data/20201229/EXT/resize/new_image'
- will use the data.csv to crop the images

How to Inference

Rotation Model

Single image: check rotation part in output_pipeline.py
Evaluate the performance: check test_rotation.py

YOLO

Qualitative Result (bbox on images):Check yolov4/predict.py
Quantitative Result (Map, F1 and diagrams... etc.):
1. Modifing (change to your training result) the model path in yolov4/yolo.py
2. Run yolov4/get_dr_txt.py
3. Run yolov4/get_gt_txt.py
4. Run yolov4/get_map.py
5. Results will be saved in yolov4/results

LPRnet

Inference and image output is in test_lprnet.py
Drawing diagrams on validation set is in test_lprnet.py
Video output of the LPRnet result is in utils.py

TODO

Improvements

- Use cutmix augmentation (postpone)
- Shuffle the number sequence
- Test-time augmentation
- Use the pattern remove image

Other methods

Using SHVM dataset train yolo.

Finished

- Change image resolution
- Enlarge a bit of annotation in x direction
- Change to cosine scheduler
- Use different decoder
- Move to the upper region
- Split the data into two sets
1. One with 9 digits
2. One with 3 digits + whatever
- Rotation
- the rotation model is in 'model.py'
- the weight is in 'weights/rotation/acc_100.00_loss_0.747.pth'
- Result:
  - val : 98.21% (56 images)
  - test: 100 % (55 images)
YOLO
- - modify the xml files
  - check the code in EDA.ipynb (in "new annotation for yolo")
- - Put in the YOLO

================================== Decrypted ==================================

MTCNN

- combine two the regions
Add the src file to increase the features (-)
- write the metric
Refine the trainings (-)
Use the DFT (V)
- Apply to the data, check the gain (-)
kmean the bbox (-)

MTCNN Workflow

Might need to rotate the images first. In preprocessing.py
data_check
annotation data add in, also col note added
dataframe_creation
Below will be save in yourDefineName_EXT_clear_2data_mode_resize.csv ID file_name GT_1 GT_2 xmin ymin xmax ymax ymax_2 mode 0 0 20201229080315_0EXT.bmp 389708 102MV 237 756 1267 1336 1046 0 1 1 20201229080446_0EXT.bmp 389708 207V 257 553 1318 1143 848 1 ...
pnet_traindata for pnet data
assemble_split assemble image list
train Pnet in train_pnet.py
onet_traindata
assemble_split again
train_onet.py
MTCNN_evaluation.py can give qualitative result
mtcnn_metric.py gives the quantitative resulrt