Giter VIP home page Giter VIP logo

mmocr's Introduction

English | 简体中文

Latest Updates

The default branch is now main and the code on the branch has been upgraded to v1.0.0. The old main branch (v0.6.3) code now exists on the 0.x branch. If you have been using the main branch and encounter upgrade issues, please read the Migration Guide and notes on Branches .

v1.0.0 was released in 2023-04-06. Major updates from 1.0.0rc6 include:

  1. Support for SCUT-CTW1500, SynthText, and MJSynth datasets in Dataset Preparer
  2. Updated FAQ and documentation
  3. Deprecation of file_client_args in favor of backend_args
  4. Added a new MMOCR tutorial notebook

To know more about the updates in MMOCR 1.0, please refer to What's New in MMOCR 1.x, or Read Changelog for more details!

Introduction

MMOCR is an open-source toolbox based on PyTorch and mmdetection for text detection, text recognition, and the corresponding downstream tasks including key information extraction. It is part of the OpenMMLab project.

The main branch works with PyTorch 1.6+.

Major Features

  • Comprehensive Pipeline

    The toolbox supports not only text detection and text recognition, but also their downstream tasks such as key information extraction.

  • Multiple Models

    The toolbox supports a wide variety of state-of-the-art models for text detection, text recognition and key information extraction.

  • Modular Design

    The modular design of MMOCR enables users to define their own optimizers, data preprocessors, and model components such as backbones, necks and heads as well as losses. Please refer to Overview for how to construct a customized model.

  • Numerous Utilities

    The toolbox provides a comprehensive set of utilities which can help users assess the performance of models. It includes visualizers which allow visualization of images, ground truths as well as predicted bounding boxes, and a validation tool for evaluating checkpoints during training. It also includes data converters to demonstrate how to convert your own data to the annotation files which the toolbox supports.

Installation

MMOCR depends on PyTorch, MMEngine, MMCV and MMDetection. Below are quick steps for installation. Please refer to Install Guide for more detailed instruction.

conda create -n open-mmlab python=3.8 pytorch=1.10 cudatoolkit=11.3 torchvision -c pytorch -y
conda activate open-mmlab
pip3 install openmim
git clone https://github.com/open-mmlab/mmocr.git
cd mmocr
mim install -e .

Get Started

Please see Quick Run for the basic usage of MMOCR.

Supported algorithms:

BackBone
Text Detection
Text Recognition
Key Information Extraction
Text Spotting

Please refer to model_zoo for more details.

Projects

Here are some implementations of SOTA models and solutions built on MMOCR, which are supported and maintained by community users. These projects demonstrate the best practices based on MMOCR for research and product development. We welcome and appreciate all the contributions to OpenMMLab ecosystem.

Contributing

We appreciate all contributions to improve MMOCR. Please refer to CONTRIBUTING.md for the contributing guidelines.

Acknowledgement

MMOCR is an open-source project that is contributed by researchers and engineers from various colleges and companies. We appreciate all the contributors who implement their methods or add new features, as well as users who give valuable feedbacks. We hope the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new OCR methods.

Citation

If you find this project useful in your research, please consider cite:

@article{mmocr2021,
    title={MMOCR:  A Comprehensive Toolbox for Text Detection, Recognition and Understanding},
    author={Kuang, Zhanghui and Sun, Hongbin and Li, Zhizhong and Yue, Xiaoyu and Lin, Tsui Hin and Chen, Jianyong and Wei, Huaqiang and Zhu, Yiqin and Gao, Tong and Zhang, Wenwei and Chen, Kai and Zhang, Wayne and Lin, Dahua},
    journal= {arXiv preprint arXiv:2108.06543},
    year={2021}
}

License

This project is released under the Apache 2.0 license.

OpenMMLab Family

  • MMEngine: OpenMMLab foundational library for training deep learning models
  • MMCV: OpenMMLab foundational library for computer vision.
  • MIM: MIM installs OpenMMLab packages.
  • MMClassification: OpenMMLab image classification toolbox and benchmark.
  • MMDetection: OpenMMLab detection toolbox and benchmark.
  • MMDetection3D: OpenMMLab's next-generation platform for general 3D object detection.
  • MMRotate: OpenMMLab rotated object detection toolbox and benchmark.
  • MMSegmentation: OpenMMLab semantic segmentation toolbox and benchmark.
  • MMOCR: OpenMMLab text detection, recognition, and understanding toolbox.
  • MMPose: OpenMMLab pose estimation toolbox and benchmark.
  • MMHuman3D: OpenMMLab 3D human parametric model toolbox and benchmark.
  • MMSelfSup: OpenMMLab self-supervised learning toolbox and benchmark.
  • MMRazor: OpenMMLab model compression toolbox and benchmark.
  • MMFewShot: OpenMMLab fewshot learning toolbox and benchmark.
  • MMAction2: OpenMMLab's next-generation action understanding toolbox and benchmark.
  • MMTracking: OpenMMLab video perception toolbox and benchmark.
  • MMFlow: OpenMMLab optical flow toolbox and benchmark.
  • MMEditing: OpenMMLab image and video editing toolbox.
  • MMGeneration: OpenMMLab image and video generative models toolbox.
  • MMDeploy: OpenMMLab model deployment framework.

Welcome to the OpenMMLab community

Scan the QR code below to follow the OpenMMLab team's Zhihu Official Account and join the OpenMMLab team's QQ Group, or join the official communication WeChat group by adding the WeChat, or join our Slack

We will provide you with the OpenMMLab community

  • 📢 share the latest core technologies of AI frameworks
  • 💻 Explaining PyTorch common module source Code
  • 📰 News related to the release of OpenMMLab
  • 🚀 Introduction of cutting-edge algorithms developed by OpenMMLab 🏃 Get the more efficient answer and feedback
  • 🔥 Provide a platform for communication with developers from all walks of life

The OpenMMLab community looks forward to your participation! 👬

mmocr's People

Contributors

2793145003 avatar allentdan avatar beyondyourself avatar cuhk-hbsun avatar doem97 avatar frankstorming avatar gaotongxiao avatar garvan2021 avatar harold-lkk avatar hegelim avatar holycrap96 avatar hqwei avatar hugotong6425 avatar innerlee avatar jeffreykuang avatar jorie-peng avatar jyshee avatar kevinnunu avatar maxbachmann avatar mountchicken avatar protossdragoon avatar quincylin1 avatar rangilyu avatar samayala22 avatar sbugallo avatar vansin avatar willpat1213 avatar xinke-wang avatar yuexy avatar zyq-scut avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mmocr's Issues

Discussion for April Plan

Here is a list of candidate topics that we could focus on in April. We will discuss and select a portion of them in a meeting later on. Feel free to add or comment on :)

Installation

Documentation

  • Tutorials on fine-tuning/training on user's data
  • Chinese documentation @quincylin1

Demo

Auxiliary Tools

  • Log analysis
  • Model complexity
  • onnx
  • config printer
  • model publishing

Benchmark

  • benchmarks with paddleocr, chineseocr, easyocr, chineseocr_lite

Multilingual

  • General discussion

Datasets

  • Improve preparation scripts & documents

More Algorithms

Build on windows 10 error

I solve some errors when build it on windows10,but i can not get through follow error:

mmocr/models/textdet/postprocess/pan.cpp(45): error C2131: 表达式的计算结果不是常数
mmocr/models/textdet/postprocess/pan.cpp(45): note: 因读取超过生命周期的变量而失败
mmocr/models/textdet/postprocess/pan.cpp(45): note: 请参见“label_num”的用法
mmocr/models/textdet/postprocess/pan.cpp(86): error C2075: “kernel_cv”: 初始化需要带括号的初始值设定项列表
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\bin\HostX86\x64\cl.exe' failed with exit status 2

Error report: Textsnake for generating ground truth targets.

When training the textsnake for my own data, issue randomly came out from

while current_line_len >= length_cumsum[current_edge_ind + 1]:

Same issue here, princewang1994/TextSnake.pytorch#30

line 205, in resample_line
    while current_line_len >= length_cumsum[current_edge_ind + 1]:
IndexError: index 2 is out of bounds for axis 0 with size 2

I'm trying to figure out, thanks a lot if you can help with that.

Iteration Plan April-May 2021

This captures the work we planned for the next month, namely, from mid-April to mid-May (5/15). We will improve the user experience including installation, documentation and demonstration. Besides, more algorithms will be supported. Details are

Installation

Documentation

Demo

Benchmark

More Algorithms


Deferred

If you got ideas on what is intersting to implement next, feel free to reply below, or request features here. Happy Research!

How to train KIE Model with custom dataset?

I see that there are currently 26 classes in wildreceipt/class_list.txt.
I am looking to train the model with only 4 classes and the dataset is prepared in the same structure as in wildreceipt/.

How do I now train the model with this custom dataset. In the documentation I only found instructions to train Detection & Recognition models.

Simplify the installation procedure

Describe the feature

Motivation
mmocr installation needs to compile c++ and cuda code.
We'd better moving the c++/cuda functions or ops to mmcv.

  1. assign_pixels in panet
  2. estimate_text_confidence in panet
  3. get_pixel_num in panet
  4. pse in psenet
  5. rroi align

Related resources
Nil

Additional context
Nil

If you would like to implement the feature and create a PR, please leave a comment here and that would be much appreciated.

Check warnings

When testing, there are some warns from wrapper.py

image

Pls double check if the code is rigorous

Webcam demo script is not working properly

Checklist

  1. I have searched related issues but cannot get the expected help: Yes
  2. The bug has not been fixed in the latest version: Yes

Describe the bug

The current model_inference function expects to receive a model and a path to an image as inputs but the webcam demo scripts tries to call it with a model and a numpy array (the return value from cv2.VideoCapture.read()).

This raises an assertion error due to type mismatch (np.ndarray vs str)

Reproduction

  1. What command or script did you run?
python demo/webcam_demo.py
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

No.

  1. What dataset did you use?

Environment

  1. Please run python mmocr/utils/collect_env.py to collect necessary environment information and paste it here.

sys.platform: linux
Python: 3.7.10 | packaged by conda-forge | (default, Feb 19 2021, 16:07:37) [GCC 9.3.0]
CUDA available: True
GPU 0: GeForce GTX 1050 Ti
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.1.TC455_06.29190527_0
GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
PyTorch: 1.5.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.4 Product Build 20200917 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0a0+82fd1c8
OpenCV: 4.5.1
MMCV: 1.2.7
MMCV Compiler: GCC 9.3
MMCV CUDA Compiler: not available
MMOCR: 0.1.0+344cc9a

  1. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch: conda

Error traceback

Use load_from_local loader
Press "Esc", "q" or "Q" to exit.
Traceback (most recent call last):
  File "demo/webcam_demo.py", line 52, in <module>
    main()
  File "demo/webcam_demo.py", line 41, in main
    result = model_inference(model, img)
  File "/home/sbugallo/Projects/mmocr/mmocr/apis/inference.py", line 18, in model_inference
    assert isinstance(img, str)
AssertionError

Bug fix

The inference method should accept the following types as input image(s) (str/ndarray or list[str/ndarray] or tuple[str/ndarray]) like in MMDetection

pip install mmdet==2.9.0 error!

pip install mmpycocotools                                                                                                                                                                                                                                   
Collecting mmpycocotools                                                                                                                                                                                                                                      
  Using cached mmpycocotools-12.0.3.tar.gz (23 kB)                                                                                                                                                                                                            
Requirement already satisfied: setuptools>=18.0 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from mmpycocotools) (52.0.0.post20210125)                                                                                                      
Requirement already satisfied: cython>=0.27.3 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from mmpycocotools) (0.29.23)                                                                                                                    
Requirement already satisfied: matplotlib>=2.1.0 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from mmpycocotools) (3.4.1)                                                                                                                   
Requirement already satisfied: numpy>=1.16 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from matplotlib>=2.1.0->mmpycocotools) (1.19.2)                                                                                                     
Requirement already satisfied: cycler>=0.10 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from matplotlib>=2.1.0->mmpycocotools) (0.10.0)                                                                                                    
Requirement already satisfied: pyparsing>=2.2.1 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from matplotlib>=2.1.0->mmpycocotools) (2.4.7)                                                                                                 
Requirement already satisfied: python-dateutil>=2.7 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from matplotlib>=2.1.0->mmpycocotools) (2.8.1)                                                                                             
Requirement already satisfied: pillow>=6.2.0 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from matplotlib>=2.1.0->mmpycocotools) (8.2.0)                                                                                                    
Requirement already satisfied: kiwisolver>=1.0.1 in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from matplotlib>=2.1.0->mmpycocotools) (1.3.1)                                                                                                
Requirement already satisfied: six in /home/.conda/envs/mmdet/lib/python3.7/site-packages (from cycler>=0.10->matplotlib>=2.1.0->mmpycocotools) (1.15.0)                                                                                               
Building wheels for collected packages: mmpycocotools                                                                                                                                                                                                         
  Building wheel for mmpycocotools (setup.py) ... error                                                                                                                                                                                                       
  ERROR: Command errored out with exit status 1:                                                                                                                                                                                                              
   command: /home/.conda/envs/mmdet/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-8hm7nhkz/mmpycocotools_09abb22fd16c43d99bf834c6e81dbdef/setup.py'"'"'; __file__='"'"'/tmp/pip-install-8hm7nhkz/mmpycocotool
s_09abb22fd16c43d99bf834c6e81dbdef/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-0tghkhjj      
       cwd: /tmp/pip-install-8hm7nhkz/mmpycocotools_09abb22fd16c43d99bf834c6e81dbdef/                                                                                                                                                                         
  Complete output (24 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.7
  creating build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/__init__.py -> build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/mask.py -> build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/coco.py -> build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/cocoeval.py -> build/lib.linux-x86_64-3.7/pycocotools
  running build_ext
  cythoning pycocotools/_mask.pyx to pycocotools/_mask.c
  /home/.conda/envs/mmdet/lib/python3.7/site-packages/Cython/Compiler/Main.py:369: FutureWarning: Cython directive 'language_level' not set, using 2 for now (Py2). This will change in a later release! File: /tmp/pip-install-8hm7nhkz/mmpycocotools$09abb22fd16c43d99bf834c6e81dbdef/pycocotools/_mask.pyx
    tree = Parsing.p_module(s, pxd, full_module_name)
  building 'pycocotools._mask' extension
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/common
  creating build/temp.linux-x86_64-3.7/pycocotools
  gcc -pthread -B /home/.conda/envs/mmdet/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/.conda/envs/mmdet/lib/python3.7/site-packages/numpy/core/include -Icommon -I/home/haor$n/.conda/envs/mmdet/include/python3.7m -c common/maskApi.c -o build/temp.linux-x86_64-3.7/common/maskApi.o
  common/maskApi.c:8:10: fatal error: math.h: No such file or directory
   #include <math.h>
            ^~~~~~~~
  compilation terminated.
  error: command 'gcc' failed with exit status 1
  ----------------------------------------
ERROR: Failed building wheel for mmpycocotools
  Running setup.py clean for mmpycocotools
Failed to build mmpycocotools
Installing collected packages: mmpycocotools
    Running setup.py install for mmpycocotools ... error
    ERROR: Command errored out with exit status 1:
     command: /home/.conda/envs/mmdet/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-8hm7nhkz/mmpycocotools_09abb22fd16c43d99bf834c6e81dbdef/setup.py'"'"'; __file__='"'"'/tmp/pip-install-8hm7nhkz/mmpycocoto
ols_09abb22fd16c43d99bf834c6e81dbdef/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install --record /tmp/pip-record-9e4fvtcn/
install-record.txt --single-version-externally-managed --compile --install-headers /home/.conda/envs/mmdet/include/python3.7m/mmpycocotools
         cwd: /tmp/pip-install-8hm7nhkz/mmpycocotools_09abb22fd16c43d99bf834c6e81dbdef/
    Complete output (22 lines):
    running install
    running build
    running build_py
    creating build
    creating build/lib.linux-x86_64-3.7
    creating build/lib.linux-x86_64-3.7/pycocotools
    copying pycocotools/__init__.py -> build/lib.linux-x86_64-3.7/pycocotools
    copying pycocotools/mask.py -> build/lib.linux-x86_64-3.7/pycocotools
    copying pycocotools/coco.py -> build/lib.linux-x86_64-3.7/pycocotools
    copying pycocotools/cocoeval.py -> build/lib.linux-x86_64-3.7/pycocotools
    running build_ext
    skipping 'pycocotools/_mask.c' Cython extension (up-to-date)
    building 'pycocotools._mask' extension
    creating build/temp.linux-x86_64-3.7
    creating build/temp.linux-x86_64-3.7/common
    creating build/temp.linux-x86_64-3.7/pycocotools
    gcc -pthread -B /home/.conda/envs/mmdet/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/.conda/envs/mmdet/lib/python3.7/site-packages/numpy/core/include -Icommon -I/home/hao
ran/.conda/envs/mmdet/include/python3.7m -c common/maskApi.c -o build/temp.linux-x86_64-3.7/common/maskApi.o
    common/maskApi.c:8:10: fatal error: math.h: No such file or directory
     #include <math.h>
              ^~~~~~~~
    compilation terminated.
    error: command 'gcc' failed with exit status 1
    ----------------------------------------
ERROR: Command errored out with exit status 1: /home/.conda/envs/mmdet/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-8hm7nhkz/mmpycocotools_09abb22fd16c43d99bf834c6e81dbdef/setup.py'"'"'; __file__='"'"'/tm
p/pip-install-8hm7nhkz/mmpycocotools_09abb22fd16c43d99bf834c6e81dbdef/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' install -
-record /tmp/pip-record-9e4fvtcn/install-record.txt --single-version-externally-managed --compile --install-headers /home/.conda/envs/mmdet/include/python3.7m/mmpycocotools Check the logs for full command output.

how to solve this problem? install pycocotools and mmdet error occured.

The hmean of DBnet in ICDAR15 is lower than which in paper

I run the script and get the H-mean of DBnet in ICDAR15. I find my H-mean is 0.8343, while it is 0.873 in the original paper(the model in the paper is DB-ResNet-50 (1152)).

Are the experiment settings different between them?

The script I use is "dbnet_r50dcnv2_fpnc_1200e_icdar2015", I have already downloaded the synthtext pretrained model.

ERROR: Failed building wheel for mmpycocotools

(open-mmlab) home@home-lnx:~$ pip install mmdet==2.9.0
Collecting mmdet==2.9.0
  Downloading mmdet-2.9.0-py3-none-any.whl (536 kB)
     |████████████████████████████████| 536 kB 267 kB/s 
Requirement already satisfied: six in ./anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.9.0) (1.15.0)
Collecting mmpycocotools
  Downloading mmpycocotools-12.0.3.tar.gz (23 kB)
Requirement already satisfied: numpy in ./anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmdet==2.9.0) (1.19.2)
Collecting matplotlib
  Using cached matplotlib-3.4.1-cp37-cp37m-manylinux1_x86_64.whl (10.3 MB)
Collecting terminaltables
  Downloading terminaltables-3.1.0.tar.gz (12 kB)
Collecting pyparsing>=2.2.1
  Using cached pyparsing-2.4.7-py2.py3-none-any.whl (67 kB)
Collecting python-dateutil>=2.7
  Using cached python_dateutil-2.8.1-py2.py3-none-any.whl (227 kB)
Requirement already satisfied: pillow>=6.2.0 in ./anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from matplotlib->mmdet==2.9.0) (8.2.0)
Collecting kiwisolver>=1.0.1
  Using cached kiwisolver-1.3.1-cp37-cp37m-manylinux1_x86_64.whl (1.1 MB)
Collecting cycler>=0.10
  Using cached cycler-0.10.0-py2.py3-none-any.whl (6.5 kB)
Requirement already satisfied: setuptools>=18.0 in ./anaconda3/envs/open-mmlab/lib/python3.7/site-packages (from mmpycocotools->mmdet==2.9.0) (52.0.0.post20210125)
Collecting cython>=0.27.3
  Downloading Cython-0.29.23-cp37-cp37m-manylinux1_x86_64.whl (2.0 MB)
     |████████████████████████████████| 2.0 MB 338 kB/s 
Building wheels for collected packages: mmpycocotools, terminaltables
  Building wheel for mmpycocotools (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: /home/home/anaconda3/envs/open-mmlab/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-0mks2l7w/mmpycocotools_8f3c4e7f03764bc4a1bf38714941bf23/setup.py'"'"'; __file__='"'"'/tmp/pip-install-0mks2l7w/mmpycocotools_8f3c4e7f03764bc4a1bf38714941bf23/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-unv6s5d5
       cwd: /tmp/pip-install-0mks2l7w/mmpycocotools_8f3c4e7f03764bc4a1bf38714941bf23/
  Complete output (63 lines):
  running bdist_wheel
  running build
  running build_py
  creating build
  creating build/lib.linux-x86_64-3.7
  creating build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/mask.py -> build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/__init__.py -> build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/cocoeval.py -> build/lib.linux-x86_64-3.7/pycocotools
  copying pycocotools/coco.py -> build/lib.linux-x86_64-3.7/pycocotools
  running build_ext
  building 'pycocotools._mask' extension
  creating build/temp.linux-x86_64-3.7
  creating build/temp.linux-x86_64-3.7/common
  creating build/temp.linux-x86_64-3.7/pycocotools
  gcc -pthread -B /home/home/anaconda3/envs/open-mmlab/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/numpy/core/include -Icommon -I/home/home/anaconda3/envs/open-mmlab/include/python3.7m -c common/maskApi.c -o build/temp.linux-x86_64-3.7/common/maskApi.o
  common/maskApi.c: In function ‘rleDecode’:
  common/maskApi.c:46:7: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation]
         for( k=0; k<R[i].cnts[j]; k++ ) *(M++)=v; v=!v; }}
         ^~~
  common/maskApi.c:46:49: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘for’
         for( k=0; k<R[i].cnts[j]; k++ ) *(M++)=v; v=!v; }}
                                                   ^
  common/maskApi.c: In function ‘rleFrPoly’:
  common/maskApi.c:166:3: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation]
     for(j=0; j<k; j++) x[j]=(int)(scale*xy[j*2+0]+.5); x[k]=x[0];
     ^~~
  common/maskApi.c:166:54: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘for’
     for(j=0; j<k; j++) x[j]=(int)(scale*xy[j*2+0]+.5); x[k]=x[0];
                                                        ^
  common/maskApi.c:167:3: warning: this ‘for’ clause does not guard... [-Wmisleading-indentation]
     for(j=0; j<k; j++) y[j]=(int)(scale*xy[j*2+1]+.5); y[k]=y[0];
     ^~~
  common/maskApi.c:167:54: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘for’
     for(j=0; j<k; j++) y[j]=(int)(scale*xy[j*2+1]+.5); y[k]=y[0];
                                                        ^
  common/maskApi.c: In function ‘rleToString’:
  common/maskApi.c:212:7: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
         if(more) c |= 0x20; c+=48; s[p++]=c;
         ^~
  common/maskApi.c:212:27: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
         if(more) c |= 0x20; c+=48; s[p++]=c;
                             ^
  common/maskApi.c: In function ‘rleFrString’:
  common/maskApi.c:220:3: warning: this ‘while’ clause does not guard... [-Wmisleading-indentation]
     while( s[m] ) m++; cnts=malloc(sizeof(uint)*m); m=0;
     ^~~~~
  common/maskApi.c:220:22: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘while’
     while( s[m] ) m++; cnts=malloc(sizeof(uint)*m); m=0;
                        ^~~~
  common/maskApi.c:228:5: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
       if(m>2) x+=(long) cnts[m-2]; cnts[m++]=(uint) x;
       ^~
  common/maskApi.c:228:34: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
       if(m>2) x+=(long) cnts[m-2]; cnts[m++]=(uint) x;
                                    ^~~~
  common/maskApi.c: In function ‘rleToBbox’:
  common/maskApi.c:141:31: warning: ‘xp’ may be used uninitialized in this function [-Wmaybe-uninitialized]
         if(j%2==0) xp=x; else if(xp<x) { ys=0; ye=h-1; }
                                 ^
  gcc -pthread -B /home/home/anaconda3/envs/open-mmlab/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/numpy/core/include -Icommon -I/home/home/anaconda3/envs/open-mmlab/include/python3.7m -c pycocotools/_mask.c -o build/temp.linux-x86_64-3.7/pycocotools/_mask.o
  gcc: error: pycocotools/_mask.c: No such file or directory
  error: command 'gcc' failed with exit status 1
  ----------------------------------------
  ERROR: Failed building wheel for mmpycocotools
  Running setup.py clean for mmpycocotools
  Building wheel for terminaltables (setup.py) ... done
  Created wheel for terminaltables: filename=terminaltables-3.1.0-py3-none-any.whl size=15355 sha256=49a8c9af3f2788e6d73d9390bcfea6fb71d0c6d2b6da6db619bf77cfdd85e91a
  Stored in directory: /home/home/.cache/pip/wheels/ba/ad/c8/2d98360791161cd3db6daf6b5e730f34021fc9367d5879f497
Successfully built terminaltables
Failed to build mmpycocotools
Installing collected packages: python-dateutil, pyparsing, kiwisolver, cycler, matplotlib, cython, terminaltables, mmpycocotools, mmdet
    Running setup.py install for mmpycocotools ... done
  DEPRECATION: mmpycocotools was installed using the legacy 'setup.py install' method, because a wheel could not be built for it. A possible replacement is to fix the wheel build issue reported above. You can find discussion regarding this at https://github.com/pypa/pip/issues/8368.
Successfully installed cycler-0.10.0 cython-0.29.23 kiwisolver-1.3.1 matplotlib-3.4.1 mmdet-2.9.0 mmpycocotools-12.0.3 pyparsing-2.4.7 python-dateutil-2.8.1 terminaltables-3.1.0

DBNet, ICDAR2015, dist_test, get h-mean 0.00

The demo part is fine with the pretrained weight, but wehn I try for evaluation, the results turn to be wrong.

./tools/dist_test.sh configs/textdet/dbnet/dbnet_r50dcnv2_fpnc_1200e_icdar2015.py checkpoints/dbnet_r50dcnv2_fpnc_sbn_2e_synthtext_20210325-aa96e477.pth 8 --eval hmean-iou

image

tools/data/textdet/icdar_voncerter.py is really slow.

Thanks for your great work.
Here I came across an efficiency problem.
I'm trying to use the tools to convert my own dataset into coco format for text detection. It contains millions of images.
After playing with it for a while, I find out this line is extremely slow.

drop_orientation(f) if is_not_png(f) else f for f in imgs_list

Do you have any ideas to make it faster? For my machine, the default backen for mmcv.imread is "cv2", but for 3500 images it takes a really long time (more than 10 mins) to read the images. I think it is helpful to read the images in paralle. Do you have any plan to update it? Thanks.

运行关键信息抽取网络sdmgr,测试脚本出错

您好,我在运行信息抽取网络的脚本时时,运行命令为tools/kie_test_imgs.sh configs/kie/sdmgr/sdmgr_unet16_60e_wildreceipt.py checkpoints/sdmgr_unet16_60e_wildreceipt_20210405-16a47642.pth checkpoints/,数据集和权重文件路径如下:

(ocr) zhoazj@zhoazj-ThinkPad-P15v-Gen-1:~/Desktop/codes/projects/github/mmocr$ tree data/ -L 2
data/
├── wildreceipt
│   ├── class_list.txt
│   ├── dict.txt
│   ├── image_files
│   ├── test.txt
│   └── train.txt
└── wildreceipt.tar

2 directories, 5 files
(ocr) zhoazj@zhoazj-ThinkPad-P15v-Gen-1:~/Desktop/codes/projects/github/mmocr$ tree checkpoints/
checkpoints/
├── _2021-04-18_22-01-30
├── crnn_academic-a723a1c5.pth
├── dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth
└── sdmgr_unet16_60e_wildreceipt_20210405-16a47642.pth

1 directory, 3 files

然后,出现报错信息如下:

Use load_from_local loader
[                                                  ] 0/472, elapsed: 0s, ETA:Traceback (most recent call last):
  File "tools/kie_test_imgs.py", line 108, in <module>
    main()
  File "tools/kie_test_imgs.py", line 104, in main
    test(model, data_loader, args.show, args.show_dir)
  File "tools/kie_test_imgs.py", line 23, in test
    result = model(return_loss=False, rescale=True, **data)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 40, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
    return old_func(*args, **kwargs)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 183, in forward
    return self.forward_test(img, img_metas, **kwargs)
  File "/home/zhoazj/Desktop/codes/projects/github/mmocr/mmocr/models/kie/extractors/sdmgr.py", line 82, in forward_test
    x = self.extract_feat(img, gt_bboxes)
  File "/home/zhoazj/Desktop/codes/projects/github/mmocr/mmocr/models/kie/extractors/sdmgr.py", line 93, in extract_feat
    x = super().extract_feat(img)[-1]
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/mmdet/models/detectors/single_stage.py", line 54, in extract_feat
    x = self.backbone(img)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/Desktop/codes/projects/github/mmocr/mmocr/models/common/backbones/unet.py", line 480, in forward
    x = enc(x)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/Desktop/codes/projects/github/mmocr/mmocr/models/common/backbones/unet.py", line 181, in forward
    out = self.convs(x)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/container.py", line 100, in forward
    input = module(input)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/mmcv/cnn/bricks/conv_module.py", line 198, in forward
    x = self.conv(x)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 349, in forward
    return self._conv_forward(input, self.weight)
  File "/home/zhoazj/anaconda3/envs/ocr/lib/python3.8/site-packages/torch/nn/modules/conv.py", line 345, in _conv_forward
    return F.conv2d(input, weight, self.bias, self.stride,
RuntimeError: Expected 4-dimensional input for 4-dimensional weight [16, 3, 3, 3], but got 5-dimensional input of size [1, 1, 3, 512, 512] instead

请问,是我的操作哪里出了问题吗?望解答,谢谢

Why not produce a pipeline for text detection and channel recognition

Describe the feature

Motivation
A clear and concise description of the motivation of the feature.
Ex1. It is inconvenient when [....].
Ex2. There is a recent paper [....], which is very helpful for [....].

Related resources
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful.

Additional context
Add any other context or screenshots about the feature request here.
If you would like to implement the feature and create a PR, please leave a comment here and that would be much appreciated.

Help needed regarding Text recognition.

Can someone help me on how to perform Text Recognition?
I am using (for example) the following image.
test1

And getting this result.
test

How can I get the desired output. Any kind of help would be appreciated. Thanks

Stuck when predicting using Pan

It gets stuck when predicting using Pan on this image, and doesn't predict:

python demo/image_demo.py demo/2.jpg configs/textdet/panet/panet_r18_fpem_ffm_600e_ctw1500.py ./panet_r18_fpem_ffm_sbn_600e_ctw1500_20210219-3b3a9aa3.pth demo/output.jpg

2

torch.jit.trace error: hope to trace dbnet_r18 model

Hi
I hope to jit.trace the dbnet_r18 model, meet the below error:

mmocr# python demo/image_demo.py demo/demo_text_det.jpg configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py  ./py_DBNet_r18/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth  demo/demo_text_det_pred.jpg
Traceback (most recent call last):
  File "demo/image_demo.py", line 52, in <module>
    main()
  File "demo/image_demo.py", line 36, in main
    script_fun = torch.jit.trace(model, tmp_in)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 875, in trace
    check_tolerance, _force_outplace, _module_class)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1021, in trace_module
    module = make_module(mod, _module_class, _compilation_unit)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 720, in make_module
    return _module_class(mod, _compilation_unit=_compilation_unit)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1884, in __init__
    tmp_module._modules[name] = make_module(submodule, TracedModule, _compilation_unit=None)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 720, in make_module
    return _module_class(mod, _compilation_unit=_compilation_unit)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1884, in __init__
    tmp_module._modules[name] = make_module(submodule, TracedModule, _compilation_unit=None)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 720, in make_module
    return _module_class(mod, _compilation_unit=_compilation_unit)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 1884, in __init__
    tmp_module._modules[name] = make_module(submodule, TracedModule, _compilation_unit=None)
  File "/opt/conda/lib/python3.7/site-packages/torch/jit/__init__.py", line 703, in make_module
    elif torch._jit_internal.module_has_exports(mod):
  File "/opt/conda/lib/python3.7/site-packages/torch/_jit_internal.py", line 438, in module_has_exports
    item = getattr(mod, name)
  File "/opt/conda/lib/python3.7/site-packages/torch/nn/modules/module.py", line 594, in __getattr__
    type(self).__name__, name))
AttributeError: 'ConvModule' object has no attribute 'norm'

I have modified the image_demo.py as below :

from argparse import ArgumentParser

import mmcv

from mmdet.apis import init_detector
from mmocr.apis.inference import model_inference
from mmocr.datasets import build_dataset  # noqa: F401
from mmocr.models import build_detector  # noqa: F401
import torch

def main():
    parser = ArgumentParser()
    parser.add_argument('img', help='Image file.')
    parser.add_argument('config', help='Config file.')
    parser.add_argument('checkpoint', help='Checkpoint file.')
    parser.add_argument('save_path', help='Path to save visualized image.')
    parser.add_argument(
        '--device', default='cpu', help='Device used for inference.')
    parser.add_argument(
        '--imshow',
        action='store_true',
        help='Whether show image with OpenCV.')
    args = parser.parse_args()

    # build the model from a config file and a checkpoint file
    model = init_detector(args.config, args.checkpoint, device=args.device)
    if model.cfg.data.test['type'] == 'ConcatDataset':
        model.cfg.data.test.pipeline = model.cfg.data.test['datasets'][
            0].pipeline

    device = torch.device('cpu')
    chkpt=torch.load("./py_DBNet_r18/dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth", map_location=device)
    model.load_state_dict(chkpt['state_dict'], strict=False)

    tmp_in = torch.rand(64,3,7,7)
    script_fun = torch.jit.trace(model, tmp_in)
    script_fun.save("./py_DBNet_r18/dbnet_r18_fpnc_trace.pt")

    # test a single image
    result =  model_inference(model, args.img)
    print(f'result: {result}')

    # show the results
    img = model.show_result(args.img, result, out_file=None, show=False)

    mmcv.imwrite(img, args.save_path)
    if args.imshow:
        mmcv.imshow(img, 'predicted results')


if __name__ == '__main__':
    main()

libGL.so.1 not found

after installing the official docker image, when import cv2 ,
ImportError: libGL.so.1: cannot open shared object file: No such file or directory
pls fix it.

numpy.ndarray size changed in latest master

when running detection using the latest master branch, which is installed on a new environment using the latest install.md:

(mmdet) home@home-lnx:~/programs/mmocr$ python demo/image_demo.py demo/demo_text_det.jpg ./configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py ./dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth demo/demo_text_det_pred.jpg
Traceback (most recent call last):
  File "demo/image_demo.py", line 5, in <module>
    from mmdet.apis import init_detector
  File "/home/home/programs/mmdetection/mmdet/apis/__init__.py", line 1, in <module>
    from .inference import (async_inference_detector, inference_detector,
  File "/home/home/programs/mmdetection/mmdet/apis/inference.py", line 10, in <module>
    from mmdet.core import get_classes
  File "/home/home/programs/mmdetection/mmdet/core/__init__.py", line 5, in <module>
    from .mask import *  # noqa: F401, F403
  File "/home/home/programs/mmdetection/mmdet/core/mask/__init__.py", line 2, in <module>
    from .structures import BaseInstanceMasks, BitmapMasks, PolygonMasks
  File "/home/home/programs/mmdetection/mmdet/core/mask/structures.py", line 6, in <module>
    import pycocotools.mask as maskUtils
  File "/home/home/anaconda3/envs/mmdet/lib/python3.7/site-packages/pycocotools/mask.py", line 3, in <module>
    import pycocotools._mask as _mask
  File "pycocotools/_mask.pyx", line 1, in init pycocotools._mask
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 88 from C header, got 80 from PyObject

Can't download icdar_2013 train_label.txt

I got error messages when trying to download train_label.txt of icdar_2013 dataset.

<Error>
<Code>AccessDenied</Code>
<Message>
You have no right to access this object because of bucket acl.
</Message>
<RequestId>6070101B4EBCCF4B4201DC83</RequestId>
<HostId>download.openmmlab.com</HostId>
</Error>

I tried several times in different devices and the problem remains, is this still my network issue or something's wrong with the website? Thanks!

pytest failed when build mmocr with "full script", installed mmcv==1.3.0, but all tests failed.

pytest showed the fiailed "mmcv should be >=1.2.4 <=1.3.0", but the full script will install mmcv-full with version 1.3.0,
fixed it by adding pip install mmcv==1.2.6 in the 'full script'

conda create -n open-mmlab python=3.7 -y
conda activate open-mmlab

# install latest pytorch prebuilt with the default prebuilt CUDA version (usually the latest)
conda install pytorch==1.5.0 torchvision==0.6.0 cudatoolkit=10.1 -c pytorch

# install the latest mmcv-full
pip install mmcv-full==1.2.6

# install mmdetection
pip install mmdet==2.9.0

# install mmocr
git clone https://github.com/open-mmlab/mmocr.git
cd mmocr

pip install -r requirements.txt
pip install -v -e .  # or "python setup.py build_ext --inplace"
export PYTHONPATH=$(pwd):$PYTHONPATH

Production Deployment

Hi there,
Production Deployment means optimizing the models for inferencing-only environments.
The goal is to allow ordinary users without coding knowledge to detect and recognize documents on CPU-only and ARM-based devices such as Raspberry Pi.

Therefore, quantization and exporting the models into TensorRT and ONNX is important.
Few examples:
https://github.com/Media-Smart/volksdep
https://github.com/Media-Smart/vedastr
https://github.com/PaddlePaddle/PaddleOCR/tree/develop/deploy

KeyError: 'meta'

getting KeyError: 'meta' when rying to predict using a psenet model

(open-mmlab) home@home-lnx:~/programs/mmocr$ python demo/image_demo.py demo/1.jpg configs/textdet/psenet/psenet_r50_fpnf_600e_icdar2015.py ./psenet_r50_fpnf_600e_icdar2015_pretrain-eefd8fe6.pth demo/output.jpg
Traceback (most recent call last):
  File "demo/image_demo.py", line 44, in <module>
    main()
  File "demo/image_demo.py", line 26, in main
    model = init_detector(args.config, args.checkpoint, device=args.device)
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmdet/apis/inference.py", line 43, in init_detector
    if 'CLASSES' in checkpoint['meta']:
KeyError: 'meta'

[Error]: failed to install skimage

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. pip install skimage
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
  2. What dataset did you use?

Environment

  1. Please run python mmocr/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error traceback here.

A placeholder for traceback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

TypeError: list indices must be integers or slices, not torch.device

when trying to predict:

(open-mmlab) home@home-lnx:~/programs/mmocr$ python demo/image_demo.py demo/1.jpg configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py ./dbnet_r18_fpnc_sbn_1200e_icdar2015_20210329-ba3ab597.pth demo/output.jpg
/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmdet/datasets/utils.py:66: UserWarning: "ImageToTensor" pipeline is replaced by "DefaultFormatBundle" for batch inference. It is recommended to manually replace it in the test data pipeline in your config file.
  'data pipeline in your config file.', UserWarning)
Traceback (most recent call last):
  File "demo/image_demo.py", line 44, in <module>
    main()
  File "demo/image_demo.py", line 32, in main
    result = model_inference(model, args.img)
  File "/home/home/programs/mmocr/mmocr/apis/inference.py", line 53, in model_inference
    data = scatter(data, [device])[0]
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/parallel/scatter_gather.py", line 44, in scatter
    return scatter_map(inputs)
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/parallel/scatter_gather.py", line 34, in scatter_map
    out = list(map(type(obj), zip(*map(scatter_map, obj.items()))))
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/parallel/scatter_gather.py", line 29, in scatter_map
    return list(zip(*map(scatter_map, obj)))
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/parallel/scatter_gather.py", line 31, in scatter_map
    out = list(map(list, zip(*map(scatter_map, obj))))
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/parallel/scatter_gather.py", line 27, in scatter_map
    return Scatter.forward(target_gpus, obj.data)
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/parallel/_functions.py", line 72, in forward
    streams = [_get_stream(device) for device in target_gpus]
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/parallel/_functions.py", line 72, in <listcomp>
    streams = [_get_stream(device) for device in target_gpus]
  File "/home/home/anaconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/parallel/_functions.py", line 115, in _get_stream
    if _streams[device] is None:
TypeError: list indices must be integers or slices, not torch.device

unittest for textsnake

Describe the feature

Motivation

  1. Add unit test for textsnake
  2. update readme if possible.

New language model support

Till now, we have released the English model which is trained on
academic datasets. We are planning to support recognition models of more languages.

If you want to support a new language, please provide us with two files:

  1. A char_list.txt file, which lists all the characters that used in the new language.
  2. A dict_list.txt file, which lists the words in the new language as more as possible.
  3. Documents or website links those guide us to learn the basic knowledge of the new language.

You can also vote for a new language request with 👍 or be against with 👎. (Remember that developers are busy and cannot respond to all language requests, so vote for your most favorable one!)

docker file lacks ffmpeg

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. The bug has not been fixed in the latest version.

Describe the bug
The mmocr docker file can't passed "pytest" due to libgl.so.1 not found.

Reproduction

  1. What command or script did you run?
docker build -t docker/ .
pytest
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
    No, I fixed it by adding "RUN apt-get install ffmpeg" to dockerfile.

Bug fix

RUN apt-get update && apt-get install -y git ninja-build libglib2.0-0 libsm6 libxrender-dev libxext6 ffmpeg\
    && apt-get clean \
    && rm -rf /var/lib/apt/lists/*

Error while running SDMGR

Reproduction

python demo/image_demo.py demo/0.jpeg configs/kie/sdmgr/sdmgr_unet16_60e_wildreceipt.py model_weights/tr_weights/sdmgr_unet16_60e_wildreceipt_20210405-16a47642.pth demo/image_recog_results/SDMGR/0_sdmgr_vt.jpg

No modifications were made to the config.

Environment

sys.platform: linux
Python: 3.7.10 (default, Feb 26 2021, 18:47:35) [GCC 7.3.0]
CUDA available: True
GPU 0: Tesla K80
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.0, V10.0.130
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.5.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.3
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0a0+82fd1c8
OpenCV: 4.5.1
MMCV: 1.2.6
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.1
MMOCR: 0.1.0+5244984

All installation steps were followed as per the documentation.

Error traceback

/home/ubuntu/anaconda3/envs/mmocr/lib/python3.7/site-packages/mmdet/apis/inference.py:47: UserWarning: Class names are not saved in the checkpoint's meta data, use COCO classes by default.
  warnings.warn('Class names are not saved in the checkpoint\'s '
Traceback (most recent call last):
  File "demo/image_demo.py", line 44, in <module>
    main()
  File "demo/image_demo.py", line 32, in main
    result = model_inference(model, args.img)
  File "/home/ubuntu/Desktop/mmocr/mmocr/mmocr/apis/inference.py", line 25, in model_inference
    data = test_pipeline(data)
  File "/home/ubuntu/anaconda3/envs/mmocr/lib/python3.7/site-packages/mmdet/datasets/pipelines/compose.py", line 40, in __call__
    data = t(data)
  File "/home/ubuntu/anaconda3/envs/mmocr/lib/python3.7/site-packages/mmdet/datasets/pipelines/loading.py", line 365, in __call__
    results = self._load_bboxes(results)
  File "/home/ubuntu/anaconda3/envs/mmocr/lib/python3.7/site-packages/mmdet/datasets/pipelines/loading.py", line 240, in _load_bboxes
    ann_info = results['ann_info']
KeyError: 'ann_info'

question about making SDMG-R label

Hello, first of all, thanks to your team, mmocr is so amazing! Now I want to train SDMG-R model use my own data, but I don't know how to label one kind of case which "key : value" in one bbox. I found that in your dataset, one bbox only contain one type of label, so how to label that case?

Improve the image demo visualization

Describe the feature

Motivation
The show_result function can save image result already as

img = model.show_result(args.img, result, out_file=None, show=False)
.
Why save image via another line as
mmcv.imwrite(img, args.save_path)

?
Suggest to rm line 38-40.

Suggest to rename save_path to out_file

Related resources
Nil

Additional context
Require the show_result function of all models to implement save and show image files.

Roadmap of MMOCR

We keep this issue open to collect feature requests from users and hear your voice. Our monthly release plan is also available here.

You can either:

  1. Suggest a new feature by leaving a comment.

  2. Vote for a feature request with 👍 or be against with 👎. (Remember that developers are busy and cannot respond to all feature requests, so vote for your most favorable one!)

  3. Tell us that you would like to help implement one of the features in the list or review the PRs. (This is the greatest things to hear about!)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.