Giter VIP home page Giter VIP logo

fudanocr's Introduction

Fudan OCR

Fudan OCR is a platform for OCR researchers, integrating several OCR modules and models. Users can train existing models based on this platform, or implement their own models using lightweight codes. With this platform, researchers can implement many OCR works such as word detection, word recognition, text super-resolution, etc.

Introduction for each Module

FudanOCR is divided into several parts. The following is a brief introduction to each module. For details, see the README file under each directory.

  • alphabet: Alphabet module. Users can use it to create an alphabet object.
  • config: Config module. Provide configuration file template inside, users fill in new configuration files as needed.
  • data: Data module. Contain functions such as obtaining datasets, obtaining data loaders, and data preprocessing, etc.
  • engine: Engine module(Important!). Users can use this module to initialize a new training environment. Then define a subclass of Trainer to train.
  • logger: Log module. Process some recording operations.
  • model: Model module. There are several detection and recognition of existing models, as well as Fudan's previous OCR technical reports.
  • component: Component module. Store some small components for building models.
  • utils: Utility module. Store some useful functions.

Dependence

Still under construction. If you use the existing model in /model, see /model/document to configure the appropriate environment.

If you want to use the framework to implement your own model, we recommend you to use python 3.6+, torch>=1.2.0 and install the other packages in requirements.txt :)

Usage

  • 1 Import related packages in main.py
# -*- coding:utf-8 -*-
from engine.trainer import Trainer
from engine.env import Env
from data.build import build_dataloader
'''
The following statements can be omitted.Look at /model/modelDict for more details
'''
from model.recognition_model.MORAN_V2.models.moran import newMORAN
  • 2 Define a subclass of Trainer. Then overload some of the function as needed.
from engine.trainer import Trainer
class XXNET_Trainer(Trainer):
    def __init__(self, modelObject, opt, train_loader, val_loader):
        Trainer.__init__(self, modelObject, opt, train_loader, val_loader)

    def pretreatment(self, data):
        '''You need to overload'''
        pass

    def posttreatment(self, modelResult, pretreatmentData, originData, test=False):
        '''You need to overload'''
        pass

    def finetune(self):
        '''You need to overload if you set opt.FUNCTION.FINETUNE to True'''
        pass
  • 3 Create a new training environment.
env = Env()
opt = env.getOpt() 
'''Use opt to get parameters from config file
print(opt.BASE.MODEL)
'''
  • 4 Get a dataloader from the data module.
train_loader, test_loader  = build_dataloader(env.opt)
  • 5 Train your model. (If you have registered your model in /model/modelDict.py, use model = env.model for convenience.)
model = newMORAN
newTrainer = XX_Trainer(modelObject=model, opt=env.opt, train_loader=train_loader, val_loader=test_loader).train()

R & D team

This project was developed by students of Fudan University. The leader is Jingye Chen, and the other members of this team are Xiaocong Wang, Siyu Miao, Huafeng Shi, and Peiyao Zhang. The supervisors of this team are Bin Li and Xiangyang Xue.

๐Ÿ˜„ We are very grateful to those who helped us in the project. And if you have new ideas or suggestions for this project, welcome to pull requests:)

fudanocr's People

Contributors

ambermsy avatar jingyechen avatar s110h0716f avatar secretzm avatar wxc0816 avatar

Stargazers

 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.