Giter VIP home page Giter VIP logo

Comments (5)

UserWangZz avatar UserWangZz commented on July 18, 2024 1

问题1. 500张图片的数据集对于识别任务来说有点少。
问题2. 使用预训练模型对拼音数据进行finetune是可以的,针对问题提到的情况,建议考虑增加数据量,即将发布的PPChatOCR v3将带来全新的融合数据训练机制,可以一定程度上保证模型在通用数据集上的精度,同时在垂类数据集上保证可行的精度。
问题3. 根据所提供的信息,感觉可能是识别的字典没有匹配上的感觉。
问题4. 样例图片中,可能作文的格子影响了检测模型的检测,导致识别效果不佳,也有可能检测结果可以,但是因为格子的竖线影响了识别模型的精度,导致最后被过滤掉。

from paddleocr.

UserWangZz avatar UserWangZz commented on July 18, 2024 1

@UserWangZz 了解了,数据集的数量有待增加。然后PPChatOCR v3发布了,我试试看。

字典用的ppocr_keys_v1.txt,然后加了几个里头没有的带声调的字母。所以是因为这个导致index变化了,所以模型预测不符合预期? 那是不是说如果要额外增加字典的字符,只能从末尾append去加喽?

是的,因为模型是基于这个字典进行的训练,如果index变化了,就会导致错误

from paddleocr.

UserWangZz avatar UserWangZz commented on July 18, 2024 1

@UserWangZz 我用v3和未修改的ppocr_keys_v1.txt测试上面beizhu.png,能够正常识别了! image

但是只要在ppocr_keys_v1.txt加上自定义的字符,识别结果就非常奇怪。这是正常的么嘛?(已经是append到末尾了) image image

这种情况是字典增多与模型最后的FC层维度不匹配造成的,可以简单微调模型,有能力可以冻住模型参数,只更新FC层尝试

from paddleocr.

robotJie avatar robotJie commented on July 18, 2024

@UserWangZz 了解了,数据集的数量有待增加。然后PPChatOCR v3发布了,我试试看。

字典用的ppocr_keys_v1.txt,然后加了几个里头没有的带声调的字母。所以是因为这个导致index变化了,所以模型预测不符合预期?
那是不是说如果要额外增加字典的字符,只能从末尾append去加喽?

from paddleocr.

robotJie avatar robotJie commented on July 18, 2024

@UserWangZz 我用v3和未修改的ppocr_keys_v1.txt测试上面beizhu.png,能够正常识别了!
image

但是只要在ppocr_keys_v1.txt加上自定义的字符,识别结果就非常奇怪。这是正常的么嘛?(已经是append到末尾了)
image
image

from paddleocr.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.