Giter VIP home page Giter VIP logo

Comments (8)

Antonoko avatar Antonoko commented on May 18, 2024 2

todo:
[x] 添加取消屏幕录制时缩放的策略选项,允许高分屏用户录制完整分辨率的视频文件;
[ ] 暴露 OCR 接口,添加更多 OCR 选型、允许用户添加自定义的 OCR 引擎,添加与完善 benchmark 测试对比工具;

from windrecorder.

ASC8384 avatar ASC8384 commented on May 18, 2024 2

业内(做OCR的)朋友推荐了PaddleOCREasyOCR
实测chineseOCRlite的准确率比Windows的高多了。
感觉可以考虑加入GPU推理。

from windrecorder.

Antonoko avatar Antonoko commented on May 18, 2024 2

image 尝试了一下调用微信自带的OCR来替换Windows.Media.Ocr.Cli,精度提升非常大,性能还没测试,也没在前端做按钮,更新刚刚推送到我fork的一个分支(https://github.com/B1lli/Windrecorder/tree/dev ),如果 @Antonoko 愿意的话,可以加在前端页面作为可选的替换OCR选择

棒!大概 0.2.0 版本前会添加自定义 ocr 接口的配置,这个 ocr 方式可以作为一个备选项加入~

from windrecorder.

linchuanXu avatar linchuanXu commented on May 18, 2024 1

https://sspai.com/prime/story/rewind-diy

这个哥们也在复现rewind,用的ocr技术是

识别文字和压缩截图尺寸:使用 OCRmyPDF
少数派过去曾有一篇文章介绍如何通过 OCRmyPDF 在扫描版 PDF 中检索文字。本文沿用那篇文章所介绍的用法,唯一多用到的选项是 --optimize 3;根据文档,这是指对图片进行比较激进的有损压缩,特别适合截图留档这种「能看清就行」的场景。

from windrecorder.

linchuanXu avatar linchuanXu commented on May 18, 2024 1

我改本地代码调用了chineseOCRlite,删除数据库全部ocr,效果好了很多! 字小的,模糊的可以考虑这个。

用chineseOCRlite的时候,在crnn.py的25行加入,可以避免输出大量onnx的警告:
rt.set_default_logger_severity(3)

from windrecorder.

linchuanXu avatar linchuanXu commented on May 18, 2024 1

https://cnocr.readthedocs.io/zh/latest/models/

cnocr 我看了一下,很灵活,cpu、gpu、模型都可以配置,效果很好。但是配环境很麻烦。

最好还是能暴露接口

from windrecorder.

Antonoko avatar Antonoko commented on May 18, 2024

目前禁用了 chineseOCRlite 的主要原因是效能比较糟糕(需要消耗更多的计算资源、时间也相对慢一些)、且同输入图像准确率和系统自带相比也接近。在同15分钟视频切片下,chineseOCRliteOCR 耗时约为8分钟,Windows.Media.Ocr.Cli大概为3分钟不到。

准确率较低的原因可能是由于录制的规格分辨率比较低,导致基于此画面的OCR结果准确率也低,可以参见这个讨论:
#9 (comment)

(因为我屏幕的缩放开得比较大,所以没有太注意到准确率的问题……下个版本中我们会加上关闭压缩分辨率策略的选项🤯,通过录制原始的分辨率画面,应该可以对 OCR 准确度有较大的提升

OCRmyPDF 我们也瞅瞅看!未来也可能会加上 paddleOCR 等方式选项进行 benchmark 供选择🤔

from windrecorder.

B1lli avatar B1lli commented on May 18, 2024

image
尝试了一下调用微信自带的OCR来替换Windows.Media.Ocr.Cli,精度提升非常大,性能还没测试,也没在前端做按钮,更新刚刚推送到我fork的一个分支(https://github.com/B1lli/Windrecorder/tree/dev ),如果 @Antonoko 愿意的话,可以加在前端页面作为可选的替换OCR选择

from windrecorder.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.