Giter VIP home page Giter VIP logo

Comments (9)

GICodeWarrior avatar GICodeWarrior commented on September 25, 2024

Thanks! I will test this out.

What character set does this cover?

Can you share the scripts / process you used to generate these?

from fir.

Kubuxu avatar Kubuxu commented on September 25, 2024

I've used https://github.com/Shreeshrii/tess5train-fonts/
It requires Tesseract5 binaries, and some workarounds to work.
Command I used to run it:
bash finetune_font.sh eng Latin eng engJost FineTune ' "Jost* Medium" "Jost*" "Jost* Light" ' ' "Jost* Medium" "Jost*" "Jost* Light" ' 0 9999 2 | tee data/logs/engJost.log

It covers https://github.com/Shreeshrii/tess5train-fonts/blob/main/data/langdata/engImpact-eval.training_text, so I think majority of ASCII and some most common Unicode.

from fir.

GICodeWarrior avatar GICodeWarrior commented on September 25, 2024

Thanks!

Do you have some example screenshots where fir has returned incorrect results? I'd like to add them to my test cases.

For the training data, it would be cool to generate some based on text from the Foxhole translations files. It would be great to recognize the stockpile types in each different language.

Also, Foxhole uses Renner primarily. I believe Jost was a previous font and is similar.
image

from fir.

Kubuxu avatar Kubuxu commented on September 25, 2024

I don't have any screenshots for fir returning wrong results. I haven't worked with fir much (yet).
I was planning to use tesseract for scanning stockpile logs, but it was messing up, even with perfectly clean text. The stockpile logs turned out are limited in length, rendering them useless for my purpose (logistic tracking, consumption tracking and forecasting).

This is how I stumbled onto fir through a dev in FMAT and an exchange officer between SPUD and 27th.

I can run the fine-tuning on Renner. If you have a list of strings you want it to detect better, send it my way. I can fine-tune it on that in addition.

from fir.

Kubuxu avatar Kubuxu commented on September 25, 2024

I have one example where the stockpile name detection (thus Tesseract) makes a small mistake. It adds a space between a T and the following 1. Example screenshot:
War-Win64-Shipping_2023-03-22_13-58-43
Using the engJost-integer model parses it correctly. It is a minor issue because it reliably inserts that additional space.

from fir.

Kubuxu avatar Kubuxu commented on September 25, 2024

Also Renner was renamed to Jost https://www.dafont.com/renner.font

from fir.

GICodeWarrior avatar GICodeWarrior commented on September 25, 2024

Thanks, this is all helpful. Stockpile name recognition accuracy has been a recurring issue.

This is something I'd like to work on, but it will be a few weeks before I have time to dig in.

from fir.

Kubuxu avatar Kubuxu commented on September 25, 2024

I have run a better finetune for 27th stockpile naming (+ asci translations of storage depot and seaport). The result is here:
engJost-final3.traineddata.gz
IDK how well it will work with the Sundial's naming scheme. Ours is 27[A-Z]{3,4}-[IOBAEF]\d\d

from fir.

GICodeWarrior avatar GICodeWarrior commented on September 25, 2024

This has been integrated and solves all currently known stockpile name recognition errors.

Thanks again!

from fir.

Related Issues (10)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.