Giter VIP home page Giter VIP logo

Comments (3)

devongovett avatar devongovett commented on August 21, 2024

Thanks for reporting! The problem is that I am currently delimiting words by space characters. Is there such a delimiter for Chinese text, or how does line wrapping work there? I'm not really knowledgable enough about languages like this to really do a correct implementation, so any help is appreciated! :)

from pdfkit.

jacksctsai avatar jacksctsai commented on August 21, 2024

I'm not really professional in publish things. But some experience using word processor might be helpful.
In Chinese text, there are thousands of characters. And each character has its own meaning. So it is basically safe to warp at any Chinese character which, if not , will be out of the width. However, there are other ways to deal with it even better (considering punctuation marks in Chinese for the completeness of a sentence, etc.)

Actually, I've tried to use a revised regular express (WORD_RE) that follow the basic wrapping. Unfortunately, it didn't work. It seems it cannot correctly calculate the width of each character.
Since I'm not so much familiar with fonts (I believe the width is highly related to font definition.), then I just hard-coded a static width when it failed to get the width of specific character (that is, @charWidths). It somehow works eventually to simple paragraph. For some complicated cases that is composed of English characters and punctuation marks, etc, there will be some other problems arise.

from pdfkit.

devongovett avatar devongovett commented on August 21, 2024

Whoa, this is an old issue! Glad to finally get back to it.

I just spent the day implementing the Unicode Line Breaking Algorithm as a separate module, and integrated it into PDFKit. It replaces the existing line wrapper and should solve a huge number of issues. The regular expression based word matching from before was not good. It was overzealous and caused the most bugs of anything in PDFKit. It also didn't work at all for languages like Chinese which don't have spaces between words.

The new wrapping algorithm supports all of this. It should fix this issue. Let me know if you see any problems with it!

from pdfkit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.