As the current version (0.1.5), it is able to put asian (zh-tw in my case) language ch

Support for asian languages about pdfkit HOT 3 CLOSED

foliojs commented on August 21, 2024

Support for asian languages

from pdfkit.

Comments (3)

devongovett commented on August 21, 2024

Thanks for reporting! The problem is that I am currently delimiting words by space characters. Is there such a delimiter for Chinese text, or how does line wrapping work there? I'm not really knowledgable enough about languages like this to really do a correct implementation, so any help is appreciated! :)

from pdfkit.

jacksctsai commented on August 21, 2024

I'm not really professional in publish things. But some experience using word processor might be helpful.
In Chinese text, there are thousands of characters. And each character has its own meaning. So it is basically safe to warp at any Chinese character which, if not , will be out of the width. However, there are other ways to deal with it even better (considering punctuation marks in Chinese for the completeness of a sentence, etc.)

Actually, I've tried to use a revised regular express (WORD_RE) that follow the basic wrapping. Unfortunately, it didn't work. It seems it cannot correctly calculate the width of each character.
Since I'm not so much familiar with fonts (I believe the width is highly related to font definition.), then I just hard-coded a static width when it failed to get the width of specific character (that is, @charWidths). It somehow works eventually to simple paragraph. For some complicated cases that is composed of English characters and punctuation marks, etc, there will be some other problems arise.

from pdfkit.

devongovett commented on August 21, 2024

Whoa, this is an old issue! Glad to finally get back to it.

I just spent the day implementing the Unicode Line Breaking Algorithm as a separate module, and integrated it into PDFKit. It replaces the existing line wrapper and should solve a huge number of issues. The regular expression based word matching from before was not good. It was overzealous and caused the most bugs of anything in PDFKit. It also didn't work at all for languages like Chinese which don't have spaces between words.

The new wrapping algorithm supports all of this. It should fix this issue. Let me know if you see any problems with it!

from pdfkit.

Recommend Projects

Support for asian languages about pdfkit HOT 3 CLOSED

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent