[reportlab-users] MediaWiki's "Download as PDF" feature uses ReportLab but has a problem

Yao Ziyuan yaoziyuan at gmail.com
Tue Jan 10 12:23:59 EST 2012


On Wed, Jan 11, 2012 at 1:10 AM, Andy Robinson <andy at reportlab.com> wrote:

> On 10 January 2012 16:59, Yao Ziyuan <yaoziyuan at gmail.com> wrote:

>> I'm not familiar with Python. But I have a simple way for ReportLab to

>> process CJK line-wrapping transparently:

>>

>> Before everything, for every CJK character found in the text, insert a

>> U+200B ("zero-width space") after it. This will logically make every

>> CJK character a possible line-wrapping point.

>>

>> Then, recognize U+200B as a kind of whitespace in ReportLab's non-CJK

>> line-wrapping code.

>>

>

> That's clever!  Thank you for this. I'll trust you that this works for

> Chinese, which unfortunately I don't speak/read/write.

>

> For Japanese, which I do know quite well, NOT every character is a

> good wrap point, and there are quite sophisticated rules about

> characters which should not begin or end a line.  Our present

> algorithm is really a "Japanese wrapping", not "CJK".

>

> The right answer is still probably a unicode-based algorithm for all

> languages.  I wish I had more time to work on it.


OK, I just found these links useful for CJK word wrap knowledge:
http://en.wikipedia.org/wiki/Word_wrap#Word_wrapping_in_text_containing_Chinese.2C_Japanese.2C_and_Korean
http://en.wikipedia.org/wiki/Line_breaking_rules_in_East_Asian_language

However, these links mention that no word processor really take all
these sophisticated rules into consideration. So instead of pursuing
perfectionism, ReportLab can simply stick to the most basic rule:
wrapping either after a whitespace or a CJK character. If ReportLab
indeed wants to implement all sophisticated rules, I suggest reusing
an existing open source Unicode word wrap library, instead of
reinventing all the wheels from scratch.


>

> - Andy

> _______________________________________________

> reportlab-users mailing list

> reportlab-users at lists2.reportlab.com

> http://two.pairlist.net/mailman/listinfo/reportlab-users



More information about the reportlab-users mailing list