[reportlab-users] Intermittent failure to find font error

Robin Becker robin at reportlab.com
Mon Aug 9 06:45:17 EDT 2010


On 06/08/2010 16:51, Alex Buck wrote:
..........

>

> What I assume to be happening is when Apache restarts and receives a

> report view request initially it will load everything into into

> primary python sub

> interpreter. (Another assumption being that secondary interpreter

> process memory is copied from the primary sub interpreter). Apache

> will create a sub

> interpreter for every process we have [2]. Furthermore, each

> interpreter should live the whole life the process [3]. There is a

> library deep within

> reportlab which calls external C code (_rl_accel.c) to try and

> accelerate functions like finding fonts (*relevant C code). These

> functions cache a

> reference to the loaded font cache in the python interpreter process.

> If the cache pointer is assigned after a bunch of initial requests it

> will be

> assigned with a reference from a secondary interpreter instead of the

> primary interpreter. Therefore additional requests could mix code and

> data

> from different interpreters resulting in the IOError "not accessible

> in restricted mode". [4]


not sure I understand exactly what's going on here, but presumably
mod_python/wsgi etc prefer to create sub interpreters rather than fork. That
could create a problem since as you found out we have no real way of storing per
interpreter globals. That was the reason why we eventually ended up using
fastcgi as it can decouple apache from the production python and it seems as
though the forking fastcgi + django play very well together. Of course there are
other issues with fastcgi eg timeouts etc etc, but we find they're not of such
great importance to us.



> The quick fix is to disable the C extensions within reportlab by

> moving the shared object (_rl_accel.so). Reportlab will degrade

> gracefully and use

> python code in place of these acceleration extensions.......]


glad this fix does work.


>

> I have included the snippet of the code which I believe is the cause

> of the error. Could the reportlab developers please look into this

> issue and fix

> the acceleration libraries so as to not share python objects from

> different interpreters?


I'm sure you're correct about this particular problem. However, I'm not certain
that the problems you observe will be so easily fixable.


First off there are several other globals in that particular extension

Encodings
defaultEncoding
_SWRecover

I think all three of these are removable. I cannot find any actual usages of
_SWRecover, but someone may differ. The Encodings & defaultEncoding is all about
maintaining a global list of fonts/encodings for speeding up font location. For
obvious reasons that would need to go as it will suffer from the same problems
as you have observed ie interpreters might differ as to what information has
been passed etc etc.

_notdefFont
_notdefChar

these are lookup speedups for python module variables and could be eliminated in
the C code, but since they are similarly defined in Python (ie as module level
globals) I'm not sure why they couldn't suffer from the same issues.

The getFontU problem can easily be resolved by just not importing it in
pdfmetrics if that's a solution or we could change to using a more straight
forward translation of

def _py_getFont(fontName):
try:
return _fonts[fontName]
except KeyError:
return findFontAndRegister(fontName)

in practice I don't think we actually get much speed up here. I find

getFont('Helvetica') takes 0.191 usec
_py_getFont('Helvetica') takes 0.22 usec


whereas the stringWidths really get accelerated

C:\code\reportlab>\Python\lib\timeit.py -s"from reportlab.pdfbase.pdfmetrics
import getFont;f=getFont('Helvetica')" "f.stringWidth('hello world!',10)"
100000 loops, best of 3: 7.87 usec per loop

C:\code\reportlab>\Python\lib\timeit.py -s"from reportlab.pdfbase.pdfmetrics
import getFont;f=getFont('Helvetica')" "f._py_stringWidth('hello world!',10)"
100000 loops, best of 3: 14.1 usec per loop

However, the stringWidth code will have it's own problems as it keeps a cache of
pointers to fonts etc etc (using the Encodings stuff).

I'm not sure what the best plan is here.

Up to now we have just said that reportlab is not thread safe, and now
apparently we know it's not multi-interpreter safe.

........

> -------------------------------------------------------------------------------------------

>

> [1] http://two.pairlist.net/pipermail/reportlab-users/2009-January/007928.html

> [2] http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Application_Global_Variables

> [3] http://code.google.com/p/modwsgi/wiki/ProcessesAndThreading#Python_Sub_Interpreters

> [4] http://code.google.com/p/modwsgi/wiki/ApplicationIssues#Multiple_Python_Sub_Interpreters

>

> *relevant C code

> -------------------------------------------------------------------------------------------

>

> src/rl_addons/rl_accel/_rl_accel.c:1035

>

> [1] static PyObject *_pdfmetrics_fonts = NULL; /*the fontName to font map

> from pdfmetrics*/

> static PyObject *_pdfmetrics_ffar = NULL; /**findFontAndRegister

> from pdfmetrics**/

> static PyObject *getFontU(PyObject *module, PyObject *args, PyObject

> *kwds)

> {

> PyObject *fontName=NULL, *_o1=NULL, *_o2=NULL, *res=NULL;

> static char *argnames[] = {"fontName",NULL};

> if(!PyArg_ParseTupleAndKeywords(args, kwds, "O", argnames,

> &fontName)) return NULL;

> [2] if(!_pdfmetrics_fonts){

> res = PyImport_ImportModule("reportlab.pdfbase.pdfmetrics");

> if(!res) ERROR_EXIT();

> [3] _o1 = _GetAttrString(res,"_fonts");

> if(!_o1) ERROR_EXIT();

> _o2 = _GetAttrString(res,"findFontAndRegister");

> if(!_o2) ERROR_EXIT();

> _pdfmetrics_fonts = _o1;

> _pdfmetrics_ffar = _o2;

> Py_DECREF(res); _o1 = _o2 = res = NULL;

> }

> [4] if((res = PyObject_GetItem(_pdfmetrics_fonts,fontName))) return

> res;

> if(!PyErr_ExceptionMatches(PyExc_KeyError)) ERROR_EXIT();

> ...

> [5] res = PyObject_CallObject(_pdfmetrics_ffar,_o1);

> Py_DECREF(_o1); /**NB this should decremnent fontName as well**/

> return res;

> ....

>

> [1] Declares global pointer to _fonts cache

> [2] If that pointer is NULL load the pdfmetrics namespace

> [3] Get the font cache from pdfmetrics from the current interpreter

> [4] Check to see if pdf metrics exists in cache and return result. However

> pdfmetrics could belong to any sub-interpreter.

> [5] If [4] fails then attempt the bruteforce search where IOError happens

> passing in _fonts cache from another sub-interpreter.

.........
--
Robin Becker


More information about the reportlab-users mailing list