Hi, I've been getting segfaults when trying to write a SAX parser for HTML. It looks like libxml2.htmlCreatePushParser works correctly but the first time you call libxml2.htmlParseChunk it will segv because, in C land, ctxt->input (and possibly ctxt) has been trashed somewhere. I got a little lost trying to debug it as python seems to be doing some wierd threading stuff (and I'm very tired now because of it :-/), so I thought I'd post it here and see if anyone else can find it whilst I sleep ;) The attached tarball contains some stuff to reproduce it, although it happened on every html or xhtml file I could find and with both python 1.5 and python 2.2. Try "python xml.py" (should work), and "python html.py" (should sig11). I'm using Red Hat Linux 7.3 with Daniel's latest libxml2 rpms from gnome.org (rpm -q says python-1.5.2-38, python2-2.2-16, libxml2-2.4.21-1 and libxml2-python-2.4.21-1). Cheers, Gary [ gary inauspicious org ][ GnuPG 85A8F78B ][ http://inauspicious.org/ ]
Attachment:
xml-py-segv.tar.gz
Description: GNU Zip compressed data