[xml] libxml newbie question on htmlParseChunk function
- From: Van H Tran <tvhoang1980 yahoo com>
- To: xml gnome org
- Subject: [xml] libxml newbie question on htmlParseChunk function
- Date: Fri, 2 Jun 2006 01:07:00 -0700 (PDT)
Hi all,
My very first post in this mailing list :)
Ok, i'm trying to unhtmlize some text, using the SAX
model.
Here is how i initialize the parser
void unhtmlizeHandleCharacters(void *user_data, const
xmlChar * string,
int length)
{
fprintf(stderr,"string = %s", (gchar *)string);
//process string here...
}
void unhtmlize(text)
{
sax_p = g_new0(htmlSAXHandler, 1);
sax_p->characters = unhtmlizeHandleCharacters;
ctxt =
htmlCreatePushParserCtxt(sax_p, buffer, string,
strlen(string), "",
XML_CHAR_ENCODING_UTF8);
htmlParseChunk(ctxt, string, 0, 1);
}
What's interesting is, this works with 'normal' text.
However if
text = "abc < xyz"
Then i see in the debug in func handleCharacters that
it only takes "abc " as the string, everything after
this '<' character is omitted.
So my func unhtmlize("abc < xyz") gives "abc " as the
result.
How can i over come this? Any reply much appreciated.
Thanks in advance
TranVan Hoang,
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]