Hello everyone, As my colleague pointed out in December (http://mail.gnome.org/archives/xml/2009-December/msg00036.html ; although he didn't do it in a clear manner), there're real world examples of HTML pages that overflows stack. We're using libxml through nokogiri ( http://nokogiri.org/ it's a Ruby library). E. g. >> Nokogiri::HTML::SAX::Parser.new(Nokogiri::XML::SAX::Document.new).parse_memory("<b>"*100_000) #=> SystemStackError: stack level too deep In the patch I change htmlParseElement to return immediately and let the caller htmlParseContent do the job. htmlParseElement is not a static function, and I changed it behavior! I googled around (http://google.com/codesearch?q=htmlParseElement&hl=en&btnG=Search+Code) and I don't see everyone actually using it. But if this is an issue, I can make htmlParseElement call the secret (static) htmlParseElement and then htmlParseContent until level matches. I'd rather see htmlParseElement converted to static though. I also attach weirdness.patch that deletes double definitions, and sets nameMax to 0 if it fails to allocate some memory. Good day, everyone :)
Attachment:
non-recursive-html-parser.patch
Description: Binary data
Attachment:
weirdness.patch
Description: Binary data