Last modified: 2007-05-15 17:53:14 UTC
[Fatal Error] :43:3: The element type "br" must be terminated by the matching end-tag "</br>". c:\x\arstaticwiki\ar\!\!\!\صورة~!!!!ユニセフ0195.JPG_c267.html org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>". [Error] :133:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :133:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :133:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :133:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :102:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :114:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :6:8: The content of element type "head" is incomplete, it must match "((script|style|meta|link|object|isindex)*,((title,(script|style|meta|link|object|isindex)*,(base,(script|style|meta|link|object|isindex)*)?)|(base,(script|style|meta|link|object|isindex)*,(title,(script|style|meta|link|object|isindex)*))))". [Error] :114:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :119:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :162:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Fatal Error] :44:3: The element type "br" must be terminated by the matching end-tag "</br>". c:\x\arstaticwiki\ar\(\2\6\صورة~(2691)_Tel_Aviv.jpg_40ef.html org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>". [Error] :172:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :6:8: The content of element type "head" is incomplete, it must match "((script|style|meta|link|object|isindex)*,((title,(script|style|meta|link|object|isindex)*,(base,(script|style|meta|link|object|isindex)*)?)|(base,(script|style|meta|link|object|isindex)*,(title,(script|style|meta|link|object|isindex)*))))". [Error] :82:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :172:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :6:8: The content of element type "head" is incomplete, it must match "((script|style|meta|link|object|isindex)*,((title,(script|style|meta|link|object|isindex)*,(base,(script|style|meta|link|object|isindex)*)?)|(base,(script|style|meta|link|object|isindex)*,(title,(script|style|meta|link|object|isindex)*))))". [Error] :6:8: The content of element type "head" is incomplete, it must match "((script|style|meta|link|object|isindex)*,((title,(script|style|meta|link|object|isindex)*,(base,(script|style|meta|link|object|isindex)*)?)|(base,(script|style|meta|link|object|isindex)*,(title,(script|style|meta|link|object|isindex)*))))". [Error] :114:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Fatal Error] :43:3: The element type "br" must be terminated by the matching end-tag "</br>". c:\x\arstaticwiki\ar\-\3\4\صورة~-34_sibirien_sviatoinos_bucht.JPG.JPG_07d9.html org.xml.sax.SAXParseException: The element type "br" must be terminated by the matching end-tag "</br>". [Error] :6:8: The content of element type "head" is incomplete, it must match "((script|style|meta|link|object|isindex)*,((title,(script|style|meta|link|object|isindex)*,(base,(script|style|meta|link|object|isindex)*)?)|(base,(script|style|meta|link|object|isindex)*,(title,(script|style|meta|link|object|isindex)*))))". [Error] :114:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :84:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :187:51: Attribute value "p-المشاركة و المساعدة" of type ID must be an NCName when namespaces are enabled. [Error] :6:8: The content of element type "head" is incomplete, it must match "((script|style|meta|link|object|isindex)*,((title,(script|style|meta|link|object|isindex)*,(base,(script|style|meta|link|object|isindex)*)?)|(base,(script|style|meta|link|object|isindex)*,(title,(script|style|meta|link|object|isindex)*))))".
1) Check with current version 2) Provide sample input to produce this 3) Compare with existing bug entries for ID issues
Hmm. I'm just consuming what's coming out of the public wikipedia site, which, I presume, doesn't run quite current version. So, much as I'd like to be a good citizen here, I'm not sure how to proceed. Let me ask this question: is the claim that the current version would prevent the unclosed br tags? Those are the big problem for me. If that's the claim, I might be able to try an experiment to see if there is still a hole allowing people to create them.
Please provide URLs to the pages you're checking, then.
We don't validate any IDs whatsoever, including those produced by the interface; that issue is known. The invalid IDs produced there would be typical for those involving the portal of an Arabic-alphabet site. See bug 4515. It should be completely impossible for Wikipedia, which has HTML Tidy enabled, to have unclosed <br> tags. I can't see any on ar.wikipedia's Main Page or at [[ar:تل أبيب]] (the Tel Aviv article that you appear to have been using).
I'm working from the most recent AR static dump (April). Is it likely that the quality of the tidy processing has gone up materially since then? I'll attach a file ... I've yet to succeed in finding a live page to match one of filenames. The page I've got here isn't the straight Tel Aviv page, it's some special JPG-rights-explaining page.
Created attachment 3641 [details] File with an unclosed br.
I see the problem. [[ar:MediaWiki:Sharedupload]] is at fault. Probably we should run its output through Tidy or Sanitizer or something (does Sanitizer fix unclosed <br>s?), if that's not too slow. As a site-specific workaround, you can ask a sysop there to edit the message to begin with <br style="clear:both" /> instead of <br style="clear:both">, or sed your files to kill that string, but this should probably be fixed in the function itself?
Thank you for tracking this down from my less than informative breadcrumbs. If these are relatively uninteresting pages, I can switch on XML parsing and ignore pages that flunk due to this problem.
See the configuration settings for tidy usage; we have it disabled for UI messages for performance reasons.