Last modified: 2010-10-30 13:33:08 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T13710, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 11710 - anchors with name= but no id=
anchors with name= but no id=
Status: RESOLVED WORKSFORME
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
1.12.x
All All
: Lowest trivial (vote)
: ---
Assigned To: Nobody - You can work on this!
http://zh.wikipedia.org/
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-10-19 21:23 UTC by Dan Jacobson
Modified: 2010-10-30 13:33 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Dan Jacobson 2007-10-19 21:23:24 UTC
Why do ZH pages have name= links without the additional id=?
EN doesn't have that problem.

$ cat linktest
while read page
      do echo $page; lynx -source $page|grep name=|egrep -v id=\|keywords
done <<EOF
http://zh.wikipedia.org/wiki/Wikipedia_talk:%E8%81%9A%E4%BC%9A/2007%E8%87%BA%E7%81%A3%E7%A7%8B%E8%81%9A
http://en.wikipedia.org/wiki/Main_Page
http://en.wikipedia.org/wiki/User:Jidanni/Sandbox
http://zh.wikipedia.org/
EOF
$ sh linktest
http://zh.wikipedia.org/wiki/Wikipedia_talk:%E8%81%9A%E4%BC%9A/2007%E8%87%BA%E7%81%A3%E7%A7%8B%E8%81%9A
<p><a name=".E6.99.82.E9.96.93.E8.88.87.E5.9C.B0.E9.BB.9E"></a></p>
<p><a name=".E5.A0.B1.E5.90.8D"></a></p>
<p><a name=".E8.A8.8E.E8.AB.96"></a></p>
http://en.wikipedia.org/wiki/Main_Page
http://en.wikipedia.org/wiki/User:Jidanni/Sandbox
http://zh.wikipedia.org/
<th><a name=".E7.89.B9.E8.89.B2.E6.A2.9D.E7.9B.AE"></a>
<th><a name=".E6.96.B0.E9.97.BB.E5.8A.A8.E6.80.81"></a>
<th><a name=".E4.BC.98.E8.89.AF.E6.9D.A1.E7.9B.AE"></a>
<th><a name=".E6.AF.8F.E6.97.A5.E5.9B.BE.E7.89.87"></a>
<th><a name=".E4.BD.A0.E7.9F.A5.E9.81.93.E5.90.97.EF.BC.9F"></a>
<th><a name=".E5.8E.86.E5.8F.B2.E4.B8.8A.E7.9A.84.E4.BB.8A.E5.A4.A9"></a>
<th><a name=".E7.89.B9.E8.89.B2.E5.86.85.E5.AE.B9"></a>
<th><a name=".E5.AD.A3.E8.8A.82.E8.AF.9D.E9.A2.98"></a>
<th><a name=".E5.A7.8A.E5.A6.B9.E8.A8.88.E7.95.AB"></a>
Comment 1 Brion Vibber 2007-12-03 21:50:13 UTC
This is most likely due to zealous validation of id attributes. Those which begin with [A-Za-z] pass through, while those beginning with [.] are dropped. That may or may not be valid. (The "." is introduced as a variant of URL percent-encoding of characters which can't appear literally, modified to pass the strict, though rarely-enforced, rules about the contents of id attributes.)

Per compatibility guidelines in XHTML 1.0 spec:

"Note that the collection of legal values in XML 1.0 Section 2.3, production 5 is much larger than that permitted to be used in the ID and NAME types defined in HTML 4. When defining fragment identifiers to be backward-compatible, only strings matching the pattern [A-Za-z][A-Za-z0-9:_.-]* should be used. See Section 6.2 of [HTML4] for more information."
[http://www.w3.org/TR/xhtml1/#guidelines]

This strict limitation is actually in the HTML 4.01 spec, as far as I can see, and would apply to both NAME and ID attributes...
[http://www.w3.org/TR/html4/types.html#h-6.2]

XHTML 1.0, as XML 1.0, allows a larger set of possibilities:
[5]  Name	   ::=   	(Letter | '_' | ':') (NameChar)*
[http://www.w3.org/TR/REC-xml/#sec-common-syn]

But both appear to technically disallow the initial "."...

Probably the fragment id normalization needs to produce something a bit different for those which don't have an initial ASCII letter... alternatively we could do some compatibility testing and see about using the wider Unicode-friendly XML selection... which still would have issues with digits and punctuation as the first character.
Comment 2 Derk-Jan Hartman 2010-10-30 13:33:08 UTC
It seems that atm, these headers all have ids (starting with .). The name attribute has been removed a while ago.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links