Last modified: 2012-10-23 17:14:46 UTC
I know this is an issue with Comp Sci rather than with Wikipedia, but I think it would be helpful if when clicking a footnote, the fragment identifier (what comes after # in the URL) would match up with the number footnote I clicked. For example, If I were to read the text: "DST is said to save electricity by reducing the need for artificial evening lighting,[4]" and were to click the raised 4, I would be taken to "http://en.wikipedia.org/wiki/Article#_ref-3", rather than the more logically "http...Article#_ref-4", as I clicked the 4th reference. I know that in Binary and Hexidecimal, the first number in the series is 0 (hence when I click "ref 1", I'm taken to #_ref-0), but I was wondering if it were possible to start the numbering at 1 to better match a layperson's expected result. Forgive me if this has been addressed in another bug report, but I didn't see anything.
This would be great except for the bit about breaking all external links to Wikipedia footnotes. I don't expect there to be terribly many of those, however, so that might be an okay price to fix even a little annoyance like this.
I can't imagine there would be many people linking to _footnotes_. Why not just link to the actual page (or copy it over)? I highly doubt it would break many links, but I'm willing to hear evidence to the contrary.
*** Bug 14986 has been marked as a duplicate of this bug. ***
While technically this change would break links, it wouldn't break them very much -- you'd just have to scroll down a line to the next note -- and it would make things much less confusing.
Wouldn't the links break also if someone inserted another footnote before the referenced? So I'd say go ahead with the change.
Created attachment 5117 [details] Patch This patch changes the id of the <ref> and <references> tags to always have the same number as the on-screen labels (only anonymous references). This patch also makes the group name part of the id (except default group). This is needed to avoid conflicts between groups.
Created attachment 5120 [details] Patch including ParserTests and a bugfix This patch includes updated citeParserTests.txt This patch also fixes a bug with non-ASCII group names in the previous patch.
Sorry I had a bunch of stuff mostly done, so your patch won't apply to what's there now, but I did read through most of it. The advantage of the current system is that all the references are numbered sequentially across groups, so all of the IDs can be unique without doing any kind of weird encoding. It looks like your patch changes it so that the IDs are no longer unique; i.e. <ref group=a name=b> would get the same ID as <ref name=a_b>. But most of your work seems sound, so this may be fixable as simply as a more clever encoding, i.e. concat the group & name, and specify the length of the group in the ID, so if the former is something like cite_note-1-a_b, the latter could be cite_note-0-a_b
Created attachment 5162 [details] Updated patch (In reply to comment #8) > The advantage of the current system is that all the references are numbered > sequentially across groups, so all of the IDs can be unique without doing any > kind of weird encoding. It looks like your patch changes it so that the IDs > are no longer unique; i.e. <ref group=a name=b> would get the same ID as <ref > name=a_b>. > > But most of your work seems sound, so this may be fixable as simply as a more > clever encoding, i.e. concat the group & name, and specify the length of the > group in the ID, so if the former is something like cite_note-1-a_b, the latter > could be cite_note-0-a_b I have changed the code to always include the group - name separator. This way id conflicts are avoided.
(In reply to comment #9) > Created an attachment (id=5162) [details] > Updated patch > > (In reply to comment #8) > > The advantage of the current system is that all the references are numbered > > sequentially across groups, so all of the IDs can be unique without doing any > > kind of weird encoding. It looks like your patch changes it so that the IDs > > are no longer unique; i.e. <ref group=a name=b> would get the same ID as <ref > > name=a_b>. > > > > But most of your work seems sound, so this may be fixable as simply as a more > > clever encoding, i.e. concat the group & name, and specify the length of the > > group in the ID, so if the former is something like cite_note-1-a_b, the latter > > could be cite_note-0-a_b > > I have changed the code to always include the group - name separator. This way > id conflicts are avoided. > That would still create a conflict if both conflicting references have a group, i.e. <ref group=a_b name=c/> and <ref group=a name=b_c/>. Once you concat arbitrary strings, there's no way to tell how to split them without knowing the length, so I'd recommend storing the length of one of the strings, either before or after the combined string.
Created attachment 5163 [details] New patch Ids now include group name length.
This looks pretty icky to me... <ref> #cite_note-0-_1 <ref group="blah"> #cite_note-4-blah_1 Those lengths are pretty awful looking and pretty unpredictable. Given that named refs are still assigned unique sequence numbers within their groups, what's the point of including the name in the fragment ID? It seems to me that it's an unnecessary exposure of internal data -- the purpose is to allow two links to the same footnote without having to hardcode the number, which may change, in the source text. Since the number was also in the link, putting the name in the link provides no static benefit for external linkers; its only purpose seems to be to complicate things by being ambiguous with the groups here. :) My recommendation is to simply drop the ref name from the fragment ID -- use group and number only. Then there's nothing that needs to be disambiguated.
Created attachment 5164 [details] Yet Another Patch (In reply to comment #12) > My recommendation is to simply drop the ref name from the fragment ID -- use > group and number only. Then there's nothing that needs to be disambiguated. Ok, no problem. Patch updated.
(In reply to comment #13) > Created an attachment (id=5164) [details] > Yet Another Patch > > (In reply to comment #12) > > My recommendation is to simply drop the ref name from the fragment ID -- use > > group and number only. Then there's nothing that needs to be disambiguated. > > Ok, no problem. Patch updated. > That seems to have resolved the conflicts. You may have made more changes to the messages than are necessary; i.e. removed cite_error_ref_invalid_parameters which is fully translated, and added two replacements that are not -- the one with no group probably won't get used, so this would cause the error message to change from the local language to English for no good reason. The other message you added duplicates an existing message, so that may be unnecessary.
(In reply to comment #14) > You may have made more changes to the messages than are necessary; i.e. removed > cite_error_ref_invalid_parameters which is fully translated, and added two > replacements that are not -- the one with no group probably won't get used, so > this would cause the error message to change from the local language to English > for no good reason. I removed *cite_error_ref_too_many_keys* because the name, the English text and at least the Dutch and German translation are wrong. This error isn't triggered when you specify two or more names, only the last name is passed to the extension by the parser. This error is only triggered when you specify invalid parameters. Therefore I created the messages 'cite_error_ref_invalid_parameters' and 'cite_error_ref_invalid_parameters_group', just like the existing 'cite_error_references_invalid_parameters' and 'cite_error_references_invalid_parameters_group'. 'cite_error_ref_invalid_parameters' may not be used, but $wgAllowCiteGroups is still existing and can be used. > The other message you added duplicates an existing > message, so that may be unnecessary. The contents of 'cite_reference_link_key_with_num' and the new 'cite_reference_link_key_with_group' are the same, but there functions are different.
(In reply to comment #15) > (In reply to comment #14) > > You may have made more changes to the messages than are necessary; i.e. removed > > cite_error_ref_invalid_parameters which is fully translated, and added two > > replacements that are not -- the one with no group probably won't get used, so > > this would cause the error message to change from the local language to English > > for no good reason. > > I removed *cite_error_ref_too_many_keys* because the name, the English text and > at least the Dutch and German translation are wrong. This error isn't triggered > when you specify two or more names, only the last name is passed to the > extension by the parser. This error is only triggered when you specify invalid > parameters. > > Therefore I created the messages 'cite_error_ref_invalid_parameters' and > 'cite_error_ref_invalid_parameters_group', just like the existing > 'cite_error_references_invalid_parameters' and > 'cite_error_references_invalid_parameters_group'. > > 'cite_error_ref_invalid_parameters' may not be used, but $wgAllowCiteGroups is > still existing and can be used. > The name may not be accurate anymore, but it would take some coordination to change. It looks like there are still a little over 60 references to the old message after your patch is applied, and (I think) it would need to be coordinated with the localization efforts. I like the old text better; since it's not so verbose, and won't need to be changed whenever a new option is added. > > The other message you added duplicates an existing > > message, so that may be unnecessary. > > The contents of 'cite_reference_link_key_with_num' and the new > 'cite_reference_link_key_with_group' are the same, but there functions are > different. > It seems unnecessary to me, but that wouldn't get localized anyway, so no big deal.
Marking with need-review because this patch still awaits review by Brion or another active MediaWiki or Cite developer.
Jelte, now that we have moved to Git as a source control system, you're welcome to get developer access https://www.mediawiki.org/wiki/Developer_access to put this & future patches directly into Git. That will get them reviewed faster, too!
Eran, would you do me the favor of checking on this patch and seeing whether it would work with Cite in trunk, and if so, moving it into Git? Thanks.
Gerrit change #28460 Merged.