Last modified: 2012-10-23 17:14:46 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T12537, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 10537 - Cite anchors should be numbered starting at 1 to correspond with on-screen labels
Cite anchors should be numbered starting at 1 to correspond with on-screen la...
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Cite (Other open bugs)
unspecified
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: patch, patch-need-review
: 14986 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-07-11 08:39 UTC by mysekurity
Modified: 2012-10-23 17:14 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch (16.55 KB, patch)
2008-08-04 20:29 UTC, Jelte (WebBoy)
Details
Patch including ParserTests and a bugfix (27.92 KB, patch)
2008-08-05 20:55 UTC, Jelte (WebBoy)
Details
Updated patch (28.43 KB, patch)
2008-08-11 14:21 UTC, Jelte (WebBoy)
Details
New patch (28.65 KB, patch)
2008-08-11 15:57 UTC, Jelte (WebBoy)
Details
Yet Another Patch (28.58 KB, patch)
2008-08-11 21:24 UTC, Jelte (WebBoy)
Details

Description mysekurity 2007-07-11 08:39:00 UTC
I know this is an issue with Comp Sci rather than with Wikipedia, but I think it would be helpful if when clicking a footnote, the fragment identifier (what comes after # in the URL) would match up with the number footnote I clicked.

For example, If I were to read the text: "DST is said to save electricity by reducing the need for artificial evening lighting,[4]" and were to click the raised 4, I would be taken to "http://en.wikipedia.org/wiki/Article#_ref-3", rather than the more logically "http...Article#_ref-4", as I clicked the 4th reference.

I know that in Binary and Hexidecimal, the first number in the series is 0 (hence when I click "ref 1", I'm taken to #_ref-0), but I was wondering if it were possible to start the numbering at 1 to better match a layperson's expected result.

Forgive me if this has been addressed in another bug report, but I didn't see anything.
Comment 1 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-07-11 18:09:33 UTC
This would be great except for the bit about breaking all external links to Wikipedia footnotes.  I don't expect there to be terribly many of those, however, so that might be an okay price to fix even a little annoyance like this.
Comment 2 mysekurity 2007-07-11 22:29:58 UTC
I can't imagine there would be many people linking to _footnotes_. Why not just link to the actual page (or copy it over)? I highly doubt it would break many links, but I'm willing to hear evidence to the contrary. 
Comment 3 Brion Vibber 2008-08-01 18:14:20 UTC
*** Bug 14986 has been marked as a duplicate of this bug. ***
Comment 4 Brion Vibber 2008-08-01 18:15:24 UTC
While technically this change would break links, it wouldn't break them very much -- you'd just have to scroll down a line to the next note -- and it would make things much less confusing.
Comment 5 Tim Landscheidt 2008-08-02 12:59:48 UTC
Wouldn't the links break also if someone inserted another footnote before the referenced? So I'd say go ahead with the change.
Comment 6 Jelte (WebBoy) 2008-08-04 20:29:09 UTC
Created attachment 5117 [details]
Patch

This patch changes the id of the <ref> and <references> tags to always have the same number as the on-screen labels (only anonymous references).

This patch also makes the group name part of the id (except default group). This is needed to avoid conflicts between groups.
Comment 7 Jelte (WebBoy) 2008-08-05 20:55:14 UTC
Created attachment 5120 [details]
Patch including ParserTests and a bugfix

This patch includes updated citeParserTests.txt

This patch also fixes a bug with non-ASCII group names in the previous patch.
Comment 8 Steve Sanbeg 2008-08-06 19:15:48 UTC
Sorry I had a bunch of stuff mostly done, so your patch won't apply to what's there now, but I did read through most of it.

The advantage of the current system is that all the references are numbered sequentially across groups, so all of the IDs can be unique without doing any kind of weird encoding.  It looks like your patch changes it so that the IDs are no longer unique; i.e. <ref group=a name=b> would get the same ID as <ref name=a_b>.  

But most of your work seems sound, so this may be fixable as simply as a more clever encoding, i.e. concat the group & name, and specify the length of the group in the ID, so if the former is something like cite_note-1-a_b, the latter could be cite_note-0-a_b

Comment 9 Jelte (WebBoy) 2008-08-11 14:21:51 UTC
Created attachment 5162 [details]
Updated patch

(In reply to comment #8)
> The advantage of the current system is that all the references are numbered
> sequentially across groups, so all of the IDs can be unique without doing any
> kind of weird encoding.  It looks like your patch changes it so that the IDs
> are no longer unique; i.e. <ref group=a name=b> would get the same ID as <ref
> name=a_b>.  
> 
> But most of your work seems sound, so this may be fixable as simply as a more
> clever encoding, i.e. concat the group & name, and specify the length of the
> group in the ID, so if the former is something like cite_note-1-a_b, the latter
> could be cite_note-0-a_b

I have changed the code to always include the group - name separator. This way id conflicts are avoided.
Comment 10 Steve Sanbeg 2008-08-11 14:54:42 UTC
(In reply to comment #9)
> Created an attachment (id=5162) [details]
> Updated patch
> 
> (In reply to comment #8)
> > The advantage of the current system is that all the references are numbered
> > sequentially across groups, so all of the IDs can be unique without doing any
> > kind of weird encoding.  It looks like your patch changes it so that the IDs
> > are no longer unique; i.e. <ref group=a name=b> would get the same ID as <ref
> > name=a_b>.  
> > 
> > But most of your work seems sound, so this may be fixable as simply as a more
> > clever encoding, i.e. concat the group & name, and specify the length of the
> > group in the ID, so if the former is something like cite_note-1-a_b, the latter
> > could be cite_note-0-a_b
> 
> I have changed the code to always include the group - name separator. This way
> id conflicts are avoided.
> 

That would still create a conflict if both conflicting references have a group, i.e. 
<ref group=a_b name=c/> and <ref group=a name=b_c/>.  Once you concat arbitrary strings, there's 
no way to tell how to split them without knowing the length, so I'd recommend storing the length 
of one of the strings, either before or after the combined string.
Comment 11 Jelte (WebBoy) 2008-08-11 15:57:40 UTC
Created attachment 5163 [details]
New patch

Ids now include group name length.
Comment 12 Brion Vibber 2008-08-11 20:58:35 UTC
This looks pretty icky to me...

<ref>
  #cite_note-0-_1

<ref group="blah">
  #cite_note-4-blah_1


Those lengths are pretty awful looking and pretty unpredictable.

Given that named refs are still assigned unique sequence numbers within their groups, what's the point of including the name in the fragment ID? It seems to me that it's an unnecessary exposure of internal data -- the purpose is to allow two links to the same footnote without having to hardcode the number, which may change, in the source text.

Since the number was also in the link, putting the name in the link provides no static benefit for external linkers; its only purpose seems to be to complicate things by being ambiguous with the groups here. :)

My recommendation is to simply drop the ref name from the fragment ID -- use group and number only. Then there's nothing that needs to be disambiguated.
Comment 13 Jelte (WebBoy) 2008-08-11 21:24:41 UTC
Created attachment 5164 [details]
Yet Another Patch

(In reply to comment #12)
> My recommendation is to simply drop the ref name from the fragment ID -- use
> group and number only. Then there's nothing that needs to be disambiguated.

Ok, no problem. Patch updated.
Comment 14 Steve Sanbeg 2008-08-15 20:14:20 UTC
(In reply to comment #13)
> Created an attachment (id=5164) [details]
> Yet Another Patch
> 
> (In reply to comment #12)
> > My recommendation is to simply drop the ref name from the fragment ID -- use
> > group and number only. Then there's nothing that needs to be disambiguated.
> 
> Ok, no problem. Patch updated.
> 

That seems to have resolved the conflicts.

You may have made more changes to the messages than are necessary; i.e. removed cite_error_ref_invalid_parameters which is fully translated, and added two replacements that are not -- the one with no group probably won't get used, so this would cause the error message to change from the local language to English for no good reason.  The other message you added duplicates an existing message, so that may be unnecessary.

Comment 15 Jelte (WebBoy) 2008-08-15 20:45:16 UTC
(In reply to comment #14)
> You may have made more changes to the messages than are necessary; i.e. removed
> cite_error_ref_invalid_parameters which is fully translated, and added two
> replacements that are not -- the one with no group probably won't get used, so
> this would cause the error message to change from the local language to English
> for no good reason.

I removed *cite_error_ref_too_many_keys* because the name, the English text and at least the Dutch and German translation are wrong. This error isn't triggered when you specify two or more names, only the last name is passed to the extension by the parser. This error is only triggered when you specify invalid parameters.

Therefore I created the messages 'cite_error_ref_invalid_parameters' and 'cite_error_ref_invalid_parameters_group', just like the existing 'cite_error_references_invalid_parameters' and 'cite_error_references_invalid_parameters_group'.

'cite_error_ref_invalid_parameters' may not be used, but $wgAllowCiteGroups is still existing and can be used.

> The other message you added duplicates an existing
> message, so that may be unnecessary.

The contents of 'cite_reference_link_key_with_num' and the new 'cite_reference_link_key_with_group' are the same, but there functions are different.
Comment 16 Steve Sanbeg 2008-09-12 21:51:14 UTC
(In reply to comment #15)
> (In reply to comment #14)
> > You may have made more changes to the messages than are necessary; i.e. removed
> > cite_error_ref_invalid_parameters which is fully translated, and added two
> > replacements that are not -- the one with no group probably won't get used, so
> > this would cause the error message to change from the local language to English
> > for no good reason.
> 
> I removed *cite_error_ref_too_many_keys* because the name, the English text and
> at least the Dutch and German translation are wrong. This error isn't triggered
> when you specify two or more names, only the last name is passed to the
> extension by the parser. This error is only triggered when you specify invalid
> parameters.
> 
> Therefore I created the messages 'cite_error_ref_invalid_parameters' and
> 'cite_error_ref_invalid_parameters_group', just like the existing
> 'cite_error_references_invalid_parameters' and
> 'cite_error_references_invalid_parameters_group'.
> 
> 'cite_error_ref_invalid_parameters' may not be used, but $wgAllowCiteGroups is
> still existing and can be used.
> 

The name may not be accurate anymore, but it would take some coordination to change.  It looks like there are still a little over 60 references to the old message after your patch is applied, and (I think) it would need to be coordinated with the localization efforts.  I like the old text better; since it's not so verbose, and won't need to be changed whenever a new option is added.

> > The other message you added duplicates an existing
> > message, so that may be unnecessary.
> 
> The contents of 'cite_reference_link_key_with_num' and the new
> 'cite_reference_link_key_with_group' are the same, but there functions are
> different.
> 

It seems unnecessary to me, but that wouldn't get localized anyway, so no big deal.
Comment 17 Sumana Harihareswara 2011-11-14 16:49:28 UTC
Marking with need-review because this patch still awaits review by Brion or another active MediaWiki or Cite developer.
Comment 18 Sumana Harihareswara 2012-05-16 19:39:08 UTC
Jelte, now that we have moved to Git as a source control system, you're welcome to get developer access https://www.mediawiki.org/wiki/Developer_access to put this & future patches directly into Git.  That will get them reviewed faster, too!
Comment 19 Sumana Harihareswara 2012-10-10 20:13:53 UTC
Eran, would you do me the favor of checking on this patch and seeing whether it would work with Cite in trunk, and if so, moving it into Git?  Thanks.
Comment 20 Ori Livneh 2012-10-23 17:14:46 UTC
Gerrit change #28460 

Merged.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links