Last modified: 2014-11-04 22:52:49 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 4582 - Provide preference-based autoformatting for unlinked dates
Provide preference-based autoformatting for unlinked dates
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Normal enhancement with 21 votes (vote)
: ---
Assigned To: Andrew Garrett
http://en.wikipedia.org/wiki/Wikipedi...
:
: 4817 8267 8404 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-01-12 21:20 UTC by David E. Siegel
Modified: 2014-11-04 22:52 UTC (History)
30 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
diff -u for includes/DateFormatter.php (4.07 KB, patch)
2008-09-03 19:25 UTC, billclark
Details
Simpler patch, just eliminates links (988 bytes, patch)
2008-09-05 18:54 UTC, billclark
Details
Eliminates links, leaves date format untouched (2.13 KB, patch)
2008-09-10 23:15 UTC, billclark
Details
unlinks dates, preserves markup, uses Accept-Language for defaults (or DMY) (1.84 KB, patch)
2008-11-04 19:21 UTC, billclark
Details

Description David E. Siegel 2006-01-12 21:20:09 UTC
As recently discussed on the EN village pump (proposals), please create syntax for 
marking dates to be rendered subject to date preferences that does not overload the 
[[]] link syntax, adn does not create a link. This could help avoid the perrenial 
debate on overlinking of dates
User:DESiegel
Comment 1 Brion Vibber 2006-01-12 22:45:10 UTC
IIRC this was done in part to encourage linking of dates so they can be used as useful 
metadata.
Comment 2 Ævar Arnfjörð Bjarmason 2006-01-13 22:01:48 UTC
REOPENING from RESOLVED INVALID
Comment 3 Ævar Arnfjörð Bjarmason 2006-01-13 22:02:10 UTC
RESOLVED WONTFIX
Comment 4 David E. Siegel 2006-01-14 17:52:41 UTC
Please note that the Current English Wikipedia Manual of Style 
specifically reccomends NOT linking dates UNLESS a link is 
needed to activate date preferense, and that some editors are 
removing such links en-mass as overlinking. There is consierable 
belife that date-links are not, in general, useful meta-daa, and 
that they are harmful overlinks. Please reconsider.
Comment 5 Brion Vibber 2006-02-01 02:31:43 UTC
*** Bug 4817 has been marked as a duplicate of this bug. ***
Comment 6 David E. Siegel 2006-02-02 17:08:11 UTC
See the fourth section of <http://en.wikipedia.org/w/index.php?
title=Wikipedia:Village_pump_%28proposals%29&oldid=37860218> (a perma-link) for 
on-wiki discussion.
Comment 7 Jason Spiro 2006-02-02 18:01:22 UTC
I resurrected the discussion at:

http://en.wikipedia.org/wiki/Wikipedia_talk:Date_debate
Comment 8 David E. Siegel 2006-02-02 20:09:51 UTC
and see http://en.wikipedia.org/wiki/Wikipedia:Date_debate
Comment 9 Pcb21 2006-02-03 10:30:19 UTC
Please could the developers reconsider reopening this bug id. 

I am not sure that Brion's recollection is correct on this matter. The [[Day
Month]] and [[Year]] links are not particularly helpful metadata in themselves.
(Having >100,000 incoming links to [[2000]] for example is not that helpful) -
we are linking merely to activate preferences rather than they are useful links
(sometimes they are, sometimes they are not - depends on the article in question). 

The coupling is not natural and we should not, if possible, enforce the
editorial standards by technical means. If we could change MediaWiki so that
preferences always work, then editorial decisions about when to link could be
left to the editors rather than being "forced into linking" by the software.

Hope that makes sense,

Pete (en:User:Pcb21) 
Comment 10 Rob Church 2006-02-03 11:32:51 UTC
[a] Agree that linking tons of dates in a page is irritating and
counterproductive. [b] Agree that formatting dates according to preferences is
useful; helpful, even.

Reopening as requested.
Comment 11 William Allen Simpson 2006-02-25 17:50:13 UTC
The problem is that some (fairly few outspoken) folks have actually stated that
linking dates are almost never "relevant" (for example, de-linking "1947" in the
article "Israel").  If the date isn't relevant, then it shouldn't be in the
article!  

Linking (many or most) dates is not irritating to me.

I propose the feature continue to work, and that (many or most) dates continue
to be linked.  The software itself follows the practice that any date relevant
enough to be specified is relevant to the article.  

I'm much more interested in an era preference, and would prefer that developers
spend limited time on that enhancement.
Comment 12 Rob Church 2006-02-25 20:56:13 UTC
An idea that came up within recent discussions was that dates could continue to
be linked, but that a stack would be maintained as links were processed and
formatted for display. Dates would continue to be shown according to
preferences, but duplicate links within the same section wouldn't be rendered.
This might be another idea to consider.
Comment 13 Rich Farmbrough 2006-03-16 13:02:44 UTC
Linking dates does three good things:
* Allows formatted preferences to work
* Creates links to articles about the dates
* Provides information that an entity is a date

It also does several bad things
* Makes pages very busy
* Provides unneeded links
* Makes confusion over the second parenthetical comma of the dd month, yyyy, style
* Encourages linking of bare years, months and days of the week.
* Creates conflict.....

The options are (ISTM)
* Status quo.  The community can and will deal with these problems (except the
parenthetical comma).
* Auto preference.  Fraught with difficulty, but probably do-able.  (E.G, the
Long March 2 rocket... ).
* Preference markup. <<13 May 1955>>.
* * Version/phase a. simply does what the existing prefernce linking does.
* * b. Deal with ranges. 
* * c. Open ended preference system dealing with BC/BCE, English varieties, 12
hour vs 24 hour clock, units, etc..

Comment 14 Tony Souter 2006-03-20 01:13:32 UTC
Creating syntax for marking dates to be rendered subject to date preferences and that at the same time does not create a link would go a long way towards resolving a lot of 
unnecessary conflict on WP.
Comment 15 Phil Boswell 2006-03-22 16:00:11 UTC
It would be necessary to ensure that dates formatted with the new syntax could 
*also* be linked when appropriate.
Comment 16 SPUI 2006-03-25 09:58:13 UTC
"The 2000 March of a Million Men was attended by only two people." How would it
cope with that?
Comment 17 Aryeh Gregor (not reading bugmail, please e-mail directly) 2006-04-09 02:12:27 UTC
<nowiki>.  Inelegant?  Possibly.  But less so, probably, than requiring every
date to be linked.  Regardless, I'm not yet convinced that this is worth much
developer time in any case.
Comment 18 Steve Bennett 2006-04-15 12:32:41 UTC
Suggestion: continue to make linked dates format according to preferences, and
allow a non-linking format for just reformatting. (I confess I don't understand
the reformatting that currently goes on...presumably 19 January can redisplay as
"January 19th" or "19/1" for some people?)

Something like:

[[19 January]] [[2000]] -> Behaves as current
[[:19 January]] [[:2000]] -> no links shown, only reformatting.

I propose the additional colon somewhat in the vein of [[:image]] and
[[:category]]...but better could surely be found. Hell, a space: [[ 19 January]]
(it couldn't link, but it could reformat).
Comment 19 Gennaro Prota 2006-05-14 00:20:33 UTC
I fully agree that date formatting and hyperlinking are completely distinct
concept; so one effect should not be obtained through the other.

I propose however to "reverse the default": all dates should be reformatted by
the software, *unless* a special tag is used to prevent that (wanting user
preferences to work is the most common case; and this way they will work with no
clutter of the article source text). Of course usage of the disabling special
tag should be for good reasons. And well, that special tag(s) happens to already
exist: <nowiki></nowiki>, though a shorter alternative, I guess, would be
welcome syntactic sugar to most Wikipedians.
Comment 20 David E. Siegel 2006-05-14 23:38:38 UTC
One probem with this is that dates in direct quotations should not be 
reformated. Thus abnythign inside a vlockquote tag should also be exempt. 
Possibly sh should anythign inside "double quotes ".
Comment 21 Aryeh Gregor (not reading bugmail, please e-mail directly) 2006-05-16 02:33:10 UTC
(In reply to comment #19)
> I propose however to "reverse the default": all dates should be reformatted by
> the software, *unless* a special tag is used to prevent that (wanting user
> preferences to work is the most common case; and this way they will work with no
> clutter of the article source text).

I would like this idea, personally, except that which formatting to display
would be a nightmare.  If we could do a city-by-IP lookup, that would be
satisfactory except for the nontrivial server lag that would be incurred if the
page had to have differently-rendered versions cached for different anons, and
look up which of the world's 4,294,967,296 valid IP addresses it is in some kind
of megabytes-long table.

(In reply to comment #20)
> One probem with this is that dates in direct quotations should not be 
> reformated. Thus abnythign inside a vlockquote tag should also be exempt. 
> Possibly sh should anythign inside "double quotes ".

Would cause way too much difficulty if a mark is omitted, or if the mark is
being used to refer to itself ([[w:Quotation mark]], anyone?).  And it would be
totally worthless for any wiki that didn't use that quotation mechanism (e.g.,
used single quotes, or some other character entirely, or no character delimiter
at all).  Basically, it's a Very Bad Thing to have ordinary characters act
mainly as ordinary characters but secretly also as function characters.  Similar
arguments apply to <blockquote>, which most people aren't going to expect to
have the effect you mention even if it *is* markup.

I think <nowiki> is probably best.
Comment 22 Mats Gustafsson 2006-05-17 19:17:02 UTC
I am happy with either the solution proposed by the original posting (and elaborated by comment #18) or the cleaner inverted solution of comment #19. Either 
approach is a major step forward from the present annoying convention of routine date-linking, which detracts from the readability of Wikipedia articles. Remember 
that the vast majority of Wikipedia users only read and never edit, so the appearance of the actual article is much more important than the syntax of the source text.

Simetrical (comment #21), I think you misunderstood comment #19. Conversion would of course still be based on user preferences (if any), not on the IP number.
- ~~~~
Comment 23 Gennaro Prota 2006-05-18 14:23:58 UTC
> ------- Additional Comments From Mats Gustafsson  2006-05-17 19:17 UTC
>
> I am happy with either the solution proposed by the original posting (and
> elaborated by comment #18) or the cleaner inverted solution of comment #19.

Thanks. About the <nowiki> tag, though, I changed my mind: I think an alternative
syntax would be much better, for two reasons. Firstly, "nowiki" means "disable
wiki markup interpretation", which is something logically different from "ignore
locales". Secondly, there could be (admittedly rare) cases where one doesn't want to
reformat the date but still link it or part thereof; in this case <nowiki>16
January [[2006]]</nowiki> would not have the desidered effect. So, if it
doesn't cause PHP implementation problems (I know pretty much nothing about it)
I would propose one of these:

a) \16 January [[2006]]     (one leading backslash)
b) \16 January [[2006]]\    (maybe easier to parse)


> [...]
> 
> Simetrical (comment #21), I think you misunderstood comment #19. Conversion
> would of course still be based on user preferences (if any), not on the IP
> number.
> - ~~~~

Yes :-) One has to log in for this to work (which isn't different from now).
Comment 24 David E. Siegel 2006-05-18 20:28:02 UTC
In reply (In reply to comment #21)
In reply to comment #20)> One probem with this is that dates in direct quotations 
should not be > reformated. Thus anything inside a blockquote tag should also be 
exempt. > Possibly so should anything inside "double quotes ".
>Would cause way too much difficulty if a mark is omitted, or if the mark isbeing used 
to refer to itself ([[w:Quotation mark]], anyone?). And it would betotally worthless 
for any wiki that didn't use that quotation mechanism (e.g.,used single quotes, or 
some other character entirely, or no character delimiterat all). Basically, it's a 
Very Bad Thing to have ordinary characters actmainly as ordinary characters but 
secretly also as function characters. Similararguments apply to <blockquote>, which 
most people aren't going to expect tohave the effect you mention even if it *is* 
markup.I think <nowiki> is probably best.

The problem is that there are lots and lots of EXISTING dates in quotes. a mechanism 
whioch would reformat them automatically and silently, and which would require that 
every such page be edited in order to preserve the integrity of quoted materiel is IMO 
simply unaceptable. If your arguments are sound 9and they do have considerable force) 
it merely shows that the "reverse" method of formattign all dates except thsoe marked 
not to be formatted has an unaceptable result on established wikis.

Comment 25 Gennaro Prota 2006-05-19 12:00:05 UTC
(In reply to comment #24)

Come on, are you saying there are implementation problems with this? Dates in
quotes or whatever quotation mechanism specified at the configuration level
would not undergo localisation. Borderline cases would be fixed by those who
edit the article, as soon as they will notice the problem. If you want to
nitpick there's the issue that some editors use quotes in place of double
primes, such as in

 5' 6" (5 feets, 6 inches)

and similarly for angle amplitudes. If this is the stance towards the problem I
think we'll never fix it.
Comment 26 David E. Siegel 2006-05-19 21:26:24 UTC
(In reply to comment #25)
> (In reply to comment #24)Come on, are you saying there are implementation problems 
with this? Dates in quotes or whatever quotation mechanism specified at the 
configuration levelwould not undergo localisation. Borderline cases would be fixed by 
those whoedit the article, as soon as they will notice the problem. If you want 
tonitpick there's the issue that some editors use quotes in place of doubleprimes, 
such as in5' 6" (5 feets, 6 inches)and similarly for angle amplitudes. If this is the 
stance towards the problem Ithink we'll never fix it.


What i am mostly saying is that I think an explicit syntax to mandate localization is 
better than an assumption that all dates will be localized unless there is markup to 
avoid localization (as suggested in #19 above). I am further saying that IF authomated 
localization of all dates not marked in some way to dispaly as writen is used, then 
mechanisms which usually indicte explicit quotes (such as paired double-quotes, or the 
blockquote tag) should be considered as a form of markup prventign localization. when 
i made these points above, comment #20 suggested that the sue of any such quotign 
mechanism to disable localization is a bad idea, nd i was reponding to that.

In short my views are:
1) a mechanism to specify localization of dates other than the overloading of linkign 
is highly desireable;
2) Dates in explicitly quoted content should never be localized (even if they are 
linked);
3) If new syntax to specify localixation of dates is created (such as <date>Date 
String</date> or <<<Date String>>> or the like this would solve both 1) and 2) but 
existing dates would need to be converted to use the new markup;
4) an alternaitive is to localize ALL dates, except where markup indicats not to -- 
this could be the <nowwiki> tag, or a new similar tag, such as <nodateloc>
5) If such "automatic localization" is applied to existing wikis with large amounts of 
content, such as wikipedia, dates in direct quotes will be localized unless all the 
more common methos of indiactign a quote are also taken to disable date localization.
6) It may well be desireable to link a date that is not to be localized. Thsi could 
not be done if dates are localized by default, AND the <nowiki> tag is used to stop 
such localization, as it also stops linking.
7) I therefore conclude that localization by default is less desireable than new 
localization syntax, but IF it is implemented then new sysntax to turn this feature 
off, separate from the nowiki tag, is required; AND *if* localization by default is 
implemented, then paired double quotes, blockquote tags, and other common methods of 
indicating a direct quote MUST turn off this feature unless additiuonal markup is used 
to explicitly indicate localization sould proceed.
Comment 27 Matthew W. Jackson 2006-06-09 00:48:08 UTC
First of all, I do agree that over-linking to dates is a problem on Wikipedia. 
However, that is not the primary reason why I think the ability to format a date
to the user's locale without linking is important for MediaWiki.

Consider a Wiki other than Wikipedia that simply has no use for pages for each
year and date.  A smaller specialized wiki might want a date to display properly
but have no need for pages about "1994".  Without creating a page, I believe
that linked dates will display in red.

So, no matter what side of the issue you take on Wikipedia, this should be added
to MediaWiki.  Whether or not Wikipedia style guidelines recommend its usage is
irrelevant.

I can almost see a syntax being created to support many types of "literals,"
that format based on locale, such as numbers, measurements (units), etc.  But
dates are the most important due to all the ambiguity problems.
Comment 28 Rich Farmbrough 2006-07-14 22:35:22 UTC
Matthew is right, red links are a problem both on the smaller wikipedias (e.g.
ang) and for iso formatted dates even on en.
Comment 29 Ligulem 2006-09-14 18:54:49 UTC
How about doing <date>2006-09-14</date> in wiki-text (the date value in ISO)?
Would be displayed in as per wiki defined default format (example: "19
September, 2006" for en.wikipedia) for *all* users (including anons), with
selectable override settings for users with logins.

And how about doing [[<date>2006-09-14</date>]] for the linked case?

Could this be implemented as an extension?
Comment 30 Aryeh Gregor (not reading bugmail, please e-mail directly) 2006-09-14 21:30:18 UTC
It could be very easily implemented as an extension.  The syntax, however, is
very awkward to type, and it would be harder to get it to catch on.  The nice
thing about the links is that people notice them and tend to link them
themselves (erroneously extending the idea to lone years, but that's a separate
issue).  That's why most of the solutions have focused on automatic stuff.
Comment 31 Matthew W. Jackson 2006-09-14 21:44:24 UTC
I suggest everybody take a look at linked dates on Commons*.  The year is blue
and the month/day is red.  It'd be much nicer if the entire date was just black.

And I don't think this syntax would be awkward: <<2006-09-14>> or <<September
14, 2006>>.  We already have "{{" and "[[", so "<<" is a natural progression. 
I'm also not suggesting we remove the current link support, as sometimes you
want a date to be linked.

Another solution would be a new magic word and use this syntax:
{{date:2006-09-14}}.  The ParserFunctions extension, in combination with
templates, could also be made to fill this gap, and I've already requested such
support be added.  But that seems like a short-term solution to me.

I'd really prefer a unified syntax that doesn't need an extension.  The angle
bracket syntax seems perfect, as it could open the door to different types of
user preferences down the road (units, coordinates, etc.).

I could care less if Wikipedia adopts non-linked dates (although it seems like
there's a good chance they would).  Seeing as Wikipedia has articles for every
day and every year, there's no big problem with the current solution.  But
MediaWiki != Wikipedia.

* see
http://commons.wikimedia.org/wiki/Image:Freebirds_Austin_Tech_Ridge_Libby.jpeg#Summary
for an example
Comment 32 Ligulem 2006-09-14 22:08:38 UTC
(In reply to comment #30)
> ..  The syntax, however, is very awkward to type, and it would be harder to
get it to catch on.

The syntax <date>2006-09-14</date> could be layered under a template
({{date|2006-09-14}}) for the template likers. At least it wouldn't be as
invasive for the parser as <<2006-09-14>>. Also <date>2006-09-14</date> is more
extensible with more tags. We already have <ref>..</ref> from cite.php. And the
current date usage would still keep working.
Per the linking: A variant could be to say that <date>2006-09-14</date> is
always linked and <uldate>2006-09-14</uldate> is not (or something - just using
two different tags).
Comment 33 Aryeh Gregor (not reading bugmail, please e-mail directly) 2006-12-15 18:35:11 UTC
*** Bug 8267 has been marked as a duplicate of this bug. ***
Comment 34 Rich Farmbrough 2006-12-15 21:38:58 UTC
Note that Bug 8247 has unanimous support from at least 70 editors at WP:MosNum.
Comment 35 Tony Souter 2006-12-16 00:54:32 UTC
This request is signed by 70 Wikipedians at http://en.wikipedia.org/wiki/Wikipedia_talk:Manual_of_Style_%
28dates_and_numbers%29#List_of_supporters

Please create an additional syntax for autoformatting dates that does not make hyperlinks to 
date pages. The current syntax conflates the two independent functions of autoformatting and 
linking. The current syntax is simple; it would be an advantage if the additional syntax were 
also simple.

The new syntax is conceived not as a replacement but as an alternative, retaining (1) the 
option to link to a chronological article where useful, and (2) the validity of the huge number 
of date-links already marked up in the project.

There are significant advantages to allowing autoformatted dates to be black rather than blue, 
where there is consensus to do so in an article. Specifically, reducing the density of blued-out 
links will:

(1) improve the readability of the text;

(2) improve the aesthetic appearance of the text;

(3) remove low-value chronological links that may lead readers to pages that are irrelevant to 
an article;

(4) increase the prominence of high-value links;

(5) reduce the spill-over effect, in which editors feel they should link centuries, decades, and 
bare years, months and days of the week; and

(6) reduce conflict on Wikipedia.

Some signatories have suggested specific mark-ups, such as <<date>>, but on balance it is 
considered best that the engineers use their expertise to choose the most appropriate mark-
up.
Comment 36 Tony Souter 2006-12-16 00:56:57 UTC
Sorry, the link didn't come through properly above. Here it is:

http://en.wikipedia.org/wiki/Wikipedia_talk:Manual_of_Style_%28dates_and_numbers%29#List_of_supporters

Tony
Comment 37 William Allen Simpson 2006-12-16 12:12:46 UTC
Farmbrough, in addition to posting an incorrect bug number, 
also removed my email from the CC list.  Dirty trick! 

Of course, he's been repeatedly sanctioned for removing 
date links over the past year.  This partisan hackery 
should cease immediately!
Comment 38 Rob Church 2006-12-16 17:20:26 UTC
All right, let's not have those arguments here.
Comment 39 Tony Souter 2006-12-19 14:39:20 UTC
Dear developers

I have a responsibility to report back by the end of the month to the other 69 Wikipedians who have signed up to Comment #35 above. Please note that Comment #35 is not a 
mere restatement of the original comment, which appears to have been understood to request a change in the existing date-linking syntax. 

On the contrary, Comment #35 asks for the bug to be fixed through the creation of a ''parallel'' syntax to the existing date-linking syntax, thus avoiding problems with retro-
compatibility and users who wish to link dates.

We realise that the developers give their time and expertise ''gratis'', and are grateful for this; however, we would appreciate a substantive response to the comment. 
Comment 40 Aryeh Gregor (not reading bugmail, please e-mail directly) 2006-12-20 00:54:17 UTC
Lack of substantive response means that none of us feels like writing the code,
generally.  If any of the 70 Wikipedians who were willing to spend a few seconds
to sign a petition are willing and able to spend a few days or more writing the
appropriate code and revising it in response to criticism, it might (but might
not, admittedly) get approved.  I'm not willing to commit (no pun intended) to
review the patch, but another developer might be.

I would be against using double brackets for even more things that aren't links,
personally.  At least right now they actually denote links.  I'd be in favor of
either auto-formatting by default, switched off by <nowiki>; or use of a <date>
tag (which could be implemented much more cleanly, via extension hook).  As
noted above.
Comment 41 Rob Church 2006-12-20 17:07:55 UTC
I'm working on a quickie proof-of-concept for this using a parser hook and
hacked-up copy of DateFormatter.php. Whether or not I'll get anywhere is up in
the air, but this one's got my interest for now, so we'll see...
Comment 42 Aryeh Gregor (not reading bugmail, please e-mail directly) 2006-12-27 22:53:51 UTC
*** Bug 8404 has been marked as a duplicate of this bug. ***
Comment 43 Omegatron 2006-12-27 22:57:42 UTC
(In reply to comment #40)
> I would be against using double brackets for even more things that aren't links,
> personally.  At least right now they actually denote links.  I'd be in favor of
> either auto-formatting by default, switched off by <nowiki>

Agreed.  Using nowiki for the special cases is preferable.
Comment 44 Omegatron 2006-12-27 22:58:58 UTC
(In reply to comment #20)
> One probem with this is that dates in direct quotations should not be 
> reformated.

Not necessarily.  It's better to just be explicit and use nowiki tags where needed.
Comment 45 Rob Church 2007-01-06 01:39:17 UTC
Just to follow up...most of the work for the DateParser class is now done, and
the FormatDates extension has been checked into Subversion. I need to find some
time to write an exhaustive set of test cases for it, and run some profiling
checks on it.

The extension, for those who weren't following, introduces a <date> tag.
Comment 46 Rich Farmbrough 2007-01-08 22:43:39 UTC
Very, very cool.
Comment 47 Omegatron 2007-01-08 22:54:21 UTC
(In reply to comment #45)
> The extension, for those who weren't following, introduces a <date> tag.

So every single date in an article (except direct quotations) is now going to
look like <date>2006-01-08</date> instead of [[2006-01-08]]?
Comment 48 Rob Church 2007-01-08 23:31:50 UTC
(In reply to comment #47)
> So every single date in an article (except direct quotations) is now going to
> look like <date>2006-01-08</date> instead of [[2006-01-08]]?

Existing text is left alone. Users have to use the date tags to wrap dates which
ought to be refactored.
Comment 49 Phil Boswell 2007-01-09 15:13:18 UTC
It's looking excellent.

The only minor wrinkle is that it can't cope with years in the format [[2006 in 
music|2006]]: it'll do the day+month component but you have to leave the year out of 
it. This should not present any particular problem unless your preference is set such 
that the order would be changed, in which case it won't (do you follow?). Since the 
year would perforce be linked in any case (duh!), you would have to be the most 
hardened anti-date nutter to object to the day/month being likewise.

HTH HAND
Comment 50 Rich Farmbrough 2007-10-15 12:47:24 UTC
What's the next step?  I can write some test cases if that's a problem. ~~~
Comment 51 Omegatron 2007-10-15 14:27:53 UTC
(In reply to comment #48)
> Existing text is left alone. Users have to use the date tags to wrap dates which
> ought to be refactored.

...which will be almost all of them.  The only dates you wouldn't want auto-formatted would be direct quotes and such, no?

If this hasn't been written yet, I just want to strongly re-assert my opinion that this should auto-detect dates and apply formatting to *ALL* of them, with <nowiki> tags used to escape from this behavior for the few exceptions that will exist.

(Dates have a limited vocabulary, which you will have to parse to reformat them anyway, and freely-licensed recognition code has probably already been written.  Ambiguous cases can just be left as they were typed, encouraging editors to change them into a format the parser recognizes with certainty.)

This is a wiki, remember?  The markup is supposed to be simple and clean, and the software is supposed to make sane assumptions about what the editor wants.  I know this has been severely eroded by Cite.php, but that doesn't mean we should just abandon the wiki concept altogether and <put>HTML-esque</parser> <hooks around="every"/> word.

For instance, there are a few instances where someone wants an asterisk at the beginning of the line without it becoming a list.  Should we give up and say "It's impossible to think of all the different exceptions!", and go back to HTML, forcing users to type <ul><li> around each item they *do* want listed (which is most of them)?  No.  We should use the asterisk to mean a list by default, and use nowiki tags for the rare exceptions, keeping the markup simple and easy to use for the vast majority of cases.
Comment 52 Tony Souter 2007-10-15 15:57:29 UTC
I'm very pleased for this matter to be re-opened and addressed. 

A thought occurred to me. Why is it that the dates displayed after our signatures are autoformatted yet avoid the bright blue of links, yet it seems to be a technical hurdle to overcome the colour issue in the unfortunately coupled link/autoformatting function. Would it not be possible, as a quick-fix, to import whatever programming is used to autoformat those signature dates without linking them, to the normal autoformatting system?
Comment 53 Rob Church 2007-10-15 16:01:17 UTC
Signature dates are formatted once, and injected upon page save.

I have no further interest in doing more than tweaking and polishing this extension at the present time; I won't be rewriting it to take into account all free-standing dates, although I agree that the lack of support for this makes the extension almost useless, and was a significant design oversight.
Comment 54 cypsy 2007-10-17 21:53:49 UTC
To recap the issues (and I hope I'm getting this right):
1. Find a way to cause [[YYYY-MM-DD]] to be formatted without it also being a link
2. Generate a "pretty" format by default (for readers who are not logged in and hence don't have a date preference)

Addressing these:
1. The fundamental complaint here is that the blue links clutter up the page. 
Can't the [[YYYY-MM-DD]] strings be simply <span>ned in a class that would make the "links" *appear* as plain text?

2. The objection that there is no standard date format, while valid, does not preclude that WP arbitrarily choose one.
Although I'm European, I'm not entirely stupid :) and I am not baffled by a string that reads "October 17, 2007". Just as I am not going to be gasping for air when I read an article that uses "color" instead of "colour" or "favor" instead of "favour". No sensible editor has a problem with the format of the timestamps in signatures, so I see no reason why that can't be "acceptable" in general too.

These stop gap measures do not necessarily have to become permanent, but they can work for now, and would be an awful sight better than the present status quo. If #1 is implemented with a template, any short term solutions could even be "upgraded" by a bot when <date>2006-01-08</date> or [[Date:2007-10-17]] or {{#strftime:}} or an article-wide {{DATEFORMAT:}} magic word or whatever is finally implemented.

Both the stop gaps I've suggested above are (I think) implementable with templates.

My two cents...
C.
Comment 55 Omegatron 2007-10-17 22:51:24 UTC
(In reply to comment #54)
> 1. The fundamental complaint here is that the blue links clutter up the page. 
> Can't the [[YYYY-MM-DD]] strings be simply <span>ned in a class that would make
> the "links" *appear* as plain text?

Why do we use link syntax for this formatting feature in the first place?  Why would we continue to do this even after recognizing that it is not semantically correct?

> 2. The objection that there is no standard date format, while valid, does not
> preclude that WP arbitrarily choose one.

It could guess the local format based on the geolocation of the IP requesting the page?
Comment 56 Rob Church 2007-10-17 22:56:33 UTC
(In reply to comment #55)
> Why do we use link syntax for this formatting feature in the first place?  Why
> would we continue to do this even after recognizing that it is not semantically
> correct?

* Someone made a bad decision once, which no doubt had some sane rationale at the time?
* No-one's changed the status quo at this time?

> It could guess the local format based on the geolocation of the IP requesting
> the page?

This might well introduce unacceptable caching dependencies, leading to cache fragmentation.
Comment 57 Omegatron 2007-10-17 23:12:05 UTC
(In reply to comment #56)
> This might well introduce unacceptable caching dependencies, leading to cache
> fragmentation.

I don't know what that means, but wouldn't it just have to cache two versions of each page, for unregistered users? 

Comment 58 cypsy 2007-10-23 04:45:48 UTC
(In reply to comment #57)
>I don't know what that means, but wouldn't it just have to cache two versions
>of each page, for unregistered users? 

Thats not the kind of caching Rob is (presumably) referring to. Because the date format depends on "who" is viewing the page, this is one of the things that the server (i.e. cache) takes care of in "real time", that is, on it way from disk to the network. The dependency/caching issue in this case would be the IP table upon which geolocation would necessarily be based. This table - which is anything but small - is what would have to be synchronized across the caches.

In any case, using geolocation just to determine an "appropriate" date format is probably not going to happen...
1. it would be inappropriate for this purpose. For example, an AOL user would get a US date no matter where he/she was located.
2. it would be extremely stressful for the servers, way in excess of any perceived advantage.
3. the WP board would (I hope) never approve of *any* criticized practice, leave alone for the sake of a darn date format.
Comment 59 cypsy 2007-10-23 05:22:05 UTC
Rob: couldn't FormattableDate() take $wgLanguageCode and $wgAmericanDates into consideration when selecting a "default" format?
i.e. 
  default: 
       if ($wgLanguageCode !== 'En' || $wgAmericanDates == false) {
           if ($wgLanguageCode == 'Jp' || $wgLanguageCode == 'Kr')
               return 'Y F j';
           return 'F j, Y';
       }
       return 'Y-m-d';

or some such? That would at least take care of the "unpretty" issue for all the non-en wikis.



Comment 60 Omegatron 2007-10-23 05:24:23 UTC
(In reply to comment #58)
> In any case, using geolocation just to determine an "appropriate" date format
> is probably not going to happen...
> 1. it would be inappropriate for this purpose. For example, an AOL user would
> get a US date no matter where he/she was located.
> 2. it would be extremely stressful for the servers, way in excess of any
> perceived advantage.
> 3. the WP board would (I hope) never approve of *any* criticized practice,
> leave alone for the sake of a darn date format.

'k.  Just an idea.  :-) 

Comment 61 Brion Vibber 2007-12-03 21:30:02 UTC
I'm just going to make an executive decision and declare than having things that may or may not be dates in general body text should *not* be reformatted willy-nilly. Special magic links at least are already specially set off, but fiddling with general text can be a recipe for trouble. We already get enough complaints about things like the RFC links that I'd hate to have some-but-not-all-dates-and-also-things-that-resemble-dates being changed around.
Comment 62 Stephen Turner 2007-12-03 21:48:38 UTC
I completely agree with that, but I'd still like to see a syntax for formatting dates without linking them.
Comment 63 Omegatron 2007-12-03 22:33:42 UTC
(In reply to comment #61)
> I'm just going to make an executive decision

:-(

> We already get enough
> complaints about things like the RFC links

Really?

> that I'd hate to have
> some-but-not-all-dates-and-also-things-that-resemble-dates being changed
> around.

You're making a decision based on a fear of what *might* go wrong if implemented as poorly as possible, instead of just thinking about it and realizing it's not very difficult to do right.

It would only format things that are unambiguously dates, erring on the side of caution.  Ambiguous things like 01-01-01 would not be auto-formatted, enabling the editor to change them into an alternative format that *is* auto-formatted if that's the desired meaning.  Otherwise it will just be formatted as originally typed.  How is that a problem?  Editors can always change it to a standard format, like they do with unclosed tags or mistypings.

And, as always, <nowiki> tags would be used for the rare special cases where you don't want auto-formatting, like documentation about the date formatting itself and direct quotations.

In the vast majority of cases, you want date formatting, so, since this is a wiki, the best solution is to automatically format general text.  This is a wiki, right?  I mean, if it's not anymore, just come out and say so.
Comment 64 Tony Souter 2007-12-04 00:48:08 UTC
Brian, many users have become fed up with the amount of time this essential reform process is taking. At MOSNUM, what had been a firm prescription to use date autoformatting has been watered down because of the multitude of problems with the system. If we can't even fix the entanglement with the linking system—which is the most complained about issue—then I suspect there will be a push at MOSNUM to water it down further. If it goes on much longer, I, personally, will be inclined to encourage all nominators of featured article candidates to ditch the system completely.

That would be a pity.

Brian, I don't see evidence that you're offering much help at all. It's just the "wontfix" button. It's very frustrating. Can you suggest any other avenues?
Comment 65 Brion Vibber 2007-12-04 21:43:52 UTC
Comment #62: There is a parser function for date and time formatting, isn't there?

Comment #64: My personal recommendation would be to remove all date autoformatting and let a sane manual of style recommend the fairly standard international English form, eg '4 December 2008'. Of course that's too simple and obvious for Wikipedia. ;)
Comment 66 Omegatron 2007-12-04 21:51:34 UTC
(In reply to comment #65)
> Comment #64: My personal recommendation would be to remove all date
> autoformatting and let a sane manual of style recommend the fairly standard
> international English form, eg '4 December 2008'. Of course that's too simple
> and obvious for Wikipedia. ;)

I suspect Mediawiki is used on a lot of sites that don't follow the English Wikipedia's Manual of Style.
Comment 67 David E. Siegel 2007-12-04 22:23:40 UTC
(In reply to comment #61)
> I'm just going to make an executive decision and declare than having things
> that may or may not be dates in general body text should *not* be reformatted
> willy-nilly. Special magic links at least are already specially set off, but
> fiddling with general text can be a recipe for trouble. We already get enough
> complaints about things like the RFC links that I'd hate to have
> some-but-not-all-dates-and-also-things-that-resemble-dates being changed
> around.

I am inclined to agree that AUTOformat code is undesirable. Quite aside from the issue of false positives (which as  Omegatron in comment #62 says, can probably be avoided) there are still the issues of links in direct quotes, and links which for other reasons should not be reformatted. It is not enough to say that nowiki tags would prevent the auto-formatting, both because quotes are often subject to wiki formatting (for indentation, among other reasons) and because of the large mass of existing text which should not be auto-formatted. if automatic date formatting is done, it should at a minimum include a disabling syntax specific to this feature, different from the general nowiki tag. Better, IMO, would be taking all indications that tend to introduce direct quotes and also treating them as disabling syntax; better still is probably not having any such feature.

HOWEVER, I still DO think that an *explicit* syntax for invoking date formatting, not tied to a link, would be a major plus. Whether there is a practical way to specify a default for users who are not logged in or not, such a feature would IMO have significant benefits. if a useful default can be specified, that would be even better. This would be a major help in Wikipedia, where dates are often overlinked to get formatting. It would be even better on smaller wikis that use the software, which are unlikely to have any use for pages for every possible day and year, and so must have redlinks if date formatting is to be used.

I urge the developers to re-open the original request for an explicit non-link date formatting syntax (such as <<DateHere>> or <date>DateHere</date>, but the precise format really is a much less important issue), quite apart from the issue of general auto-formatting for existing plain-text dates, and not simply mark this as "won't fix" because of disputes over the auto-formatting aspect.
Comment 68 David E. Siegel 2007-12-04 22:58:43 UTC
I should be clear -- such a new syntax would be in *addition* to the current link-based syntax, not in total replacemnt of it. whether it eventually becomes favored is another issue.
Comment 69 Omegatron 2007-12-04 23:58:44 UTC
(In reply to comment #67)
> both because quotes are
> often subject to wiki formatting (for indentation, among other reasons)

Nowiki tags don't work inside formatting tags?

> and
> because of the large mass of existing text which should not be auto-formatted.

Examples?  This wouldn't change any source code of articles, you realize.  It just formats them for display based on user preferences (which will probably be "off" by default for non-logged in users, as it currently is).

Do you mean text inside direct quotes?  How many instances like this do you estimate there are, even in the entire text of the English Wikipedia?  It's not going to autoformat "four score and seven years ago".
Comment 70 David E. Siegel 2007-12-05 18:59:05 UTC
(In reply to comment #69)
> (In reply to comment #67)
> > both because quotes are
> > often subject to wiki formatting (for indentation, among other reasons)
> Nowiki tags don't work inside formatting tags?

True, it would be possible to have nowiki tags tightly bound to the actual date strings, and have any formatting codes outside such tags. However, if a special syntax is used to disable date localization, and ONLY date localization, it could be automatically joined to blockquote formatting with a template. I think that such special nodateloc tags would be cleaner and clearer, and would permit a date that should be linked, but should not be reformatted/localized (this is probably rare, but would occur upon occasion). 

> > and
> > because of the large mass of existing text which should not be auto-formatted.
> Examples?  This wouldn't change any source code of articles, you realize.  It
> just formats them for display based on user preferences (which will probably be
> "off" by default for non-logged in users, as it currently is).
> Do you mean text inside direct quotes?  How many instances like this do you
> estimate there are, even in the entire text of the English Wikipedia?  It's not
> going to autoformat "four score and seven years ago".

I would estimate, admittedly without real evidence, that there are tens of thousands of date strings in directly quoted content in the en Wikipedia alone, and hundreds of date strings where the format is being mentioned, not just used, and so should not be reformatted either. There are also the many date strings on talk pages created by signatures, and by templates which act like signatures. I think that none of these should be reformatted, although that might be debated. There are millions of these, at least.

I also think that if auto-formatting is to be implemented, and to be of significant value, there probably should be a non-empty default date preference set for non-logged-in users, either a single one for any given wiki, which becomes that wiki's standard, or one somehow based on the IP address. I understand that the latter may add excessive bandwidth and cache problems, at least for large wikis with many users, such as the major Wikipedias. Such wikis would probably opt for a single default style for non-logged-in users. But my views remain unchanged even if no such default preference is to be implemented.

My views on auto-formatting are, in short:

1) Changing the effective display of many pages after the fact is at least possibly unwise.
2) Changing displayed formats in existing directly quoted content is completely unacceptable.
3) Requiring a search of all existing text for directly quoted date strings and other instances of date stings that should not be reformatted is, at least, a significant burden on editors of wikis with large existing content, such as the major Wikipedias.
4) Therefore, if auto-formating is implemented, tags such a blockquote and paired double quotes (or perhaps paired single quotes) should disable auto-formatting. Which tags serve this function perhaps should be configurable on a wiki-by-wiki basis.
5) If auto-formatting of date strings is to be implemented, considerable care must be taken to avoid and test for false positives, and there should be a commitment by the responsible developers to fix the autoformant code if instances of false positives are found.

My views on some new way to format date strings to match preferences are:
1) The overloading of link syntax for this purpose is, at best awkward and poor design.
2) The link-based syntax encourages overlinking. While there may be debates on when dates are valuable links, and when they are not, almost everyone will agree that multiple occurrences of date strings, all linked, in close textual proximity to each other, when the dates are identical or close in time, is overlinking. (I personally think the vast majority of dates should be unlinked, but that is an opinion that is controversial.)
3) The link-based syntax is quite awkward for some date formats, particularly the "Month dd, YYYY" format, as it leads to a link to a page about a particular calender date unconnected with a year, which is unlikely to be useful metadata, and requires the editor to create two links for what is logically a single entity.
4) On smaller wikis, the date-based syntax causes redlinks which are even more distracting and confusing for the reader than bluelinks, and invite the creation of pages for specific dates which may be of no value to a wiki more specialized than a Wikipedia instance.
5) The ability to specify date reformatting using a non-link-based syntax would avoid these problems. So would auto-formatting all date strings.
6) A non-link-based date syntax might encourage the use of the non ambiguous ISO date format, particularly if a default date preference is assigned to non-logged-in users.
7) Therefore, either a new non-link-based syntax forcing localization, or auto-formatting for the same purpose, should be implemented. The first choice is simpler, and will cause fewer problems, but will require editorial effort before it makes a difference anywhere. The second will have wide effects at once, but at least some of those will be undesirable, and require editorial efforts to fix.
Comment 71 Tony Souter 2007-12-08 04:34:46 UTC
Could I remind participants that some 80 Wikipedians signed the original petition to have the system fixed (eight more have signed up since, and further promotion of the existence of this page would be very likely to generate a lot more signatures).

Tony1
Comment 72 Tony Souter 2007-12-08 04:35:52 UTC
Could I remind participants that some 80 Wikipedians signed the original petition to have the system fixed (eight more have signed up since, and further promotion of the existence of this page would be very likely to generate a lot more signatures).

Tony1
Comment 73 Brion Vibber 2007-12-18 02:27:25 UTC
Let's keep it open for now and see what happens...
Comment 74 Voyagerfan5761 / dgw 2007-12-31 02:42:51 UTC
Speaking from the perspective of a non-Wikipedia MediaWiki user (my own sandbox wiki), tying the date formatting to links is annoying in that it generates tons of unnecessary links that are probably not relevant to the project. This has been covered above.

Also covered above is the possibility of a parser hook, which Rob wrote as an extension. I don't think that's a particularly viable option, because XTHML-style tags are often inconvenient to type. I myself am OK with them, as I write a lot of HTML-style code, but I expect most users of MediaWiki don't do so, and would find using the < and > keys for dates all the time cumbersome.

<<DateHere>>-type constructs, also as mentioned above, use similar syntax to the current links, and would add no additional article size (not that size is a major concern in this case). It does use the < and > keys again, which as I've stated, could be inconvenient for "normal" users.

Now, I've tried to come up with some ideas for alternatives, not the least of which is to simply use regular expressions to look for common date formats and reformat them, excluding quoted text (probably). I don't know any off the top of my head, but I'm sure there are established regexes to match common date formats; and really, why should the software have to deal with possible date formats like 27 Dec 90 - that would be crazy to try to implement reliably.

Another idea is to use something like d2007-12-31d (dDecember 31, 2007d) to denote dates. How often is it that a string exactly matching the format of a common date arrangement is wrapped in the 'd' character? Or perhaps we could use ?2007-12-31? or ?d2007-12-31?d (the latter allowing for other preferences like ?m21 lbs.?m for measurements). They make the source a little harder to read, but at least the ones with question marks are set off a bit.

So those are my ideas for now. Anyone have feedback?
Comment 75 Tony Souter 2007-12-31 04:02:08 UTC
Hey Voyager, thanks for your suggestions. << and >>, or the single pair, would be no problem for Wiki users. As for the string that starts and ends with "d", it's fine by me, but others more familiar with the technical language will comment if they see a problem.

It would be great to get this thing off the ground. We desperately need to resolve the bug so that we can all move on from the tension at WP about the autoblotching of full dates.

Tony
Comment 76 Omegatron 2007-12-31 04:11:21 UTC
(In reply to comment #74)
> Also covered above is the possibility of a parser hook, which Rob wrote as an
> extension. I don't think that's a particularly viable option, because
> XTHML-style tags are often inconvenient to type. I myself am OK with them, as I
> write a lot of HTML-style code, but I expect most users of MediaWiki don't do
> so, and would find using the < and > keys for dates all the time cumbersome.

Agreed.
 
> Now, I've tried to come up with some ideas for alternatives, not the least of
> which is to simply use regular expressions to look for common date formats and
> reformat them, excluding quoted text (probably). I don't know any off the top
> of my head, but I'm sure there are established regexes to match common date
> formats; and really, why should the software have to deal with possible date
> formats like 27 Dec 90 - that would be crazy to try to implement reliably.
> 
> Another idea is to use something like d2007-12-31d (dDecember 31, 2007d) to
> denote dates.

Or just say that anything in ISO 8601 format should be automatically formatted.  How often do those occur in quotations or text that shouldn't be auto-formatted?
Comment 77 Voyagerfan5761 / dgw 2007-12-31 04:17:34 UTC
I guess I visited this bug because I've been mildly frustrated by this omission (can I call it that?) in MediaWiki. I'd like auto-formatted dates on my wiki, but I don't want to make articles for every day and year. On Wikipedia it's at least within the scope of the project; within my local wiki, it's just note-taking and (eventually) a project wiki.

If others don't think the < and > would be inconvenient to type, then by all means we should go ahead and implement markup like the <<Date Here>> syntax from above. Thinking over my suggestions for character delimiters, only the ?x "tags" seem like they'd be really appropriate; just having letters would probably be confusing.

Actually, what if they were to have syntax like d?2007-12-31?d, or d?2007-12-31?, or even d?2007-12-31?d ? (Whoa, that's confusing placement on that last one!) I think symmetry is an integral part of "now it's on, now it's off" markup like HTML and wikicode (the second option above notwithstanding).

On a side note, I've changed the bug summary to better reflect what we actually want, which is simply a way to autoformat dates without linking.

UPDATE before post: I just got an email about Omegatron's comment (comment #76), and that's not a bad idea. ISO 8601 is a good way to detect dates. There would have to be some escaping of template parameters so things like {{cite web}} don't get broken by the autoformatting (the template takes care of rendering on output), but that's actually a ''very'' good idea. If we want linked dates, we simply add the same [[brackets]] around them as we do now; unlinked ISO 8601 dates get formatted as well, and anything else is displayed as typed unless linked (in which case the current parsing would apply). Also, the signature timestamps should be changed to use the ISO 8601 format (e.g. output 23:34, 2007-12-31) if this method is adopted.
Comment 78 Omegatron 2007-12-31 04:26:42 UTC
(In reply to comment #77)
> UPDATE before post: I just got an email about Omegatron's comment (comment
> #76), and that's not a bad idea. ISO 8601 is a good way to detect dates. There
> would have to be some escaping of template parameters so things like {{cite
> web}} don't get broken by the autoformatting (the template takes care of
> rendering on output)

No.  The template would be updated to let the autoformatting do its job.  It already recommends to use ISO 8601 in the template parameters.

I've already been researching this, though, and *proper* ISO 8601 might only be useful from the years 1582 to 9999?  After 9999 you are supposed to write it as +12345-01-01, and before 1582 you are supposed to convert the date to the proleptic Gregorian calendar.  So 0826-03-16 is actually 12 March 826.  :-D

http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style_(dates_and_numbers)#Calendars
http://www.tondering.dk/claus/cal/node4.html
http://www.w3.org/International/questions/qa-date-format
Comment 79 Voyagerfan5761 / dgw 2007-12-31 04:41:36 UTC
(In reply to comment #78)
> (In reply to comment #77)
> > UPDATE before post: I just got an email about Omegatron's comment (comment
> > #76), and that's not a bad idea. ISO 8601 is a good way to detect dates. There
> > would have to be some escaping of template parameters so things like {{cite
> > web}} don't get broken by the autoformatting (the template takes care of
> > rendering on output)
> 
> No.  The template would be updated to let the autoformatting do its job.  It
> already recommends to use ISO 8601 in the template parameters.
> 
> I've already been researching this, though, and *proper* ISO 8601 might only be
> useful from the years 1582 to 9999?  After 9999 you are supposed to write it as
> +12345-01-01, and before 1582 you are supposed to convert the date to the
> proleptic Gregorian calendar.  So 0826-03-16 is actually 12 March 826.  :-D
> 
> http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style_(dates_and_numbers)#Calendars
> http://www.tondering.dk/claus/cal/node4.html
> http://www.w3.org/International/questions/qa-date-format
> 

It just seems that the template would take e.g. accessdate and do [[2007-12-31]], letting the autoformatting on links apply normally as it does now, actually. My misspeak (?; that doesn't sound right).

Regarding the date formatting before 1582: Ew. I don't want to think about that, honestly. Of course, somebody has to think about it, which is why we pay our devs the big bucks (oh wait... :D).

If we autoformat dates, the software is going to have to handle them somehow. So dates before 1582 can be assumed to be proleptic or they can be assumed to be Julian. But not both, so far as I know. If MediaWiki does a simple comparison on the year and branches based on what year is used, performing optionally an additional conversion from Julian to proleptic Gregorian dates based on user preferences (that'd have to be added if we choose this option - sorry Brion), then dates that are already in the proleptic Gregorian calendar would be converted incorrectly.

Probably the best way to handle it is to just format the date, without any calendar conversion, and let the surrounding wikitext specify what calendar the date obeys. Perhaps it's not as easily read by machines, if MediaWiki eventually supports the scraping of metadata like dates and measurements from wiki articles, but it's probably easier to code and maintain (and for backwards compatibility and ease of use, for that matter).

I think that's "proper" ISO 8601, since actual proper ISO 8601 (sans quotes) is defined as exactly what you say it is, with a plus before years with more than four digits. But we have 8,000 years to worry about plussed dates. (Unless someone's planning on writing future history?)
Comment 80 Tony Souter 2007-12-31 04:51:47 UTC
Really, the safest thing to do is not change the raw date that is input. Let's keep away from ISO, please, which is not favoured by Wikipedia. Can't we just substitute exactly the double square brackets currently in use with the arrows? So:

 <<September 27>>, <<1980>> 

or 

<<27 September>> <<1980>>

depending on whether US or international formatting is used in the article? That means no one will have to learn new ways of inputting the autoformatting code. This is critical if it's to gain wide support. Let's not complicate matters.

Tony
Comment 81 Omegatron 2007-12-31 06:02:59 UTC
(In reply to comment #79)
> It just seems that the template would take e.g. accessdate and do
> [[2007-12-31]], letting the autoformatting on links apply normally as it does
> now, actually. My misspeak (?; that doesn't sound right).

Yes.

> Probably the best way to handle it is to just format the date, without any
> calendar conversion, and let the surrounding wikitext specify what calendar the
> date obeys.

Yes.

> Perhaps it's not as easily read by machines, if MediaWiki
> eventually supports the scraping of metadata

Yeah, I guess that would be the only concern.  But if it's scraping dates, it can be smart enough to not scrape things incorrectly before a certain date.

So 0826-03-16 would behave just like [[0826-03-16]] does now, and display as "March 16, 826".  Easy and simple.

(I just checked, and [[826-03-16]] doesn't work, by the way.)

> (Unless someone's planning on writing future history?)

You mean like this?

http://en.wikipedia.org/wiki/11th_millennium_and_beyond#Simultaneous_occurrence_of_solar_eclipse_and_transit

(In reply to comment #80)
> Really, the safest thing to do is not change the raw date that is input. Let's
> keep away from ISO, please, which is not favoured by Wikipedia.

Not favoured by *you*, you mean.  ISO 8601 is the preferred format for citation templates and tables on Wikipedia.  Or do you mean in article prose?  You do realize what we're talking about, right?  Using ISO format as a form of "source code", which is then formatted on article display in your preferred written-out format.

> Can't we just
> substitute exactly the double square brackets currently in use with the arrows?
> So:
> 
>  <<September 27>>, <<1980>> 

Hard to type, and confusingly similar to HTML, just like the <date> tags.  Not a good solution and not very wiki-like.
Comment 82 Tony Souter 2007-12-31 07:23:37 UTC
Not favoured by *me*? Well, I'm referring only to this, in WP's MOS:

"ISO 8601 dates (1976-05-13) are uncommon in English prose and are generally not used in Wikipedia. However, they may be useful in long lists and tables for conciseness and ease of comparison."

That's all. Now I see what you're talking about, though. That's fine if you mean to avoid the current autodud system, you just input an ISO date: no problem.

The double arrows, IMV, are not at all similar to single arrows unless you're almost blind and refuse to wear spectacles. I don't find the arrows hard to type, and don't see why others would. 

However, I don't care what symbols are used, as long as they're short and simple and easy. Please go ahead.
Comment 83 Voyagerfan5761 / dgw 2007-12-31 14:19:25 UTC
(In reply to comment #81)
> > (Unless someone's planning on writing future history?)
> 
> You mean like this?
> 
> http://en.wikipedia.org/wiki/11th_millennium_and_beyond#Simultaneous_occurrence_of_solar_eclipse_and_transit

Ugh, I never saw that article. Well, perhaps we ''should'' press for an extra (\+\d{1,3})? in whatever regex ends up in the code to detect these dates after all. Can we leave it at seven-digit years for now? Of course now someone will pop up with an article that mentions a date in the year 802701000... :P

> 
> Not favoured by *you*, you mean.  ISO 8601 is the preferred format for citation
> templates and tables on Wikipedia.  Or do you mean in article prose?  You do
> realize what we're talking about, right?  Using ISO format as a form of "source
> code", which is then formatted on article display in your preferred written-out
> format.
> 
> > Can't we just
> > substitute exactly the double square brackets currently in use with the arrows?
> > So:
> > 
> >  <<September 27>>, <<1980>> 
> 
> Hard to type, and confusingly similar to HTML, just like the <date> tags.  Not
> a good solution and not very wiki-like.
> 

For both of the above, exactly my feelings. Thanks for putting them into words for me; I was trying and failing to think of how to respond last night.
Comment 84 Voyagerfan5761 / dgw 2007-12-31 14:31:02 UTC
(In reply to comment #82)
> Not favoured by *me*? Well, I'm referring only to this, in WP's MOS:
> 
> "ISO 8601 dates (1976-05-13) are uncommon in English prose and are generally
> not used in Wikipedia. However, they may be useful in long lists and tables for
> conciseness and ease of comparison."
> 
> That's all. Now I see what you're talking about, though. That's fine if you
> mean to avoid the current autodud system, you just input an ISO date: no
> problem.
> 
> The double arrows, IMV, are not at all similar to single arrows unless you're
> almost blind and refuse to wear spectacles. I don't find the arrows hard to
> type, and don't see why others would. 
> 
> However, I don't care what symbols are used, as long as they're short and
> simple and easy. Please go ahead.
> 

Graah, I was going to do a double reply just like Omegatron but forgot to click the second link. I feel so bad for double posting. But I won't dwell on it.

I don't think we should add more markup codes unless there's a reason to do so. <date> tags would be awfully long, and <<2007-12-31>> looks funny (to me; other reasons below). There is the capability to detect dates automatically during parsing if we insist on a single format, like ISO 8601, that is not separated by spaces. The whole regex could (and probably should) be wrapped in \b(regex for date)\b to help avoid false matches. Er, or am I wrong and hyphens-are-the-same as spaces where \b is concerned?

For the <<double arrows>>, I personally find that they make the wikitext harder to read on the edges, since the symbols and the numbers on each end seem to blend. Perhaps I have a funny monitor or something, but that's my reasoning (in addition to the "looks funny" mentioned above).

Really, if all the dates in articles were converted to ISO 8601 in the source code, it would be one step closer to semantic MediaWiki, even if there's no markup around them, because they'd all be consistent. I can see a new Help: page or two if this is implemented to get newbies accustomed to  ISO dates...
Comment 85 Omegatron 2007-12-31 14:49:40 UTC
(In reply to comment #82)
> That's all. Now I see what you're talking about, though. That's fine if you
> mean to avoid the current autodud system, you just input an ISO date: no
> problem.

Yes, exactly.  And for the rare cases like tables where ISO format is the desired display, just use nowiki tags.

(In reply to comment #83)
> Ugh, I never saw that article. Well, perhaps we ''should'' press for an extra
> (\+\d{1,3})? in whatever regex ends up in the code to detect these dates after
> all.

Might as well.

> Can we leave it at seven-digit years for now? Of course now someone will
> pop up with an article that mentions a date in the year 802701000... :P

Don't worry *too* much about it.  The current system only works on ISO dates up to December 99th, 9999, after all.  :-)
Comment 86 Stephen Turner 2007-12-31 19:45:23 UTC
Although I am a strong proponent of separating date syntax from linking syntax, I also strongly disagree with the idea that dates, even just ISO 8601 dates, should automatically be magical. For a start, we shouldn't go adding magic to all the dates that are already in the encyclopaedia. Much more importantly, I believe Brion has already vetoed this approach in comment 61 above.
Comment 87 Voyagerfan5761 / dgw 2007-12-31 20:06:13 UTC
(In reply to comment #86)
> Although I am a strong proponent of separating date syntax from linking syntax,
> I also strongly disagree with the idea that dates, even just ISO 8601 dates,
> should automatically be magical. For a start, we shouldn't go adding magic to
> all the dates that are already in the encyclopaedia. Much more importantly, I
> believe Brion has already vetoed this approach in comment 61 above.
> 

Most of the dates that are in the encyclopedia aren't in ISO 8601 format. The ones that are in the text already tend to be formatted as "October 23, 2007" or "15 April, 1952"; stuff like that. We're talking about ''only'' auto-formatting dates in the format yyyy-mm-dd; nothing else. Brion's comment doesn't apply, as far as I can tell, because what besides a date is formatted like an ISO 8601 date string?
Comment 88 Omegatron 2007-12-31 20:21:30 UTC
(In reply to comment #87)
> Most of the dates that are in the encyclopedia aren't in ISO 8601 format.

As Tony pointed out, these are already recommended against in article text, so it's not like there's going to be many false positives there.  The only cases I know of where they are in use and should *not* be autoformatted is tables (which are relatively rare and can be nowikied).

Can anyone think of any other examples of potential conflicts?   Certainly there will be a few weird cases, but I don't know why people are portraying this as the end of the world.  It's just the formatting of dates.  Changes to templates or the software break stuff much more dramatically than this all the time, and editors have it fixed in a short period of time.
Comment 89 Voyagerfan5761 / dgw 2007-12-31 20:25:44 UTC
(In reply to comment #88)
> Can anyone think of any other examples of potential conflicts?   Certainly
> there will be a few weird cases, but I don't know why people are portraying
> this as the end of the world.  It's just the formatting of dates.  Changes to
> templates or the software break stuff much more dramatically than this all the
> time, and editors have it fixed in a short period of time.
> 

Somehow I'm reminded of the UNIQ bug from November... Even that wasn't as big a deal as all that. We made lists, got editors involved in fixing the pages, and life went on. The same thing will happen if the new date formatting breaks something. It'll get found and corrected.

I'm not sure about whether the formatter should be disabled inside tables or not. Since that's the most likely place for intended ISO dates, it could be practical, but on the other hand it's possible that someone will want auto-formatting dates in a table. Probably best to have it opt-out (e.g. <nowiki> or even something like <nofmt>) no matter where if it's going to do the plain date thing.
Comment 90 Omegatron 2007-12-31 20:28:36 UTC
(In reply to comment #89)
> Probably best to have it opt-out (e.g.

<nowiki>2007-12-31</nowiki>

I don't see why this is seen as such a big deal.
Comment 91 Stephen Turner 2007-12-31 20:41:59 UTC
It's a deal (I won't say a big deal, but a deal nonetheless) because there are already some ISO dates in the encyclopaedia that shouldn't be magicked, and so it will cause complaints.

I also think it's just as easy to train new users to write (something like) "<date>January 1, 2008</date>" or "<<January 1, 2008>>" as "2008-01-01". Neither is the way they'd naturally write it or read it, after all.

Also, if ISO dates become magical, we'd have to choose a default date preference for unregistered and new users. That's not necessarily a bad idea in itself, but it's a change, and people may have views about which date preference to choose.

To me it makes a lot more sense that if an editor wants magic to happen, they should ask for magic. Magic shouldn't happen behind their backs because it will tend to have unexpected and undesired consequences occasionally -- not in the majority of cases, but enough to be noticed. This is the way that pretty much all the wiki syntax works -- you have to ask for it.
Comment 92 Voyagerfan5761 / dgw 2007-12-31 21:02:30 UTC
(In reply to comment #91)
> To me it makes a lot more sense that if an editor wants magic to happen, they
> should ask for magic. Magic shouldn't happen behind their backs because it will
> tend to have unexpected and undesired consequences occasionally -- not in the
> majority of cases, but enough to be noticed. This is the way that pretty much
> all the wiki syntax works -- you have to ask for it.
> 

Magic like the automatic <pre> tags around lines that start with a space? A lot of people don't ask for that; I've seen a lot of cases where there's been an unintentional space at the beginning of a line which makes the whole paragraph scroll off to the right. And don't we have a Preview button to help avoid mistakes stemming from unintentional markup interpretation?
Comment 93 Omegatron 2008-01-01 00:09:20 UTC
(In reply to comment #91)
> To me it makes a lot more sense that if an editor wants magic to happen, they
> should ask for magic.

And we should remove the auto-linking of ISBN numbers, because there are a few cases where they should not link to the Book sources page.  See http://en.wikipedia.org/wiki/International_Standard_Book_Number#Check_digit_in_ISBN-10 for example.

Oh, and we should also remove the auto-linking of URLs, because sometimes people want them to just be written out without actually becoming a link.  See http://en.wikipedia.org/wiki/URL_normalization#Normalization_process for some examples.

And we should remove the auto-linking of RFC numbers, since there are some cases where this conflicts and people don't want an actual link.  See ... well, I can't find any examples.  But I'm sure there are a few cases where people don't want them to be linked.

We certainly don't want to confuse users by linking these things every time, when they should only be links 99.9% of the time.  Users should have to explicitly spell out what they want with special tags, since we certainly don't want to make assumptions about what they want.  Let's call it an "a" tag, for "Anchor", and then we'll have the link text inside the tags, with a URL which we'll label "href" for "Hypertext REFerence", ...

This way we'll know exactly when users want links to be created, and they won't have to be bothered with those bothersome "nowiki" tags in the handful of cases that they don't.  It will be a great improvement to the usability of our wiki software.<signature block/><date><today's date/></date>
Comment 94 Tony Souter 2008-01-01 00:57:06 UTC
Automatically rendering ISO-formatted dates as autoformatted would need to be turned off. Autoformatting doesn't work with date ranges (3–7 January 2010, the way MOS says to format such a range), nor slashed dates (night of 3/4 July). I guess you'd just write these ranged/slashed items in raw form without any auto.
Comment 95 cypsy 2008-01-01 21:16:39 UTC
(In reply to comment #66 w.r.t to some normalized value like '4 December 2008')
> I suspect Mediawiki is used on a lot of sites that don't follow the English
> Wikipedia's Manual of Style.

True, but all it needs do is use the same config setting that determines the date style for talk/history. See $wgLanguageCode and $wgAmericanDates and comment #59. 
NB: This would just be the *default*, i.e. active only if there is no user-setting to override it.

(In reply to comment #80)
> Can't we just substitute exactly the double square brackets currently in use with the arrows?
> So: <<September 27>>, <<1980>> 

There is no need to institute a new syntax. What is wanted is for the dates to be formatted but not linked (or alternatively, the link to not be an obvious link), which has very little to do with the fact that it appears in square brackets.

To that end, it does not make any difference whether the value is put in square brackets or in some altogether novel format like angle brackets, <date> or d? or whatever else. All these do what square brackets do already, which is to "flag" it as needing special handing. Square brackets work fine. No need to change it. The actual generation/formatting/color of a link (or not) is something that the server does anyway.

It is /trunk/extensions/FormatDates/DateParser.php and/or /trunk/extensions/FormatDates/FormattableDate.php that need fixing first. Specifically, the fact that PREF_ISO and PREF_NONE are treated as synonymous in FormattableDate.determineFormat(). As such, the purpose of PREF_NONE is lost, which might under normal circumstances be used to switch to a sensible default. 

The solution is to either fix determineFormat() or to have DateParser.convertPref() intelligently select an actual default format. i.e. it would not return self::PREF_NONE if no preference is set, but instead pick a suitable alternative (e.g. based on $wgLanguageCode and/or $wgAmericanDates).
Comment 96 Omegatron 2008-01-01 21:44:06 UTC
(In reply to comment #94)
> Autoformatting doesn't work with date ranges (3–7 January 2010,
> the way MOS says to format such a range), nor slashed dates (night of 3/4
> July). I guess you'd just write these ranged/slashed items in raw form without
> any auto.

ISO 8601 specifies periods of time, too; written with a slash between them, so we could auto-format that in a smart way as well ("1938-10-17/2007-11-30" becomes "October 17, 1938 – November 30, 2007" according to my preference, while "2010-01-03/07" becomes "3–7 January 2010" according to yours).  But it wouldn't be useful in cases where you want to add birth locations, "circa", and so on.
Comment 97 Tony Souter 2008-01-02 00:56:33 UTC
Well, I don't mind this ISO solution; nor the arrows substituted for square brackets. Just as long as the result is not blue/underlined as for links: that's the feature that upsets the greatest number of users.

Not quite sure I understand cyp@abwesend.de's statement that there's no need for new syntax.

At some stage, we need to establish a time-line and/or bring to a head what is to be done. Sorry to be bossy (and I'm a total tech-idiot at that); but we've postponed the push to have the autoblotch made explicitly optional on WP (which would be the start of the end of it, I'm afraid) because of this new activity here.

At the same time, we're very appreciative of the technical skills and knowledge brought to bear here, pro bono. 
Comment 98 Voyagerfan5761 / dgw 2008-01-02 01:00:34 UTC
(In reply to comment #97)
> Well, I don't mind this ISO solution; nor the arrows substituted for square
> brackets. Just as long as the result is not blue/underlined as for links:
> that's the feature that upsets the greatest number of users.
> 

The linking itself bothers me. It's not such an issue on Wikipedia, where every year and day of the year has an article; but on smaller wikis (as I've mentioned before), having links just to format the dates clutters up the Special:Wantedpages list. And the blue is annoying, yes. If it were just the formatting that was irritating, we could simply push for a special class to be added to auto-formatted links and fix the appearance in CSS. The result should not merely be not blue; it should be not linked.
Comment 99 cypsy 2008-01-11 02:35:01 UTC
(In reply to comment #97)
> Not quite sure I understand cyp@abwesend.de's statement that there's no need
> for new syntax.

This bug report actually conflates three issues. The reasons why they haven't been dealt with 
are enumerated at the end of this comment.

The three issues (and for #2 and #3 the potential solutions) are:

1. The description of this bug report is 'Provide preference-based autoformatting for unlinked dated'.
   In other words, the person _wants_dates_to_be_identified_by_pattern_, and not 
   by [[ ]] or <date></date> or whatever.

2. Articles are already being overloaded with linked dates. That is, people (me included) want to 
   inhibit the /existing/ [[dates]] from appearing as links.

   To describe the "solution", I have to first describe the problem, which is this:
   The '[[wikilink]]' syntax "tells" the server that what is between the brackets needs to be 
   dealt with in a "special" way. What the server then does with it is something like this: 
       a) call DateFormatter to reformat all instances of [[dateish-string]] into [[date links]]
       b) convert all instances of [[internal links]] (which now includes the reformatted 
          [[dateish-string]]s) into regular html hyperlinks.
   Thus, if one didn't want dates to be link-ified, then all it would take is for the elements
   in the $this->targets[] array in DateFormatter to not be [[wikilink]] format.
   For example,   $this->targets[self::DMY] = '[[F j|j F]] [[Y]]'
   would read     $this->targets[self::DMY] = 'j F Y'
   See http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/includes/DateFormatter.php 
   which is called from Parser.php in the same directory.

   NB: the DateFormatter() is only called when $wgUseDynamicDates is set to 'true'. I don't think
   this option is enabled by default. 

3. People (me included) have also asked that the "default" be more human-friendly than ISO 8601.

   The easiest way to do this would be to change the way DateFormatter.reformat()
   deals with "no preferred date format" (i.e. $preference is set to, or defaults to self::NONE);
   For example, it could default to the same date format as used on talk pages.
   There are of course more complex solutions possible, including:
   b) passing a new localsettings parameter to the DateFormatter.reformat() which would be the 
      override value when $preference is/becomes self::NONE.
   b) inferring a default date format from the server's language code ($wgLanguageCode) and 
      $wgAmericanDates when the language code is English.

> At some stage, we need to establish a time-line and/or bring to a head what is
> to be done. 

Point #1 should never be instituted, and (presumably) the reason why this bug was once 
         closed as WONTFIX.
Point #2 can't be implemented on the server side because then dates that *really* need 
         linking wouldn't be linked either!
Point #3 if I'm reading the source right, this could be implemented without major issues, 
         but nobody really cares. ;)

Point #2 can really only be dealt with by a site-specific formal ruling to inhibit 
unnecessary date linking and (optionally) disabling $wgUseDynamicDates for .en. Turning 
existing overlinking into normal strings can be delegated to a bot.

Comment 100 Omegatron 2008-01-11 03:34:31 UTC
(In reply to comment #99)
> 1. The description of this bug report is 'Provide preference-based
> autoformatting for unlinked dated'.
>    In other words, the person _wants_dates_to_be_identified_by_pattern_, and
> not 
>    by [[ ]] or <date></date> or whatever.

No.  The original request, and the thing that is most hated by everyone, is the fact that localization depends on *linking* the dates.  The link syntax is overloaded.  That's what this bug is supposed to get rid of.

Finding dates by pattern is the best way to do this.  (Instead of adding yet another wiki syntax that will have to be wrapped around 99% of the dates in the entire encyclopedia.)  But apparently that's too hard or something.

So the compromise method is to pick just *one* pattern and localize dates that fall into that pattern.  ISO is already recommended against in text, is designed to be machine-readable, and is unambiguous, so it should be used as the pattern to match.  Dates written in ISO in the article source will be localized to the user's preferred format, while regular written-out dates (used in quotations or whatever) will be untouched.

I've written up a better description here:

http://en.wikipedia.org/wiki/User:Omegatron/Date_formatting

(Also see talk for discussion about guessing unregistered users' preferred format based on their browser's headers.)
Comment 101 Stephen Turner 2008-01-11 08:01:47 UTC
Omegatron, you don't seem to have realised that ISO 8601 is also, at least as far as editors are concerned, "another wiki syntax that will have to be wrapped around 99% of the dates in the entire encyclopedia". All dates would have to be changed to either (i) dates in natural language with new markup round them or (ii) ISO format; and written in the corresponding way in future. I argued before that (i) is preferable because text transformations should be requested explicitly, but I also suspect more editors would prefer to write in style (i).

cyp, the original request says "please create syntax for marking dates to be rendered subject to date preferences that does not overload the [[]] link syntax, adn does not create a link", which does not correspond to any of your #1, #2 and #3. It requests new markup for dates, and I still think this is the best approach.
Comment 102 Voyagerfan5761 / dgw 2008-01-11 08:08:24 UTC
Stephen, I'm one of those people who would prefer to write in style (ii). It's more easily internationalized that way. ISO 8601 is an international standard, after all. More markup is just more markup and makes things harder to type. The software is capable of regex matching of dates, and ISO 8601 is the most foolproof way to implement it. I'm against adding additional tags, but I don't like typing tags that the software I'm using (MediaWiki or otherwise) can insert for itself.
Comment 103 Tony Souter 2008-01-11 14:40:27 UTC
I'm really pleased to see these ideas being debated. However, as a normal, non-technical user, I'm thoroughly confused (what is "localising", for example?). 

As the original organiser of the 86-signature petition of Wikipedians asking for linking and autoformatting to be decoupled—or for ''something'' to be done to stop the blue—may I ask for the short-list of potential strategies/solutions to be stated as plainly and simply as possible? 

That is, something like this. 

Option 1: 
*what you key in
*how it will be displayed
*what is required technically/administratively to achieve it
*advantages
*disadvantages

 Option 2:
 ....

etc.
Comment 104 Tony Souter 2008-01-12 07:06:32 UTC
By the way, if you haven't seen it, here's the original petition of 88 Wikipedians from December 06. I'm sure it would multiply hyperbolically if restarted.

http://en.wikipedia.org/wiki/Wikipedia_talk:Manual_of_Style_%28dates_and_numbers%29/Archive_D1#A_new_parallel_syntax_for_autoformatting_dates

Tony
Comment 105 cypsy 2008-01-15 03:41:41 UTC
@Omegatron, Stephen Turner: I evidently misunderstood what it was that the person who filed this bug wanted, which is (if I understand it now), formatting of date-like thingies without these also being links.

One simple solution:
The parser functions already have a #time function that allows dates to be reformatted. Whats missing is the means for it to obtain the user's date preference. Either...
1. add a {{DATEPREF}} magic word, which would be #switch-adjusted for a call to #time
2. adjust #time, so that calling it with a null first argument causes it to use user preferences.
In either case, wiki installations could then have a {{date|dateish-value}} template that wraps the back-end and that then displays a dateish-value as a formatted date. Option #1 is preferable because its more flexible.

Getting real:
* the vast majority of people actually *reading* articles are anonymous users for whom dates are not reformatted. 
* ISO dates (where such a function would actually be quite useful) do not typically appear in article space; dates like "January 14, 2008" are far more common. The ISO dates also remain unformatted for anon readers.
* is '14 January 2008' really more readable than 'January 14, 2008'?
* it does not solve the over-the-top zeal with which dates are being linked.
Comment 106 Stephen Turner 2008-01-15 08:14:51 UTC
Thanks for your reply, cyp. I'd just like to comment on your last two bullet points.

'14 January 2008' and 'January 14, 2008' are equally readable, but if we follow that argument to its conclusion, why have date preferences at all? Actually, abolishing date preferences is maybe another solution to the problem, but while we have them we need a way to invoke them without also linking.

On zeal: I think it does solve it, or at least permits it to be solved. There are several overlapping factors here. (1) I think there has been a cultural shift over the years to link less. My understanding is that before my time there was a general assumption that every year number should be linked. Now, although there are differences of opinion, more editors are inclined to regard those links as useless clutter. (2) Users are encouraged in the Manual of Style to link full dates (day-month or day-month-year) to make date preferences work. The proposed change in this bug would make that unnecessary. No doubt it would take some time for existing users to be educated and for new users not to see so many examples and copy them, but it would happen. (3) It would also help with overlinking bare years, or months and years, because many new and intermediate users don't understand that there is no autoformatting gain to linking these and assume that they all have to be linked too.
Comment 107 Omegatron 2008-01-15 14:41:44 UTC
(In reply to comment #105)
> In either case, wiki installations could then have a {{date|dateish-value}}
> template that wraps the back-end

So it would be like this?

"'''George Washington''' ({{date|February 22, 1732}} – {{date|December 14, 1799}}) was a central, critical figure in the founding of the [[United States]].  He was born on {{date|February 22, 1732}} ({{date|February 11, 1731}}, [[Old Style and New Style dates|O.S.]])..."

How would you link the dates that *should* be linked?  Would it handle ranges?

> * ISO dates (where such a function would actually be quite useful) do not
> typically appear in article space; dates like "January 14, 2008" are far more
> common. The ISO dates also remain unformatted for anon readers.

Default formatting for unregistered users would be "14 January 2008", and would be surrounded by a css class so that it could be further manipulated by javascript if desired (something in Common.js).  Then the servers only need to cache one version of each page.  See the discussions:

http://en.wikipedia.org/wiki/User_talk:Omegatron/Date_formatting#Good_idea
Comment 108 cypsy 2008-01-16 03:06:50 UTC
(In reply to comment #106)
> if we follow that argument to its conclusion, why have date preferences at all? 

To misquote Tennyson,... not ours to reason why, ours but to do and sigh. :)
Some scheme is necessary to determine how dates are to be formatted. I see the user-pref as an extension of that, albeit an incomplete one since the default is not in a natural language.

> On zeal: I think it does solve it, or at least permits it to be solved. 

Good point. Yes, an alternative would mitigate the present need to link.

Comment 109 cypsy 2008-01-16 03:09:53 UTC
(In reply to comment #107)
> So it would be like this?
> 
> "'''George Washington''' ({{date|February 22, 1732}} – {{date|December 14,
> 1799}}) was a central 

yes. In much the same way as one today would write
  [[February 22]], [[1732]] – [[December 14]], [[1799]]
or 
  [[1732-02-22]] – [[1799-12-14]] 
or whatever, with the result being the same, but sans linkage.

(NB: #time presently depends on php's own date() function, which can't deal with dates before 1970 and after 2038. This can be solved in much the same way that DateFormatter.php deals with the problem)

> How would you link the dates that *should* be linked?  

The alternative wouldn't affect existing $wgUseDynamicDates functionality, so [[ ]] syntax would still work for users with a date preference.

Ideally, the alternative {{date}} would (in contrast to dynamic dates) also have a human-friendly default, so it may be useful to have an alternate "with links" template (say {{ldate}}) that would always generate a human-friendly result even though $wgUseDynamicDates would not. Its just a matter of semantics, e.g. {{#time: F j, Y|2008-01-15}} versus {{#time: [[F j]], [[Y]]|2008-01-15}}

Also, with the presently enabled features on the server side, there is no functionality on the local installations for a template to determine whether there is a year present or not, and thus the basic {{date}} would not be able to be able to deal with dates that have no year value; "{{date|January 15}}" would return a wrong result. Thus, there would have to be other templates to deal with such values anyway. Again, its just a matter of semantics. e.g. Some front-end template calling a back-end core template with {{date/core|January 15|1985|linkage=no|canonical=no}} or whatever.

> Would it handle ranges?

No. And it shouldn't. One tool, one job.
 
> Default formatting for unregistered users would be "14 January 2008", and would
> be surrounded by a css class so that it could be further manipulated by
> javascript if desired (something in Common.js).  

Thats a good idea. {{date|whatever}} would then just read 
 <span name="wpAutoDate">{{{1|{{CURRENTMONTHNAME}} {{CURRENTDAY2}}, {{CURRENTYEAR}}}}}</span>
Javascript would do the rest. 

This could even be implemented right now, with the preferred date format eventually being passed in a 'var wgUserDateformat' (as is already done for wgUserName, wgContentLanguage etc).

There are a few pitfalls that need to be navigated around, but the idea is good. It will need be very carefully written in order to not be rejected as mis-usable. It could also end up being fairly long. But the advantages would definitely outweigh the disadvantages.
Comment 110 Tony Souter 2008-01-16 10:51:09 UTC
In reply to Stephen Turner: yep, abolishing date autoformatting (or at least making it optional) is a trigger waiting to be pulled at MOSNUM. The two main formats are equally readable, and frankly, I'd much rather read the one I don't use—in normal black text—than a blue splotch. 

But it the kind folks here can decide on a way to fix the bug, we'd all be fine about keeping it. 

Tony
Comment 111 Omegatron 2008-01-16 14:49:55 UTC
(In reply to comment #109)
> there is a year present or not, and thus the basic {{date}} would not be able
> to be able to deal with dates that have no year value; "{{date|January 15}}"

So there would be a syntax for full dates, a different one for month-day dates, a different one for human-friendly defaults, and a completely different one for linked dates?

> > Would it handle ranges?
> 
> No. And it shouldn't. One tool, one job.

Date ranges should be formatted, too.  "5–7 January 1979" vs "January 5–7, 1979".  Would there be another template for ranges?

What advantages would this have over the ISO proposal?

"'''George Washington''' (1732-02-22/1799-12-14) was a central, critical figure in the founding of the [[United
States]].  He was born on 1732-02-22 (1731-02-11, [[Old Style and New Style dates|O.S.]])..."

> > Default formatting for unregistered users would be "14 January 2008", and would
> > be surrounded by a css class so that it could be further manipulated by
> > javascript if desired (something in Common.js).  
> 
> Thats a good idea.

Ok.  (Moved to http://en.wikipedia.org/wiki/User:Omegatron/Date_formatting#Proposal_for_defaults)
Comment 112 cypsy 2008-01-16 18:34:36 UTC
> (In reply to comment #109)
> So there would be a syntax for full dates, a different one for month-day dates,
> a different one for human-friendly defaults, and a completely different one for
> linked dates?

Without any extra mustard that regular [[ ]] wouldn't give you either, you'd need only 1 template. For linkage functionality, which is not what this bug is asking for, you'd need another (and/or an additional parameter for the first). Irrespective of the number, the result is always human friendly.

As I said before, its only a matter of semantics. The number of functions you want is both directly proportionate to complexity and is inversely proportional to its ease of use. 

> Date ranges should be formatted, too.  "5–7 January 1979" vs "January 5–7,
> 1979".

Whats wrong with '5 January—7 January 1979'? Will you want '30 January—1 February 1979' and '30 December 1978—1 January 1979' to be handled too? What about '10 BC—10 AD'? How about ranges with "to"? Or "about", "mid", "early", "late" qualifiers?

> Would there be another template for ranges?

That depends on the degree of complexity that you are willing to impose on the editor, and the threshold of acceptance for the increased complexity of the code. *Technically* there is no need for more than one template for *everything*.

Don't buy a cow if you're expecting eggs or wool.
Comment 113 Voyagerfan5761 / dgw 2008-01-16 18:42:00 UTC
Regarding some of the recent discussion, I would oppose any changes that would make the core feature complex enough to require a wrapper template...
Comment 114 Ross Patterson 2008-01-19 19:45:56 UTC
(In reply to comment #110)
> In reply to Stephen Turner: yep, abolishing date autoformatting (or at least
> making it optional) is a trigger waiting to be pulled at MOSNUM.

Lest anyone read Tony's statement as "MOSNUM will soon be anti-date-formatting", it's actually a matter of quite a bit of argument, and there's currently no obvious end result.  It may happen, but it also may not.

en:User:RossPatterson
Comment 115 Tony Souter 2008-01-20 11:50:16 UTC
That's true; so let's do something about it here and everyone will be relieved.

Tony


Comment 116 Tony Souter 2008-01-23 08:35:08 UTC
People at WP have been asking for a synopsis of what has been decided on this page, and in one case specifically for "some conclusion on what syntax/code/markup has been decided, and who if anyone has written code to implement it". 

I wonder whether we're in a position to provide this information; I lack the technical skills to do so. 

en:User:Tony1
Comment 117 Omegatron 2008-01-23 13:07:35 UTC
(In reply to comment #116)
> People at WP have been asking for a synopsis of what has been decided on this
> page, and in one case specifically for "some conclusion on what
> syntax/code/markup has been decided, and who if anyone has written code to
> implement it".

I don't think anything has been decided.  Where are they asking?  They can read the bug report and comment on it themselves, can't they?
Comment 118 Tony Souter 2008-01-23 13:47:28 UTC
The impression formed by the most recent correspondent, on my talk page, is that "Discussion is all over the place, and I'm looking for a synopsis". Like other Wikipedians, this correspondent appears to be keen for progress here; he probably wonders where to start to contribute.

I wonder whether the discussion has raised several possible solutions that can be summarised—along with the perceived advantages and disadvantages of each—by someone who has the skills. Then we might set about deciding on a definitive option, and how it might be pursued. 

I'd be pleased to see the easiest technical solution at the moment—anything to remove the blue splotches.
Comment 119 Omegatron 2008-01-23 13:57:30 UTC
(In reply to comment #118)
> I'd be pleased to see the easiest technical solution at the moment—anything
> to remove the blue splotches.

Long-term usefulness is more important than urgently getting rid of something that has been there for years anyway.  Let's think carefully about what we *really* want it to do.  Implementing the dynamic date feature without predicting the long-term consequences of linking every date was what caused this problem in the first place, no?  The people clamoring to get rid of these ugly blue splotches ASAP need to relax; a technical solution will be implemented eventually.  They can help decide *which* technical solution to go forward with instead of arguing about style guidelines that will be rendered moot by it anyway.
Comment 120 Tony Souter 2008-01-23 14:14:54 UTC
Thanks very much, but I won't "relax", as you put it, until something is done about it. This issue has been here for more than a year now—I don't call that "clamouring" or "ASAP". How long this sad function has been inflicted on us has no relevance at all. If it had been implemented last week, the need to address its deficiencies would be just as great. The lack of impetus on this page is what concerns me, and others at WP.
Comment 121 Ryan Kaldari 2008-01-28 16:05:20 UTC
We have a problem on the English Wikipedia. Template Citation pulls in accessdates without wikilinking them, while Cite News, Cite Web, etc, wikilink the date when it is pulled in. Thus editors have to remember to use unlinked ISO dates for Cite web, Cite news, etc. but have to remember to use a wikilinked regular date for Citation. The fact that these two sets of templates aren't consistent is ridiculous, but neither is willing to change as it would break thousands of articles if either changes to the other convention. I tried to get Citation to switch to using the unlinked ISO convention (that gets converted by the template), but the editors there said there was no reason to force the change since this bug is supposed to be fixed soon (in a matter of weeks they claim). Reading through the comments here, I'm only more confused as to the status of this bug. What is the status, and when is the expected implementation date, if any?
Comment 122 Omegatron 2008-01-28 16:17:23 UTC
(In reply to comment #121)
> Reading through the comments here, I'm only more confused as
> to the status of this bug. What is the status, and when is the expected
> implementation date, if any?

Who said a few weeks?  Who's going to write it?  Can't you switch to the unlinked ISO date to make all the templates consistent and then have a bot fix them?
Comment 123 Voyagerfan5761 / dgw 2008-01-28 16:40:07 UTC
(In reply to comment #122)
> (In reply to comment #121)
> > Reading through the comments here, I'm only more confused as
> > to the status of this bug. What is the status, and when is the expected
> > implementation date, if any?
> 
> Who said a few weeks?  Who's going to write it?  Can't you switch to the
> unlinked ISO date to make all the templates consistent and then have a bot fix
> them?
> 

I don't know how many pages use {{citation}} -- I've only come across {{cite stuff}} templates in the wild myself. Perhaps the number is small enough to recruit a small number of AWB users and have them do a mass fix run right after the template is changed.
Comment 124 Ryan Kaldari 2008-01-28 17:24:36 UTC
Fullstop says he's working on it so maybe he has something up his sleeves. In the meantime, I've changed the Citation template to format the accessdate with the date template. It's not an ideal solution, but I guess it will work until we can get a bot lined up to do a mass-conversion.
Comment 125 cypsy 2008-02-15 15:38:27 UTC
A Javascript-based solution for this issue is at
   http://en.wikipedia.org/wiki/User:Fullstop/autodate.js

en.wiki users can add
   importScript('User:Fullstop/autodate.js');
to their monobook.js. 

Dates tagged with {{date|...some date...}} will be automagically converted. 
To tag dates manually, use <span class="wpAutoDate">...some date...</span>

Documentation is in http://en.wikipedia.org/wiki/User:Fullstop/autodate.js itself.

Have fun. :)

ps: Let me know of any interest in a test-case generator written to stress autodate.
Comment 126 Voyagerfan5761 / dgw 2008-02-15 17:08:47 UTC
That script looks interesting, but it doesn't solve the problem for unregistered users or users who don't install the script. Even if it's installed in the site JS, it still leaves out non-JavaScript browsers. I'm considering putting it in my monobook.js, but I don't think it's more than a workaround for this bug.
Comment 127 cypsy 2008-02-15 23:43:20 UTC
(In reply to comment #126)
> That script looks interesting, but it doesn't solve the problem for
> unregistered users or users who don't install the script. Even if it's
> installed in the site JS, it still leaves out non-JavaScript browsers. 

1. Amazing but true, its not a solution for world hunger either.
2. Its /meant/ to run as site JS. How could anyone even think otherwise?

And at the risk of disillusioning you further, client-side scripting is also the only solution for whatever it is you think this bug is about. 

That is because -- with the exception of minimal phase3 stuff (i.e. [...]/[[...]] translation, which includes DateFormatter.php stuff) -- articles are not rewritten on the way from the cache to the client. No templates, no custom <time></time>, whatever. All that was processed at the time of last save; its all cached and is all end-user independent.

Autodate.js is also the only solution that (were it in site JS) would work correctly for unregistered users. Actually, as long as the script is _not_ site JS (and as long as the server does not emit a wgUserDateFormat variable), the script also cannot infer a registered user's date pref. Meaning: at present (as long as its not site JS), the script treats an registered user in the same way it would an unregistered one. Of course, I have a solution for that too, but I'm not going to implement it; because the whole point is that the script is /meant/ to be running as site JS. (oh duh!)

But all that aside, if anyone thinks there is a perfect solution to this problem, perhaps even with real working code and not just empty talk, then for heaven's sake lets hear/see it. Until then, how about some /practical/ feedback? Something actually *useful* would be nice.
Comment 128 Omegatron 2008-02-16 00:15:19 UTC
(In reply to comment #127)

> Autodate.js is also the only solution that (were it in site JS) would work
> correctly for unregistered users. Actually, as long as the script is _not_ site
> JS (and as long as the server does not emit a wgUserDateFormat variable), the
> script also cannot infer a registered user's date pref. Meaning: at present (as
> long as its not site JS), the script treats an registered user in the same way
> it would an unregistered one. Of course, I have a solution for that too, but
> I'm not going to implement it; because the whole point is that the script is
> /meant/ to be running as site JS. (oh duh!)

Did you read http://en.wikipedia.org/wiki/User:Omegatron/Date_formatting#Proposal_for_defaults ?
Comment 129 cypsy 2008-02-16 10:07:58 UTC
> Did you read
> http://en.wikipedia.org/wiki/User:Omegatron/Date_formatting#Proposal_for_defaults
> ?

What exactly is it that you're trying to say? Yes, I've read #Proposal_for_defaults. But what does that have to do with the text that you're quoting? 

As you might infer from the autodate documentation, the script effectively does what is described in that proposal. As such, you could consider autodate to be the proof-of-concept that your proposal is (fundamentally) sound.

Incidentally, the {{date|...}} template (which adds the <span class=...> tags that autodate.js sees) pre-formats dates as 'dmy' (aka "RFC 2822") dates. That is then what a user who does not have JS enabled (or is not running the script) will see.
Comment 130 Jonathan Schilling 2008-07-10 03:05:03 UTC
I feel strongly that defaults for un-logged-in users can be based upon their
browser's language and locale settings.  That's what those settings are there for!
I've worked on I18N/L10N projects that did this, and it's the standard way that
a lot of e.g. Java server-side webpage software is set up to operate.

This change could be made *now*, independently of decoupling date autoformatting
from the link syntax.  It would let a lot of unregistered users see dates, 
especially in Wikipedia citations, in their native format rather than ISO format.
Comment 131 Tony Souter 2008-07-10 09:08:39 UTC
Well, the horse has bolted, I think. Over the past year, date autoformatting has gone from being mandatory to being no longer encouraged at Wikipedia's MOSNUM style guide. The problem with Jonathon's suggestion is that there are multiple issues: the fact that it's a self-indulgent in-house-only facility (our readers put up with the strange bright-bluing of dates and wonder why, since they see the raw format, often inconsistent within articles); and of course it's STILL tangled up with the linking mechanism, despite our huge petition more than a year ago.

I wouldn't hold my breath for any technical changes given our experience, and I firmly believe that we're better off without autoformatting. Heck, we've been managing very well with different varieties of English spelling, as long as consistent within each article: who CARES whether day or month comes first in a date? It's piffle.

The only thing to happen now is for the practice of not autoformatting to filter through properly to WPians.

Tony
Comment 132 Jonathan Schilling 2008-07-10 12:15:03 UTC
I'm not arguing for autoformatting, I don't mind seeing it go either.

But the fact remains that there are zillions of autoformatted dates
currently in WP articles.  Until they all disappear, a simple technical
change can render them more reasonably to not-logged-in readers.  This is
especially the case for dates in cite template usages, which will likely
be the last to be un-autoformatted and where not-logged-in readers currently 
see most of them in raw, unfriendly technical ISO format.
Comment 133 billclark 2008-09-03 19:25:13 UTC
Created attachment 5281 [details]
diff -u for includes/DateFormatter.php

This patch implements auto-formatting for non-wikilinked dates.
Comment 134 Tony Souter 2008-09-04 01:44:11 UTC
Thanks, but I have to rain on that parade now. First, as most correspondents here probably know, date autoformatting (DA) is now deprecated by the Manual of Style. One of the key reasons is that we now realise it is inherently bad for editors not to see the dates that our IP readers see. A large-scale audit of dates in general WIkipedia articles—in which DA is removed, ''inter alia''— is revealing autoformatting to be a major source of date-mess in our articles; this applies even to WP's most popular articles. It's easy to see why we have been performing so poorly in this respect: a British editor comes along to the article on J. F. Kennedy, for example, and adds a few dates in international format, forgetting that the international format s/he is viewing in the article is not the real, underlying format. So we have three or four international dates among many US formatted dates—this in an article that couldn't be more American if you tried. Because most editors haven't yet switched their prefs to "no preference"—something all serious contributors should do forthwith—these inconsistencies remain, viewed by our readers.

Here's the breakdown (in parentheses) from the auditing of a sample of 71 such articles (with my prior off-the-cuff estimations in square brackets, out of interest): 
#a small proportion of dates in the "other" format (36.6% of the 71 audited articles) [40%];
#a messy mixing of the formats, where the correct format can be determined via MOSNUM's rules (8.5%) [10%];
#a messy mixing of the formats, where the correct format needs further input by local editors (I don't see it as my job to go back to the first date that appeared in the article—I typically leave a note at the talk page, asking them to buzz me if they need further assistance) (8.5%) [5%];
#the completely wrong date format (e.g., US dates for an Irish rock group) (5%) [fewer than 5%]
#all correct (~ 40%) [40.8%]. 

Within the first four categories, six articles contained faulty dates ("th", weird order or syntax, etc).

It's a no-brainer that keeping our dates under good management is going to improve markedly as we move on from the DA period, apart from all of the other benefits of removing DA. So as much as I admire the motivation behind Bill's creation of a patch, using it will simply retain a major impediment to the good management of dates in WP articles. 

Tony
Comment 135 Omegatron 2008-09-04 02:17:45 UTC
(In reply to comment #134)
> Thanks, but I have to rain on that parade now. First, as most correspondents
> here probably know, date autoformatting (DA) is now deprecated by the Manual of
> Style.

You mean linking dates.  The software should autoformat dates without links, so that they are always consistent for viewers, whether logged in or not.
Comment 136 S. McCandlish 2008-09-04 06:03:00 UTC
As long as it results in people moving away from linking dates for no *article*-meaningful reason, then great.  I can't believe its taken this long to get there.
Comment 137 Ryan Kaldari 2008-09-04 15:44:39 UTC
I think you've convinced me Tony. Is there a bug for removing Auto-formatting entirely?
Comment 138 Tony Souter 2008-09-04 16:12:58 UTC
By all means, Ryan. The instructions for installation and usage are here:

http://en.wikipedia.org/wiki/User_talk:Tony1#More_on_date_formatting

User Lightmouse, the author, has a page for feedback, a wishlist, and a register for any technical issues you uncover, here:

http://en.wikipedia.org/wiki/User:Lightmouse/wishlist

Tony
Comment 139 Gerry Ashton 2008-09-04 17:39:23 UTC
1. Bill Clark's patch should be much better documented before it is given serious consideration.

2. I think Tony Souter misunderstood Ryan Kaldari's question. I think Ryan asked if there is a request for a mediawiki change that would entirely eliminate Auto-formatting, so that if the software encountered [[4 September]] [[2008]] it would treat it as a link to the article 4 September, a link to the article 2008, and nothing more. The concept that the order of the elements might be changed would be removed from the software. The Date Format box on the Date and time tab of User preferences would disappear, or would only apply to system-generated timestamps such as the time edits were performed. Right?
Comment 140 Tony Souter 2008-09-04 17:45:59 UTC
You mean as though all WPians had chosen "no preference" in their date-format prefs ... That's what we advise they do already. I can see no point in that proposal, if it's really what Ryan meant. 
Comment 141 billclark 2008-09-04 19:37:14 UTC
If it's decided upon that autoformatting should be turned off (and there's no consensus for this on WP) then it's a one-line change to the configuration file in MediaWiki.  If official policy is to effectively disable autoformatting, by either unlinking nearly all dates or by having all registered users specify "No preference" then it would be far better to disable the function entirely in the parser.  I don't understand the point about documentation; I didn't add any new code, just some extra variable definitions that are just as obvious in their meaning as the other, similarly undocumented ones already there.

NOTE: This patch should NOT be applied without additional discussion on WP.  If and when it is applied, it will cause all unlinked dates to be autoformatted, unless they are surrounded by a &lt;nowiki> tag.  Dates that have been intentionally left unlinked (such as those in quotes, etc.) will need to have the &lt;nowiki> tag wrapped around them, prior to this patch being applied.
Comment 142 cypsy 2008-09-05 02:32:57 UTC
@Jonathan Schilling:

I have working Javascript that will set a datepref cookie whose value is inferred from browser locale/time settings. The cookie's values are the same as those DateFormatter.php uses, i.e. 'mdy'/'dmy'/'ymd'/'ISO 8601'.  

But its a chicken-egg/bureaucratic problem. If DateFormatter.php doesn't support such a cookie, then wikis won't implement such a Javascript function. And vice-versa.

@Bill Clark: 

as you already seem to be aware ;), causing unlinked dates to be autoformatted is not such a GoodThing. But you can avoid that with a twist to the tale:  make the *existing* [[ ]] syntax reformat dates *without* links. And make [[: ]] reformat dates *with* links (':' because dates are in main namespace). Yes, it perverts [[ ]] syntax a bit, but date formatting is perverted to begin with and I don't suppose it would cause any serious disruption.


Comment 143 Tony Souter 2008-09-05 02:47:43 UTC
Guys, I appreciate your technical expertise, don't get me wrong; two and a half years ago, I'd have been delighted to have received your input. But by now the WP community has moved on, having realised the disadvantages in trying to solve a non-problem with hi-tech, no matter how "simple" these proposed programming solutions might seem. 

As much as this must seem like "looking a gift horse in the mouth", I have to say that the horse has bolted and is off to better grasslands. No one here has presented an argument that month or date first really matters in the first place. WPians have reacted positively, in many cases enthusiastically, to the removal of DA, and now we'd like to get on with WYKIWYG (What You Key in Is What You Get). I think they wouldn't react positively to another round of tech solutions. For what?

If the move to get rid of DA had happened overnight, it might be different. But it's steadily evolved over the past two years and more, from mandatory to optional and now to deprecated. During this period, it has been debated extensively (what you see on this page is a mere iceberg tip).

Thanks for your efforts, but they don't seem to be at all relevant now. Issue sorted. 

Tony
Comment 144 cypsy 2008-09-05 07:38:37 UTC
Tony, you seem to think that all the people tracking this bug are DA fans. You need to remember that this is a *bug* report, and the problem is [[autoformatting]], which is either enabled or disabled, and having it enabled is not a bug per se. So really all anyone can do is ask to get rid of its most painful aspects. That way, those /other fellows/ who like DA can have their damned DA, and everyone else can preserve their sanity. 

So, (for me at least) its an issue of finding an option to give the DA fans so that they don't get on my nerves. This bug report was (as I see it) another way to make DAs irrelevant, and I'd like to think that many of the people think like me and voted for deprecation. :)

But you can come down from your high horse even if I'm wrong. :) Deprecating DAs hasn't really changed anything. [[autoformatting]] is still enabled. A bazillion articles still have ADs by the score. Mix-and-match still abounds. Everyone still has to deal with the bluelink blizzard. IPs still get to see gobbledygook.

Its great that a resolution has be found on the MOS front. But the battle to get rid of DAs altogether has not yet been won. Until then, don't rest on your laurels, and don't underestimate the resistance. Until/unless you have an alternative to appease the DA fans /and/ a fix (alas, tech!) for existing DAs, there is no way you (or I or anyone else) will find consensus at WP:VPR/WP:VPT to turn off autoformatting.
Comment 145 billclark 2008-09-05 15:26:19 UTC
Count me as one of the DA fans.  I much prefer it to the alternative, which would be a return to the bad old days of editors arguing over date formats.  There's some work going on with the [[Wikipedia:WikiProject Dates|Dates WikiProject]] to compile statistics on just how many articles have mix-and-match formatting, linked dates, unlinked dates, etc.  From there we can hopefully make some progress on cleanup -- which I agree needs to be done, whether or not we keep DA.

Tony, I wish you'd stop with your anti-tech comments.  Within hours of your pointing out this bug to me, I provided a working patch.  Then I created a WikiProject to deal with the cleanup issue.  I also enlisted the aid of the [[Wikipedia:WikiProject Database Analysis|Database Analysis WikiProject]] to generate some useful statistics (though in retrospect some of those people seem to be a bit nuts... though that's just more evidence of the lack of consensus on this issue, I suppose.)  I'm putting some serious effort into dealing with this issue and reaching some kind of workable solution that satisfies everybody, not just the anti-DA crowd.  When you make disparaging remarks about the quality of the programming work done by me or anyone else, you're alienating people.

Also, this has been an issue for a lot longer than two years.  I remember arguments over date formatting back in 2003, and even earlier.  You are the one who is coming in at the "tail end" and missing out on all of the historical context.  The reluctance of the WikiMedia sysadmins to touch this issue stems from their experiences being burnt by this in the past.

I think we should focus our efforts on cleaning up the problems that everybody agrees are problems -- the mix-and-match articles, and those for which the default format are inappropriate.  That can be done without unlinking, and without pissing people off.  Once that's done, we can revisit the issue of DA.  Does that sound workable?

We should probably continue discussion of this on the talk page for the Dates WikiProject, since it has now gone well beyond the scope of this bug report.
Comment 146 Gerry Ashton 2008-09-05 17:08:56 UTC
Requirement:
Considering that http://meta.wikimedia.org/wiki/Help:Date_formatting_and_linking seems to be an official statement that the all-numeric date/time format for the wikimedia software is ISO 8601, and

Considering that section 4.1.2.1 of ISO 8601:2000 provides 

   calendar year is, unless specified otherwise, represented
   by four digits. Calendar years are numbered in
   ascending order according to the Gregorian calendar
   by values in the range [0000] to [9999]. Values in
   the range [0000] through [1582] shall only be used
   by mutual agreement of the partners in information
   interchange.

Considering that the partners in information in interchange for Wikipedias in various languages are the editors and the readers, and it is not feasible to reach a mutual agreement amongst the readers and editors to use years with 5 or more digits, or to use years in the range [0000] through [1582],

Therefore, let y be the year. The autoformatting software shall only act upon dates that meet the requirement 1583<= y <= 9999 an none others.
Comment 147 billclark 2008-09-05 17:22:35 UTC
"The autoformatting software shall only act upon dates that meet the requirement 1583<= y <= 9999 an none others."

No, that's too confusing.  Just remove references to ISO 8601 in the documentation, since the vast majority of readers don't understand the implications of Gregorian/Julian calendar choice anyway.  Formatting dates as YYYY-MM-DD has a long tradition independent of ISO, and most readers assume "1008-03-15" and "15 March, 1008" are the same date.  Prohibiting the first format isn't going to make things any better, because most people will still assume that "15 March, 1008" occurred exactly 1000 years prior to "15 March, 2008" anyway.  Despite what ISO 8601 says, most readers do not assume that the choice of date format has any implication with regards to the calendar being used.
Comment 148 Gerry Ashton 2008-09-05 17:43:13 UTC
I should have narrowed my last sentence above to say:

Therefore, let y be the year. The autoformatting software shall not transform a date into or out of the ISO 8601 format unless 1583 <= y <= 9999.

I disbelieve that there is a long tradition of the format YYYY-MM-DD independent of ISO.

What is the proper forum to see binding arbitration?
Comment 149 cypsy 2008-09-05 18:41:08 UTC
@Gerry, nothing would be accomplished by such a restriction. There is no actual date manipulation behind the scenes, so a date that says 1500-01-01 will still come out January 1, 1500. That is to say, the choice of calendar/prolepsis lie with the person specifying the 1500-01-01 to begin with.

Also, ISO mandates that all year values must be at least 4 digits long, so effectively "0009-10-11" is just useful/useless as "October 11, 9".

@Bill, re your patch, see comment #61.
Comment 150 billclark 2008-09-05 18:54:52 UTC
Created attachment 5292 [details]
Simpler patch, just eliminates links

This is a simpler patch that preserves the normal functioning of autoformatting (dates must be linked to be autoformatted) but renders the dates WITHOUT LINKS.  If you want to link to a date, you can use the [[:11 September]] style (similar to how categories work) although that will not autoformat.
Comment 151 S. McCandlish 2008-09-05 19:07:54 UTC
Bill, that patch doesn't solve any of the relevant problems and will probably just compound them.  There is no point in having an autoformatting function that only works some of the time, and having link syntax overloaded to NOT LINK is going to confuse the heck out of a lot of editors.  The functions need to be completely decoupled. The fact that a very large number of dates will remain linked and not be autoformatted for some time is just something we'll have to live with for the short term, and bots can clean up most of it.  If someone wants to autoformat a date, there needs to be another syntax entirely for doing so, that can be used with intentional linking of dates in the rare cases where this is actually a good idea; the specifics don't matter too much to anyone, on either side, I don't think.
Comment 152 billclark 2008-09-05 19:19:08 UTC
The patch solves the problem of overlinking.  The syntax is exactly the same as with categories, and should be no more confusing than adding a page to a category vs. linking to a category.  It preserves the requirement that dates be specially marked up in order to be autoformatted.  The only drawback to the patch is that it won't preserve links to dates that were intentionally linked but not specially formatted (either with a ":" or a "|") but since some editors are currently mass-unlinking articles without regard to the intentions of the original editors anyway, this patch is actually less disruptive than those activities.  I agree that a special syntax for dates would be preferable, but that would require changes to articles whereas this patch solves the biggest problem most people have with date autoformatting (overlinking) while requiring absolutely no changes to article text, except in those cases where a date was intentionally linked (which critics of date autoformatting are already disregarding anyway.)
Comment 153 Gerry Ashton 2008-09-05 19:24:09 UTC
Cypsy: That which cannot be done correctly should not be done at all. If the software cannot operate differently depending on the value of the year, any input date in the ISO 8601 format should be output with no changes, and the possibility of displaying dates in the ISO 8601 format should be removed altogether. This would leave the system to only reformat dates that contain a spelled-out month.
Comment 154 S. McCandlish 2008-09-05 19:31:51 UTC
Comment to Bill: No denigration of the tech work is intended on my part.  It is also cool to have a WikiProject for cleaning up the date mess at en.wikipedia.  WRT you and Tony:  That date issues have arisen as early as 2003 doesn't make the concerns raised by later editors invalid, and characterizing Tony as basically a noob on the issue doesn't fly, since MOS has been discussing this, and begging for a fix, for years, with Tony as a major participant.  So, rather than grouching at each other (Tony, this means you too!), focusing on fixing the problem at its source seems likely to be much more productive. :-)  That said, I don't have any objection to the new patch, on further thought, though I think it doesn't go far enough, and that having no connection between the link and autoformatting functions, by using separate syntaxes, should remain a goal.
Comment 155 billclark 2008-09-10 23:15:51 UTC
Created attachment 5311 [details]
Eliminates links, leaves date format untouched

This patch will cause linked dates to be rendered without the link, and in the same format as they appear in the wikicode (although it will add missing commas where appropriate) -- in other words, this patch completely nullifies date autoformatting.  It is only intended for testing purposes, pending community approval of a proposal to temporarily disable date autoformatting to gauge editor response.
Comment 156 Tony Souter 2008-09-11 02:08:57 UTC
Yes, and it will leave us with the maintenance problem: when WP's editors don't see the raw formatting in their display mode, they don't fix errors, they introduce NEW errors, and wrong global choices of formatting are left that way for years. The whole idea was hair-brained in the beginning.
Comment 157 Richard Morris 2008-10-23 23:44:23 UTC
Is there any action on this bug. en.wikipedia is in quite a state of limbo at the moment. Some bots are going round delinking dates, and changing them to  ISO 8601 format which renders poorly for IP users. No one seems to really know what we should be doing and it all seems to hinge on resolution of this bug. 
See http://en.wikipedia.org/wiki/Template_talk:Citation#.22Wiki-magic.22.3F, http://en.wikipedia.org/wiki/Wikipedia_talk:Citation_templates#De-linking_dates
Comment 158 Tony Souter 2008-10-24 02:03:32 UTC
To declare that the English WP is in "a state of limbo" skews the real state of affairs. It is well-known on this page that, finally, date autoformatting was strongly discouraged at the dates and numbers style guide MOSNUM, back in August. Since since, no surprisingly, there has been much removal of DA by scripts, bots and manually by editors, in what is a widely accepted change for the better.

There are some six disadvantages in the use of DA, and if anyone is interested, I can link them to a list and explanation. WP has moved on from what was its foolish adoption of a toy that would never work for readers—only for editors at WP. It has led to very poor management of date formatting, since editors can't see what their readers see (in display mode).

It is untrue to say that the removal of DA is rendering dates in ISO. ISO dates are not supposed to be used in running prose (but is, contrary to long-established rules), but some editors are keen to retain it as the display of a few of the citation templates in the reference sections at the bottom of articles. There was absolutely no reason to link ISO dates, and they are steadily being delinked.

I suggest that this page be closed, since a solution is no longer relevant.

Tony
Comment 159 Stephen Turner 2008-10-24 07:13:28 UTC
I disagree. I know you (Tony) have now got your preferred solution after years of campaigning -- and I've come round to the view that it's the best solution given the current abilities of the code -- but I still think that it would be valuable to be able to autoformat dates without linking them, and conversely.
Comment 160 Tony Souter 2008-10-24 08:04:20 UTC
So it's *your* preferred solution, too ... and that of a huge number of others—in fact, all but a tiny minority of hard-line WPians who haven't yet put a good case. I need to dispel this implication that it's *my* solution, as though I'm an autocrat.

The issue is, in fact, much wider than the simple decoupling of linking and date-autoformatting functions as proposed here; among the other pressing reasons is the in-house nature of DA (our readers don't see it, so it's rather indulgent), and the fact that it prevents WPians from maintaining date formats properly (they're in a bad state).

So I'm unsure which part of my post above that you "disagree" with. I don't wish to start a long debate about it here, though. 
Comment 161 Stephen Turner 2008-10-24 10:03:45 UTC
It's not my preferred solution. It's my preferred policy given the current code. I still think the code leads to a suboptimal policy.
Comment 162 S. McCandlish 2008-10-24 13:17:51 UTC
Concur with Tony: This is not "his" issue, it is the issue of a large number of editors, including the majority of the MOS regulars. The "sea of blue" problem caused by the overloading of linking and autoformatting of dates is one that the resolution of which seems to have very few critics (while a handful of them are exceedingly vocal about it, being loud and incessant doesn't a better argument make).

I can also see Stephen's point that there could be some use for a date autoformatting function that did not operator-overload the linking function, but want to reiterate Tony's caution that use of it on Wikipedia by editors will lead to date messes in articles - if editors are not seeing what the non-logged-in reader sees, then errors and inconsistencies are guaranteed.  That is, it may be useful for MediaWiki, as a software package, to offer this functionality, but it is generally NOT useful to have it available at Wikipedia or other Wikimedia Foundation sites, which have high standards of "product" quality for the end-user reader.  Just because a function or plug-in for MW exists does not mean that it has to be used at WP.
Comment 163 billclark 2008-10-24 16:20:36 UTC
Oh thank heavens for this forum, where we're not constrained by the phony politeness of Wikipedia itself, or people crying "personal attack! personal attack!" whenever they get the chance.

Tony, you're an idiot who clearly doesn't understand the first thing about technology.  You should just leave Wikipedia for good, and stop annoying people.  At the very least you should drop yourself from replies on this ticket, since you've made it clear you have no interest whatsoever in a solution to the problem outlined here.  Bypassing autoformatting is not the same as fixing it, so your asshole-ish actions of mass delinking aren't actually a "solution" at all.  Go away.

(Please just direct complaints about my venting at Tony directly to me, rather than this list.)

Now, for those here that actually give two shits about autoformatting:  What features would we like to see implemented?  I'm more than happy to develop a patch to implement a solution, if we can all just agree on what it should be.  That is -- and always has been -- the real problem:  Agreeing on the specification.

I recommend we just completely ignore the MOS-nuts who have some sort of vendetta against date links (I think everyone knows who those people are) and proceed with fleshing out a good specification.  This list is probably a better place to do it, because of the ease of communication in this less-restrained forum.  Once a workable solution has been developed, THEN we can propose it at VP or wherever and get approval of a wider range of editors, and then propose it to the core developers.

After having spent some time thinking about an improved autoformatting feature, I have the following suggestions:

* Use a completely new markup, something like [[[August 10, 2005]]].  My reasoning is that we could treat dates as a special case of links, so that single brackets go with external links, double brackets go with internal links (and categories), and triple brackets go with autoformatted links -- which we could eventually use for other things such as units of measure or even spelling variations (how nice would it be to have [[[colour]]] display variable spelling of the word for different readers?)

* Allow preference settings (and possibly the default for non-logged-in-users) to render autoformatted dates WITHOUT links, to prevent complaints about the "sea of blue" for people that are bothered by such things.  Users that prefer to have dates linked could also set that preference, so links would autoformat AND appear as links for those users.

* Add support for auto-generated timelines.  A timeline would work like a category, so that an article page for a particular date would automatically be populated with links to articles that reference that date, complete with anchor tags to take you to the exact point in the article where the date is mentioned.  The tricky part would be figuring out how the text of the timeline page should appear (since presumably we'd want more than just a link, so something like a cross between a category page and a list page.)

* Use the browser's locale settings to provide default date formatting for non-logged-in users.  This will guarantee that autoformatted dates will appear in a consistent format throughout the site for ALL users, rather than defaulting to their raw format as they do now (which introduces inconsistency.)

I think the only way we'll get widespread support for a rewrite of the autoformatting feature is to really improve it, which is why I'm suggesting completely new features like the auto-timelines.  It'll make it more time-consuming to develop this way, but if we can make it really really useful to people then it should be worth it.

-Bill Clark
Comment 164 Gerry Ashton 2008-10-24 17:18:46 UTC
The first requirement is that it be written by someone other than Bill Clark,
because his response to input he does not like is to castigate the person who
provided input rather than think about the input.

The second requirement is to avoid confounding the concepts of a date that
is mentioned in an article with a date that is relevant in an article. For
example, "Professor Jones found a carved stone on January 1, 1970*, which says
that Clamipiter died on February 14, 82". Clearly a timeline about Clamipiter should
include February 14, 82 Julian calendar, and exclude January 1, 1970, Gregorian
calendar. For a hugh pool of dates that have little or no relevance to the topic
of an article, see the citations in any article.

*All dates within this comment are in the Gregorian calendar for dates on
or after 15 October 1582, and the Julian calendar before that date.

-Gerry Ashton
Comment 165 S. McCandlish 2008-10-24 17:43:12 UTC
The proposed features sound good to me (as does linking being off by default). I think that the autoformatting should also be off by default (at least on WP; I don't care what new installations of MW do) for the reason already identified, that editors with this feature turned on are generally blind to date formatting inconsistencies in the articles they are editing, unless they are paying a whole lot of attention, which is rarely the case.
Comment 166 S. McCandlish 2008-10-24 17:45:54 UTC
(Re: Gerry's 2 technical issues, there's a third one: Dates that appear in quotations. These should never be autoformatted, as this will alter the quoted material (in most cases) before it reaches the reader.
Comment 167 Gerry Ashton 2008-10-24 18:01:29 UTC
Fourth technical issue: autoformatted dates should provide a pleasing appearance
when there is an era indicator. An example if an off-the-cuff solution were
attempted is given below.

Wikitext before processing:
"According to the Wikipedia article "Julian calendar", which was last edited
[[[23 October 2008]]], AD 8 was a leap year; therefore 29 February AD 8 existed."

The second date is not marked up, because an off-the-cuff solution does not provide
for era indication. When this is presented to someone who prefers American dates it
will read:

"According to the Wikipedia article "Julian calendar", which was last edited
October 23, 2008, AD 8 was a leap year; therefore 29 February AD 8 existed."

This of course creates a style clash.
Comment 168 billclark 2008-10-24 18:56:58 UTC
> I think that the autoformatting should also be off by default
> (at least on WP; I don't care what new installations of MW do)
> for the reason already identified, that editors with this feature
> turned on are generally blind to date formatting inconsistencies
> in the articles they are editing, unless they are paying a whole
> lot of attention, which is rarely the case.

No, it should be ON by default, because that eliminates the entire problem
with inconsistencies.  All dates (other than those in quotes) will be
marked up with the new syntax, so they will all be displayed in the same
format.  For users that haven't set a preference (or aren't logged in)
some other method should be used to determine the format to use, such as
the locale setting from their browser.

The current problem with inconsistency stems from the fact that
autoformatting is turned off for non-logged-in users, and so they see the
raw date format, which might be inconsistent.  If autoformatting was
automatically turned on by default for ALL users, there would never be any
inconsistency in display.

-Bill Clark
Comment 169 billclark 2008-10-24 19:06:05 UTC
(In reply to comment #166)
> (Re: Gerry's 2 technical issues, there's a third one: Dates that appear in
> quotations. These should never be autoformatted, as this will alter the quoted
> material (in most cases) before it reaches the reader.
> 

There's no way to identify these automatically, since there are any number of ways that people include quotations.

It's also a completely unnecessary concern; Dates in quotes simply shouldn't be marked up, and if they're not marked up (i.e. put in triple-brackets) then they won't be autoformatted.

The same would apply for anything we want to use the triple-brackets for in the future, such as spelling variations or unit conversions.  A quote containing the word "colour" should always read "colour" and never "color" so it should never be marked up as [[[colour]]] in the wikitext.

So that's really an editor issue, not a technical one.  And it's already covered.

-Bill Clark
Comment 170 billclark 2008-10-24 19:10:38 UTC
(In reply to comment #164)
> The first requirement is that it be written by someone other than Bill Clark,
> because his response to input he does not like is to castigate the person who
> provided input rather than think about the input.

<<plonk!>>

I understand your concerns (and Tony's) and if you actually read anything I wrote, you'd realize that.  You're just an idiot, and THAT'S why I castigate you.  Fortunately I can be free to speak my mind about what a dumb-ass you are, here.

Now you too should go away, since your points are trivial to take into account and your constant harping on Gregorian vs. Julian calendar is really tiring.  Yes, we understand.  Yes, era and calendar -awareness will be built into a new system.  Now shut up and let us work.

-Bill Clark
Comment 171 S. McCandlish 2008-10-24 19:15:00 UTC
Bill:

* Quotations: Okay, that makes sense. I think I simply misunderstood you the
first time around as suggesting that locale info be used to auto-format dates,
regardless of markup, and that clearly doesn't seem to be the proposal.

* Default to ON, use locale info for non-logged-in users: If/when there is a
solution for non-logged-in users, then I will concur with you completely, so
long as it really is across-the-board -- something consistent needs to happen
for non-logged-in users for whom no locale settings can be determined.
Comment 172 billclark 2008-10-24 19:32:42 UTC
(In reply to comment #171)
> * Quotations: Okay, that makes sense. I think I simply misunderstood you the
> first time around as suggesting that locale info be used to auto-format dates,
> regardless of markup, and that clearly doesn't seem to be the proposal.

No problem.  I probably wasn't being clear enough, because there was a PREVIOIUS patch I submitted that WOULD have caused "naked" dates (i.e. not marked up at all) to be autoformatted.  But you're correct that in the latest proposal only marked-up dates would be autoformatted.

> * Default to ON, use locale info for non-logged-in users: If/when there is a
> solution for non-logged-in users, then I will concur with you completely, so
> long as it really is across-the-board -- something consistent needs to happen
> for non-logged-in users for whom no locale settings can be determined.

All (major) browsers send locale information, so that should never be an issue.  For the tiny fraction of readers that use some exotic browser, I think defaulting to international format should be fine.  (NOTE: I actually don't know of a single browser that doesn't send locale information, but I'm leaving open the possibility that some exists somewhere.)

One source of complaint might be that not all readers will have the "correct" locale setting in their browser.  For instance, it's possible that somebody in the UK might have a browser that's mistakenly set for en-US for some reason, and in that case they would see American-style date formats when they wouldn't be expecting it.

However, I think that using the locale setting is preferable to using IP addresses (the other serious alternative) because:

a) Users have control over their locale settings, whereas they have little to no control over their IP address (this is an issue for Europeans living in America or vice-versa, since they'd otherwise have no way of overriding an IP-derived default format.)

b) It's much less work on the servers to check the locale setting (actually the "Accept-language" header) than to do lookups on IP addresses -- although we could do ROUGH mapping of IP addresses fairly efficiently, if it came down to it.

-Bill Clark
Comment 173 S. McCandlish 2008-10-24 19:51:05 UTC
> For the tiny fraction of readers that use some exotic browser, I think
> defaulting to international format should be fine.

Right.

> One source of complaint might be that not all readers will have the "correct"
> locale setting in their browser.

Not our problem; we cannot account for every possible operator error. Using the locale info should work just fine in most cases, and it's not like the few rare cases of incorrect settings will generate something unintelligible.

> However, I think that using the locale setting is preferable to using IP
> addresses (the other serious alternative)

Agreed.
Comment 174 billclark 2008-10-24 20:02:07 UTC
Okay, since Tony has now started reposting my comments from here on Wikipedia itself, I'm no longer going to write anything here.  If you want to email me directly to discuss this, please feel free to do so (billclark@berkeley.edu) although at this point I think there is enough information (including stuff from the ticket history) to start working on a first pass.  Thanks for all the useful feedback (especially S. McCandlish.)  -Bill
Comment 175 cypsy 2008-10-24 20:06:52 UTC
(edit collision)

An alternative suggestion:

Using patch #3 (comment  #155 ) as a base... 

#1:  
   Instead of 
      $this->mTarget = $i;
   simply do 
      $this->mTarget = '<span class="someclass">' + $i + '</span>';
   And so leave the post-processing to the wiki (client) side, 
   e.g for meta-data handling or Javascript handling or whatever.
   (JS is already available btw, it would only need to be made site-wide).

#2: 
   For ISO dates /and only for those/, convert the date to something 
   human-friendly. In the absence of a user pref for guidance, look for 
   "magic" to decide what to convert to. "Magic" being (for example) a 
   new Wikimedia magic word, or inclusion in a category, or some (dummy) 
   template that is being transcluded.

The timeline is a really cool idea. It would also emphasize the (ir)relevance 
of a linked date.

@S. McCandlish: A user's locale settings are not known to servers unless
these were to be inferred using Javascript and then transmitted by cookie. 
But if using Javascript to infer locale then we might as well do the whole 
date rewriting stuff in Javascript too. 
Comment 176 cypsy 2008-10-24 20:18:36 UTC
Re: browser locale being transmitted. I forgot about Accept-Language.

Whats transmitted for US usage is "en-us" plus some other stuff:
  Accept-Language:  	en-us,en;q=0.5
"en-us" would be enough to rule out "24 October 2008" usage. 
So everyone else gets "24 October 2008". Indeed easy.


Comment 177 David E. Siegel 2008-10-24 20:50:40 UTC
(In reply to comment #163)

<snip>

> What
> features would we like to see implemented?  I'm more than happy to develop a
> patch to implement a solution, if we can all just agree on what it should be. 
> That is -- and always has been -- the real problem:  Agreeing on the
> specification.
As the original poster of this bug, what I would like, and have always wanted, is syntax that would autoformat a date, WITHOUT linking it. If the editor wants it linked, thy can put in a link, and enclose the new syntax in the piped porion. If a link is not wanted, none need be coded.

Autoformatting and linking are two quite separate operations, why retain the association between thm that has been the source of much trouble?

> I recommend we just completely ignore the MOS-nuts who have some sort of
> vendetta against date links (I think everyone knows who those people are) and
> proceed with fleshing out a good specification.  

I don't think i am an MOS nut, but i do think that date links are wildly overused on wikipedia. I also think that many instalations of MediaWiki software exiost that are not wikipedia, and do not routinely have articles on dates. on such wikis, links to dates are almost never useful, when inserted for the sake of autoformatting, they are red-links.

<snip>

> I have the following suggestions:
> * Use a completely new markup, something like [[[August 10, 2005]]].  My
> reasoning is that we could treat dates as a special case of links, so that
> single brackets go with external links, double brackets go with internal links
> (and categories), and triple brackets go with autoformatted links -- which we
> could eventually use for other things such as units of measure or even spelling
> variations (how nice would it be to have [[[colour]]] display variable spelling
> of the word for different readers?)
My preferce would be to use <<August 10, 2005>> or even <date>August 10, 2005</date>, to make it clear that this syntax is in no way conencted with linking. 

> * Allow preference settings (and possibly the default for non-logged-in-users)
> to render autoformatted dates WITHOUT links, to prevent complaints about the
> "sea of blue" for people that are bothered by such things.  Users that prefer
> to have dates linked could also set that preference, so links would autoformat
> AND appear as links for those users.
This is where i most strongly disagree, it preserves the confusing idea that every autofomatted date is in some sense a link, even if the link dispaly is supressed for soem (most) users. Why do this? Simply separate the functions, and an editor can specify either autoformatting or a link, or both, as may suit the neeeds of the particular article and the wiki on which it exists.

> * Add support for auto-generated timelines.  A timeline would work like a
> category, so that an article page for a particular date would automatically be
> populated with links to articles that reference that date, complete with anchor
> tags to take you to the exact point in the article where the date is mentioned.
>  The tricky part would be figuring out how the text of the timeline page should
> appear (since presumably we'd want more than just a link, so something like a
> cross between a category page and a list page.)
This is a much larger scope than the siample provision of autoformatting without linking, although it might be useful in its own right.

> * Use the browser's locale settings to provide default date formatting for
> non-logged-in users.  This will guarantee that autoformatted dates will appear
> in a consistent format throughout the site for ALL users, rather than
> defaulting to their raw format as they do now (which introduces inconsistency.)
This I agree with.
-David E. Siuegel
Comment 178 S. McCandlish 2008-10-24 21:06:48 UTC
There ARE a few other places that use US formatting, though.
Comment 179 S. McCandlish 2008-10-24 21:13:14 UTC
> My preferce would be to use <<August 10, 2005>> or even <date>August 10,
2005</date>, to make it clear that this syntax is in no way conencted with
linking.

The <date>...</date> idea makes sense, in that it is self-explanatory, but may be too long-winded for people to adopt it. The <<...>> idea might work better.
Comment 180 cypsy 2008-10-24 23:40:23 UTC
(In reply to comment #178)
> There ARE a few other places that use US formatting, though.
Yes there are, but can they be identified with 'Accept-Language'?

(In reply to comment #179)
a) <<...>> or <date>...</date> do not fix the problems with the existing 
   syntax. The articles would still be awash in seas of blue. 
b) the existing syntax it is very well entrenched. Indeed, it is probably 
   too well entrenched to dislodge. That is, editors are probably going to 
   continue to use the old syntax just out of sheer habit.
b) although the overloading of square brackets to reformat dates is really 
   just syntactic sugar, it has the advantage of flattening the 
   learning curve. There is no additional syntax to learn.
d) our (old hand) inclination to identify square brackets with linkage first 
   and date parsing second is a conceit. There is no real /reason/ to 
   envision things in that order. After all, we don't "link" to images or 
   "link" to categories either. When we /want/ to link to images or categories 
   we put a ':' before the name. With that we change the default behavior of 
   whatever it is in [[ ]]. There is no reason why this cannot also be true 
   for dates.

Now note the order of clauses in the following phrase: "syntax that would autoformat a date, WITHOUT linking it". For that we don't need to invent altogether new syntax. We merely need to /negate/ the implied link. Bill's second patch (theoretically described in comment #150) did precisely that: 

[[:date]] for dates that should be linked, and [[date]] for dates that shouldn't be linked.

Such a thing would require no changes to well established editing habits, nor would it require sending out a bot to rewrite articles. We already use ':' syntax when we want the default behavior of [[ ]] to change for images and categories. This transfers easily to dates.

Comment 181 David E. Siegel 2008-10-25 01:50:40 UTC
(In reply to comment #180)
> (In reply to comment #178)
> > There ARE a few other places that use US formatting, though.
> Yes there are, but can they be identified with 'Accept-Language'?
> (In reply to comment #179)
> a) <<...>> or <date>...</date> do not fix the problems with the existing 
>    syntax. The articles would still be awash in seas of blue. 
This is true, but I gather that existing areticels are already beign delinked en masse. 

> b) the existing syntax it is very well entrenched. Indeed, it is probably 
>    too well entrenched to dislodge. That is, editors are probably going to 
>    continue to use the old syntax just out of sheer habit.
More deeply entranched habits have been changed on wikipedia in the past, although this one will be a problem. but it will be a problem whatever is done  -- editors with entrenched habits will complain loudly if existing syntx suddenly works differently.

> b) although the overloading of square brackets to reformat dates is really 
>    just syntactic sugar, it has the advantage of flattening the 
>    learning curve. There is no additional syntax to learn.
Frankly i think it will be harder to learn to use the colon. That is too subtle, people will miss it. People often fail to sue it when needed, or use it when not needed, in the existign bracket syntaz,
 
> d) our (old hand) inclination to identify square brackets with linkage first 
>    and date parsing second is a conceit. There is no real /reason/ to 
>    envision things in that order. After all, we don't "link" to images or 
>    "link" to categories either. When we /want/ to link to images or categories 
>    we put a ':' before the name. With that we change the default behavior of 
>    whatever it is in [[ ]]. There is no reason why this cannot also be true 
>    for dates.
The pracket syntax for both images and categories is pretty much invariably described as a "link", albiet one with special properties. Whenever there is overloaded syntax, ther is a choice of how to view it. Peiople usign C++ don't have to think of "+" as additon first and concatenation second, but they do, and will. Linking (in the basic sense) is FAR more commonthan is date-autoformattign, and until now date autoformatting always linked also. We can play Humpty Dumpty ("my words mean whatever I say they mena, neither more nor less") but people won't listen. Bracket syntax will be thoguht of as a link, like it or not. And anyway, WHY use the same syntazx for two quite different functions, whatever you call it.

> Now note the order of clauses in the following phrase: "syntax that would
> autoformat a date, WITHOUT linking it". For that we don't need to invent
> altogether new syntax. We merely need to /negate/ the implied link. Bill's
> second patch (theoretically described in comment #150) did precisely that: 
> [[:date]] for dates that should be linked, and [[date]] for dates that
> shouldn't be linked.
Yes that would accomplish that goal. When i first lised this bug, several years ago, i did not think that chnging the meaning of existing syntax by fiat and all at once would be acceptable, and even today I'm not convinced that it is a good idea. But it would acomplish my stated goal.
 
> Such a thing would require no changes to well established editing habits, nor
> would it require sending out a bot to rewrite articles. We already use ':'
> syntax when we want the default behavior of [[ ]] to change for images and
> categories. This transfers easily to dates.

I suspect it would not in fact be easy. Moreover, i suspect the effect of changing the redition of existing markup, on many many pages, would be so strongly resisted that the change would never be implemented. I am generally opposed to date-linking, but I find the idea dubious. At least bot-changes are recorded in the history, and can be individually reverted.
-Daivid E. Siegel
Comment 182 S. McCandlish 2008-10-25 16:52:28 UTC
> our (old hand) inclination to identify square brackets with linkage first 
> and date parsing second is a conceit. There is no real /reason/ to 
> envision things in that order. After all, we don't "link" to images or 
> "link" to categories either. When we /want/ to link to images or categories 
> we put a ':' before the name. With that we change the default behavior of 
> whatever it is in [[ ]]. There is no reason why this cannot also be true 
> for dates.


Works for me.  Someone brought that up earlier, and I'd forgotten.  This would be intuitive and probably easier to implement than an entirely new syntax.  I also really like the idea that it would instantly de-link dates by default the moment Wikipedia installed it.

>  Linking (in the basic sense) is FAR more commonthan is
date-autoformattign

So?  The very commonness of pointlessly linking dates (the "sea of blue" problem) is very much at the heart of the matter.

>  When i first lised this bug, several years
>  ago, i did not think that chnging the meaning of existing syntax by fiat and
>  all at once would be acceptable, and even today I'm not convinced that it is a
>  good idea. But it would acomplish my stated goal.

The only way to find out is to try it.  I would think that this would be vastly preferable to massively unlinking all dates (there are some bots and individuals doing this, but given literally millions of linked dates no one need panic; they will not just suddenly all disappear).

> I suspect it would not in fact be easy. Moreover, i suspect the effect of
> changing the redition of existing markup, on many many pages, would be so
> strongly resisted that the change would never be implemented.

I don't follow you.  There's already a general consensus that (visible) date linking should mostly stop; there doesn't seem to be any reason to fear any more of a backlash against this particular way of implementing that than against any other method (such as removing [[ and ]] around dates), or against the general proposal to stop linking dates generally.

> At least bot-changes are
> recorded in the history, and can be individually reverted

The [[:DATE]] proposal doesn't have anything to do with editing and history; it's a presentational matter.  What you're arguing seems to be akin to arguing that no changes to the sitewide .css files would be accepted by the community without a massive RFC or VP debate, but (aside from the fact that the debate in this case has already long since occurred more generally) this proves not the be the case at all. The CSS has been radically changed many times without outcry, despite major display alterations in WP's output.  See the histories of the two major CSS files, or for that matter just look at http://nostalgia.wikipedia.org/, if you don't believe me.
Comment 183 Richard Morris 2008-10-25 21:01:22 UTC
Following up on the <date> idea looking at the html-5 spec there is a time element: http://www.whatwg.org/specs/web-apps/current-work/multipage/text-level-semantics.html#the-time-element. This has a syntax like <time datetime="2006-09-23">a Saturday</time>, there may be some advantage is using this, which would make future transition to html-5 easier.

S. McCandlish's argument that editors should see what IP uses see is compelling.
Comment 184 Gerry Ashton 2008-10-26 00:04:53 UTC
The interesting document brought to our attention by Richard Morris has this to say about the purpose of the time element:

"The primary use cases for these elements are for marking up publication dates e.g. in blog entries, and for marking event dates in hCalendar markup. Thus the DOM APIs are likely to be used as ways to generate interactive calendar widgets or some such."

I don't claim to fully understand the spec, but it seems to say the actual value of the time element is the number of milliseconds since 1970-01-01 00:00 UTC. This is likely to lead to it being poorly suited to describe historical events. In particular, dates are Gregorian, and there is no provision to represent BC years. Also, years must be less than or equal to 9999. The suggested use for the element might mean that the people working on that element would not be interested in making it robust for all historical dates.

Since our use would require some kind of extension, we would probably be better off avoiding creating a tag with the same name but different meaning. 
Comment 185 Masem 2008-10-26 15:32:52 UTC
One of the problems we are still left with by simply switching the syntax of date autoformatting without linking is what non-logged in users see, which is a huge crux of the argument.  Since this has been determined to be something that needs to be set per-page, defaulting to international format if a page is not specifically tied to a country that uses the US format, that has to be part of a working end solution.

To me, a simple option altogether is to provide some means of setting a specified date on a per article basis, and then a magic word that determines the date format based on the per-page setting and user pref.  From this, templates can handle the rest, including whether to link or not (Editors have to be mindful of Julian vs Gregorian calendars, and BC/AD as well when entering dates, there are certain defaults we can make via templates for this, and make sure resulting displays show any flags for such).  Not knowing the code well enough to figure out how to do this, the simplest approach seems to be to 
  1) Assume international format as default per-page.
  2) Allow the magic word __DF_US__ (Date Format US) to override this.   A __DF_INT__ can be added as well for completeness, but again, we start with the assumption this is default.  Presumably the tag would need to be a non-display element at the start of the article in the same way the TOC is done.
  3) The magic word DATEFORMAT returns either "US" or "INT" as to be interpreted by a template.  (If a magic word can't be expanded before template instructures, this could also be a #if-like template function).  This magic word/function returns the US date format if:
     - The user is logged in and has US set.
     - The user is not logged in, or has set no preference, and the per-page setting is set to US.

Comment 186 cypsy 2008-10-29 17:10:36 UTC
(In reply to comment #185)
> One of the problems we are still left with by simply switching the syntax of
> date autoformatting without linking is what non-logged in users see, which is a
> huge crux of the argument.  

That has been settled; everyone will see whatever 'Accept-Language' suggests they should see. 

> Since this has been determined to be something that needs to be set per-page, 

A) Dateformatting has not in fact been "determined to be something that needs to be set per-page". This is a server-related bug report of a specific Mediawiki extension. This is not en.wikipedia.org. This is also not anything to do with magic words. Date-formatting has been user-oriented as long as server-side date-formatting has existed, and there has been no discussion to change that model, which is also way beyond the scope of this bug report.

B) Don't put the cart before the horse. If you want that magic word to be supported, you will first have to ensure that that magic word exists. To do that you will need to file a different bug report, or submit a patch for approval etc. *Once* the magic exists adding support for it will not require major changes to dateformatter.

Incidentally, what you are asking for is not going to happen without some thoughtful consideration (elsewhere!) of its ramifications; after all, it will constitute the very first editor-defined content-related magic word, and as such represents a brand-new facility. While there is some precedent for such a thing, CATEGORYSORT and DISPLAYTITLE are not content related. Some thought should also be given to how the facility could be extended/used in the future, e.g. for ENGVAR (think 'Content-Language') and era, and how bots and browsers may make use of it.

Also, content-oriented magic is by definition incompatible with user-oriented settings (which date formatting is), and the order of precedence (or getting rid of the user-oriented setting) will either need executive decision or will need to be discussed in open forum (elsewhere!). Good luck.

Your idea is sound, but the feature you want does not exist yet, and so can't be used here (yet). (BTW: double-underscore magic words are preprocessor commands, not environment variables).
Comment 187 David E. Siegel 2008-10-29 22:20:25 UTC
(In reply to comment #182)
> > our (old hand) inclination to identify square brackets with linkage first 
> > and date parsing second is a conceit. There is no real /reason/ to 
> > envision things in that order. After all, we don't "link" to images or 
> > "link" to categories either. When we /want/ to link to images or categories 
> > we put a ':' before the name. With that we change the default behavior of 
> > whatever it is in [[ ]]. There is no reason why this cannot also be true 
> > for dates.
> Works for me.  Someone brought that up earlier, and I'd forgotten.  This would
> be intuitive and probably easier to implement than an entirely new syntax.  I
> also really like the idea that it would instantly de-link dates by default the
> moment Wikipedia installed it.
There is a difference, IMO, between delinking done as an edit, even by a bot, that can be reverted, and all-at-once effective delinking across an entire site. Also, this is a change to the software, which would affect more sites than Wikipedia. Not all those sites will have the same views as the wikipedia community does, perhaps.

> >  Linking (in the basic sense) is FAR more commonthan is
> date-autoformattign
> So?  The very commonness of pointlessly linking dates (the "sea of blue"
> problem) is very much at the heart of the matter.

My point is that most editors consider the [[ syntax to basically mean "link" and that the idea that it actually means any one of several things, depending on context, is not  taken in. Even the category and image uses are spoken of as links. Changign the software so that [[date]] si no longer a link, but [[:date]] is, seems perverse to me, adn likely to be confusing.

> >  When i first lised this bug, several years
> >  ago, i did not think that chnging the meaning of existing syntax by fiat and
> >  all at once would be acceptable, and even today I'm not convinced that it is a
> >  good idea. But it would acomplish my stated goal.
> The only way to find out is to try it.  I would think that this would be vastly
> preferable to massively unlinking all dates (there are some bots and
> individuals doing this, but given literally millions of linked dates no one
> need panic; they will not just suddenly all disappear).
Actualy i think tha tthe both method is preferable -- it allows individuals watchign particualr articles to see the change and relink where appropriate, and no one denies that date links are proper in some (if perhaps few) cases. With a sudden site-wide change, the burden is shifted to thsoe wishing to maintain the status quo, whch seems mistaken to me.

> > I suspect it would not in fact be easy. Moreover, i suspect the effect of
> > changing the redition of existing markup, on many many pages, would be so
> > strongly resisted that the change would never be implemented.
> I don't follow you.  There's already a general consensus that (visible) date
> linking should mostly stop; there doesn't seem to be any reason to fear any
> more of a backlash against this particular way of implementing that than
> against any other method (such as removing [[ and ]] around dates), or against
> the general proposal to stop linking dates generally.
> > At least bot-changes are
> > recorded in the history, and can be individually reverted
> The [[:DATE]] proposal doesn't have anything to do with editing and history;
> it's a presentational matter.  What you're arguing seems to be akin to arguing
> that no changes to the sitewide .css files would be accepted by the community
> without a massive RFC or VP debate, but (aside from the fact that the debate in
> this case has already long since occurred more generally) this proves not the
> be the case at all. The CSS has been radically changed many times without
> outcry, despite major display alterations in WP's output.  See the histories of
> the two major CSS files, or for that matter just look at
> http://nostalgia.wikipedia.org/, if you don't believe me.

Have any such changes in "presentation" changed things so that what was a link before was not a link afterwards? if not, then i don't think they are relevant. There is IMO a diffrence in kind, not just in degree, between changing color or text style or even layout, and changing from linked to non-linked.

Morever, this change, if made, will be MORE than site-wide. No change to site level css files changes things for every wiki using the media-wiki software, but this change would do that.

-DES


Comment 188 S. McCandlish 2008-10-30 00:30:34 UTC
Keep it simple.  The plan as currently outlined makes sense.  There is no "shift in burden"; there simply *won't be* a burden any longer.  Cannot agree that "'[[..]]' means 'link'"; it *usually* means "link", but it also means "inline an image" or "add a category", and Wikipedians and other users of MW understand this just fine.  Adding another variant isn't going to cause any heads to asplode.  The great benefit of doing it the [[:...]] way is that the bots that are ticking people off can just STOP, as can manual de-linking sprees, and in any (rare) case where a date should be linked all it takes is the addition of a single character to make it happen.  Enough kvetching; let's just do this and move forward.  PS: There's nothing "sudden" about any of this; the matter's been debated (at WP; other wikis, WMF or not, have not seemed to care one way or the other, judging by the lack of their commentary here) for *years* and come out strongly in favor of de-linking of dates.  
Comment 189 billclark 2008-10-30 15:09:52 UTC
One advantage to using a new markup (e.g. [[[30 October, 2008]]] or <<30 October, 2008>>) as opposed to the existing markup ([[30 October, 2008]]) is that a new markup would improve the efficiency of the parser function with regards to matching date formats.  That in turn would allow us to greatly expand the number of patterns that can be matched, so that dates in a variety of other formats could be recognized (e.g. 10/30/2008, October 30th 2008, Oct. 30, etc.)

As currently implemented, the date parsing code is first run against the entire page text, and checks the contents of each [[foo]] item to see if "foo" matches a known date format.  That means that each wikilink in a page will be checked against each and every date format, which is inefficient.  If a new -- UNIQUE -- markup is used, then ONLY dates will match the markup pattern, and so the "inside" portion can be matched against pretty much as many different date format patterns as we like, without a significant  performance hit.

-Bill Clark
Comment 190 cypsy 2008-10-30 19:58:32 UTC
Bill,...

A) using a new markup may (/may/, see below) be an advantage vis-a-vis the current scheme, but *not* using a new markup is also *not* a disadvantage vis-a-vis the current scheme. 

B) if I'm reading your "allow us to greatly expand the number of patterns" correctly, you envision that [[October 30th 2008]], [[Oct. 30]] etc would be individually regex matched, i.e. using a separate rule for each possible variation. But there are only six primitive regexes, no more than DateFormatter's present set of eight. So, if your version stuck to primitives instead of trying every possible combination with a regex, there would be no more matching than DateFormatter is already doing.

C) DateFormatter does not have to continue to be the monolithic beast that it is now. The process can (/should/!) be split to run in two halves. The first half would run pre-cache (e.g. at 	ParserFirstCallInit) and would pre-process the multitude of human-esque date formats to a "normalized" (simplified) form that can be quickly machine-processed later. The second half, running ex-cache (like DateFormatter now), would then convert the simplified form into whatever 'Accept-Language'/user-pref/etc says it should be. How dates are tagged/preprocessed internally (for faster processing later) is left to the developer's discretion; editors don't see it or need to care about it.

Comment 191 billclark 2008-10-30 20:22:10 UTC
(In reply to comment #190)

> B) if I'm reading your "allow us to greatly expand the number of patterns"
> correctly, you envision that [[October 30th 2008]], [[Oct. 30]] etc would be
> individually regex matched, i.e. using a separate rule for each possible
> variation. But there are only six primitive regexes, no more than
> DateFormatter's present set of eight. So, if your version stuck to primitives
> instead of trying every possible combination with a regex, there would be no
> more matching than DateFormatter is already doing.

I don't follow this at all.  Could you explain it in some more detail?

> C) DateFormatter does not have to continue to be the monolithic beast that it
> is now. The process can (/should/!) be split to run in two halves. The first
> half would run pre-cache (e.g. at    ParserFirstCallInit) and would pre-process
> the multitude of human-esque date formats to a "normalized" (simplified) form
> that can be quickly machine-processed later. The second half, running ex-cache
> (like DateFormatter now), would then convert the simplified form into whatever
> 'Accept-Language'/user-pref/etc says it should be. How dates are
> tagged/preprocessed internally (for faster processing later) is left to the
> developer's discretion; editors don't see it or need to care about it.

I'm not sure I follow this, either.  Where would the normalized dates be stored?  If I'm understanding this correctly, you're saying that some normalization of the date format should take place when a new edit is submitted, stored (where?) and then when the cache is rebuilt following the first page request, the new DateFormatter code can operate on the normalized dates.  That would seem to either require a new place to store the normalized page text or else we would be replacing the submitted article text with the normalized version, neither of which I think is a great idea.  Could you clarify this point as well?

-Bill Clark
Comment 192 cypsy 2008-10-31 05:01:05 UTC
I'm not a php guy, so what follows is only my take from what I can glean by reading the code.

Explaining B)

DateFormatter presently loops through 8 regexes, each being a particular date-format pattern. So, e.g. [d M Y] has one pattern, and [d M] has another pattern and so on. The entire article is searched through eight times, once for each regex.

In that scheme, if you also wanted to handle [31st October 2008] and [31 Oct. 2008] and [31st Oct. 2008] and [31 Oct 2008] and [31st Oct,] etc ad nauseum, you would need a new regex for each of them, and the number of passes through the article rises accordingly.

But you could also find a "primitive" that matches all of the above (see link in the email I sent a few days ago). In such a regex punctuation would be ignored, the test for month names would be {October|Oct) (with /i of course), the day-number test would be (\\d{1,2})(st|nd|rd|th), and the year + era would also be parenthesized as optional.

When you make the regex as encompassing as possible, the regex handler does not have to return to the calling routine as often, which cuts out one pass through the article for every regex that you save. While the regex itself gets slower, this is not as significant as extra runs through the php. After all, the regex handler is native. 

Secondly, and irrespective of which syntax is chosen, running through the article N times is wasteful. phase3/includes/Parser.php does this much faster by exploding all "links" to an array, string-manipulating them there, and then merging them back. Incidentally, since parser.php is processing each and every [[ ]] anyway you'd be handed each "link" on the silver platter if you could get the devs to provide a call/hook from that level. I'm guessing but the end of the inner loop in replaceInternalLinks2() seems like a good place.

Explaining C)

Currently, when an editor presses save, the article is saved in two "states". One state in the editable form which editors see, and the other is the form that /eventually/ makes it way to the client as html. The second of these two forms is immediately run through a preprocessing phase, which includes most (all?) template expansion and many mediawiki extensions such as cite.php. This is relatively slow, and this preprocessed form is what is cached.

Later, when a client requests an article, the preprocessed version is retrieved from cache and processing is finalized before finally becoming the html as you see it. This, "final" stage is where DateFormatter currently sits (after all, it needs to be userpref aware, and so needs to be update content on the fly) and also where "links" are expanded to become "red links", "blue links", image links, category links, "Special:" namespace links, "Media:" links, ISBN links, PUBMED links, inline images, etc etc etc.

So, the date normalization would be done at about the same time as template expansion et al, which is in the state that eventually becomes html. Its not in the in the state that would subsequently be edited.

Comment 193 billclark 2008-10-31 16:47:12 UTC
(In reply to comment #192)
> Secondly, and irrespective of which syntax is chosen, running through the
> article N times is wasteful. phase3/includes/Parser.php does this much faster
> by exploding all "links" to an array, string-manipulating them there, and then
> merging them back. Incidentally, since parser.php is processing each and every
> [[ ]] anyway you'd be handed each "link" on the silver platter if you could get
> the devs to provide a call/hook from that level. I'm guessing but the end of
> the inner loop in replaceInternalLinks2() seems like a good place.

Ah, I understand your point now.  Thanks for the clarification.

I think I was getting ahead of myself in thinking about this, and was (mistakenly) thinking that the DateFormatter was already extracting [[ ]] "links" prior to matching against the date regexes.  Revisiting the code, I see it's exactly as you describe.

Now, the reason I was suggesting we use a completely new markup is because I'd like to make DateFormatter work essentially the same way as Parser does -- by first extracting all of the date-marked-up strings into an array, then iterating through that.  That way, the first pass of DateFormatter would only need to use one regex (to identify the strings with the new date markup) and then we could apply either a large number of simple regexes or a smaller number of more complex regexes to ''only'' the strings that were extracted in the first pass.  I believe that would be far more efficient than the current scheme, and would allow us to recognize a much larger number of formats (including AD/CE and BC/BCE indicators, as well as indicators as to which calendar is being used, etc.) AND to do some additional processing based on the date itself to handle the ISO-related exceptions that Gerry has repeatedly mentioned.

> Currently, when an editor presses save, the article is saved in two "states".
> One state in the editable form which editors see, and the other is the form
> that /eventually/ makes it way to the client as html. The second of these two
> forms is immediately run through a preprocessing phase, which includes most
> (all?) template expansion and many mediawiki extensions such as cite.php. This
> is relatively slow, and this preprocessed form is what is cached.

Ah, I haven't looked much at the cache-related code, mostly because I turn it off in all of my own MediaWiki installations.  Your claimed lack of PHP expertise notwithstanding, you seem to have a very good grasp of what's going on in those parts of the code, so I'll trust that your explanation is accurate.

However, even with the improvements due to caching, I still think it would be better to use a new markup and to make DateFormatter work more like Parser, since it would mean that articles would only need to be matched against one -- very simple -- regex in most cases, and then the extracted date strings could be matched in a second pass (and thus any articles that had zero dates marked up in the new format would skip the second pass entirely and only need to be matched against a single regex.)

If we DO introduce a new markup syntax, I'd like to leave open the possibility of using it for things other than dates as well.. unit conversions and spelling variations are just two other cases where I could see this type of functionality being useful.  For the same reasons that I think date formats should be something that's ultimately up to the reader, I think spelling variations and unit preferences should also be reader-specific.  Since that represents an entirely new class of functionality, it's another reason I'm leaning toward a new markup syntax.

-Bill Clark
Comment 194 Rich Farmbrough 2008-10-31 22:14:49 UTC
A little note.

While we are discussing ways to display different formats the world moves on, in particular the US does.  Whilst the majority of US usage seems still to be month-first - "April 10" - the proportion of day first - "10 April" - is increasing, or so I read in a well-known on-line encyclopaedia.  The US military use day-first, and the US courts to some extent (I know not how extensively).  For this reason I feel that tying date format to reader locale is dubious at best (as is referring to the date format as "US"). If, however, it were possible to know the system date format it might be more apposite. 
Comment 195 billclark 2008-10-31 23:01:01 UTC
(In reply to comment #194)
> I feel that tying date format to reader locale is dubious at best
> (as is referring to the date format as "US"). If, however, it were
> possible to know the system date format it might be more apposite. 

There might be ways of getting such information using Javascript, but it would be very platform-dependent and is bordering on invasion of privacy issues, since it's not information that's normally transmitted with web requests (as opposed to the Accept-Language, which is.)

I'm also not sure if it's really worth the trouble.  Remember, using Accept-Language is just a ''guess'' at the date format the reader expects, just as using "strong national ties" for the article subject was simply a ''guess'' at the majority readership (and thus an even worse guess as to the expected date format.)  As long as the majority of readers that have "en-US" set as their Accept-Language expect "Month Day, Year" format, then we're fine in using that as the default for those users (and FAR better off than we are now.)

The best solution for format issues is for a reader to register and set a format preference themselves.  The only problem with that currently is that setting that preference then hides inconsistencies in format from the editor who sets a preference -- but if there were no inconsistencies in format (because we use some reasonably default instead of "no default") then that would no longer be an issue, and we could then ''encourage'' people to set a date format preference.

Also, just as an aside, I happen to greatly prefer dates in "big-endian" format (i.e. YYYY-MM-DD format, which I DO NOT consider to be ISO, despite what Wikipedia labels it) and find that format both easier to read and more logical.  I'm mentioning this mostly because a lot of people here and on Wikipedia seem to take it as a given that big-endian format is "unreadable" and that nobody wants it.  I'm also far from alone, since a LOT of programmer-types prefer that format (mostly because alphanumeric sort order then corresponds to chronological order) so we shouldn't make too many assumptions about the popularity of various date formats and how it relates to the depth of feeling that people have on the matter.

-Bill Clark
Comment 196 Stephen Turner 2008-11-01 08:36:33 UTC
(In reply to comment #195)
> 
> Also, just as an aside, I happen to greatly prefer dates in "big-endian" format
> (i.e. YYYY-MM-DD format, which I DO NOT consider to be ISO, despite what
> Wikipedia labels it) and find that format both easier to read and more logical.
>  I'm mentioning this mostly because a lot of people here and on Wikipedia seem
> to take it as a given that big-endian format is "unreadable" and that nobody
> wants it.
>

I think the "given" is a weaker one: that it shouldn't be presented to users who haven't explicitly chosen that format, certainly in normal prose. So, given the current code, no-one should write "the Second World War started on [[1939-09-03]]", because unregistered and new users will see the date in that format: worse, regular users won't even realise that that's what most people are seeing.
Comment 197 Gerry Ashton 2008-11-01 14:27:04 UTC
(In reply to comment #196)
> (In reply to comment #195)
> I think the "given" is a weaker one: that it shouldn't be presented to users
> who haven't explicitly chosen that format, certainly in normal prose...

I think that the YYYY-MM-DD format shouldn't be presented
to readers who haven't explicitly chosen that format (in normal
prose), and further, the readers should be explicitly warned
at the time of making the choice
that dates in that format do not conform to ISO 8601, and
that they are a simple substitition and reordering of the date
that was originally written by the editor who wrote it, without
consideration of what calender it might be in. This would
require a change to the preference menu where the format
is selected.

I also think there should be different preference choices for
articles vs. system information, because I'd like to see
system information (e.g. the items in my watch list) in
the YYYY-MM-DD format but I don't want to see it in articles.
Comment 198 cypsy 2008-11-03 16:25:01 UTC
Returning to the subject at hand...

Assuming that there is any interest in preserving the potentially valuable markup, there are only two issues that /need/ to be taken care of *RIGHT NOW*. 
1) The seas of blue, 
2) the YYYY-MM-DDs. 
These are the two basic issues addressed by MOSNUM's deprecation of date linking.
* The first issue requires merely requires standard [[date]] syntax to not be a link. 
* The second issue merely requires dates to be converted for anons too.

If/When/How an alternate date-formatting markup is implemented does not have anything to do with [[date]] syntax and accordingly is not a solution for the "seas of blue". The "seas of blue" problem has to be resolved one way or another, and given that we already have bots and users running around stripping dates of their markup, the "seas of blue" problem has to have priority over everything else. 

The 'accept-language' method is -- _for_the_moment_ -- sufficient to guess what an anon's preferred date format might be. All the other preference detection is non-crucial "nice-to-have" stuff that neither affects a majority of the readership, nor will it be "missed" since it wasn't there to begin with. And all of the other preference detection depends on whether wikis will provide something that can be detected to begin with. It is accordingly pointless to discuss it here and now. Nice-to-have thingies should be filed elsewhere as "Requests for enhancement" after discussion at the village pumps. It was precisely this sort of brainless "exception ABC needs solution XYZ" narcissism that led to there being no sensible default for anons.

A patch is available, and although incomplete because it doesn't resolve the default-for-anons problem, it is a first step, and imminently commit-able. The fatal problem then is that devs with commit privileges don't care to commit it. AND THEY WON'T as long as people bicker about _future_enhancements_ as if such features would solve the bugs. The only thing that these requests-for-enhancement do is indicate that a solution is not in sight.

If there is no interest in preserving the potentially valuable markup, then there is no need to discuss /anything/ date-formatting related. The [[date]]s will vanish eventually, and MOSNUM has deprecated any further markup. So this bug report will eventually become obsolete, and with it all the blocking requests-for-enhancements. The stalling over added functionality allows date-formatting to destroy itself.

If Bill Clark would upload a patch that does what his [[:]] patch theoretically did, then the REASON FOR THIS BUG REPORT is solved. This bug could then be reassigned to Tim (or another dev) and marked _FIXED_. MOSNUM's other technical issue is the date-format for anons, and if this patch could infer a sensible default then that problem would be solved too. Aside from these two issues there is nothing else that "must" be taken care of _now_.
Comment 199 billclark 2008-11-03 16:49:59 UTC
(In reply to comment #198)

> If Bill Clark would upload a patch that does what his [[:]] patch theoretically
> did, then the REASON FOR THIS BUG REPORT is solved. This bug could then be
> reassigned to Tim (or another dev) and marked _FIXED_. MOSNUM's other technical
> issue is the date-format for anons, and if this patch could infer a sensible
> default then that problem would be solved too. Aside from these two issues
> there is nothing else that "must" be taken care of _now_.

Good point.

This is a busy week for me at my real life job, but I can probably rework the patch by the weekend.  The only complicated part (I think) is tying it in with the cache stuff, since I think the anon default would need to be worked into the user preferences so the cache checking code knows which version of the page to retrieve.  At least, I ''think'' that's how the cache code works (one pre-processed version of the page is cached for each user preference option) but I'll poke through the code to be sure, when I get some time.  I may be able to do at least that much today.

It may actually be simpler (and less efficient) than that, if it's not tied into the cache checking code, in which case I'll be able to provide an updated patch within the next few days.

-Bill Clark
Comment 200 cypsy 2008-11-04 05:32:13 UTC
(In reply to comment #199)

> It may actually be simpler (and less efficient) than that, if it's not tied
> into the cache checking code, in which case I'll be able to provide an updated
> patch within the next few days.

DateFormatter is indeed not tied to the cache. Which is why there is lots of room to making it more efficient ;)

There is only one version in the cache, and links (of all kinds) are not pre-processed into <A>s. This expansion occurs in the very last stage of wiki->html conversion, and needs to be done post-cache so that links are sensitive to state (user and dependencies). I.e., a user's [[Special:]] links at the top-right of each page need to be correct, redlinks/purplelinks/bluelinks state needs to be marked up with the appropriate css classes, and so on.
Comment 201 billclark 2008-11-04 19:21:03 UTC
Created attachment 5502 [details]
unlinks dates, preserves markup, uses Accept-Language for defaults (or DMY)
Comment 202 billclark 2008-11-04 19:25:39 UTC
I've uploaded a modified version of my second patch.

It uses the current date markup ([[ ]]) and will reformat dates according to user preferences, but WITHOUT making them into links.  Dates that should be linked can use the [[:]] format and they will be linked and left in their raw format (same as now.)  Users with "No preference" or anon users will either see marked-up dates in MDY format (if the string "en-US" appears in the "Accept-Language" header sent by their browser) or DMY format (aka "International format") otherwise.
Comment 203 billclark 2008-11-04 19:35:38 UTC
(In reply to comment #202)
> Users with "No preference" or anon users 

Hmm.. it just occurred to me that maybe these two should be handled differently?  The most recent patch will basically force all properly marked-up dates to appear in a consistent format for all users.  But perhaps logged-in users who have specified "No preference" might actually want to see the (possibly inconsistent) raw format, for some reason?

It's currently recommended practice to have "No preference", because anon users see dates in raw format.. but since (with this patch) anon users will see either MDY or DMY format consistently, should "No preference" users see the same thing as anons, or should "No preference" continue to mean "raw" format?

Comment 204 Stephen Turner 2008-11-04 19:45:19 UTC
Doesn't "no preference" really mean that the new editor hasn't found the preferences page yet? In which case, I think they should be treated the same as anons.

There could alternatively be another option to distinguish the two cases: "guess from my locale" (default) vs "no transformation". (Better names needed!).
Comment 205 cypsy 2008-11-06 00:11:40 UTC
Well done Bill (technical observation/suggestion follows by e-mail)

Apropos 'Accept-Language' for "No preference" or anon users: The rationale behind currently recommended practice to have "No preference" is to see dates as anons see them. Since anons are now going to see something other than YYY-MM-DD, then having "No preference" should do the same.

Stephen's "no transformation" is a good idea for an /additional/ option, but getting that added to the prefs page is going to be another ball of wax altogether. Bill could of course preempt that addition by adding that functionality to DateFormatter now, but I think it would be a bit premature to change the list of 'const'ants right now. 
Comment 206 cypsy 2008-11-06 00:27:10 UTC
Bill, I'm not sure if this matters, but there is a discrepancy in the dates of the version that you are diff'ing against.
  Your diff is against some version from October 23
  The version on svn (revision 36353) is dated June 16
Comment 207 Gerry Ashton 2008-11-06 00:43:09 UTC
Concerning whether a logged in user with no preferences should see what an
anonymous reader sees, or should see an untransformed date, making no
preferences equal the same as anonymous reader removes function, and
makes it impossible to see untransformed dates. Function should not
be removed, especially because a significant number of editors would
rather see harmony between the style of the article and the style of the
dates, rather than one fixed choice.

Further, the present preference selection menu is inadequate, and for
best chance of acceptance of this change, it should be changed as
follows:

-An explicit warning that dates presented in the format that looks
like ISO 8601 do not always conform to that standard

-Replace "No preference" with two items, "Default" and "No transformation"

-Separate the choice for the realm where the date autoformatting presently
occurs (articles, talk pages, Wikipedia space, user pages, help pages, etc.)
with the choice for system messages (such as what appears on watchlists
and history pages). This would provide the advantage that system messages
in the YYYY-MM-DD format include seconds, while other formats do not, so
separating the choices would allow people to see seconds in system messages
and at the same time article dates that include spelled-out months. (Not
that people usually want to see seconds, but it may be useful on occasion.)
Comment 208 Voyagerfan5761 / dgw 2008-11-06 01:14:20 UTC
I would like to second Jerry's suggestions from c207, with a question about ISO 8601 dates. As I understand it, the issue arises with dates in the Julian calendar. Since dates on talk pages, system messages, and such shouldn't be anything but Gregorian, would the "explicit warning" only apply to the option for content-namespace dates?
Comment 209 Gerry Ashton 2008-11-06 02:04:14 UTC
I agree with Voyagerfan5761 that system generated times, which will neither have problems with being in a non-Gregorian calendar, nor have a problem with years greater than 9999, will conform to ISO 8601. It might be more expedient to just say dates in that format don't conform to the standard, rather than trying to explain when the standard is obeyed and when it isn't. People will understand 21st century dates even if we make no claim about conforming to the standard.
Comment 210 cypsy 2008-11-06 05:06:17 UTC
In response to Comment #207, Comment #208, Comment #209:

> "making no preferences equal the same as anonymous reader removes  function". 

What function? 
Doing nothing (not doing anything) is not a function. Consequently, 
doing anything (not doing nothing) isn't a loss of function.

> "impossible to see untransformed dates"

The purpose of linking dates is to see them transformed. There is no purpose to linking dates that are not to be transformed.
The condition is reversable; the condition that makes them transformable can be removed to make them untransformable. Therein lies the solution for "impossible to see untransformed dates".
Also, it seems that date transformation no longer occurs in previews, so there you go.

> An explicit warning that dates presented in the format that looks like ISO 8601 do not always conform to that standard

This is not the right place to complain about the perceived abuse of the term "ISO". Regardless of the potential for abuse, the term "ISO" is a handy moniker for "YYYY-MM-DD". Come up with something else and then go preach it with the world (after solving some more concrete crises perhaps). Dateformatter doesn't care what you call it. Dateformater does not have fireside chats, leave alone chit-chat about cerebral abstractions.

> Julian calendar

DateFormatter does not care a hoot about the validity of a date or what calendar is being followed. For all it cares, there are a bazillion months in the year, 99 days in every month, and every 42 years is a leap year. Accordingly, February 92 is just as (in)valid as February 29. Garbage in, garbage out. This approach has not been a problem in the past, and there is no point to making it a problem now.

> Replace "No preference" with two items, "Default" and "No transformation"

Adding a new date pref option is far beyond the scope of this bug report. Its not even DateFormatter's business to generate the prefs page. It also has ramifications far beyond DateFormatter. If you want such changes, you need to file another bug report.
Comment 211 S. McCandlish 2008-11-06 06:00:46 UTC
> The most recent patch will basically force all properly marked-up
dates to appear in a consistent format for all users.  But perhaps logged-in
users who have specified "No preference" might actually want to see the
(possibly inconsistent) raw format, for some reason?

Logged-in "no preference" users should continue to see the raw format, if possible, as this helps us to identify inconsistencies and fix them (inconsistencies of that level of geekiness are probably only going to be noticed during featured article reviews, but still).  We also need to keep in mind that WP content can be repurposed in multiple ways, including database dump and import into other wikis that may have different settings (i.e., might not be doing autoformatting tricks), so ultimately consistency is non-trivial.

> The rationale
behind currently recommended practice to have "No preference" is to see dates
as anons see them. Since anons are now going to see something other than
YYY-MM-DD, then having "No preference" should do the same.

Not really. Some have stated it that way, but it is a misstatement.  The rationale is actually to *see what is really there* so that consistency problems are easy to spot and resolve. Before this patch, that happened to equate to seeing what anons see, but the relationship is one of coincidental correlation, not causation.

> There could alternatively be another option to distinguish the two cases:
"guess from my locale" (default) vs "no transformation". (Better names
needed!).

That compromise works for me, other than I would reverse the default. If someone has an account, they are most probably an editor, and (MOS collectively feels, anyway) that they *should* be seeing the real deal, for the reasons given above.

> Function should not
be removed, especially because a significant number of editors would
rather see harmony between the style of the article and the style of the
dates, rather than one fixed choice.

That, too.  And this is another reason to add the new choice, and invert the suggested "guess" default.

> Therein lies the solution for "impossible
to see untransformed dates".

Cypsy, I don't think you grokked the argument that was being made; I think I may have explained it better here.  The really short version is: When I (logged in) go to an article on, say, a thoroughly American topic, like [[Rudoloph Wanderone, Jr.]], and see an int'l.-formatted date, as an editor I know immediately that this article has at least one inconsistency in it and needs proofreading (and being an active and compulsive editor, I will be very likely to do that proofread and cleanup).

> Further, the present preference selection menu is inadequate, and for
best chance of acceptance of this change, it should be changed [...]

I tend to agree, but reiterate cypsy's logical (if forceful) comment that *this is not what this bug report is about*; that's a new feature request.  Even some of what we're talking about here technically consists of new feature requests but can arguably be included here because without that "cleanup" the simpler changes that would resolve this bug in theory wouldn't do so in spirit or in functional practice. But this does not apply, that I can see, to these add'l. feature requests.  KISS principle, then: Only do what we have to, to resolve this bug, then open a new feature request ticket for expanded date formatting options (and warnings, and so forth).

All that said, I agree with everything else so far.  The [[:]] trick will do what MOS needs, and the HTTP headers trick will also solve the YYYY-MM-DD problem for anons.  Huzzah!
Comment 212 Stephen Turner 2008-11-06 08:22:51 UTC
(In reply to comment #211)
> 
> > The rationale
> > behind currently recommended practice to have "No preference" is to see dates
> > as anons see them. Since anons are now going to see something other than
> > YYY-MM-DD, then having "No preference" should do the same.
> 
> Not really. Some have stated it that way, but it is a misstatement.  The
> rationale is actually to *see what is really there* so that consistency
> problems are easy to spot and resolve. Before this patch, that happened to
> equate to seeing what anons see, but the relationship is one of coincidental
> correlation, not causation.
> 

I tend to agree with the grandparent post not the parent post here. The most important reason for an editor to see the raw dates is because that's what most readers see. If most readers no longer see that, the reason is removed. Maybe there are other reasons -- such as worrying about reuses that don't do date formatting -- but they seem minor to me, and I'm far more concerned about editors who've never gone into their preferences page and chosen their favourite settings. As a geek, I usually do that straight away in a new program, but as a software author I've discovered that lots of people don't. My main worry is people who create an account, and suddenly all the dates -- which worked when they were an anon -- go wrong.
Comment 213 cypsy 2008-11-06 11:41:36 UTC
> Cypsy, I don't think you grokked the argument that was being made; 

> The really short version is: When I (logged in) go to an article on, 
> say, a thoroughly American topic, like [[Rudoloph Wanderone, Jr.]], 
> and see an int'l.-formatted date, as an editor I know immediately 
> that this article has at least one inconsistency in it and needs
> proofreading

Let me reiterate to see if I understand it right this time:

An editor-with-"no preference"-pref (i.e. no transform) visits an article with a strong US tie. That being a "US article", the raw wiki dates should all have been written as [[M d, Y]] already and will so appear in that form for editor-with-"no preference"-pref. But an inconsistency, say, a [[YYYY-MM-DD]] date, will stick out.

Is that correct? If so, does it make any difference if raw wiki dates were not already [[M d, Y]]? After all, the output -- regardless of how the raw wiki dates are written -- would still be uniform, right? 

As I understand it, the consistency/national ties issue is moot when all dates in an article are linked. A visitor to [[Rudolf Wanderone]] with intl pref (or defaulting to intl pref) will see all linked dates as intl dates, no matter how these are formatted in the raw text. 
Similarly, a visitor with US pref (or defaulting to US pref) will see all linked dates as US dates, no matter how these are formatted in the raw text. 
Since the output will always match the pref/autopref of the visitor, it wouldn't really matter anymore how dates in the raw wiki text are formatted. Right?

Secondly, normalizing the dates to an internally consistent format must be a truly onerous task. 
Surely this can be automated?
Comment 214 billclark 2008-11-06 17:19:41 UTC
(In reply to comment #213)
> Since the output will always match the pref/autopref of the visitor, it
> wouldn't really matter anymore how dates in the raw wiki text are formatted.
> Right?

It wouldn't matter for WP, but as others have pointed out, it could matter for mirror sites and for people downloading database dumps.

> Secondly, normalizing the dates to an internally consistent format must be a
> truly onerous task. 
> Surely this can be automated?

It can only be automated if the dates are marked-up.  It's significantly more difficult to identify dates that aren't surrounded by markup (mostly because of variations of date ''ranges'' that contain substrings that look like stand-alone dates) but that's not the worst part.  The worst part is that there's no way for a script to know if a non-marked-up date string is within a ''quotation'' and should thus NOT be normalized, because there is no consistent way of identifying quotations (some articles use quote marks or templates, but some just indent the text or change the size/style or make it italic.)

So yes, it's possible to automatically correct dates that are marked up with [[ ]] (or any other syntax) but any dates that have already been de-linked (or were never linked in the first place) will need to be checked manually, to make sure they're not within a quotation.

-Bill Clark 

Comment 215 Gerry Ashton 2008-11-07 13:58:50 UTC
If "no preferences" is to be reinterpreted as "format according to my browser language setting" and the preference menu is to remain untouched, this bug should be rejected. It is not acceptable to prevent signed in users to indicate they wish to see marked-up dates as they were entered in the article (with the sole exception of inserting a comma between the day and month in a case like [[January 1]] [[2008]], due to the historical behavior of date autoformatting.

Comment 216 S. McCandlish 2008-11-07 22:55:54 UTC
> If "no preferences" is to be reinterpreted as "format according to my browser
language setting" and the preference menu is to remain untouched

That's what I've been arguing against. Editors SHOULD see the date as it is entered, unless they elected to see a specific date format, otherwise they are not aware of article inconsistencies.

> this bug should be rejected.

No need to be hyperbolic.

> It is not acceptable to prevent signed in users to indicate
they wish to see marked-up dates as they were entered in the article

Right. But if we just make that the default behavior, there's no need for further alteration of the prefs menu.

If people do insist on browser lang. settings being the default, then, yes, that menu will need changing, but only in that case, unless I'm misunderstanding something.
Comment 217 cypsy 2008-11-08 07:48:49 UTC
Ok, assuming that Bill will return to old "no preferences" behavior for logged-in users, is the patch acceptable to everyone? 
Comment 218 Stephen Turner 2008-11-08 08:50:40 UTC
(In reply to comment #217)
> Ok, assuming that Bill will return to old "no preferences" behavior for
> logged-in users, is the patch acceptable to everyone? 
> 

I wouldn't veto it, even if I had such a thing as a veto! But I do still have some worries about the case I raised in comment #212:

> My
> main worry is people who create an account, and suddenly all the dates -- which
> worked when they were an anon -- go wrong.
> 

Is there anything we can do to address that?
Comment 219 cypsy 2008-11-08 09:42:20 UTC
(In reply to comment #218)
> > My
> > main worry is people who create an account, and suddenly all the dates -- which
> > worked when they were an anon -- go wrong.
> > 
> 
> Is there anything we can do to address that?

Not if the logged-in default is "no transform" (the alternative is to disable autodates for anons, which I don't think anyone considers an "option").

Since we back on that subject ([*sigh* deeply here])...

> > Since the output will always match the pref/autopref of the visitor, it
> > wouldn't really matter anymore how dates in the raw wiki text are formatted.
> > Right?

> It wouldn't matter for WP, but as others have pointed out, it could matter for
> mirror sites and for people downloading database dumps.

ok, lets automate that then (see next point). Given (what must be) thousands of pages with inconsistent date formatting, this should be handled by bot even if the logged-in default is "no transform".

> > Secondly, normalizing the dates to an internally consistent format must be a
> > truly onerous task. Surely this can be automated?

> It can only be automated if the dates are marked-up. 

Um. Those are the only kind of dates under discussion. If they weren't marked up, they would not be subject to transformation. When they are not subject to transformation, they are also not influenced by whether the logged-in default was "no transform" or "from locale".

Comment 220 Philippe Verdy 2008-11-13 23:57:26 UTC
See Bug 16337 that points to a discussion article and proposal on Meta, for a new very versatile ParserFunction whose style will adopt the syntax of ParserFunctions, and of template parameters (it may look like an extension of template parameters, except that they are modifiable and don't have to be necessarily passed on template invokation (because the block of context variables is passed implicitly).

The discussion on Meta relates existing problems on Multilingual projects like Commons and Wiktionary (or even Meta itself). It can be used to manage multiple sort keys in Wiktionary, instead of just a single default one (the alternative being to pass extra values as parameters to all subtemplates used in a page within some context of use, like the current language code, and the current sort key for that language, to use during categorization of pages using utility templates like the grammatical type templates, or translation templates on Wiktionary.

[[Meta:GlobalContextVariables_Extension]]

Comment 221 Le Chat 2008-11-20 10:53:27 UTC
Can I vote against this bug? There are so many useful things developers could be doing to improve the software, I don't see any reason to be wasting time on such a pointless issue. No-one needs dates to be formatted in a particular way - it's a non-problem. If text is to be presented to readers in different ways depending on some option or IP range, then start with different spellings like color/colour, or synonyms like football/soccer. However we don't do that, and we don't do metadata either - just let editors write dates as they want them to appear, linked or unlinked according to principles decided by their particular project, and everyone (once they've got over the recent changes in appearance) will be happy. (Kotniski from en.WP) 
Comment 222 Alex Z. 2008-11-25 15:43:28 UTC
Note that the way Wikimedia's caching works, anons will only be able to see one style of date links without significantly reducing the effectiveness of caching. Wikimedia uses a [[Squid cache]], which means that for anonymous users, the process to get a page from the server works something like:
1. Is the page in the cache?
  a. If so, and the cache hasn't expired, return the cached version
  b. Otherwise, continue
2. Start up MediaWiki to get the page
  a. Is the page in the parser cache? If so, return that version
  b. If not, get the text from the database and reparse

For anon users to have multiple options for date formatting, the number of cache misses would likely rise significantly, especially at first, and the size of the cache could potentially be multiplied by the number of formatting options.
Comment 223 billclark 2008-11-25 22:23:40 UTC
(In reply to comment #222)
> Note that the way Wikimedia's caching works, anons will only be able
> to see one style of date links without significantly reducing the
> effectiveness of caching. 

Yep.  That's pretty much a show-stopper, too.  At best we'd be able to "fool" the squid cache into treating the MDY and DMY versions of a page as two different pages (with which to display determined by the HTTP headers sent by the browser) but that would still cut the cache efficiency in half, which is too much of a performance hit for too little gain.

I'm giving up on this bug and recommending that it be closed.

A javascript-based approach would still work, but that's well beyond the scope of this request and a new one should be opened if people want to pursue that route.

This also means that anyone who wants to keep Date Autoformatting should start reverting the work of Tony, et. al. with their date unlinking, since it doesn't look like a technical fix is in the works after all, and the only reason some of us have been holding off on the reverts is because we'd been hoping for a better (technical) solution.

-Bill
Comment 224 Masem 2008-11-25 22:29:43 UTC
FWIW, in reply to Bill's comment on closing the bug, I will point out a current RFC where it was asked if editors feel Date Autoformatting is desirable, see http://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style_(dates_and_numbers)/Date_Linking_RFC#Is_some_method_of_date_autoformatting_desirable.3F

The answer seems to be leaning to "no", so closing the bug and giving up on fixing it would seem to be agreeable.
Comment 225 Omegatron 2008-11-25 22:34:32 UTC
(In reply to comment #223)
> Yep.  That's pretty much a show-stopper, too.  At best we'd be able to "fool"
> the squid cache into treating the MDY and DMY versions of a page as two
> different pages (with which to display determined by the HTTP headers sent by
> the browser) but that would still cut the cache efficiency in half, which is
> too much of a performance hit for too little gain.

We've been through this already.  The default format for all unregistered users should be "28 May 1996" as per RFC 2822.  Then you only need to cache one version of the page.

But the dates should be given a class name like <span class="localizedate">, so that third-party tools can recognize from the HTML which dates should be left verbatim and which can be localized.

This offloads the formatting for anons into a javascript, which can be written at any later date and doesn't have anything to do with Mediawiki itself.
Comment 226 S. McCandlish 2008-11-25 22:57:13 UTC
(In reply to comment #225)
> We've been through this already.  The default format for all unregistered users
> should be "28 May 1996" as per RFC 2822.  Then you only need to cache one
> version of the page.
> 
> But the dates should be given a class name like <span class="localizedate">... 
> ...and doesn't have anything to do with Mediawiki itself.

Exactly. This should not be a WONTFIX candidate at all!  I don't know why every time I come back to this page, the KISS principle has gone out the window and people are trying to complicate the matter. This could have been fixed with patches already posted quite some time ago.

Comment 227 Alex Z. 2008-11-25 23:03:07 UTC
So this request is to just change the default date preference formats on the English Wikipedia ($wgDefaultUserOptions), and have the parser wrap them in a span?
Comment 228 S. McCandlish 2008-11-26 03:55:08 UTC
(In reply to comment #227)
> So this request is to just change the default date preference formats on the
> English Wikipedia ($wgDefaultUserOptions), and have the parser wrap them in a
> span?
> 

No. I think you need to read up on a lot more of the discussion. The defaults are a minor side point. This bug is to make [[date]] stop autoformatting, and have [[:date]] do that instead.
Comment 229 Alex Z. 2008-11-26 06:04:37 UTC
Okay, nobody's going to do this if they have to read through 2 years worth of comments to figure out what's going on. When I first saw this, the plan seemed to be to modify the software to give some sort of date formatting for anon users.

This bug has more than 3 times as many comments than bug 57 - the bug for single-user-login!

Please, someone just summarize what the request here actually is, then there is a chance that it might get resolved sometime this century.
Comment 230 p858snake 2008-11-26 06:24:38 UTC
Why not have a special syntax for date formatting << >> which would be auto formatted to a certain style (ISO 8601 maybe?) for uses that aren't logged in or don't have it set in their preferences and then it formats the dates according to their preference.

For example this is how it might be written in WikiCode: "The show premiered on <<2006-02-17>> and has since produced 50 episodes," and this is how some date formats might display it:
1. The show premiered on 17/02/2006 and has since produced 50 episodes,
2. The show premiered on 17 February 2006 and has since produced 50 episodes,
3. The show premiered on 02/17/2006 and has since produced 50 episodes
Comment 231 Le Chat 2008-11-26 07:57:16 UTC
<rant>Why not just NOT DO THIS? What is the problem it's trying to solve? I'm just amazed that a colour-of-the-bikeshed discussion about date formats should grow to such grotesque proportions. Can anyone explain why this (whatever "this" currently is) is even being considered? (Or, for that matter, why the original date autoformatting function was even considered?) Developers don't have time for many of the genuinely useful feature requests or minor bugs that are being reported here all the time; why should they be wasting their time over this? </rant>
Comment 232 Stephen Turner 2008-11-26 08:41:17 UTC
The request is to separate the syntax for date autoformatting from the syntax for linking. At the moment they both use the same syntax, so it's impossible to have one without the other.

I agree that that sentence doesn't form a plan of action, and the reason this bug is so long is that people have different ideas about how to solve it. But there is a genuine user problem here.
Comment 233 Le Chat 2008-11-26 08:58:54 UTC
It's possible to have linking without autoformatting (using colons or pipes).

If the only problem is that it's not possible to have autoformatting without linking, then that should be fairly trivial to fix (just rewrite whatever code does the autoformatting so that it's triggered by some formatting feature other than links, like a CSS style). 

Is there anything else?
Comment 234 cypsy 2008-11-26 15:47:02 UTC
(In reply to comment #233)

> then that should be fairly trivial to fix (just rewrite whatever code
> does the autoformatting so that it's triggered by some formatting feature other
> than links, like a CSS style). 

just add the necessary <span class="x"> to Bill's patch.

> Is there anything else?

Nope. As for "Developers don't have time": Bill has already done the work. Its /done/. Devs just need to pick it up. Someone needs to give tstarling a wake-up call; its his baby.

Mr Z-man's comment re squid caching for anon users is not a reason why date formatting cannot default to a human-readable format (instead of self::NONE, which -- because of "ISO" date usage -- is a disaster).
Comment 235 billclark 2008-11-26 16:49:05 UTC
(In reply to comment #234)

> Mr Z-man's comment re squid caching for anon users is not a reason why date
> formatting cannot default to a human-readable format (instead of self::NONE,
> which -- because of "ISO" date usage -- is a disaster).

No, but it IS a reason why date formats for anons can't be different based on the browser settings indicated by the Accept-Language header.  All anons would need to see the same date format, which is fine by me (just make the default the "international" format of DMY and screw any jingoist Americans who are confused or offended by that, or let them register and set a preference) but that has already been soundly rejected by the community, multiple times.

To clear a few things up, this ticket is NOT asking to allow for linked dates that bypass autoformatting, because that's already a standard feature (just use the [[:]] format of link) and it's not asking to change the meaning of [[:]] formatted dates so that they autoformat but don't link (which would reverse the current intent of linked dates and thoroughly confuse people.)

Unless people can agree that ALL anons should see the same date format (presumably DMY) or somebody provides a Javascript-based solution that will do date formatting client-side, this ticket is not able to be resolved.  It should be closed and a new one started that takes into account everything that's been discussed (and rejected) here so far, and asks for a more workable outcome.

...or if you really want to have the sysadmins (why does everybody call them "developers" when WE are just as much "the developers" as they are?) apply one of the patches already submitted, that will basically have the effect of disabling date autoformatting and autolinking (leaving [[:]] dates untouched in terms of format and link) for all users.  I guess that's better than manually unlinking dates (since it leaves the markup in place for possible future use) but not by much.

-Bill
Comment 236 Alex Z. 2008-11-26 17:00:01 UTC
(In reply to comment #234)
> (In reply to comment #233)
> 
> > then that should be fairly trivial to fix (just rewrite whatever code
> > does the autoformatting so that it's triggered by some formatting feature other
> > than links, like a CSS style). 
> 
> just add the necessary <span class="x"> to Bill's patch.
> 

Which patch? There's 4. And CSS classes shouldn't trigger the parser to do things, that's just incorrect on multiple levels. CSS classes should be used for CSS styling or to help JavaScript find the right objects, that's pretty much it. It needs to be some sort of syntax that isn't already reserved for something else (like CSS styling or normal links). That needs to be decided before anything is done.

> > Is there anything else?
> 
> Nope. As for "Developers don't have time": Bill has already done the work. Its
> /done/. Devs just need to pick it up. Someone needs to give tstarling a wake-up
> call; its his baby.

People aren't just going to grab a patch and commit it. Especially when the last patch seems to be entirely different from what's being requested in these past several comments.

> Mr Z-man's comment re squid caching for anon users is not a reason why date
> formatting cannot default to a human-readable format (instead of self::NONE,
> which -- because of "ISO" date usage -- is a disaster).
> 

I never said it was. Its the reason why there can only be one default. Defaulting to a different version depending on what the browser language is won't work for anons. The actual default can be changed, as long as there's only one default.
Comment 237 billclark 2008-11-26 17:23:01 UTC
(In reply to comment #236)
> Which patch?

To clarify what they are:

"diff -u for includes/DateFormatter.php"

This patch satisfies the ORIGINAL request for this ticket, which was to autoformat dates even when they're not marked up with link syntax.  This was a BAD IDEA and this patch should NOT be applied under any circumstances (in fact I think it makes mistakes with some date ranges, because of the difficulty in identifying dates that lack markup.)

"Simpler patch, just eliminates links"

This patch could be applied if desired, because it works the same as the current autoformatting except that dates aren't linked, just formatted.  Note that there is a vocal group on MOSNUM that is opposed to ANY date autoformatting, so this patch will probably not be acceptable to them.

"Eliminates links, leaves date format untouched"

This patch is the most likely to be applied, because it essentially nullifies date autoformatting.  Dates that are marked up with regular link syntax ([[]] but NOT [[:]]) are left untouched -- no formatting change, no link.  This patch does automatically what people like Tony, Lightmouse, et. al. are doing manually.  Applying this patch would be the same as manually unlinking every [[]] linked date on WP, except that it would leave the markup in place for possible future use.

"unlinks dates, preserves markup, uses Accept-Language for defaults (or DMY)"

This patch should not be applied, because the squid cache makes it unworkable.

-Bill
Comment 238 Philippe Verdy 2008-11-26 20:05:48 UTC
I strongly support the removal of the request. Autoformatting dates (or numbers, currency amounts, percentages, zip codes and phone numbers if there are some...) according to user preferences at the server level is a very bad idea.

Autolinking dates is a separate issue that is not to be done in MediaWiki but based on community decision on each Wiki project, and implemented in utility templates if really needed, and used according to local usage policies.

Removing autolinked dates could however be performed on the server as it will not be dependant on user preferences: however this should be done by a configuration PHP hook to enforce a local policy decision (so the existing templates trying to bypass this limitation would stop working, or a complex utility template would have to be designed to tweak and complicate the syntax so much that the PHP hook will not detect it, but at the same time the usage of such utility template could be monitored.

My view about it is just that the default dates generated for example in user signatures with tildes, should JUST be surrounded by something indicating this is a date and giving its effective numeric value in abbreviated ISO format (in Zulu timezone). This can be done just by generating (example on French Wikis):

<span class=date"id="date-20081126T184500">mercredi 16 novembre 2008, 20:45:00 CET</span>

instead of just the date and time formatted with the SINGLE DEFAULT format that has been set according to the local Wiki localisation settings (and not accoriding to user preferences). This will be useful for example when someone that can't read Arabic or Farsi goes there to discuss something and has to reply in some local discussions: even if the discussion is made in English, the defautl signature will appear formatted in a way that is impossible to read by most non-readers of Arabic (but a gadget selection and user preference settings should help those users see the dates in a way they can read, and formatted at times accurate for their local timezone (instead of the timezone of the server).

With this type of code, connected users could still use a gadget to transform those dates and interpret the signature dates (notably in discussion pages) accordingto their curerntly selected locale in their preference (the transform would NOT be performed by the server but locally by the user's browser, using the Javascript loaded from the gadget). Nothing should be done for dates inserted in articles.

This would just concern the way the MediaWiki parser transforms four-tildes and five-tildes into formatted current dates...

Nothing is needed for all other dates, that articles will specify directly using a single locale setting (users editing articles to insert dates manually or defisigning templates computing and displaying dates on the Wiki should use only a format suitable for the localization of the current wiki project (but if they want, they may insert the same markup as the one used by MediaWiki for the substitution of tildes. Note that this won't work completely for such dates (with the markup above) occuring in section titles : only the section title will be converted by the Javascript gadget running in the user's browser, but not the formatted date visible in the TOC. I don't think this is a problem.

Comment 239 Philippe Verdy 2008-11-26 20:12:28 UTC
Also I don't support the unneeded special wiki syntax for formatting dates. This is not needed at all !

Writing <<2008-11-26>> is just a shortcut for generating a single format that you can also generate using existing templates like: {{DateFormat|2008|11|26}}. It is as much readable than the proposed syntax, and not surprising for editors if the template is given a meaningful name... And what the template will generate and the kind of aditional HTML markup it can provide to help a user gadget interpret and reformt the date itself on the user's browser is in full control of the template, without any addition to the Wiki syntax.
Comment 240 billclark 2008-11-26 20:21:17 UTC
(In reply to comment #239)
> Also I don't support the unneeded special wiki syntax for formatting dates.
> This is not needed at all !

Using an alternate markup syntax was suggested when we were discussing server-side date processing to be done in DateFormatter.php, but that's not a viable approach any longer.

So you're correct that the currently proposed solution (using Javascript to do client-side date formatting) can make do with a template.  That's very different than what was being discussed before, and in the server-side approach, a new syntax would have been better than using a template.  That's why it was being proposed, although it's no longer under consideration.

-Bill
Comment 241 Stephen Turner 2008-11-26 22:14:36 UTC
(In reply to comment #238)
> I strongly support the removal of the request. Autoformatting dates (or
> numbers, currency amounts, percentages, zip codes and phone numbers if there
> are some...) according to user preferences at the server level is a very bad
> idea.
> 

I have a lot of sympathy for this point of view. However, we already have date autoformatting. It can be argued that that was a mistake. The particular implementation was certainly a mistake (IMO). So the question now is how to make it suck less, especially given the articles that already use it.
Comment 242 Philippe Verdy 2008-11-26 23:00:01 UTC
That's simple: if you implemented user-specific data formatting, keep the syntax you have adopted for Wiki, but make it generate a generic format that will be handled now with Javascript (and well, ignore anonymous users without preferences in some user account...).

Anyway, I have made some other suggestions on Madiawiki to help servers reduce the load of Wikimedia Squid caches and PHP servers. Basically, it's time to think about creating a locally deployable proxy version of the MediaWiki software, implementing the PHP MediaWiki PHP code, the PHP engine and the web server, and also working as the proxy that a local browser (or a new dedicated MediaWiki browser with extra editing tools) could connect to instead of connecting directly to the online service. I suggested that this type of deployment cuold also be part of a CD/DVD distribution and would allow eliminating the need to preview before sending edits.

Instead I spoke about the possibility of the locally deployed proxy to be able to have its own local cache, communicating with the online server only with pages in raw format (much less load on Wikimedia PHP servers, because much less work to do in PHP for formatting pages, just the need to manage the history and control the access rights, and the possibility to inform the conencted users about incoming messages).

The locally deployed proxy would be able to load pages from a locally installed database (from a dump stored on a CD/DVD or downloaded onto a harddisk) and filter out the edits made by the local user(s) that could mange a queue of updates to send (in a private, corporate or shoold environment, it would be possible to support a supervizion for pages sent/edited from the local network through the proxy storing the list of locally edited pages, if local users are not directly granted the right to commit their pages online without permission or basic monitoring anf history.

I spoke also about the fact that this would also allow Wikipedia and similar collaborative projects to be able to better control the level of vandalism made mostly anonymously from school/university networks (no more need to ban the entire school or university, if it has firewalled Wikiemdia servers but routed them through a MediaWiki proxy managing the supervision: the school or corporate network would be able to submit updates through public online accounts managed by supervizors that are contactable and can act locally about the vandalism or breach of copyright by one of their supervized local users (that may be connected to the local proxy using a strong and verifiable identity that does not need to be transmitted online, so it can also improve their online privacy).

Finally, the whole set: a dedicated MediaWiki browser application and a local proxy could become the prefered way to work with Wikipedia: all would be integrated including the management of the submission queue and a local supervizion of the queue by the user itself, managing himself the pririty of queues. As it would embed a local cache, it would perform much less traffic with the online server.

The Wikimedia's Squid proxies would still be used, but onlyto cache the much smaller raw pages instead of the large HTML pages. The Wikimedia's own MEdiaWiki servers would also see much less traffic and would caould satisfy more users with limited local cache of pre-generated HTML pages (without the user-specific additions like messages), because all locally deployed proxies would only communicate with them in raw format.

So that's a way to control for a longer term the problem of the explosion of traffic and of server costs to manage this traffic. You would not necessarily need the same explosion in terms of number of squid servers needed to cache the full HTML pages (they would still be needed to cache the images, because, by default, the image thumbnails should be generated centrally and cached centrally to avoid sending them always in full resolution to every user of a proxy version of MediaWiki...

Well all this goes too far way from the current problem of date formatting. I realize that the fact you have eimplemented it is causing problems wiith more traffic generated, but it should be easily reverted before its use becomes widespread, or the onli ne preformances will suffer. I suggest the simple solution of generating a single HTML code for your existing syntax, and then let a Javascript gadget perform the actual formatting. Your squid servers will appreciate !


Comment 243 S. McCandlish 2008-11-27 01:31:34 UTC
Re:
> No, but it IS a reason why date formats for anons can't be different based on
the browser settings indicated by the Accept-Language header.  All anons would
need to see the same date format, which is fine by me (just make the default
the "international" format of DMY and screw any jingoist Americans who are
confused or offended by that, or let them register and set a preference) but
that has already been soundly rejected by the community, multiple times.

I'm not so sure.  I'm unaware of anyone actually involved in the discussion, here or at w:en:WP:MOSNUM, upset by the idea any longer.  It clearly would have the benefit of consistency over the current practice, which is willy-nilly and confusing even to editors, much less readers.  PS: I am an American.  Non-Yank date formatting doesn't bother me at all, and anyone upset to apoplexy about it needs to find a new hobby or something.  ANY change to the way MW works generally, and more specifically how it works at WP, will ruffle a few people's feathers, but it chills out quickly and life goes on.
Comment 244 cypsy 2008-11-27 21:46:47 UTC
(In reply to comment #237)
> (In reply to comment #236)
> > Which patch?

#1 is out of the question, as Bill points out.
#2, #3 and #4 all prevent linked dates from being links. So they all fix the "seas of blue" problem.

#4 also fixes the "ISO cruft for anons/noprefset" problem. The objection that "the squid cache makes [#4] unworkable" is not the whole story. Squid caching only applies to anons, so #4 still works fine for the noprefset registered users. And even *if* the patch were not modified to deal with anons some other way (eg comment #243), then anons would get to see whatever it was that the last anon who caused a fetch from MW saw. Which A) is still a whole sight better than seeing ISO cruft, B) a pretty good weighted-random choice of which dateformat to display for anons.

The bottom line is that *something* needs to be done at DateFormatter too, otherwise the "seas of blue" will continue to exist, and people will continue to go around removing date links, which is inherently destructive. Although there is a Javascript solution available, its not a viable solution unless DateFormatter adds the appropriate markup /and/ the admins make the Javascript site-wide, which is unlikely to happen because -- as we see from the interminable "discussions" elsewhere -- common-sense is in short supply. 

Objecting all very well and good, but #3 and #4 are a GoodThing, and if there are problems with them, then these can be fixed or compensated for. Squid et al are not the show-stoppers that they being treated as.

For those joining the discussion late (e.g. Mr. Z Man and Philippe Verdy): the title of this bug only tells half the story. This bug is in effect the search for a technical solution to the perennial discussions at en:wp, which are provoked by A) the eyesore caused by the linked dates also appearing as links; B) the fact that only very few dates actually need to be links; C) the inconsistencies that appear in an article if not every date is linked; D) the cruft that anons and editors without a datepref get to see, E) the (preservation of) meta information present in [[date]]s but not in plain text. 

Patch #3 addresses issues A, B, C, E. Patch #4: A, B, C, D, E.

(comment #236)
> People aren't just going to grab a patch and commit it. Especially when the
> last patch seems to be entirely different from what's being requested in these
> past several comments.

This comment borders on stupidity. All the issues noted above are already mentioned in the top 10 comments on this page, which are from Jan-March *2006*. That these issues have not been addressed (and in fact the reason why this bug was once closed for equally ignorant and stupid reasons) is due solely to dev apathy and their inability to read (evident again in a recent email). One need only follow Tony Souter's (en:User:Tony1) comments on this page -- from engaged and supportive (comment #14) to disillusioned (comment #158) -- to recognize how destructive dev negligence has been. 

Comment 245 Alex Z. 2008-11-28 04:43:19 UTC
(In reply to comment #244)
> (In reply to comment #237)
> > (In reply to comment #236)
> > > Which patch?
> 
> #1 is out of the question, as Bill points out.
> #2, #3 and #4 all prevent linked dates from being links. So they all fix the
> "seas of blue" problem.
> 
> #4 also fixes the "ISO cruft for anons/noprefset" problem. The objection that
> "the squid cache makes [#4] unworkable" is not the whole story. Squid caching
> only applies to anons, so #4 still works fine for the noprefset registered
> users. And even *if* the patch were not modified to deal with anons some other
> way (eg comment #243), then anons would get to see whatever it was that the
> last anon who caused a fetch from MW saw. Which A) is still a whole sight
> better than seeing ISO cruft, B) a pretty good weighted-random choice of which
> dateformat to display for anons.

As I said, the default can be changed rather easily. It can even be changed for en.wikipedia without a change to the software. But having anons see either of 2 date formats depending on the location of the last person who caused a cache update, is just ugly.

> The bottom line is that *something* needs to be done at DateFormatter too,
> otherwise the "seas of blue" will continue to exist, and people will continue
> to go around removing date links, which is inherently destructive. Although
> there is a Javascript solution available, its not a viable solution unless
> DateFormatter adds the appropriate markup /and/ the admins make the Javascript
> site-wide, which is unlikely to happen because -- as we see from the
> interminable "discussions" elsewhere -- common-sense is in short supply. 

I really don't like the idea of having normal wikilink syntax do something, but not actually make a link. It adds even more complexities to wikitext and inconsistencies in parser output. This would probably need approval from Tim/Brion and would probably need a config option to enable it. 

> Objecting all very well and good, but #3 and #4 are a GoodThing, and if there
> are problems with them, then these can be fixed or compensated for. Squid et al
> are not the show-stoppers that they being treated as.

4 will not work correctly with squid caching. I'm not going to introduce code into MediaWiki that I know will be partially broken for Wikimedia.

> For those joining the discussion late (e.g. Mr. Z Man and Philippe Verdy): the
> title of this bug only tells half the story. This bug is in effect the search
> for a technical solution to the perennial discussions at en:wp, which are
> provoked by A) the eyesore caused by the linked dates also appearing as links;
> B) the fact that only very few dates actually need to be links; C) the
> inconsistencies that appear in an article if not every date is linked; D) the
> cruft that anons and editors without a datepref get to see, E) the
> (preservation of) meta information present in [[date]]s but not in plain text. 
> 
> Patch #3 addresses issues A, B, C, E. Patch #4: A, B, C, D, E.

A is basically a matter of personal opinion. 
B is a local style issue. Just because the English Wikipedia decided it doesn't think dates should be linked anymore, doesn't mean we should force that on every user of MediaWiki.
As for C, the patches that simply remove the link wouldn't really address it, since the autoformatting would still happen.
Changing the default preference will also mostly fix D (it won't affect existing accounts) without pointless inconsistencies for anon users.

> (comment #236)
> > People aren't just going to grab a patch and commit it. Especially when the
> > last patch seems to be entirely different from what's being requested in these
> > past several comments.
> 
> This comment borders on stupidity. All the issues noted above are already
> mentioned in the top 10 comments on this page, which are from Jan-March *2006*.
> That these issues have not been addressed (and in fact the reason why this bug
> was once closed for equally ignorant and stupid reasons) is due solely to dev
> apathy and their inability to read (evident again in a recent email). One need
> only follow Tony Souter's (en:User:Tony1) comments on this page -- from engaged
> and supportive (comment #14) to disillusioned (comment #158) -- to recognize
> how destructive dev negligence has been. 
> 

Insulting the people trying to help isn't really going to get them to help. There's hundreds of other open bugs I can work on where I won't be called stupid, ignorant, apathetic, and negligent when I try to figure out WHAT PEOPLE ACTUALLY WANT NOW. Not what people want 2 years ago. Not what people wanted last month. Not things where it has been established won't work. This is basically what I've seen in the few days I've been commenting here:

*We want different formatting depending on browser language
**That won't work
*Well we didn't want that anyway. Just a change to the default and create an easy way to change them with JS
**Okay, that sounds easy enough
*No, we want [[date]] to not autoformat and have [[:date]] do it instead
**Okay
*No, we want a completely new syntax for date links
**That'll be a little harder
*No, we want to wrap dates in a span if we want them to be autoformatted
**That's just ugly
*Well what we actually want is for [[date]] to not create a link
**And that's where we're basically at now.

The actual request seems to vary significantly depending on who's making the comment.
Comment 246 S. McCandlish 2008-11-28 06:29:51 UTC
> Just because the English Wikipedia decided it doesn't
think dates should be linked anymore, doesn't mean we should force that on
every user of MediaWiki.

It's really the opposite, though.  Someone, back when, decided it would be a good idea to operator overload linking and date autoformatting, and this was pushed on user of WP and every other MW installation. Today, virtually everyone who has thought about it agrees that it was a bad idea.

> Insulting the people trying to help isn't really going to get them to help.

I absolutely agree, and want to point out that not everyone involved here is being negative. I appreciate the time and analysis you've put into this, and fully understand that you cannot just go implement something without understanding its ramifications, nor implement "something A" when "something B" and "something C" have also been proposed.

> I really don't like the idea of having normal wikilink syntax do something, but
not actually make a link.

Strikes me as a non-issue, since [[image]] and [[category]] already do non-linky things.  I think it is more accurate to think of [[something]] as "take an action" code.  That "action" is most commonly creating a wikilink, but for a very long time has obviously also had alternative meanings depending upon context. This wouldn't be any different, and having [[date]] not create a link but do autoformatting instead, and [[:date]] create a link, is actually more, not less, consistent with this [[something]] model.  In very, very few cases is there any need for a date to actually be linked; probably far fewer cases than there are cases (e.g. on "Wikipedia:"-namespace pages where images and categories are being discussed) of categories and images needing to be wikilinked to instead of being inlined or being added as a category to a page, respectively.
Comment 247 Radon210 2008-11-30 03:42:48 UTC
How about this setup:
*All dates with [[]] will be autoformatted but not linked
*All dates with [[:]] will be autoformatted and linked
*Autoformating is set to a default global configuration option so that the display of dates is consistent
*If its desirable, a magic word can be set up to change the default on a page by page basis (I would put this at the lowest priority)
Thats my 2 cents here.  The first 3 should solve many of the outstanding problems about date format and the 4th would make certain pages that have a good reason to be formatted differently from the default. Obviously 3 and 4 would not override user preferences
Comment 248 Alex Z. 2008-11-30 03:52:38 UTC
(In reply to comment #247)
> *All dates with [[:]] will be autoformatted and linked

Isn't that sort of the opposite of current behavior? I thought currently [[:date]] is used for linking without autoformatting. If we change it so that it does autoformatting it will potentially break existing uses, or at least be confusing.
Comment 249 Philippe Verdy 2008-11-30 10:38:31 UTC
Remove ALL this autoformatting mess from the [[]] syntax.

For me, [[2000-12-31]] is ONLY creating a link to an article named "2000-01-01" (with no attempt to interpret it as a date, and NO attempt to reformat it as it breaks data tables where they are NOT dates)

For me [[:2000-12-31]] is ALSO creating a link to the same article name, but explicitly in the default namespace (and here also, it MUST not attempt to reformat it).

Those that want autoformatting (according to the default locale set on the server for the WHOLE Wiki project only, but NOT according to user preferences) should have their own syntax, but if needed they can still use a template to do that if there's no syntax support, like {{Date|2000|12|31}}.

If a syntax like <<2000-01-01>> is supported for autoformatting, it should NOT create any link: liks should have to be spacified by users themselves like this: [[2000-12-31|<<2000-12-31>>]], or using a template that performs both, like {{DateLink|2000|12|31}} that can be defined as

[[{{{prefix|}}}{{{3}}}-{{{4}}}-{{{5}}}{{{suffix|}}}|{{{prefix|}}}<!--
--><span class="formated date" id="date-ymd-{{{3}}}-{{{4}}}-{{{5}}}"><!--
--><<{{{3}}}-{{{4}}}-{{{5}}}>><!--
--></span><!--
-->{{{suffix|}}}]]<!--

Such example template above performs three things:
- formatting the date for the DEFAULT display according to server-side Wiki project defaults (3rd line);
- surrounding the DEFAULT disply format by enough HTML to allow user-level preferences on the CLIENT side (via a Javascript gadget) (2nd and 4th lines);
- linking to an article (whose prefix and suffix part of the name can be given, see 1st and 5th line), but whose format used in the target article name is predictable and will not change even if the server's default format changes).


Now don't implement or remove date linking automatically from the [[ymd]] or [[:ymd]] syntax!
This is not the role of MediaWiki to take such uninformed decision, but the role of a community decision, only as a general rule where there may exist exceptions for some famous dates, or within some namespaces or categories related to content management and maintenance.

Comment 250 Radon210 2008-11-30 14:35:13 UTC
@MrZman:
Yeah that would pretty much work correctly because autoformatted links can be created via [[[[:]]]]
@Philippe:
This is not just about the English Wikipedia, many other websites use Mediawiki and this feature may be useful to them
Comment 251 Philippe Verdy 2008-11-30 16:16:08 UTC
I did not speak specifically about English Wikipedia (in fact I did not know (or noticed) that you had implemented autoformatting there. I hope it was not deployed too on French Wikipedia because I doubt that this "autoformatting" will work correctly there and that you really know how French dates are formatted !

For me, autoformatting dates is silly and this has to be at least fully disabled as needed, i.e. everywhere its "guess" (about what is a date or not or if this formatting is desired) is completely wrong, because the syntax it recognizes is completely equivalent to the syntax used for normal links. Then if you need a special keyword in the page to disable this guess and so to disable autoformatting in links, you've done a wrong job: in fact the syntax chosen for autoformatting was badly chosen.

Suggestion for the choice of the syntax (that DOES NOT create a link): {{#2000-12-31}}. No conflict there because it uses the prefix for parser functions, and no parser function can have a name starting by a digit... But I wonder why you absolutely want a short syntax for dates specifically but not for other features that may want a shorter syntax that would be much more useful.

The need for autoformatting would mean that the writer of an article does not know how to format a date in the SINGLE main language of the target Wiki (or page or section, if a page can be marked specifically as using another target language, something that is possible with the "xml:lang=" attribute name, which is also already recognized by MediaWiki as the "lang=" atttribute name of HTML elements). If that writer does not know that, what is the meaning of the rest of the page, beside this isolate date? Can't the editor directly format the date himself (including in templates that would like to format computed dates)?

On the same front, trying to remove links generated by the [[]] syntax should not depend on the presence or not of the leading ":" because it is also part of the syntax to separate the root (anonymous) namespace space before the article name.

Yeah! Two hacks introduced recently in English Wikipedia (and badly documented), two errors, and now you're thinking about compatiblity elsewhere ? No. you shoulmd revert this "support", as it is completely UNNEEDED and generates more problems than what it is supposed to solve.

Comment 252 Alex Z. 2008-11-30 22:21:02 UTC
(In reply to comment #250)
> @MrZman:
> Yeah that would pretty much work correctly because autoformatted links can be
> created via [[[[:]]]]

What? That doesn't make any sense at all. Ideally, if anything is done as a result of this bug, partially reversing current behavior won't be part of it. AFAIK, [[:date]] is currently used to link a date *without formatting it.* If all of a sudden they start autoformatting, its going to break every every existing use.

> I did not speak specifically about English Wikipedia (in fact I did not know
> (or noticed) that you had implemented autoformatting there. I hope it was not
> deployed too on French Wikipedia because I doubt that this "autoformatting"
> will work correctly there and that you really know how French dates are
> formatted !

Date autoformatting in MediaWiki has been around since 2003. There are 5 different options and the default is no autoformatting.
 
> Suggestion for the choice of the syntax (that DOES NOT create a link):
> {{#2000-12-31}}. No conflict there because it uses the prefix for parser
> functions, and no parser function can have a name starting by a digit... But I
> wonder why you absolutely want a short syntax for dates specifically but not
> for other features that may want a shorter syntax that would be much more
> useful.

I agree a parser function would be better than creating a whole new syntax, or breaking existing syntax to do it.

> The need for autoformatting would mean that the writer of an article does not
> know how to format a date in the SINGLE main language of the target Wiki (or
> page or section, if a page can be marked specifically as using another target
> language, something that is possible with the "xml:lang=" attribute name, which
> is also already recognized by MediaWiki as the "lang=" atttribute name of HTML
> elements). If that writer does not know that, what is the meaning of the rest
> of the page, beside this isolate date? Can't the editor directly format the
> date himself (including in templates that would like to format computed dates)?

No, it means people may not want the normal style of dates used on the wiki. Using the English Wikipedia as the obvious example, British users may want the "date month year" format, American users the "month date year" format. Others may want the ISO 8601 format.

> Yeah! Two hacks introduced recently in English Wikipedia (and badly
> documented), two errors, and now you're thinking about compatiblity elsewhere ?
> No. you shoulmd revert this "support", as it is completely UNNEEDED and
> generates more problems than what it is supposed to solve.
> 

What? As I said, date autoformatting has been around for 5 years.
Comment 253 Minh Nguyễn 2008-11-30 22:27:08 UTC
(In reply to comment #252)
> Date autoformatting in MediaWiki has been around since 2003. There are 5
> different options and the default is no autoformatting.

If I'm not mistaken, the autoformatting is only supported for the English localization. In other locales, the date format option in [[Special:Preferences]] only affects MediaWiki's interface, not the wiki's content.
Comment 254 Radon210 2008-11-30 23:32:57 UTC
@MrZman that was the point, [[:]] only creates a link and if its wrapped in [[]] (the link syntax that is) its both linked and autoformatted.
Comment 255 S. McCandlish 2008-12-01 00:08:50 UTC
> Yeah! Two hacks introduced recently in English Wikipedia (and badly
documented), two errors, and now you're thinking about compatiblity elsewhere ?
No. you shoulmd revert this "support", as it is completely UNNEEDED and
generates more problems than what it is supposed to solve.

Exactly. Calling date linking+autformatting as currently implemented a "feature" is disingenuous.  The ranty earlier message about it being oh-so-important to preserve all of this and introduce some new <<date>> syntax (or {{#date}}, I think someone else suggested along the same lines) seems to me to be missing the point and unnecessarily complicating the issue.
Comment 256 Alex Z. 2008-12-01 20:52:23 UTC
We don't really seem to be getting anywhere here, I think at the least we need to summarize the options. Here's a list of the ones I've noted, some can be combined with others:

1. Remove date autoformatting entirely (rather than removing the option entirely from code, this could most easily be accomplished by setting $wgUseDynamicDates to false for enwiki)
2. Keep the current syntax, remove the links: [[date]] would continue to autoformat, but it wouldn't create a link, [[:date]] would continue to work as it currently does
3. Remove autoformatting and linking, "[[date]]" would be parsed as "date" - no link, no formatting
4. Create a new syntax for date autoformatting (a parser function, some new syntax, etc), [[date]] would work the same as any other normal link, [[:date]] would continue to work as it currently does
5. Wrap dates in a <span> for client-side customization with JavaScript
6. Change the default date format preference from "no preference" to something more sensible

Option 1 would obviously be the simplest, though a combination of options 4, 5, and 6 would also remove most of the problems I think. I really don't care for option 2, and option 3 is pretty terrible IMO.
Comment 257 S. McCandlish 2008-12-01 22:25:06 UTC
That list seems to be missing the main one proposed, for which a patch has already been written:

7. [[date]] would autoformat but NOT link, just as [[image]] and [[category]] do something special and don't make a link, while [[:date]] would link but not autoformat (just as [[:image]] and [[:category]] create links but don't do anything special).  If someone wants to link AND autoformat, which frankly would be kind of weird, they can use template code, examples of which have already been posted.

I think it is important to keep in mind that the main impetus behind this bug report is the "sea of blue" problem.  I think few of us would cry if the autoformatting just plain stopped, but I don't think many have been actually asking for that, and resolving the sea-of-blue problem doesn't require that, nor require new features, like a new <<date>> or {{#date}} or {{DATE:date}} syntax.

Imagining some wiki where linking to actual articles on every conceivable date would make sense AND where autoformatting is important (I cannot think of any practical application of this, but let's pretend), this can just be handled with templates.  Enwiki is awash in useless, reader-distracting date links (I want to point out that the frequent kvetching against Tony1 about that here is a red herring; many, many users have been removing such date links), and as the most-used wiki in the world, by several orders of magnitude, not to mention the Wikimedia flagship and thus the source of funding for MediaWiki to even have this site to argue about bugs, I don't feel we can just throw up our hands and not resolve this, in an enwiki-favorable manner.

My vote is to just do #7, and keep this as simple as possible, so that it actually proceeds instead of sitting here as a languishing bone of contention for another 2 years.
Comment 258 Radon210 2008-12-01 22:42:59 UTC
I like a combination of 2 and 6, but 4 can be used in place of 2.  It would probably solve most outstanding problems and I'd say 6 is a must do
Comment 259 Alex Z. 2008-12-01 23:52:39 UTC
(In reply to comment #257)
> That list seems to be missing the main one proposed, for which a patch has
> already been written:
> 
> 7. [[date]] would autoformat but NOT link, just as [[image]] and [[category]]
> do something special and don't make a link, while [[:date]] would link but not
> autoformat (just as [[:image]] and [[:category]] create links but don't do
> anything special).  If someone wants to link AND autoformat, which frankly
> would be kind of weird, they can use template code, examples of which have
> already been posted.

That would be #2, "Keep the current syntax, remove the links: [[date]] would continue to autoformat, but it wouldn't create a link"

Comment 260 Voyagerfan5761 / dgw 2008-12-02 00:30:37 UTC
(In reply to comment #259)
> (In reply to comment #257)
> > That list seems to be missing the main one proposed, for which a patch has
> > already been written:
> > 
> > 7. [[date]] would autoformat but NOT link, just as [[image]] and [[category]]
> > do something special and don't make a link, while [[:date]] would link but not
> > autoformat (just as [[:image]] and [[:category]] create links but don't do
> > anything special).  If someone wants to link AND autoformat, which frankly
> > would be kind of weird, they can use template code, examples of which have
> > already been posted.
> 
> That would be #2, "Keep the current syntax, remove the links: [[date]] would
> continue to autoformat, but it wouldn't create a link"
> 

So if #7 is the same as #2, and #7 already has a patch written for it, #2 already has a patch written. Since #(7|2) is also "the main one proposed", is there any reason the patch shouldn't be applied to allow this bug to be finally closed for good?
Comment 261 Alex Z. 2008-12-02 00:48:09 UTC
(In reply to comment #260)
> (In reply to comment #259)
> > (In reply to comment #257)
> > > That list seems to be missing the main one proposed, for which a patch has
> > > already been written:
> > > 
> > > 7. [[date]] would autoformat but NOT link, just as [[image]] and [[category]]
> > > do something special and don't make a link, while [[:date]] would link but not
> > > autoformat (just as [[:image]] and [[:category]] create links but don't do
> > > anything special).  If someone wants to link AND autoformat, which frankly
> > > would be kind of weird, they can use template code, examples of which have
> > > already been posted.
> > 
> > That would be #2, "Keep the current syntax, remove the links: [[date]] would
> > continue to autoformat, but it wouldn't create a link"
> > 
> 
> So if #7 is the same as #2, and #7 already has a patch written for it, #2
> already has a patch written. Since #(7|2) is also "the main one proposed", is
> there any reason the patch shouldn't be applied to allow this bug to be finally
> closed for good?
> 

Because A) it doesn't solve all the problems (crappy default being the big one) B) its not clear if that's what people actually want, hence my summary of the multitude of options. C) Many of the options are easy, Option 1 for example doesn't need a patch at all, its a matter of removing 1 line from the Wikimedia config files. Just because option 2 happens to have had code written for it first doesn't mean its the best option. D) I'm not entirely convinced its a good idea to use link syntax for a non-link. Category and Image links don't create normal links, but they do create links. E) I haven't tested it yet.
Comment 262 Gerry Ashton 2008-12-02 03:34:13 UTC
VoyagerFan5761 asks why not do #7. Discussion on the English Wikipedia "Manual of Style (dates and numbers)" seems to be building a consensus to not do autoformatting, especially if non-logged-in readers do not benefit. If eventually autoformatting is no longer done, and gradually all autoformatting markup is removed, then someone who wants to link to an article about a date for some reason will have to use the odd syntax [[:February 30]], and since there will be few examples of autoformatting markup around, the reason for this odd syntax will not be apparent. This does not compare to the unusual syntax for images, because the current form of image linking does not seem to be going away.
Comment 263 S. McCandlish 2008-12-02 10:51:01 UTC
Not to be overly argumentative, but re: #2/#7:

2 and 7 weren't identical, since 2 did not address [[:date]], only [[date]], but whatever.

> it doesn't solve all the problems (crappy default being the big one)

Refresh my memory; what's the crappy default?  If it's that crappy, then just fix that, too, and we're done, right? I've seen [D]D Month YYYY suggested as the proper default, with little if any objection, including from Americans (like myself); we may not be globally popular folks right now, but we aren't retarded, and understand that format just fine.

> B) its not clear if that's what people actually want, hence my summary of the
multitude of options. 

From my reading, it's the most-sought solution, and it would comport with MOSNUM's consensus, which took years to arrive at, meanwhile I don't think I've seen a credible, non-hypothetical reason not to go that direction, only "what-ifs" like "maybe there's a wiki somewhere where having all dates linked and autoformatted is useful".

> C) Many of the options are easy, Option 1 for example
doesn't need a patch at all, its a matter of removing 1 line from the Wikimedia
config files. Just because option 2 happens to have had code written for it
first doesn't mean its the best option. 

The opposite is also true. I never suggested that the other options are impossible, only that we are in a good position to implement the one that seems to have the most favor, and have been for some time.

> D) I'm not entirely convinced its a good idea to use link syntax for a non-link. 

Moot point, for all practical purposes, given both images and categories, which use "link syntax" (better thought of as "do something" syntax) to show a picture and put an article in a category, respectively.  #2/#7 is just "do something, different in this case". (More accurately, it's "stop doing two things - linking and autoformatting - the latter of which is something different, and only do the one different thing".)

> Category and Image links don't create normal links, but they do create links.

That's picking nits; they would not actually have to - one could easily create a [[foo:...]] that did not create any form of link but did something else entirely, such as applying a CSS class or whatever.  And [[date]] formatting could conceivably create some kind of "non-normal link", in some manner on the page somewhere, such as an entry in a (presently nonexistent) article timeline sidebar feature, or whatever.  (For this reason, I faintly agree that dates should remain marked up with [[date]] formatting, rather than have the [[ and ]] removed by bots or by individuals with too much time on their hands.)

E) I haven't tested it yet.

True of all options, no?

All that said, I'll ultimately be satisfied with any solution that ends the sea of useless blue date links, really.
Comment 264 S. McCandlish 2008-12-02 10:53:41 UTC
> Discussion on the English Wikipedia "Manual
of Style (dates and numbers)" seems to be building a consensus to not do
autoformatting, especially if non-logged-in readers do not benefit. If
eventually autoformatting is no longer done, and gradually all autoformatting
markup is removed, then someone who wants to link to an article about a date
for some reason will have to use the odd syntax [[:February 30]], and since
there will be few examples of autoformatting markup around, the reason for this
odd syntax will not be apparent.

That's a cart-before-the-horse issue, really.  MOSNUM is castigating autoformatting because it creates a sea of blue.  If THAT problem is solved, then there is no particular reason to oppose autoformatting, and as others have pointed out here there are other potential uses for autoformatting more generally (color/colour, etc.), if it doesn't overload the wikilinking function.
Comment 265 Radon210 2008-12-02 22:54:13 UTC
(In reply to comment #264)
> > Discussion on the English Wikipedia "Manual
> of Style (dates and numbers)" seems to be building a consensus to not do
> autoformatting, especially if non-logged-in readers do not benefit. If
> eventually autoformatting is no longer done, and gradually all autoformatting
> markup is removed, then someone who wants to link to an article about a date
> for some reason will have to use the odd syntax [[:February 30]], and since
> there will be few examples of autoformatting markup around, the reason for this
> odd syntax will not be apparent.
> 
> That's a cart-before-the-horse issue, really.  MOSNUM is castigating
> autoformatting because it creates a sea of blue.  If THAT problem is solved,
> then there is no particular reason to oppose autoformatting, and as others have
> pointed out here there are other potential uses for autoformatting more
> generally (color/colour, etc.), if it doesn't overload the wikilinking
> function.
> 

2 things need to be noted.
1) The English Wikipedia is not the only website using mediawiki
2) There is also probably going to be a configurable default set
Comment 266 Le Chat 2008-12-03 08:56:38 UTC
Seems to me that all the devs need to do is to provide a magic word for user locale (value dependent on user preference or browser information); it would then be up to individual wikiprojects to decide whether to use it and how. If they did decide to use it it would be easy to make parser-function-based templates to do date autoformatting, spelling autoformatting and whatever else, without the need for any more software enhancements. (I don't know how the caching problem would be solved, but presumably in whatever way it's currently solved with date autoformats.)
Comment 267 cypsy 2008-12-03 19:11:40 UTC
(In reply to comment #266)
> Seems to me that all the devs need to do is to provide a magic word for user
> locale 

Notwithstanding that magic words cannot be made subject to user, this bug is evidence that per-user content is a terrible idea. Besides, /articles/ that are locale-specific (cf. ENGVAR), not users.

(In reply to comment #256)
> [list of options]

If MediaWiki were wikipedia-specific, option #2 would be fine. But MediaWiki is used by other sites too, who may be expecting dateformatting to work as it has done so far. Among the listed options, the only realistic options is #1 since that does not affect anything but en.wiki. But the resultant "seas of red" will of course not be an improvement over "seas of blue".

(In reply to comment #264)
> That's a cart-before-the-horse issue, really.  MOSNUM is castigating
> autoformatting because it creates a sea of blue.

True, but it is not /just/ that the seas of blue exist, but that there is no alternative and -- to judge from this bugzilla ticket -- no hope that an alternative will be forthcoming. Once editors/bots begin mass de-linking dates (and the RFCs indicate that this is going to happen) there will be no horse to put the cart before. 

Comment 268 billclark 2008-12-03 20:26:46 UTC
(In reply to comment #267)
> Once editors/bots begin mass de-linking dates (and the RFCs indicate
> that this is going to happen) there will be no horse to put the cart
> before. 

It's not just "going to happen" -- it's already happening.  That's what got me involved in this bug in the first place, since Tony (et al.) were refusing to hold off on delinking, and used the "lack of response" on this bug as an excuse.  By their own estimates, they've already delinked tens of thousands of dates in thousands of articles.  The RFCs are an attempt to get them to stop, or to at least get some GENUINE consensus support for their actions.

I'd prefer to keep date autoformatting and actually LIKE the date links (I don't agree that "overlinking" is EVER an issue.. I'd prefer to see MORE links in articles, not fewer) but more importantly I want to see the markup left in place so that we have the option of doing something more useful with it in the future.  With the markup, it would be trivial to have a bot replace the [[]] with a template or new syntax or whatever... but without the markup, all of that would need to be done manually, to confirm that we're not dealing with a date in a quote, or odd formatting edge cases like "On March 1, 500 people were killed" which could either mean that 500 people were killed on March 1 or that some unspecified number of people were killed on March 1, 500 A.D.

If the consensus is that date autoformatting should be done away with and dates should not (usually) be linked, then I'd prefer to have that enforced by applying the third patch, since it would leave all the markup in place (and we could then proceed to start working on an improved date autoformatting for some future release).  It would also accomplish the goal of Tony and his cohorts right away, without them having to edit untold number of articles.  Everybody wins.

-Bill
Comment 269 cypsy 2008-12-03 22:34:16 UTC
> ... used the "lack of response" on this bug as an excuse.  

Its a valid point. Cf. comment #35 of 16 December 2006

Lack of response is also a response. Its not an unjustifiable position either, after all "if X is causing a problem then X should be removed." Since X is local to en.wiki, its up to en.wiki to remove it.

> Everybody wins.

Its a well-known fact that everybody wins with patch #3. There is however no indication that it will in fact be implemented. Indeed, the opposite is true.
Comment 270 S. McCandlish 2008-12-05 09:23:41 UTC
I can resignedly go along with Bill's idea (third patch), though it doesn't seem optimum to me.  I can't agree with the idea that the sea of blue isn't a real problem and we need more and more links. Overlinking is a well-studied problem in usability circles, and has been since the mid-1990s, with a lot of data backing up the concerns. It's not a WP issue, but a general one with regard to hypertext media as a class.
Comment 271 Le Chat 2008-12-05 15:00:51 UTC
Everyone wins with the third patch? Forgive me if I've misunderstood, but the third patch means not linking dates that are enclosed normally in [[...]], right? I can't see how ordinary editors can be said to "win" by being prevented from making links to dates in the normal way they make links to everything else; nor is it the developers' job to forcibly delink *all* existing dates (which would be the consequence of this patch - again unless I've misunderstood).

Seems the optimum solution is to give up all futile attempts at social engineering and simply turn off the current autoformatting, allow dates to be linked or not just like anything else, and (when ready, if at all) introduce a simple new syntax (if not "magic word" then something else, doesn't really matter, obviously it is feasible since the current autoformatting does it) for producing alternative text output depending on user preference (/locale). Then the projects could decide whether to use this syntax for dates, for alternative spellings, alternative meanings of words or whatever. Or (preferably) not use it at all.
Comment 272 billclark 2008-12-05 15:51:28 UTC
(In reply to comment #270)
> I can resignedly go along with Bill's idea (third patch), though it doesn't
> seem optimum to me.  I can't agree with the idea that the sea of blue isn't a
> real problem and we need more and more links. Overlinking is a well-studied
> problem in usability circles, and has been since the mid-1990s, with a lot of
> data backing up the concerns. It's not a WP issue, but a general one with
> regard to hypertext media as a class.

Just in case there's any confusion, the third patch DOES eliminate links on [[]] formatted dates (leaving [[:]] or "piped" [[|]] dates alone) so it solves the "sea of blue" problem.  My personal preference would have been to keep the links, but I wrote the patch to do what (I think) other people want, not what I want.

As for usability studies, I'd take them with a grain of salt.  I had a boss that was fond of pointing to usability studies to "prove" that serif fonts were easier to read than san-serif fonts (he came from the newspaper industry and insisted on serif fonts on all our websites.)  Then, about ten years ago, the numbers started to shift -- and nowadays, san-serif fonts are preferred something like 2-to-1 over serif fonts.  What changed?  More people started reading text on computers, which are more likely to us san-serif fonts.

Usability studies measure what people are *familiar* *with*, not what is "inherently" easier to use.  As people gain more exposure to hyperlinked text, their "tolerance" for hyperlinks will increase.  Plus, there are always people in any usability study that deviate from the norm.  My point in saying that *I* don't mind the hyperlinks is that the choice of whether or not to link a date should be one left up to the *reader* in most cases.  Some people don't like an abundance of links, and they should have the option of turning off the links.  Some people like having links, and they should have the option of turning them on.

-Bill
Comment 273 billclark 2008-12-05 16:07:13 UTC
(In reply to comment #271)
> I can't see how ordinary editors can be said to "win" by being prevented
> from making links to dates in the normal way they make links to everything
> else; nor is it the developers' job to forcibly delink *all* existing dates
> (which would be the consequence of this patch - again unless I've
> misunderstood).

Dates that use the [[:]] or [[|]] style of markup will still be linked but not autoformatted, just as they are now.  It's only the dates with [[]] markup that will appear unlinked (and in "raw" format) on the page.

Presumably some of the [[]] dates are ones that really *should* remain linked, and editors will need to change those to [[:]] or [[|]] format to re-enable the links after (if) the patch is applied.

It's important to note that there are a number of editors that are manually removing ALL of the [[]] markup around links, RIGHT NOW.  They've already removed thousands, perhaps tens of thousands, of such links.  In doing so they're also eliminating valuable markup that could be put to better use, or (if consensus dictates) re-enable date linking/autoformatting.  The proposed (third) patch accomplishes the same goal as these editors -- eliminate date linking and autoformatting for [[]] dates while leaving [[:]] and [[|]] ones untouched -- but in a way that happens immediately and which is completely reversible.

So instead of writing "everybody wins" I should have written "the people currently removing link syntax win, and the people (including me) who want to develop an improved parser function that depends on date markup don't lose quite as badly as they otherwise would."  That better?

-Bill
Comment 274 Le Chat 2008-12-05 16:28:16 UTC
Well, other losers seems to be (1) the editor population at large (who would have to learn a different linking syntax specially for dates, and - if your preservation programme is to have any value - be expected to use a counterintuitive syntax even for unlinked dates); (2) those who wish to retain existing linked dates in many articles (such as the chronological articles themselves). I'm personally all in favour of the delinking, but I don't see a need to do it by brute force in a way that leaves a pointless syntactical complication for editors frozen in for the foreseeable future. 
Comment 275 billclark 2008-12-05 17:20:54 UTC
(In reply to comment #274)
> who would have to learn a different linking syntax specially for dates

There's *already* a different linking syntax for dates, and it's the same as for links to categories and for images (and inter-wiki links for that matter.)  The patch doesn't change that.

If I wanted to link to a specific date *currently* I would still need to use the [[:]] or [[|]] syntax to do it, otherwise my link would automatically be broken into a link to the day-month combination and a different link to the year.

So the patch has nothing to do with the weird date linking syntax.

> those who wish to retain existing linked dates in many articles
> (such as the chronological articles themselves).

If those dates aren't already marked up using either the [[:]] or [[|]] syntax, then editors like Tony and Lightmouse are going to delink them (if they haven't done so already.)  They're using a Javascript tool that does the delinking automatically and (if the number of complaints on their talk pages are any indication) they aren't being too careful about which dates they're delinking.

> I'm personally all in favour of the delinking, but I don't see a
> need to do it by brute force in a way that leaves a pointless syntactical
> complication for editors frozen in for the foreseeable future. 

I don't see the need to do it by "brute force" (which applies more to what Tony and others are doing rather than the proposed patch) in a way that makes the decision (reached without proper consensus) hard to reverse.

Bear in mind that the patch is meant as a temporary measure.  Applying it would allow us to do *instantly* what Tony and others will take months or years to do by hand, and to gauge reader response right away.  If readers complain about the change, it's trivial to remove the patch and revert the instructions on MOSNUM to how they were for the past few years (i.e. encourage date linking for autoformatting.)  If readers applaud the change, then the manual delinking could still occur (or a bot could take over) to clean up the now-obsolete linking syntax, and THEN the patch could be removed.

Either way, the patch will eventually no longer be needed, either because an improved date autoformatting function has been created or because readers show that they really do want date linking and/or autoformatting.

-Bill
Comment 276 billclark 2008-12-05 17:24:02 UTC
(In reply to comment #275)
> Either way, the patch will eventually no longer be needed, either because an
> improved date autoformatting function has been created or because readers show
> that they really do want date linking and/or autoformatting.

Sorry.. that last part should read: "or because readers show that they really DON'T want date linking and/or autoformatting."

-Bill
Comment 277 S. McCandlish 2008-12-05 18:07:28 UTC
Bill says:
> Presumably some of the [[]] dates are ones that really *should* remain linked,
and editors will need to change those to [[:]] or [[|]] format to re-enable the
links after (if) the patch is applied.

Right. There are so few dates that actually should be linked that this is a really trivial matter. 
Comment 278 Andrew Garrett 2009-03-10 01:13:50 UTC
Fixed, r48249.
Comment 279 Brion Vibber 2009-03-18 23:02:56 UTC
Since this tweaks around markup, it really needs some parser test cases.
Comment 280 MacGyverMagic 2009-03-31 10:36:46 UTC
It's a nice addition for templates, but not suitable for use in regular articles because it uses unneccesarily complicated formatting. The perceived problem with regular linking of dates is that it caused irrelevant links in whatlinkshere, but it's format is highly superior. Can't we just choose how to format a date in our preferences without the requirement for any extra coding at all?
Comment 281 Andrew Garrett 2009-04-07 00:50:54 UTC
parser test cases were done ages ago.
Comment 282 wclark 2009-04-07 01:35:01 UTC
This wikitext:

{{#formatdate:January 15}}

Produces this page text:

January 1, 5

Surely that's not correct.
Comment 283 Andrew Garrett 2009-04-11 15:20:55 UTC
(In reply to comment #282)
> This wikitext:
> 
> {{#formatdate:January 15}}
> 
> Produces this page text:
> 
> January 1, 5
> 
> Surely that's not correct.


Quoted from my talk page, on this matter:

If you're interested in fixing the problem with yearless dates, it's an interesting one. When you strip out the [[ ]] syntax, you end up leaving " *,? *" as the only thing separating the day from the year, and since that regex matches the empty string, the parser function thinks the first digit of a two-digit day is the day, and the second digit is the year (or the other way around, depending on what the "raw" format is.) Fixing it is non-trivial because while the simple and obvious fix is to use " *,? +" (or " *,? *" for the non-perl-compatible regex) that will introduce annoying edge cases where the comma is misplaced (handled correctly by the standard autoformatting) or where the year is on a new line in the wikitext. I'm not sure if those edge cases are worth worrying about though. --UC_Bill (talk) 18:30, 10 March 2009 (UTC)

    Maybe "( ,)+" would be better.. except that will allow [[15 January]],,,,,,,,[[2009]] to be matched (which may or may not be a problem) and would require some corresponding changes to the "keys" array in DateFormatter to tell it to ignore the new match register. --UC_Bill (talk) 18:41, 10 March 2009 (UTC)
Comment 284 Philippe Verdy 2009-04-11 16:33:02 UTC
Why not simply: "( *, *| +)" ?
This seems a trivial change that explicitly wants a single comma with optional surrounding spaces, or at least one space.
Comment 285 Philippe Verdy 2009-04-11 16:46:36 UTC
In other words:
--- trunk/phase3/includes/parser/DateFormatter.php	(revision 48249)
+++ trunk/phase3/includes/parser/DateFormatter.php	(revision ?????)
@@ -50,3 +50,3 @@
 
 		# Real regular expressions
+		$this->regexes[self::DMY] = "/{$this->prxDM} *,? *{$this->prxY}{$this->regexTrail}";
-		$this->regexes[self::DMY] = "/{$this->prxDM}( *, *| +){$this->prxY}{$this->regexTrail}";
Comment 286 Andrew Garrett 2009-04-24 04:01:44 UTC
Fixed, with a parser test case in r49794.
Comment 287 wclark 2009-05-08 01:51:09 UTC
The most recent changes break existing autoformatting for linked dates (not those in {{#formatdate}}) -- badly.  The problem is that by putting an additional () grouping into the regexes, the characters of $this->keys no longer correspond to the correct groups.

This will fix it:

Index: includes/parser/DateFormatter.php
===================================================================
--- includes/parser/DateFormatter.php   (revision 50326)
+++ includes/parser/DateFormatter.php   (working copy)
@@ -59,10 +59,10 @@

        # Extraction keys
        # See the comments in replace() for the meaning of the letters
-       $this->keys[self::DMY] = 'jFY';
+       $this->keys[self::DMY] = 'jF Y';
        $this->keys[self::YDM] = 'Y jF';
-       $this->keys[self::MDY] = 'FjY';
-       $this->keys[self::YMD] = 'Y Fj';
+       $this->keys[self::MDY] = 'Fj Y';
+       $this->keys[self::YMD] = 'Y  Fj';
        $this->keys[self::DM] = 'jF';
        $this->keys[self::MD] = 'Fj';
        $this->keys[self::ISO1] = 'ymd'; # y means ISO year

HOWEVER, I think it would be better to simply revert ALL of the changes associated with this bug, and to close the bug as "WONTFIX" since the recent ArbCom-sponsored poll determined that the existing autoformatting is to be eliminated anyway.
Comment 288 Rich Farmbrough 2009-05-23 13:29:16 UTC
I agree with Bill and am closeing the bug.  See https://bugzilla.wikimedia.org/show_bug.cgi?id=18479 for the request to turn off autoformatting on en:WP.
Comment 289 Andrew Garrett 2009-05-24 09:12:10 UTC
I don't care whether English Wikipedia wants to use it or not. There is useful functionality in MediaWiki that needs to be fixed. 

Please don't decide on my behalf that I'm not going to work on a feature unless your name is Brion Vibber.
Comment 290 David E. Siegel 2009-05-25 22:01:52 UTC
(In reply to comment #287)
...
> HOWEVER, I think it would be better to simply revert ALL of the changes
> associated with this bug, and to close the bug as "WONTFIX" since the recent
> ArbCom-sponsored poll determined that the existing autoformatting is to be
> eliminated anyway.

MediaWiki is uised on sites other than en.wp. Even if the poll indicates that this functionality will never be used on the en wikipedia, it may well be useful on other mediawiki sites. 

Of course, for many such sites, the original request of new syntax to accomplish formatting *without* linking would be preferable, and i think simpler to code, since detectign existing date formats would not be an issue. 

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links