Last modified: 2013-01-17 16:36:26 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T16404, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 14404 - {{int:X}} respects user-defined interface language, breaking link tables etc. (aka {{USERIFCODE}} strikes back)
{{int:X}} respects user-defined interface language, breaking link tables etc....
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Parser (Other open bugs)
unspecified
All All
: Low normal with 3 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
http://test.wikipedia.org/wiki/Bug_2085
: patch
: 19638 (view as bug list)
Depends on:
Blocks: 2085 28424 16608
  Show dependency treegraph
 
Reported: 2008-06-04 12:59 UTC by Mormegil
Modified: 2013-01-17 16:36 UTC (History)
19 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Change to use the parser's function language instead of UI language (730 bytes, patch)
2008-11-25 23:47 UTC, Brion Vibber
Details

Description Mormegil 2008-06-04 12:59:48 UTC
The int: core parser function (e.g. {{int:Talkpage}}) retrieves the text using _the user interface language_ (_not_ the contents language).

But that means all reasons why Bug 2085 is marked WONTFIX (caching, link table corruption), are already here! (See the linked URL.)

I guess the proper fix would be to change CoreParserFunctions::intFunction to add one “true” argument to the wfMsgReal call.
Comment 1 Alexandre Emsenhuber [IAlex] 2008-06-09 17:58:19 UTC
fixed in r36093.
Comment 2 Brion Vibber 2008-06-11 02:52:34 UTC
Reverted in r36185 -- caused regression to parser cache consistency.

User-specific options such as stub threshold were still applying in the parser, but not taken into account in the parser hash key. As a result, the caches were corrupt, saving different options into the anonymous-default options cache.
Comment 3 Rocket000 2008-08-06 04:32:54 UTC
No! Don't fix this! I always assumed this was the desired behavior. We have multilingual templates on Commons that make great use of this. See [[commons:Template:Edit-int]] and [[commons:Template:See also]]. We also utilize translated messages from our upload form, such as [[commons:MediaWiki:UploadFormSourceLabel]]. I was planning to do this for our {{Information}} template: [[commons:Template:Information (Internationalised)]]. Now I learn this great feature is a bug! I thought {{MediaWiki:... was for when you wanted the actual contents regardless of language.

...if only we had a {{USERLANGUAGE}} variable. And it's entirely possible if this is.
Comment 4 Mormegil 2008-08-06 08:52:06 UTC
(In reply to comment #3)
> No! Don't fix this! I always assumed this was the desired behavior.
> […]
> ...if only we had a {{USERLANGUAGE}} variable. And it's entirely possible if
> this is.

Did you read Bug 2085? And that it is marked WONTFIX and why? In the current MediaWiki architecture, it would be (IMHO) terribly difficult to make parser’s behavior dependent on the current viewer’s preferred language.

You could implement simple preference-dependent translation using JavaScript, nothing too difficult in that.

(Personal sidenote: do you realize that by voting for this bug means you want it to be _fixed_, i.e. the {{int:X}} hack to be removed?)
Comment 5 Rocket000 2008-08-06 09:10:05 UTC
Unvoted. :) I originally misunderstood what this was... I was thinking it was more of a sequel to Bug 2085. It's suggesting that a magic word is possible because "all reasons why Bug 2085 is marked WONTFIX (caching, link table corruption), are already here!" I'm not sure what would be terribly difficult about it. I mean, we already can archive the same with int:, it's just that might get "fixed" now. :/
Comment 6 Casey Brown 2008-08-06 14:19:32 UTC
(In reply to comment #3)
> No! Don't fix this! I always assumed this was the desired behavior. We have
> multilingual templates on Commons that make great use of this. See
> [[commons:Template:Edit-int]] and [[commons:Template:See also]]. We also
> utilize translated messages from our upload form, such as
> [[commons:MediaWiki:UploadFormSourceLabel]]. I was planning to do this for our
> {{Information}} template: [[commons:Template:Information (Internationalised)]].
> Now I learn this great feature is a bug! I thought {{MediaWiki:... was for when
> you wanted the actual contents regardless of language.
> 
> ...if only we had a {{USERLANGUAGE}} variable. And it's entirely possible if
> this is.
> 

I agree with Rocket here, this is used already in a bunch of places.  Try requesting a different function for the content language.
Comment 7 Mormegil 2008-08-06 18:39:22 UTC
(In reply to comment #5)
> I'm not sure what would be terribly difficult
> about it. I mean, we already can archive the same with int:, it's just that
> might get "fixed" now. :/

The fact that {{int:}} seems to work, is just a bug, causing database integrity violation. The requested behavior cannot be easily implemented into MediaWiki _in a correct way_. That is the reason Bug 2085 has been closed as WONTFIX, i.e. the feature request has been rejected, such functionality is _not_ going to get into MediaWiki. Comments by Brion Vibber at Bug 2085 explain why. In short (see also the attached URL for an example): If you write [[{{int:History short}}]], you are creating a page that links to either [[History]], or [[Historie]], or [[Historique]], or … etc., according to the user’s preferred language. OK, you might think, what’s the problem? The problem is twofold, but the most difficult part to solve is link tables: MediaWiki needs to know which page links to which (e.g. because of Special:Whatlinkshere etc.). But it is unable to decide in this case – the linked page depends on the user who views the page! And, if you check the linked example, you can see the broken behavior – only one of the linked Whatlinkshere pages lists the example page (and which one is “random”, it depends on the language of the user who saved the last edit).

I am not saying this problem is completely impossible to solve, just that it would be IMHO really difficult to do (implementation-wise, and probably even performance-wise after that).

(In reply to comment #6)
> I agree with Rocket here, this is used already in a bunch of places.  Try
> requesting a different function for the content language.
 
You misunderstand the situation. It’s not like I’m asking for a new feature. I am just reporting this behavior, which is a bug, not a feature (see above and Bug 2085 for explanation why).
Comment 8 Siebrand Mazeland 2008-08-16 23:08:47 UTC
Assigning to Nikerabbit.
Comment 9 Tisza Gergő 2008-09-11 19:26:09 UTC
(In reply to comment #7)
Would using the the content language for the link table, and the interface language for the actual display be acceptable?
Comment 10 Alexandre Emsenhuber [IAlex] 2008-09-11 19:32:28 UTC
(In reply to comment #9)
> Would using the the content language for the link table, and the interface
> language for the actual display be acceptable?
> 

Already tried, see comment 1 and comment 2 :)
Comment 11 Brion Vibber 2008-11-25 23:47:56 UTC
Created attachment 5541 [details]
Change to use the parser's function language instead of UI language

Replaces the wfMsgGetKey() and wfMsgReplaceArgs() calls with a call to wfMsgExt() using $parser->getFunctionLang().

This _ought_ to cause no changes in behavior in UI message usage, but would render page content material using the site content language. This does what was originally planned on this bug, but... people *do* like to put little UI thingies in their pages, and it is useful, so I don't want to break it just yet.

This change, or something like it, is also needed in order for {{int:}} to do what's expected in UI messages being pulled for something that's not the *general* UI language (eg, not $wgLang). To work around this in CentralNotice I'm currently temporarily overriding $wgLang while doing message renders, and this kind of sucks.

Maybe what we want is some way to mark a page or a part of a page as being in a different language, either a specific one or the selected UI language, so that parser functions (including PLURAL and GRAMMAR as well as int) can use the appropriate language for their individual bits of content.
Comment 12 Alno 2008-11-29 23:52:47 UTC
(In reply to comment #7)
> (In reply to comment #5)
> 
> The fact that {{int:}} seems to work, is just a bug, causing database integrity
> violation. 
> (...)
> In short (see also the attached URL for an example): If you write [[{{int:History
> short}}]], you are creating a page that links to either [[History]], or
> [[Historie]], or [[Historique]], or … etc., according to the user’s
> preferred language. OK, you might think, what’s the problem? The problem is
> twofold, but the most difficult part to solve is link tables: MediaWiki needs
> to know which page links to which (e.g. because of Special:Whatlinkshere etc.).
> But it is unable to decide in this case – the linked page depends on the user
> who views the page! And, if you check the linked example, you can see the
> broken behavior – only one of the linked Whatlinkshere pages lists the
> example page (and which one is “random”, it depends on the language of the
> user who saved the last edit).

I'd suggest that in such case, MediaWiki should always store every existing page that would actually be seen by users.

For instance, when encountering [[{{int:History (short)}}]], and having at the same time actual pages at [[History]], [[Historique]] and [[Historie]] and nothing for it in any other language, we'd store as backlinks to the current page the three said ones (as if there were three real links). Actually, these *are* virtually real links: any user could really get to one of these page, so making a link to all of them makes sense. 

Ths would also be independent of the user's actual settings, and wouldn't store too much backlinks for the vast majority of the pages.

I hope I didn't write something completely stupid! :)
Comment 13 Brion Vibber 2008-11-30 00:03:21 UTC
(In reply to comment #12)
> I'd suggest that in such case, MediaWiki should always store every existing
> page that would actually be seen by users.
> 
> For instance, when encountering [[{{int:History (short)}}]], and having at the
> same time actual pages at [[History]], [[Historique]] and [[Historie]] and
> nothing for it in any other language, we'd store as backlinks to the current
> page the three said ones (as if there were three real links). Actually, these
> *are* virtually real links: any user could really get to one of these page, so
> making a link to all of them makes sense. 

Well, this could require parsing every page that used {{int:}} several hundred times every time it's saved. Yeouch! :)
Comment 14 Alno 2008-11-30 16:33:28 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > I'd suggest that in such case, MediaWiki should always store every existing
> > page that would actually be seen by users.
> > 
> > For instance, when encountering [[{{int:History (short)}}]], and having at the
> > same time actual pages at [[History]], [[Historique]] and [[Historie]] and
> > nothing for it in any other language, we'd store as backlinks to the current
> > page the three said ones (as if there were three real links). Actually, these
> > *are* virtually real links: any user could really get to one of these page, so
> > making a link to all of them makes sense. 
> 
> Well, this could require parsing every page that used {{int:}} several hundred
> times every time it's saved. Yeouch! :)

I see... You mean something like:

FOREACH language l DO
  set int: to l
  parse to get links
  FOREACH link li DO
    IF exists(li) THEN
      cache li
    ENDIF
  ENDFOR
ENDFOR

=> this would led to parse N(int:) times the page, then check if N(int:)*N(links) pages exist

I'd see something like:

parse to get links, without resolving their 'int:' part
FOREACH link l DO
  FOREACH language lang DO
    set int: to lang
    set possible_link to the resolution of l with current value of int:
    IF exists(possible_link) THEN
      cache possible_link
    ENDIF
  ENDFOR
ENDFOR

=> then you would parse the page only once, letting the hundreds of tests made separately.

I understand this would be still quite long... :(
Comment 15 Splarka 2008-11-30 23:26:14 UTC
(In reply to comment #14)
> => this would led to parse N(int:) times the page, then check if
> N(int:)*N(links) pages exist

Ahh, but this would recurse too. Imagine if {{ {{int:foo}} }} called potentially up to N templates, each of which had another {{int:}}. Two deep would be N^2
Comment 16 Ilmari Karonen 2008-12-10 07:57:03 UTC
Note that {{int:}} isn't the only case where page content can end up depending on the user's interface language, although it's probably the most visible one.  A number of parser error and warning messages, for example, are also embedded in page content using the user's interface language.

I suspect the only practical resolution, if we want to retain the current behavior of {{int:}}, would be to only update the link tables when the interface language matches the content language.  If the page is saved or purged by a user with a different interface language, we'd have to reparse the page in the background using the content language in order to correctly refresh the links.

Yes, this would leave the possibility of having links visible in different interface languages that are not recorded in the database, but that may be acceptable: after all, there are plenty of other cases where links are not recorded either.  The biggest problem is that updates to templates that are only used on "localized" versions of pages may not propagate fully -- but that can be worked around either by manual purging or simply by not doing that in the first place.

(Incidentally, {{int:}} suffers from a similar problem anyway: I don't believe transcluding an interface message using {{int:}} creates a templatelinks entry, so changes to the transcluded interface message won't propagate automatically.  In practice, we just live with that limitation.)
Comment 17 Mormegil 2008-12-10 10:20:08 UTC
(In reply to comment #16)
> (Incidentally, {{int:}} suffers from a similar problem anyway: I don't believe
> transcluding an interface message using {{int:}} creates a templatelinks entry,
> so changes to the transcluded interface message won't propagate automatically. 
> In practice, we just live with that limitation.)

Note that not only {{int:}} does not create a templatelinks entry, it would not be enough, anyway. Most of the interface messages are not pages, but come directly from the PHP files. And when you update those files (e.g. during MediaWiki upgrade), you don’t know what/if you have changed anything (you would need to rerender all pages using any message).
Comment 18 Ilmari Karonen 2009-04-15 17:57:59 UTC
There's a similar link table inconsistency issue with time-based parser functions or magic words, which I've filed separately as bug 18478.
Comment 19 Dan Jacobson 2009-04-23 03:04:14 UTC
*** Bug 17629 has been marked as a duplicate of this bug. ***
Comment 20 Chad H. 2009-07-15 19:28:47 UTC
*** Bug 19638 has been marked as a duplicate of this bug. ***
Comment 21 Philippe Verdy 2010-07-17 07:16:14 UTC
I don't know why {{int:}} would corrupt the cache. In fact the cache just has to remember the language for which it generaetd the page (i.e. the value of the "uselang=" query parameter, or by default the language code infered from the "Accept-Language:" header in the HTTP query.

Yes this means that pages may be cached multiple times, but only if they are visited by different users using different preferences for their language. All users will see a coherent page in their own language, the cache of prerendered pages will remain a FIFO and, instead of indexing just on "{{FULLPAGENAME}}", it will index on "{{USERLANGUAGE}}:{{FULLPAGENAME}}".

Note that pages in the cache should also have a short lifetime, if they use any one of the builtin magics that access to the current time:
- if {{CURRENTMONTH}} or {{CURRENTYEAR}} is used, the lifetime should not exceed the current month or year on the server (but anyway, any page in the cache woyuld probably be flushed before, to make room for other cached pages, or jsut because the server was upgraded)
- if {{CURRENTDAY}} is used, the lifetime in cache should not exceed the current day on the server ;
- if {{CURRENTHOUR}} is used, the lifetime in cache should not exceed the current hour on the server ;
- if {{CURRENTMINUTE}} is used, the lifetime in cache should not exceed the current minute on the server ;
- if {{CURRENTSECOND}} is used, the lifetime in cache theoretically should not exceed the current second on the server, but a minimum lifetime of pages in the cache may still be increased, to avoid too much work on frequently accessed pages
- if {{#time:}} is used and takes as default parameters the current time on the server (instead of being specified as constants in additional parameters), the same logic should be applied by detecting which date element was used in the format string (take the shortest element to reduce the lifetime while scanning the format string)
- if some other magic keywords that return server statistics (such as number of pages, number of edits...) are used in a page, these statistics should have a reasonable lifetime.

This means that builtin functions and magic keywords must be able to decrease the default lifetime of pages (but not be able to increase it), according to their semantics, and only after they have evaluated their parameters: their return value is not just a string, but a structure containing the parsed text and the maximum lifetime.

All builtin functions (as well as the template expansor) will see what to do with their parameters : if the parameter is not used (for example because of a #if:, or #switch: that skips some parameters, the builtin functions will not reduce the lifetime of the output string they are creating. In other words, the lifetime for each template parameter, or each builtin function parameter, or the output of any of them are independant.

For example, the #if parser function evaluating:


  {{#ifeq:a|b| The current second is: {{CURRENTSECOND}}. |}}

will still return the maximum lifetime, even if one of its parameters has a short lifetime (because its actual value si not used in the output of #if).

On the opposite, with:

  {{#ifeq:{{CURRENTSECOND}}|00| I'm up to the minute! |}}

the result will always be dependant of the value of {{CURRENTSECOND}}, because the 1st and 2nd parameter of #ifeq: always needs to be evaluated. This means that the the lifetime of #ifeq: is first initialized as the minimum value of its 2 first parameters being compared (because they are always evaluated), before determining if the conditional 3rd or 4th parameter will be evaluated and returned : the #ifeq: builtin will then reduce (but not increase) the initialized value according to the lifetime computed and returned separately by either the 3rd or 4th parameter.

The same logic should be applied to the lifetime of parameters of conditional builtin functions like: #ifexpr:, #ifeq:  #switch:, when computing the lifetime of their returned value : unused parameters should have their lifetime simply ignored, and the returned value will be the minimum lifetime of ONLY the parameters they effectively use in their returned texts.

In all cases, once you have computed the maximum lifetime by taking the minimum of all these values above, check that this lifetime is not below a tuning parameter for the minimum lifetime of validity of ALL pages on the server cache (you may tune it to one minute for example, or less if the server can support it : this may depend on the server or project on which MediaWiki is installed, and on the policy needed for the global page caches used directly in the server, or within front proxy servers). And let the MediaWiki rendering server instruct the page cache about this lifetime of prerendered pages that will be stored there.
Comment 22 Philippe Verdy 2010-07-17 07:34:10 UTC
> Ahh, but this would recurse too. Imagine if {{ {{int:foo}} }} called
potentially up to N templates, each of which had another {{int:}}. Two deep
would be N^2

Actually no ! The UI language is always the same while rendering a page. So all dependencies are computed within the restricted set of the contant UI language. This will never multily the number of rendered pages to manitain in the cache, but will only store different versions (and different lists of backlinks for each UI language) ONLY when that UI language is used.

What does this means ? Backlinks are all dropped from a page when it is edited, but as it is saved, it is always within the contect of a specific UI language. If later a visitor comes that wants another UI language, it won't be present in the cache and the page will be regenerated.

You can still minimize the impact of #int: and of {{USERLANGUAGE}}: when evaluating end rendering pages, detect if one of them is used (use the same algorithm used for computing lifetimes of pages) : first start evaluating as if the page was generated within a locale-neutral root language. Then if one of these  #int: or {{USERLANGUAGE}} is being evaluated, set the language code in the result structure (that contains the generated text, the lifetime, and the UI language code).

After evaluating all the page, you immediately see if the result is dependant from a UI language, and if so, you'll index the generated page in the cache as
  {{UILANGUAGE}}:{{FULLPAGENAME}}
and you'll drop:
  :{{FULLPAGENAME}}
from the cache.

Otherwise you'll index it as
  :{{FULLPAGENAME}}
and you'll drop all pages in the cache that match:
  *:{{FULLPAGENAME}}

With such algorithm, you can significantly reduce the workload because a lot of pages or templates do not depend on the UI language.

And you absolutely don't need to regerate at the same time all the pages for all supported UI languages: generate them only on the fly, as they are effectively demanded by users (the first version that will be indexed will be the version built for the UI language used by the page editor saving it, but only when it will be requested through as standard GET request after saving it and being redirected to it.

Additional backlinks (to templates or page names that depend on the UI language) can be added as well on the fly very long after the page has been saved, but of course, you also opt for rendering the page immediately in and save the backlinks for the page being rendered for the default Project language (my opinion is that it would complicate things for no benefit, and would increase the response time for the user saving a page while its UI language is not the default UI language of the project).
Comment 23 Niklas Laxström 2010-07-17 10:45:01 UTC
It's not about caching, it works already pretty well (at least if exclude cache fragmentation), but it's about the link tables in the database which should not change depending on the user language.
Comment 24 Roan Kattouw 2010-07-17 13:55:11 UTC
(In reply to comment #21)
> I don't know why {{int:}} would corrupt the cache. In fact the cache just has
> to remember the language for which it generaetd the page (i.e. the value of the
> "uselang=" query parameter, or by default the language code infered from the
> "Accept-Language:" header in the HTTP query.
> 
> Yes this means that pages may be cached multiple times, but only if they are
> visited by different users using different preferences for their language. All
> users will see a coherent page in their own language, the cache of prerendered
> pages will remain a FIFO and, instead of indexing just on "{{FULLPAGENAME}}",
> it will index on "{{USERLANGUAGE}}:{{FULLPAGENAME}}".
> 
We already do this for various parameters.

> Note that pages in the cache should also have a short lifetime, if they use any
> one of the builtin magics that access to the current time:
> - if {{CURRENTMONTH}} or {{CURRENTYEAR}} is used, the lifetime should not
> exceed the current month or year on the server (but anyway, any page in the
> cache woyuld probably be flushed before, to make room for other cached pages,
> or jsut because the server was upgraded)
> - if {{CURRENTDAY}} is used, the lifetime in cache should not exceed the
> current day on the server ;
> - if {{CURRENTHOUR}} is used, the lifetime in cache should not exceed the
> current hour on the server ;
> - if {{CURRENTMINUTE}} is used, the lifetime in cache should not exceed the
> current minute on the server ;
> - if {{CURRENTSECOND}} is used, the lifetime in cache theoretically should not
> exceed the current second on the server, but a minimum lifetime of pages in the
> cache may still be increased, to avoid too much work on frequently accessed
> pages
When any of these time-dependent magic words is used, the page is only cached for one hour. This was implemented ages ago.

> This means that builtin functions and magic keywords must be able to decrease
> the default lifetime of pages (but not be able to increase it), according to
> their semantics, and only after they have evaluated their parameters: their
> return value is not just a string, but a structure containing the parsed text
> and the maximum lifetime.
> 
They already are able to do so.
Comment 25 Platonides 2010-07-17 21:39:44 UTC
(In reply to comment #23)
> It's not about caching, (...) but it's about the link tables in the database 
> which should not change depending on the user language.

In fact, I think we fixed it by doing a second parse in the content language. So... RESOLVED FIXED?
Comment 26 Niklas Laxström 2010-07-17 21:43:56 UTC
(In reply to comment #25)
> In fact, I think we fixed it by doing a second parse in the content language.
> So... RESOLVED FIXED?

In which commit?
Comment 27 Platonides 2010-07-18 13:35:17 UTC
Sorry, it's not fixed.
Comment 28 Philippe Verdy 2010-07-19 17:03:58 UTC
(In reply to comment #23)
> It's not about caching, it works already pretty well (at least if exclude cache
> fragmentation), but it's about the link tables in the database which should not
> change depending on the user language.

Yes but I addressed this already in the last paragraph of comment #23 (speaking about "backlinks").

And forcing all pages that use magic time-based keywords to use only a 1-hour lifetime is not the best option, when ONLY the day (or week, month, year) precision is used.

In addition, you still don't consider when a function or template parameter that depends on time (or server statistics like number of pages in categories or namespaces) will be actually be used to generate the result.

Reread what I wrote about the parameters of #if/#ifexpr/#ifeq/#switch, where only the first parameters and the effective conditional result is important for the cacheability of the result which could be made much longer if a conditional output parameter is not used. And this may be applied as well within the evaluation of #expr/#ifexpr expressions containing the "a ? b : c" ternary operator (only the lifetime of "a", and of EITHER "b" OR "c" should restrict the cache lifetime of the result):

It is best to effectively track the lifetime of builtin functions and templates in order to get consistant results, but still a maximal cachability of pages because it can save lots of ressourcs on the servers (one hour is not enough in most cases when it could be even a full year or month, for heavily visited pages that are NOT modified, such as project and portal pages).

The choice of one hour seems quite arbitrary (even if it's good only as a PROJECT-SPECIFIC policy for the minimum lifetime to consider for the final rendered page).

- some projects will still want to accept 1-second lifetime for a few very active pages such as a few pages of discussions (or within very specific namespaces with restricted modification policies such as "mediawiki:", indirectly referenced by "{{int:}}" and that may also include server-wide notices), or pages giving status information about the server,

- and some projects will even consider that some pages should **never** be cached and rendered gain each time it is requested, when it contains time-dependant or statistics-dependant information (the "mediawiki:" namespace is such a candidate namespace whose cachability should be tracked as precisely as possible, but there are a few other "special:" pages that may benefit of a more precise cachability).
Comment 29 Roan Kattouw 2010-07-19 19:40:53 UTC
(In reply to comment #28)
> (In reply to comment #23)
> > It's not about caching, it works already pretty well (at least if exclude cache
> > fragmentation), but it's about the link tables in the database which should not
> > change depending on the user language.
> 
> Yes but I addressed this already in the last paragraph of comment #23 (speaking
> about "backlinks").
> 
> And forcing all pages that use magic time-based keywords to use only a 1-hour
> lifetime is not the best option, when ONLY the day (or week, month, year)
> precision is used.
> 
> In addition, you still don't consider when a function or template parameter
> that depends on time (or server statistics like number of pages in categories
> or namespaces) will be actually be used to generate the result.
> 
> Reread what I wrote about the parameters of #if/#ifexpr/#ifeq/#switch, where
> only the first parameters and the effective conditional result is important for
> the cacheability of the result which could be made much longer if a conditional
> output parameter is not used. And this may be applied as well within the
> evaluation of #expr/#ifexpr expressions containing the "a ? b : c" ternary
> operator (only the lifetime of "a", and of EITHER "b" OR "c" should restrict
> the cache lifetime of the result):
> 
> It is best to effectively track the lifetime of builtin functions and templates
> in order to get consistant results, but still a maximal cachability of pages
> because it can save lots of ressourcs on the servers (one hour is not enough in
> most cases when it could be even a full year or month, for heavily visited
> pages that are NOT modified, such as project and portal pages).
> 
I'm not exactly sure how smart the mechanisms we currently have are, that is, whether they recognize a {{CURRENTMONTH}} in a branch that isn't taken.

> - some projects will still want to accept 1-second lifetime for a few very
> active pages such as a few pages of discussions (or within very specific
> namespaces with restricted modification policies such as "mediawiki:",
> indirectly referenced by "{{int:}}" and that may also include server-wide
> notices), or pages giving status information about the server,
> 
> - and some projects will even consider that some pages should **never** be
> cached and rendered gain each time it is requested, when it contains
> time-dependant or statistics-dependant information (the "mediawiki:" namespace
> is such a candidate namespace whose cachability should be tracked as precisely
> as possible, but there are a few other "special:" pages that may benefit of a
> more precise cachability).
While some projects may indeed want uncacheable or 1-second lifetime (there's hardly a difference between these two) pages, I'm pretty sure the servers wouldn't like that very much. At Wikimedia, we err on the side of caching over correctness in quite a few situations.
Comment 30 Philippe Verdy 2010-07-19 20:02:51 UTC
> While some projects may indeed want uncacheable or 1-second lifetime (there's
> hardly a difference between these two) pages, I'm pretty sure the servers
> wouldn't like that very much. At Wikimedia, we err on the side of caching over
> correctness in quite a few situations.

You're right. But it's a matter of project-specific policy about their local use of caches for prerendered pages.

The policy will just have the effect of increasing the computing the final lifetime to a sustainable level for most frequent pages, when some limited subsets of pages (most probably within specific namespaces with stronger modification policies) will require to be able to track smaller precisions.

Note that heavily used and modified discussion pages could have a high precision as long as they are modifiable, but later they will be archived, or they may be edited in separate subpages, for example one for each day, so that a container page will still avoid transcluding older pages or archived pages that will have a long lifetime (because they will no longer depend on the use of magic keywords like {{CURRENTSECOND}} or {{PAGESINCAT:1}}):

Those page archives, even if they remain modifiable would be moved to a namespace where they have a longer cachability, or where they will be frozen (by blocking administratively all later modifications), so that they won't be impacted by their smaller lifetime in caches.

Special statistics pages on the server, for example, can perfectly have a very short lifetime as their layout is easily fixed and these pages are not directly modifiable (so there would be no risk that an included template would have to cause the page to be rendered again each time the template is modified. Their HTML or wiki code will be built from stable PHP server scripts (which can't be modified without administrative access to the server, using special admin tools that will specifically flush their cached rendering, if it is stored).

If ever these special pages (with low lifetime and constantly updated dynamically) are transcluded within user-modifiable pages, the policy applicable to these user pages will still reincrease the lifetime to the minimum acceptable litefime. So there will be no problem at all, even if those user pages do not reflect the most instantaneous state of these special pages.

But when the renderer will rebuild the wiki source of these user pages, it will get access to the instant state of these pages, and will cache it for a longer time than what you would get by visiting directly these special pages. In fact, these short-time special pages could even be denied direct access to normal unpriviledged users, even if these pages may be transcluded, or the server may choose to expose to these users only their cached version.

The same can be applied to stable (patrolled) versions of pages which could benefit a lot of longer lifetimes in the cache, as they live within a stable and unmodifiable update id which does not need to reflect any current state of the server.

Some wiki projects only work with stable versionned pages, and edits are only visible by selected patrolling users (that can validate a version), or only by users that performed these updates (so these newer updates don't even have to be cached).
Comment 31 Philippe Verdy 2010-07-19 20:15:32 UTC
Another note about patrolled versioned page: as these pages were committed at a very precisely known time, the dynamic values of magic keywords used in them should also be stored :
- the timestamp of the version is already stored, just use it as the source of time instead of the current time on the server: the view should NEVER change depending on the time where the page is rebuilt (if it was flushed out from the cache, or if the server is restarted because of cache corruption).
- the magic values of other server statistics could be stored as well with the version, in a list of properties attached to the stored and timestamped version, for example the values of each used {{PAGESINCAT:...}} when the page was first submitted.
Comment 32 Niklas Laxström 2010-09-11 16:33:55 UTC
Taking myself of as an assignee. We can't break existing functionality without making lots of people unhappy. There are some other ways for reaching multilingualism (Translate extension has some solutions), but more is needed.
Comment 33 MC10 2010-11-03 03:12:28 UTC
If we break existing functionality but create another solution that creates the same results, we could have a steward replace templates on wikis (or any user for templates that are not admin-protected) that use the outdated {{int:}} code.
Comment 34 Platonides 2010-12-28 21:50:08 UTC
Fixed in r79122
Comment 35 Tim Starling 2011-03-14 04:32:24 UTC
Reopening. Fix reverted in r83868. The proposed fix causes bug 27891, which is more severe than this one.
Comment 36 p858snake 2011-04-30 00:10:15 UTC
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
Comment 37 Platonides 2011-06-24 16:36:59 UTC
Recommited in r89706

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links