Last modified: 2014-10-05 23:56:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T14019, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 12019 - ifexist function uses pagelinks table in lieu of better options
ifexist function uses pagelinks table in lieu of better options
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
ParserFunctions (Other open bugs)
unspecified
All All
: Low major with 6 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: newparser
: 15735 71636 71637 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-11-18 10:13 UTC by Chris
Modified: 2014-10-05 23:56 UTC (History)
21 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Chris 2007-11-18 10:13:06 UTC
this problem still exists as of today (2007-11-18).

i've traced it down to the line $parser->mOutput->addLink( $title, $id ); that was added by tstarling in revision 19892 on Mon Feb 12 10:24:23 2007 UTC with the reason "Register a link on #ifexist. Otherwise this breaks cache coherency.."

i can find no logical reasoning for this change. all it is doing is checking if the target exists, and outputting one of two user supplied text blocks. and that is all it should do. it is not making a link to target, nor does it display a link to target anywhere in the scope of this functions code so why does the target need to be added to the link list?

granted, i do not have a complete grasp of the internals of the parser nor the cache systems, but the feedback noise on special:whatlinkshere renders the page useless.
Comment 1 David 2007-11-20 18:29:45 UTC
According to the source code and comments, when the static class method "Title::newFromText()" creates a new Title object, that object is added to a private map so that if multiple links to the same page exist, it doesn't have to go through the process of creating another Title object.  It can just retrieve one from the cache, which saves a bit of time.

I'm guessing that every Title object in the map must also be in the underlying collection accessed by the method "$parser->mOutput->addLink()" otherwise there's processing/page-rendering issues.

The best solution of which I can think is to create a new static method in the Title class whose only purpose is to check whether or not a page exists based on a given text, and have this check done without fiddling with the underlying Title map or creating a Title object:

  Title::exists($text, $defaultNamespace = NS_MAIN)

(Such a method may already exist.  If so, then parser extension needs to be changed so that said method is called instead of "Title::newFromText().")
Comment 2 Thomas Bleher 2008-03-26 18:27:28 UTC
(In reply to comment #0)
> i've traced it down to the line $parser->mOutput->addLink( $title, $id ); that
> was added by tstarling in revision 19892 on Mon Feb 12 10:24:23 2007 UTC with
> the reason "Register a link on #ifexist. Otherwise this breaks cache
> coherency.."
> 
> i can find no logical reasoning for this change. all it is doing is checking if
> the target exists, and outputting one of two user supplied text blocks. and
> that is all it should do. it is not making a link to target, nor does it
> display a link to target anywhere in the scope of this functions code so why
> does the target need to be added to the link list?

I guess it's there so that the page containing the #ifexist check can be purged when the target page gets created.
Comment 3 Tim Starling 2008-09-26 15:44:03 UTC
See bug 10857.
Comment 4 MZMcBride 2008-09-26 15:54:32 UTC
I've generalized the summary for this bug following the resolution of bug 15731.

#ifexist can generate links like [[Category:Foobar]] but it places the rows in the pagelinks table instead of using of a more appropriate table (in this case it would be the categorylinks table).

This pollutes the results of cross namespace link searches and other tools.
Comment 5 Aryeh Gregor (not reading bugmail, please e-mail directly) 2008-09-26 16:06:55 UTC
Comment 2 is correct.  It needs to go in some links table or other so the page can be purged if necessary.  I'm pretty sure that the Title cache referred to in comment 1 is irrelevant.  I guess we could create an ifexistlinks table, but that seems pretty stupid.  Adding entries to pagelinks seems as correct to me as anything.

(In reply to comment #4)
> #ifexist can generate links like [[Category:Foobar]] but it places the rows in
> the pagelinks table instead of using of a more appropriate table (in this case
> it would be the categorylinks table).

Why is the categorylinks table more appropriate?  That would be completely broken, it would show up on category pages just for existence checking.  Likewise templatelinks would cause it to show up in "templates used on this page".  pagelinks is the generic one and should be used for random stuff like this.

> This pollutes the results of cross namespace link searches and other tools.

Putting it in the categorylinks table would effectively categorize the page, which is much less correct.  Putting it in pagelinks doesn't disrupt anything in the software proper, except arguably WhatLinksHere (but only arguably: it does kind of link there).  If people are assuming that pagelinks entries correspond to actual clickable links on the page, yes, this breaks that assumption.

We're both talking about the same thing, right?  {{#ifexist:Category:Foobar|...}} adds a pagelinks entry for [[Category:Foobar]].  {{#ifexist:Somerandompage|[[Category:Foobar]]}} adds a pagelinks entry for [[Somerandompage]], and (maybe) a categorylinks entry for [[Category:Foobar]].  Yes?
Comment 6 Chris 2008-12-28 05:15:56 UTC
>We're both talking about the same thing, right? 
> {{#ifexist:Category:Foobar|...}} adds a pagelinks entry for [[Category:Foobar]].
yes

->exists() on an Article causes this (i assume this is what the #ifexist parser function does too, have not looked)
Comment 7 Splarka 2009-07-19 06:22:42 UTC
*** Bug 15735 has been marked as a duplicate of this bug. ***
Comment 8 Mark A. Hershberger 2011-04-12 16:19:49 UTC
Punting this to the new parser Brion has under development.
Comment 9 Platonides 2011-11-27 16:38:18 UTC
This has little to do with the parser. It's about the way that information is stored.
I suppose we could add an ENUM to pagelinks to state if it's a link check or a ifexists, but pagelinks does seem the right table .
Comment 10 CzechOut 2012-06-22 00:11:29 UTC
This bug is really annoying if your userbase uses Special:WantedPages to guide their editing and you've got templates which liberally use #ifexist.  Has any progress been made in resolving it?
Comment 11 MZMcBride 2013-02-06 09:33:58 UTC
Adding a pl_status column or similar to the pagelinks table seems like the path forward for this bug.

I thought alterations to Wikimedia wikis' pagelinks tables were off-limits due to their size, but Tim assures me that after recent alterations to the revision and categorylinks tables, alterations to pagelinks are doable.
Comment 12 Tyler Romeo 2014-06-19 20:38:56 UTC
I'm wondering, are there any issues with just merging all the link tables? It would allow extensions to use their own custom link types rather than having this issue.
Comment 13 Jan Zerebecki 2014-06-19 21:04:58 UTC
Do not try to merge all link types to fix this bug. This is really a subtype of pagelink instead of just another link type like templatelinks. The previous idea to just add a column to pagelinks is better.

Not sure if other link types also share similarities regaring parser cache invalidation, life cycle, etc. But trying that would make this change much more bigger, complicated and harder to validate. Also such a schema change would not be forward compatible, which means a maintenance window during which no one could edit.
Comment 14 Tyler Romeo 2014-06-19 22:02:28 UTC
(In reply to Jan Zerebecki from comment #13)
> Do not try to merge all link types to fix this bug. This is really a subtype
> of pagelink instead of just another link type like templatelinks. The
> previous idea to just add a column to pagelinks is better.

Well it wouldn't be just to fix this bug. I'm sure there are other extensions that would like to record their own types of links.

> Also such a schema change
> would not be forward compatible, which means a maintenance window during
> which no one could edit.

This is probably not true. I'm sure it's possible to make such a schema change without interrupting availability. I'm just speculating, but assuming that statement-based replication is used, you can bring down a slave server, combine the tables into one, set up views as replacements for the deleted tables, restart the slave, and when it is up to date, make the slave the new master and replicate the schema changes across the remaining servers in the cluster.
Comment 15 Jan Zerebecki 2014-06-20 10:45:30 UTC
Yes you are right, there are ways to do that without interrupting availability. Another possible one is to first create the new table, then change the code to deal with both tables but move data to the new table and later remove the understanding of the old empty tables. However this direction is IMHO more complicated than necessary.

For the new ifexist pagelink type to only be known by the extension, everything that uses pagelinks needs to call back into the extension. Do you know the complete list?
Comment 16 Tyler Romeo 2014-06-20 13:34:34 UTC
Well just to be clear, if we're going to be making any schema changes, we are not just going to randomly add a link type field to the pagelinks table, primarily because core has no use for such field. That was the reason I suggested the combined table approach.

If you want to solve this bug without a schema change, your only other option is to make a separate extension table for ParserFunctions.
Comment 17 Jan Zerebecki 2014-06-21 14:24:03 UTC
This wouldn't be randomly adding a link type field. It would be adding support for pagelinks that do not count as a wanted page to core, even though core would not create such pagelinks on itself.

Creating an extension table for these links poses the same question: Which functionality (core or in Extensions) that currently obtains its information (possibly indirectly) from the pagelinks table would then need hooks in core to call into this new extension table?
Comment 18 Jackmcbarn 2014-10-04 14:02:46 UTC
*** Bug 71636 has been marked as a duplicate of this bug. ***
Comment 19 Brad Jorsch 2014-10-05 20:07:04 UTC
*** Bug 71637 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links