Last modified: 2014-09-02 17:12:29 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T55945, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 53945 - A cheap way to get information from page_props table for all links on page (similar to LinkCache?) needed
A cheap way to get information from page_props table for all links on page (s...
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
: performance
Depends on:
Blocks: 17212 8339
  Show dependency treegraph
Reported: 2013-09-09 14:51 UTC by Bartosz Dziewoński
Modified: 2014-09-02 17:12 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Bartosz Dziewoński 2013-09-09 14:51:16 UTC
A cheap way to get information from page_props table for all links on page is needed. Two use cases are related to the DISPLAYTITLE: magic word (bug 17212) and to marking disambigs using the Disambiguator extension (bug 8339).

At a glance it seems like it would be possible to reuse LinkCache + LinkBatch for both purposes, adding one indexed query per batch per prop, which sounds fast enough to me. (We'd also need a hook in both to allow extensions to do this.)

The results could be stored in mGoodLinkFields in LinkCache, with names prefixed with "prop_" or something to distinguish them from the information from the page table that's in there right now. (This might require more hooks or not, depending on how we do them and how powerful we make them - if the hook for LinkCache allowed any queries, we'd need a separate one; if it just allowed stating prop names, it'd do here as well.)

This really seems implementable and sort of easy. Thoughts?

(CC-ing people involved in current patches to those two bugs and resident performance magicians.)
Comment 1 Tim Starling 2013-09-10 00:31:11 UTC
LinkCache is a kind of cache of last resort for parser linking, and could be removed without much performance impact. I think marking of links to disambiguation pages (bug 8339) should be handled in the same way as stub links. LinkHolderArray::replaceInternal() should do a batch query, and the result should be passed to Linker. There is an existing GetLinkColours hook, maybe that could be used.

Ideally, the fixme comment on line 349 of LinkHolderArray.php should be fixed, i.e. clarifying the role of LinkCache and removing it from the data flow path from Parser to Linker.

The issue with storing page_props data in a global-lifetime object like LinkCache is that it would use a lot of memory. LinkCache has no limit on the number of titles it holds, and it is difficult to implement such a limit due to the way it has historically been used. LinkHolderArray, by contrast, has a limited title count and its lifetime is limited to the length of the parse operation.

Similarly, bug 17212 should be done with a batch query in CategoryViewer, similar to the treatment of the category table (cat_subcats etc.). It shouldn't be necessary to modify Linker in that case.
Comment 2 Bartosz Dziewoński 2014-09-02 17:09:59 UTC
Today I learned of the GetLinkColours hook. That's good enough for me, and it's unclear how this bug could be fixed.

Note You need to log in before you can comment on or make changes to this bug.