Last modified: 2010-05-04 16:14:40 UTC
I was interested in having alphabetical order mode so I have developed it in the extensions by adding this two three code lines at line 196: <source lang=php> case 'alphabetical': $sOrderMethod = 'alphabetical'; break; </source> and this code at line 340 (343 if you have added the lines above): <source lang=php> else if ('alphabetical' == $sOrderMethod) $sSqlWhere .= ' ORDER BY page_title '; </source> I tested the code and it works well. Do you think it is possible add this code to the stable revision?
Requested feedback from Amgine, and IlyaHaykinson on their wikinews talk pages.
Sorry about the delay. The software indicated from /trunk/extensions/DynamicPageList is not actually DynamicPageList, but is DPL2. DPL2 used a different model, so I'm not sure I can comment with any value. I do not believe MySQL will 100% accurately sort on page_title in UTF8, but that may not be true: I haven't checked. Guessing off the top of my head, there should be no probem with this "patch".
Thank you for your feedback, Amgine. I think we *are* actually talking about the "intersection" extension, as the URL field contains http://www.mediawiki.org/wiki/Extension:Intersection. Does this change your opinion? To the reporter: please provide a proper diff, to avoid any ambiguity.
Sure, I don't see any problem with this change.
Added in r39398. None of the features of this extension are documented on http://www.mediawiki.org/wiki/Extension:Intersection. Please update it.
Reverted in r39617. This patch will pretty obviously fail in combination with various other query options; it adds an ORDER BY to the where clause in the middle of other options. Further, an order by page_title can't be indexed; it probably already can't be indexed, but it never helps. ;) Any comparisons of query plans with and without this option?
Brion: Ilya's concatenation routine properly places the ORDER BY. <curious look> Do you currently index on page_touched and c1.cl_timestamp? those are used by the current version for the current ordering options. Comparison: with this option, DPL can alpha sort list. Without this option, status quo. Siebrand: I don't have the time to do a proper patch including line numbers, but the following may just work: replace: else if ('ordermethod' == $sType) { switch ($sArg) { case 'lastedit': $sOrderMethod = 'lastedit'; break; case 'categoryadd': default: $sOrderMethod = 'categoryadd'; break; } } with: else if ('ordermethod' == $sType) { switch ($sArg) { case 'lastedit': $sOrderMethod = 'lastedit'; break; case 'lastedit': $sOrderMethod = 'alphabetic'; break; case 'categoryadd': default: $sOrderMethod = 'categoryadd'; break; } } replace: if ('lastedit' == $sOrderMethod) $sSqlWhere .= ' ORDER BY page_touched '; else $sSqlWhere .= ' ORDER BY c1.cl_timestamp '; with: if ('lastedit' == $sOrderMethod) $sSqlWhere .= ' ORDER BY page_touched '; elseif ('alphabetic' == $sOrderMethod) $sSqlWhere .= ' ORDER BY page_title '; else $sSqlWhere .= ' ORDER BY c1.cl_timestamp '; [untested, off-the-cuff, looksrighttome, YMMV, etc.]
bah. Forgot to fix the case statement to be: case 'alphabetic':
Created attachment 5229 [details] Diff patch for alphabetical order in DPL
Ramac created a new patch. Now the code is in the right place (brion reverted it because an "else if" was misplaced). I reviewed it and it seems ok. Can you apply it please?
Quoting Brion (comment #6): > Further, an order by page_title can't be indexed; it probably already can't be > indexed, but it never helps. ;) Any comparisons of query plans with and without > this option? In other words, could you post the output of "EXPLAIN SELECT ..." without the ORDER BY (status quo) and with the ORDER BY (with new feature), on some realistic data, so I can be sure I'm not going to cause the servers to melt even more than they otherwise would on DynamicPageList?
It already sorts by last edit, or by category join. This merely adds the option of sorting by title alphabetic instead.
This is a seperate field, so it probably is wise to do an EXPLAIN. It would also be different because it's a different datatype (text field rather than number)
(In reply to comment #12) > It already sorts by last edit, or by category join. This merely adds the option > of sorting by title alphabetic instead. Sorting by a different field can radically alter performance. I don't use DPL and so can't easily test this myself, but if you guys assure me it's correct, and it looks correct, and you give me the performance info I asked for and it's sane, I'd be willing to check it in for you. Try running the relevant queries with EXPLAIN on the toolserver. Or even just give me two sample queries that I can test myself. Because I don't have a lot of spare time to go digging around and figuring out unfamiliar code for an extension I don't use anyway.
Please follow up, or this patch will start gathering dust...
This extension executes only a query. These queries are exectued as now (without patch): Wikitext: <dynamicpagelist> category=Test </dynamicpagelist> Executed query: SELECT page_namespace, page_title, c1.cl_timestamp FROM `mw_page` INNER JOIN `mw_categorylinks` AS c1 ON page_id = c1.cl_from AND c1.cl_to='Test' WHERE 1=1 AND page_is_redirect = 0 ORDER BY c1.cl_timestamp DESC Wikitext: <dynamicpagelist> category=Test ordermethod=lastedit </dynamicpagelist> Executed query: SELECT page_namespace, page_title, c1.cl_timestamp FROM `mw_page` INNER JOIN `mw_categorylinks` AS c1 ON page_id = c1.cl_from AND c1.cl_to='Test' WHERE 1=1 AND page_is_redirect = 0 ORDER BY page_touched DESC The new code (with the patch) would execute this query: Wikitext: <dynamicpagelist> category=Test </dynamicpagelist> Executed query: SELECT page_namespace, page_title, c1.cl_timestamp FROM `mw_page` INNER JOIN `mw_categorylinks` AS c1 ON page_id = c1.cl_from AND c1.cl_to='Test' WHERE 1=1 AND page_is_redirect = 0 ORDER BY page_title DESC Hope this is what you need!
Created attachment 5703 [details] Patch for alphabetical order in DPL
Ramac, if I understood comments #11 and #14 by Aryeh Gregor correctly, he is asking you to run the code with and without the change (i.e. with the current sorting by date and with the new sorting by name) on "realistic data" and report the performance. I would assume 1000 items would be more than enough for such a test.
EXPLAINs on enwikinews_p from toolserver: mysql> EXPLAIN SELECT page_namespace, page_title, c1.cl_timestamp FROM `page` INNER JOIN `categorylinks` AS c1 ON page_id = c1.cl_from AND c1.cl_to='Test' WHERE 1=1 AND page_is_redirect = 0 ORDER BY c1.cl_timestamp DESC; +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+--------------------------+ | 1 | SIMPLE | categorylinks | ref | cl_from,cl_timestamp,cl_sortkey | cl_timestamp | 257 | const | 1 | Using where; Using index | | 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwikinews.categorylinks.cl_from | 1 | Using where | +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+--------------------------+ 2 rows in set (0.03 sec) mysql> EXPLAIN SELECT page_namespace, page_title, c1.cl_timestamp FROM `page` INNER JOIN `categorylinks` AS c1 ON page_id = c1.cl_from AND c1.cl_to='Test' WHERE 1=1 AND page_is_redirect = 0 ORDER BY page_touched DESC; +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+-----------------------------------------------------------+ | 1 | SIMPLE | categorylinks | ref | cl_from,cl_timestamp,cl_sortkey | cl_timestamp | 257 | const | 1 | Using where; Using index; Using temporary; Using filesort | | 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwikinews.categorylinks.cl_from | 1 | Using where | +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+-----------------------------------------------------------+ 2 rows in set (0.00 sec) mysql> EXPLAIN SELECT page_namespace, page_title, c1.cl_timestamp FROM `page` INNER JOIN `categorylinks` AS c1 ON page_id = c1.cl_from AND c1.cl_to='Test' WHERE 1=1 AND page_is_redirect = 0 ORDER BY page_title DESC; +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+-----------------------------------------------------------+ | 1 | SIMPLE | categorylinks | ref | cl_from,cl_timestamp,cl_sortkey | cl_timestamp | 257 | const | 1 | Using where; Using index; Using temporary; Using filesort | | 1 | SIMPLE | page | eq_ref | PRIMARY | PRIMARY | 4 | enwikinews.categorylinks.cl_from | 1 | Using where | +----+-------------+---------------+--------+---------------------------------+--------------+---------+----------------------------------+------+-----------------------------------------------------------+ 2 rows in set (0.00 sec) If the middle query is actually being run, the last one shouldn't be any worse, I suppose. But it still looks bad. I'm very reluctant to commit this, especially since I've never used the extension and am not familiar with its code. Sorry.
Aryeh Gregor: I'm afraid I cannot follow. Why does the last query look bad? (Isn't the result to look at "0.00 sec"? Or is it because of the "Using temporary; Using filesort"?) Regarding your question: Yes, the DynamicPageList extension currently installed on en.wikibooks allows sorting by "lastedit" (that should be "page_touched") and "categoryadd" (should be "c1.cl_timestamp"). (Some background for my interest in this change here: on en.wikibooks it was decided to replace the primary way of access to the books from "bookshelves" to "subject" pages. The idea is to generate these subject pages automatically using dynamic page lists. Of course, presenting basically unsorted lists of books doesn't make a lot of sense; thus, for en.wikibooks it would be crucial to have a way of sorting these dynamic page lists alphabetically.)
It takes 0.00 seconds because it's only EXPLAINing the query, not running it. And yes, the temporary/filesort is the worrisome part.
@Martin: yes, DPL is used for dynamically generated bookshelves; we would like to use this feature also on it.wikibooks. @Aryeh: is there any way to fix it?
Probably somehow.
At pt.wikibooks this change would be very useful too. See also the messages of this thread at wikitech: http://lists.wikimedia.org/pipermail/wikitech-l/2009-September/045080.html (or via gmane at http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/45386/focus=45406) Helder
Marking this FIXED as of r60800.
Is this feature currently avalilable for Wikimedia projects like pt.wikibooks? Helder
(In reply to comment #26) > Is this feature currently avalilable for Wikimedia projects like pt.wikibooks? > > Helder It is currently live at en.wikibooks.
Cool! Now it is working at pt.wb too... (although it the parameter was already in some lists, it was necessary to add "order = ascending" before it works o.o) Helder