Last modified: 2014-11-19 10:19:57 UTC
Supposing category:A redirects to category:B. Would it be feasible to automatically move all articles placed in cat:A into cat:B instead? Alternateively, would it be possible to create a Specialpage that lists all categories that are redirects, so that a bot can do the moving?
I've been pondering for a while how to deal with this. The cleanest solution (automatically reassigning the pages) is impossible unless/until we move category information out of the page content and into a separately editable "metadata" display. The best alternative I can think of is to follow redirects from category to category when displaying the category page. For instance, if Category:Foo is moved to Category:Bar, and you then view Category:Bar, pages containing "[[Category:Foo]]" should show up in the "articles in this category" list, perhaps with a footnote-marker explained below as "Via alternate name ''Category:Foo''" so they can be distinguished for maintenance etc. On the technical front, this would mainly consist of recursively following redirects backwards, like Special:Whatlinkshere does, only using the categorylinks table. (In reply to comment #0) > Alternateively, would it be possible to create a Specialpage that lists all > categories that are redirects, so that a bot can do the moving? That would certainly be pretty easy, I'd have thought.
Out of curiosity - is moving category info into metadata a planned change? Would this work... if you save a page, some lookups are already made for e.g. substing in templates. Would it be feasible to check, when a page is saved, whether it is in any categories, and whether those categories are redirects, and if so, to change them accordingly?
(In reply to comment #2) > Out of curiosity - is moving category info into metadata a planned change? Well, category membership is already stored in a special 'categorylinks' table in the database, so it's not unreasonable to consider presenting the current content of this to the user, rather than also storing it as part of the article text. But there's no specific plans to do this any time soon, afaik, it's just a thought that gets floated occasionally. > Would this work... if you save a page, some lookups are already made for e.g. > substing in templates. Would it be feasible to check, when a page is saved, > whether it is in any categories, and whether those categories are redirects, and > if so, to change them accordingly? I guess that would be kind of possible, but I'm not keen on the idea of actual text in an article changing without the user's consent, as it were. Hence the need to present this metadata as separate from the main content - if there were a box labelled "Current category memberships", it could be changed either by hand, or by the software, as appropriate. OTOH, having the categories in the article text has the advantage that changing them shows up in the article history, watchlists, etc...
> I guess that would be kind of possible, but I'm not keen on the idea of actual > text in an article changing without the user's consent, as it were. It's not dissimilar from doing {{subst:sometemplate}} when sometemplate in fact redirects to another template. This is mainly intended for such things as "American actors" and "United States actores" - both categories are the same, but many people don't know that so they put articles in either or both (which has the undesirable side-effect that neither category is in fact complete). Having a true cat redirect would prevent that. (by the way wouldn't having categories as metadata allow for cross-secting categories? e.g. list all articles that are in both cat:A and cat:B? That would be terrific)
(In reply to comment #4) > > I guess that would be kind of possible, but I'm not keen on the idea of actual > > text in an article changing without the user's consent, as it were. > > It's not dissimilar from doing {{subst:sometemplate}} when sometemplate in fact > redirects to another template. It's not really the same at all - in that case, the user has specifically requested that their text be converted on save into something else; the only relevance of the redirect is that the content that goes in isn't from exactly the same title they typed. Changing a category in a page on edit, however, might happen without the user even *touching* the list of categories - the category might have become a redirect since the page was last editted - and yet the user will be creditted with having made that change to the text. [And what if someone vandalises a category to be a redirect somewhere inappropriate; suddenly, innocent editors appear to be vandalising the pages in that category...] Generally, this is a kind of voodoo that should be minimised, because users who don't know why it's happening will become confused and frustrated - "I typed X, but when I saved the page it said Y instead; what's going on?" Don't get me wrong, I absolutely agree that some solution to this would be very useful, as your example demonstrates; I'm just saying that automatically changing the content of an article because something's changed elsewhere is a departure from the current data model. If we're to go down that route, it needs to be in a more bot-like form, which makes an explicit edit to all affected articles with an appropriate summary. Unless, as I say, the list of categories is removed from the article's content and placed in a separate box pulled straight from the database, which can be dynamically updated by both users and system operations. > (by the way wouldn't having categories as metadata allow for cross-secting > categories? e.g. list all articles that are in both cat:A and cat:B? That would > be terrific) Perhaps I haven't put this very clearly: categories are already *stored* as metadata (there's a seperate table in the database that basically stores {page, category} pairs); but currently you can't edit that metadata directly, only by changing the content of the article. Indeed, the <DynamicPageList> extension used on Wikinews (see [[meta:DynamicPageList]]) can already do the kind of lists you're talking about; it's just that it's potentially very db-intensive to allow unbounded lists like this, I think. What would become possible is an interface for editting categories from the other side, as it were - edit the list of "pages in this category", and those pages would change automagically.
I'd also like redirects for categories but I doubt a smart solution will be possible while categories are part of the article text. Until then a simple special page would help: [[Special:CategoriesThatAreRedirects]] - lists all categories that are redirects: SELECT page_title from page where page_is_redirect=1 AND page_namespace=14 So the user can find categories that are redirects, get it's articles, change category tags and delete the category afterwards. But what if the next user creates the deleted category again? So you better not delete the category that is a redirect but use another new special page: [[Special:CategoriesThatAreRedirectsButAreNotEmpty]] - lists all categories that are redirects but have pages in it: SELECT page_title from page where page_is_redirect=1 AND page_namespace=14 AND EXISTS (SELECT * FROM categorylinks WHERE cl_to=page_title); The second special page is also fast for most Wikipedias I testet (it took 22 seconds at the Toolserver for the first call at enwiki_p but following calls are faster because of some kind of caching I don't know). At the moment there are 580 category redirects and 171 of them have pages in it - I'd call them "lost categories" because if a user clicks on them at an article he won't find the article in the category he is directed to. There are 1348 articles pages in the English Wikipedia that are partly hidden this way: SELECT cl_from FROM categorylinks WHERE EXISTS (select page_title from page WHERE page_title=cl_to AND page_is_redirect=1 AND page_namespace=14 AND EXISTS (SELECT * FROM categorylinks AS C WHERE C.cl_to=page_title));
Marking as dependend of Bug 710: Redirect to category page doesn't work Comment 1 is about "category:A redirects to category:B" The genaral case (bug 710) is "namespace:pagename redirects to category:B" best regards reinhardt [[user:gangleri]]
*** Bug 4879 has been marked as a duplicate of this bug. ***
Having this problem solved for the Commons would be fantastic. Because of this problem we currently have a restriction that all category names should be in English. As you can imagine that hardly does anything to promote the multilingual policy of the Commons and I'm sure it's one of the things that turns off would-be contributors whose native tongue is not English. If this was implemented (as I understand it), we'd be able to have [[:Category:Maus]], [[:Category:Mouse]] and [[:Category:Mysz]], and the effect of putting an image in any of them would be the same in the end.
It seems to me that we don't want to "automatically move all articles". When [[A]] redirects to [[B]] and you link to A in an article, MediaWiki doesn't replace that with [[B|A]]. I think all that's necessary is that if Category:A redirects to Category:B, then every article in Category:A appears in the category listing for Category:B. Implementing this wouldn't require having articles store category links as metadata, since it has nothing to do with ''how'' a particular article got into a category.
It would also help get a message in category view at the list of pages in this category that shows a list of categories that redirect to the viewed category. Example: 1. There is Category:Mouse 2. People regularly use Category:Maus instead of Category:Mouse 3. So you create Category:Maus as a redirect to Category:Mouse 4. If you view Category:Mouse you miss all the articles in Category:Maus 5. So a message is shown: "Category:Mouse is also know as Category:Maus. Please move the articles to the redirected category." The current status is unsatisfying. In commons 857 categories are redirects and 1263 pages use this categories (452 pages in 677 categories for the English Wikipedia): SELECT COUNT(*) from page where page_is_redirect=1 AND page_namespace=14; SELECT COUNT(DISTINCT cl_from) from categorylinks, page WHERE page_is_redirect=1 AND page_namespace=14 AND cl_to=page_title;
(In reply to comment #11) > It would also help get a message in category view at the list of pages in this > category that shows a list of categories that redirect to the viewed category. > Example: > > 1. There is Category:Mouse > 2. People regularly use Category:Maus instead of Category:Mouse > 3. So you create Category:Maus as a redirect to Category:Mouse > 4. If you view Category:Mouse you miss all the articles in Category:Maus > 5. So a message is shown: "Category:Mouse is also know as Category:Maus. Please > move the articles to the redirected category." > > The current status is unsatisfying. In commons 857 categories are redirects and > 1263 pages use this categories (452 pages in 677 categories for the English > Wikipedia): > > SELECT COUNT(*) from page where page_is_redirect=1 AND page_namespace=14; > > SELECT COUNT(DISTINCT cl_from) from categorylinks, page WHERE > page_is_redirect=1 AND page_namespace=14 AND cl_to=page_title; > There is no need to do that - if we fix that bug, the pages in Category:Maus are automatically shown in the Category:Mouse. Your suggestion is just a workaround.
*** Bug 5893 has been marked as a duplicate of this bug. ***
Category redirect is especially useful in chinese wikipeida and wikinews. As you may know, In chinese, one thing can be writen both in traditional chinese and simplified chinese. These character to represent the same thing and both of them are correct and should be existe. For example, [[Category:災難]] and [[Category:灾难]] are the same (BTW: "災難" and "灾难" mean "disaster"). However, redirect [[Category:災難]] to [[Category:灾难]] is useless, now. Articles categorized under 災難 still can not be seen in [[Category:灾难]]. Category redirect only redirected the category *itself*, but not the articles categorized in the category.
*** Bug 6750 has been marked as a duplicate of this bug. ***
Created attachment 2905 [details] Patch This patch changes Parser::replaceInternalLinks: when parsing a link to a category (e.g. [[Category:A]], *not* [[:Category:A]]), checking if it's exist in the "redirect" table (using a slave - I think it's OK), and if so and the redirect is to another category, overriding the current title (variable $nt) with the redirected title. This fixes both the display in the page and the DB (table "categorylinks"), as the additions to this table are done using the same parser method. The patch works for me, however it should still be checked for regressions.
Created attachment 3015 [details] Patch Two additions: 1. Allowing categories to be moved. 2. Updating categorylinks when moving categories. TODO: Fix new redirect (currently links to [[Category:B]] instead of [[:Category:B]]), update categorylinks when editing a category.
Created attachment 3016 [details] Patch Fixing redirect page, move page message and links to the page when it becomes a redirect. Summary of the patch changes: 1. Redirect the categories: if a category is a redirection, make the links to it and the items in "categorylinks" table refer to the redirected category when parsing the page. 2. Update categorylinks when a category is edited and becomes a redirect. 3. Make category moves possible - remove it from the forbidden namespaces. 4. Update categorylinks when a category is moved. 5. Fix the redirect page left when a category is moved: used a colon to prevent inclusion in the category. 6. Fix the links in pagemovedtext. Things which still have to be done: 1. Update categorylinks when a redirect category becomes a redirection to another category. 2. Update categorylinks when a redirect category becomes a regular category which is not a redirection. I think that these require another field in categorylinks, "cl_original_to" (may be null, or maybe same to "cl_to" if not redirected?), which specifies the original target (which is now a redirection). If it's not added, it's not possible to update categorylinks because it's not *known* which pages are in this category. I don't know if this field should be added, these things should not be fixed, or there is another way to do it. Any ideas?
As an expediency, somewhat short of full category redirect support, can a change be made so that when an article is saved it is added to the target of a redirected category rather than the redirected category (i.e. if category:A is redirected to category:B, when changing an article to add it to cateogry:A the article is actually added to category:B)? Doing this one change, in combination with recat bots like RobotG, would enable category redirects to work nearly perfectly.
Endorse expediency request above with fervency! Moving/remaning would be nice too, but per [http://en.wikipedia.org/wiki/Wikipedia_talk:Categories_for_discussion#Propose_tagging_with_both_and_expanding_use_of_Cat_redirects_overall this] the method of combined soft and hard redirects put together with the proper linking at the time of saving a page would pretty well cover normal editing objections to redirecting categories. CONSIDER: There are multiple ways to phrase the equivalent page classification in English, but note the three legged stool... most every language's wiki, no matter what type project, one way or another connects with interwiki's to the English Wikipedia articles pages (if only through their own article on the topic in their Wikipedia), the commons category, and/or the Wikipedia category (Which I labor mightily to synchronize, as much as possible, so I know them well). How a language alliterates into English translation we native speakers would find to be an awkward phrasing more often than not, AND VICE VERSA, so on the commons, redirects of categories are tolerated specifically to cover such 'ambiguities', including redirects of foreign category names to the "Official" English name (Unlike most namings on the commons, Principal Category names are English by fiat... articles, images, etc. can all be other languages. So the importance should be obvious, I hope!). So too do we native speakers have our choice of how to state a category name... (e.g. Countries in Europe, vs. Countries of Europe) the halls of the en.Wikipedia (and I suspect all others!) CFD discussions are ankle deep in blood from some of those debates! That in part exists because of different schemes in related categorisation (Geography, Maps, and History all intersect Countries... so comes complications, or a need to alias however belatedly! <g>) Which frankly is a needless waste of time, were it easy to alias a name, and that name be the one 'online' per this proposal. In one sense these names issues are are trivialities, but they are important trivialities, as recollection and modes of phrase formulation are inherently personal things involving the way each of us thinks. So there is a natural factionalism as others think like me, and some like him, and even when compromises occasion, they involve a lot of work for someone... which is hopefully the guys not thinking like me! <BSEG> Bottom line, aliasing would prevent and eliminate a lot of relatively uselessly wasted man-hours renaming things. The computers should be doing that drudgery, not we humans, save for you developers... if you do it once, your effort pays back over and over for the many. In sum, this matter has a inherently more important and higher priority than convenience of one editor, but to multiples of the many of many's of editor's across all nationalities! I guess I'm saying this has had far too low a priority heretofore. It seems simple, so unimportant, but categories are fundamental to organising the projects, hence the effects are vastly magnified. In one swell foop, all the nit-picky (local language) name choices can be trivialized and one uniform name emerge in each locale—yet still not only retain, but actually enhance interconnectivity between sisters and within a given project. So please do expedite both a determination on Rick's iterim proposal, and a full implementation allowing name moves and the like. Just cutting down the debate will free many daily man-hours each day on CFD, so delay is frankly, costly and world wide costly at that. Best regards // ~~~~
> 2. Update categorylinks when a category is edited and becomes a redirect. > ... > 4. Update categorylinks when a category is moved. This might be an issue. I don't know if an UPDATE of, for the current worst case, a couple hundred thousand rows is acceptable. An alternative would be to check the redirect status at display time rather than on update, as we do for pages: retrieve all pages that are in category X or anything that redirects to it. Of course that means faster UPDATE and slower SELECT, which is generally the reverse of what's wanted. We should ask Domas or Tim or someone what's best, I guess. > Things which still have to be done: > 1. Update categorylinks when a redirect category becomes a redirection to > another category. > 2. Update categorylinks when a redirect category becomes a regular category > which is not a redirection. > I think that these require another field in categorylinks, "cl_original_to" > (may be null, or maybe same to "cl_to" if not redirected?), which specifies the > original target (which is now a redirection). If it's not added, it's not > possible to update categorylinks because it's not *known* which pages are in > this category. I don't know if this field should be added, these things should > not be fixed, or there is another way to do it. Any ideas? Short of reparsing every page in the category, this probably does require an extra field, yes. Other schema updates might be necessary to make the updates for large categories efficient. I think that whatever happens will be more efficient than bots loading tons of pages and forcing them to be reparsed, though. :)
*** Bug 10236 has been marked as a duplicate of this bug. ***
Created attachment 3823 [details] Include redirected members in category view The patch submitted earlier kind of scares me. Consider the following scenario: 1. Page is categorized in Category:A 2. Category:A becomes a redirect to Category:B 3. Page is updated accordingly 4. Category:A becomes a redirect to Category:C 5. Page is NOT updated accordingly, since it is treated as a member of Category:B. Someone suggested a new DB field to counter this, but that isn't necessary. The attached patch fixes this bug in a simpler way, without the problem described above. When you view Category:B, the code will check if any other categories redirect to B. If Category:A redirects to Category:B, both A and B's members will show up when you view Category:B. As usual, double redirects won't work, i.e. if A redirects to B and B redirects to C, Category:C will show B and C's members, but not A's. The attached patch makes moving categories easy, just remove NS_CATEGORY from the forbidden namespaces. Since everything is handled transparently through redirects (just like we do with normal pages), no problems should ensue.
A solution was proposed using the redirects table in bug 8685 ...
Created attachment 3824 [details] Include redirected members in API list=categorymembers This patch does the same thing as my previous patch, with the difference that this one fixes listing category members in the MediaWiki API as opposed to the category page itself.
*** Bug 8685 has been marked as a duplicate of this bug. ***
I'm hardly an SQL expert, I'm afraid, but any particular reason you added an extra query rather than joining? I doubt it makes much difference, though, performance-wise. I'm a bit alarmed that the change has to be made separately for the API, rather than both calling a general-purpose public method of something, but I guess that's a separate issue. I'll take a look at this and hopefully commit something today or tomorrow. Although I notice a few bugs assigned to me that I've totally forgotten about, so let's hope this doesn't become one. :D
(In reply to comment #24) > A solution was proposed using the redirects table in bug 8685 ... (In reply to comment #27) > I'm hardly an SQL expert, I'm afraid, but any particular reason you added an > extra query rather than joining? I doubt it makes much difference, though, > performance-wise. The JOIN suggestion suggested in bug 8685 didn't work (selected the wrong data from the page table), and since I'm not particularly good at writing complex SQL queries either, I decided to do it this way. I think my way may actually be faster, since the latter query is still a regular category lookup which is indexed. A complex JOIN statement wouldn't have the indexing benefit (correct me if I'm wrong). > I'm a bit alarmed that the change has to be made separately for the API, rather > than both calling a general-purpose public method of something, but I guess > that's a separate issue. This is partly due to the fact the API provides much more information and filtering options than you'll ever need in a regular page. Also, the current code mixes DB code with UI code, which makes a lot of functions unusable for the API. Article.php and EditPage.php are good examples. > I'll take a look at this and hopefully commit something today or tomorrow. > Although I notice a few bugs assigned to me that I've totally forgotten about, > so let's hope this doesn't become one. :D We're all humans, we all need breaks ;) take your time.
Checking EXPLAIN shows that the query will use a filesort due to replacement of simple equality with a check for IN. I got the same trying the one-query join technique, adjusted to give correct results. Domas will probably kill me if I add a gratuitous filesort to every category, so I (we) will have to ask him why it's filesorting and how to stop it. (By the way, more easily fixed but also significant, your check for rd_title='title' alone can't use the redirect table's indexes, because the index is on (rd_namespace, rd_title). You need to add 'rd_namespace' => NS_CATEGORY to the conditions for that query to be efficient.)
(In reply to comment #29) > (By the way, more easily fixed but also significant, your check for > rd_title='title' alone can't use the redirect table's indexes, because the > index is on (rd_namespace, rd_title). You need to add 'rd_namespace' => > NS_CATEGORY to the conditions for that query to be efficient.) By all means do so. I know just enough about MySQL to get by, and have no idea how all those optimizations work.
(In reply to comment #29) > Checking EXPLAIN shows that the query will use a filesort due to replacement of > simple equality with a check for IN. I got the same trying the one-query join > technique, adjusted to give correct results. Domas will probably kill me if I > add a gratuitous filesort to every category, so I (we) will have to ask him why > it's filesorting and how to stop it. I don't really understand any of that (for instance, my query doesn't use IN AFAIK), but I understand it's a performance problem. How can that be solved?
Your code contains an IN because you have 'cl_to' => $titles as a condition, with $titles an array, and that translates to (cl_to IN ($titles)). I don't know how it can be solved, try asking Domas or someone.
After discussion with Domas, it seems that any attempt to check for redirects in the current schema will *probably* cause a filesort, or at least all the ones suggested did. We probably need a new field, cl_real_to or something, that has the redirect pre-resolved. When adding a category to a page, the actual target would be put in cl_to as now; then if it's a redirect, the redirect target would be put in cl_real_to, otherwise that would be a copy of cl_to (or it would be NULL, depending on which works better). Then cl_real_to would be used for displaying category pages in place of cl_to. Whenever a category is changed to a redirect, or the target of a category redirect is changed, categorylinks would be updated appropriately. River pointed out that if cl_real_to is an id instead of a title, it will persist across renames of the category. But Rob pointed out that that only works if the category has an associated page. River then suggested a cat_id, which may or may not be going too far for this exercise. We can always stick updates for cl_real_to in the job queue, basically mimicking the current bot-update situation.
(In reply to comment #33) I think cl_real_to is the way to go. Queries would be indexed, you'd have something like WHERE cl_to='title' OR cl_real_to='title';. Updating cllinks would be the simple (but potentially massive) query UPDATE categorylinks SET cl_real_to='redirtarget' WHERE cl_to='redirname';.
*** Bug 15742 has been marked as a duplicate of this bug. ***
Sorry, the technical stuff is over my head, but can someone explain what the chances are of this being fixed? What effect do the patches referred to have? Can we expect members of redirected categories to show up on the target category page?
Just came here from http://en.wikipedia.org/wiki/Wikipedia_talk:Categories_for_discussion#Category_redirects and wanted to add a vote for this bug. Disclaimer: It's a long time that I last was here, and I didn't find a "vote" feature, as the Mozilla db has. Also, I didn't spend much time to understand the discussion of this and related bugs, and I don't understand the difference to bug 710, which is supposedly fixed.
(In reply to comment #37) > I didn't find a "vote" > feature, as the Mozilla db has. Bottom right, "Vote for this bug" (Ctrl-F for "vote" would have found it). > I don't understand the difference > to bug 710, which is supposedly fixed. Bug 710 is about redirects to a category working when you navigate directly to the page, the same way they work for other pages. Prior to that bug's resolution, I guess "#REDIRECT [[:Category:XYZ]]" would do nothing, or be buggy or something (it was before my time). This is about redirects actually including one category's contents in another.
may fixed in r46706.
This seems not to be working (at least, I just tried it on English Wikipedia and it didn't work - I don't know if it's supposed to be live there yet).
Wikipedia is still on r46424, see Special:Version.
I'm glad this is going to be solved soon, but there is the problem of potential exploitation by vandals. I've filed a new bug (bug 17461) to address this.
Don't like to sound impatient, but when can we expect this fix to actually come live?
OK, it is live, thanks. There still seems to be a slight problem, though, in that you can't get a list of members of the redirected category specifically. I've raised this in a new bug (bug:17571).
It has been decided that the change will be made in the next mediawiki full release ( http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/RELEASE-NOTES?view=markup ) , so just be patient.☺
Reverted, see CodeReview r46706.
Why has this potentially helpful change been reverted? It seemed to be working well; the only problem was bug 17571, which surely can't be difficult to fix. We know that the tables don't get updated straight away when a category is changed to/from a redirect, and I presume we wouldn't want them to. Bots would handle emptying existing categories when they get redirected, exactly as they do now.
(In reply to comment #47) > Why has this potentially helpful change been reverted? It seemed to be working > well; the only problem was bug 17571, which surely can't be difficult to fix. > We know that the tables don't get updated straight away when a category is > changed to/from a redirect, and I presume we wouldn't want them to. Bots would > handle emptying existing categories when they get redirected, exactly as they > do now. The tables indeed were not updated straight away, in fact they were not updated at all, ever. You'd have to have a bot go through and edit every page in the category, every time the redirect status or redirect target changed. It's possible to do these updates immediately, with negligible performance loss, and to retire the bots. But it would be much more difficult to implement that feature if the categorylinks table was significantly polluted with spurious links from r46706.
I don't think it was ever envisaged that the tables would be updated automatically (I didn't think that was desirable anyway, since inappropriate redirects of large categories, and subsequent reversions, would cause lots of extra processing, of the sort that doesn't seem to happen when e.g. templates with categories get updated). But if you say it can be done, then we'll wait in eager anticipation...
(In reply to comment #49) > I don't think it was ever envisaged that the tables would be updated > automatically (I didn't think that was desirable anyway, since inappropriate > redirects of large categories, and subsequent reversions, would cause lots of > extra processing, of the sort that doesn't seem to happen when e.g. templates > with categories get updated). But if you say it can be done, then we'll wait in > eager anticipation... > Templates with categories don't cause immediate updates because those updates are put in the job queue and executed later. Presumably, updating for category redirect changes would also use the job queue.
Templates with categories don't cause immediate updates because those updates require reparsing of large numbers of pages. Category redirects don't, I don't see any reason why they should need the job queue. Except for really giant categories, maybe, where you'd want to batch the updates to not lag the slaves.
Making a "normal" Category a category can be done straight away, but unredirecting a category requires reparsing all category members.
Or adding an extra column to categorylinks. That seems like a better idea, unless un-redirecting is expected to be very rare.
That's probably the way to go. What would be that column?
cl_to_original or such, an unredirected variant of cl_to. Then if a redirect chain changes, you could do UPDATE categorylinks SET cl_to='New_redirect_target' WHERE cl_to_original IN ('Original_category1', 'Original_category2');. You'd want an index on cl_to_original, of course, so this is a pretty heavyweight addition to the table.
I think that the best solution is you place [[A]] into [[Category:Foo]] and Foo redirects to Bar so you see [[A]] in [[Category:Bar]] and clicking on the catlink to Foo leads you to Bar. For commons they can have "co-categories" where a member of one co-category is visible in all other co-categories. This can be done by having all the categories have [[;Category:Fu]] [[;Category:Fuz]] [[;Category:Faz]] [[;Category:(...)]]
Hello, is anyone still working on this? Any progress lately? It all seemed to be going so well at one point...
Sorry my following observation is probably noted above, but I didn't check. On [[Page A]] put "[[Category:C1]]". Now on [[Category:C1]] put "#REDIRECT [[Category:C2]]". Note how Page A is not listed on Category:C2. Instead the only way to hunt down Page A in the categories is to visit Category:1&redirect=no !
(In reply to comment #58) > visit Category:1&redirect=no ! I meant Category;C1&redirect=no. The redirect=no part is not something the average user will know to try. So the category entry is effectively lost in this sense.
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
*** Bug 32262 has been marked as a duplicate of this bug. ***
Adding the keywords that seem right -- if the patches still need reviewing, please change "reviewed" to "need-review".
This feature request is being proposed at http://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#Automatic_category_redirects and I'm considering whether to add it or not to https://www.mediawiki.org/wiki/Summer_of_Code_2013#Project_ideas Question: Is there a potential mentor willing to help potential students interested in this project? Is there a reasonable support from the MediaWiki core maintainers to incorporate this feature if it's developed and meets the quality criteria? Without these qualifications in place we can't even consider the proposal for GSOC 2013.
(In reply to comment #63) > Question: > > Is there a potential mentor willing to help potential students interested in > this project? Yes me :) > > Is there a reasonable support from the MediaWiki core maintainers to > incorporate this feature if it's developed and meets the quality criteria? > I think so. Would require schema changes which is the only bit that could potentially be sticky.
Ok, you're in: https://www.mediawiki.org/wiki/Summer_of_Code_2013#Automatic_category_redirects Thank you and good luck!
The more you know... The current query for getting category members is: SELECT ... FROM `page` INNER JOIN `categorylinks` FORCE INDEX (cl_sortkey) ON ((cl_from = page_id)) LEFT JOIN `category` ON ((cat_title = page_title) AND page_namespace = '14') WHERE cl_to = 'Test' AND cl_type = 'page' ORDER BY cl_sortkey LIMIT 201 And, true enough, if you change the cl_to check from a comparison to an IN operator, it triggers a filesort. *However*, if you instead move the contents of the WHERE clause into the INNER JOIN condition, then the filesort disappears. The resulting query is: SELECT ... FROM `page` INNER JOIN `categorylinks` FORCE INDEX (cl_sortkey) ON ((cl_from = page_id) AND (cl_to IN ('Test')) AND (cl_type = 'page')) LEFT JOIN `category` ON ((cat_title = page_title) AND page_namespace = '14') ORDER BY cl_sortkey LIMIT 201 Now I'm not too much of an expert on databases, but theoretically this should produce the exact same results (since it's an INNER JOIN) but still be efficient (because the cl_sortkey index includes the cl_from and cl_to columns). This would eliminate the need for any new columns and whatnot.
Just a note to say that Liangent has applied to GSoC with a proposal related to this report. Good luck! https://www.mediawiki.org/wiki/User:Liangent/cat-redir
Re comment 66: If I have more than a single category in the IN condition when doing that, I get a filesort: mysql> describe SELECT /* CategoryViewer::doCategoryQuery Bawolff */ page_id,page_title,page_namespace,page_len,page_is_redirect,cl_sortkey,cat_id,cat_title,cat_subcats,cat_pages,cat_files,cl_sortkey_prefix,cl_collation FROM `page` INNER JOIN `categorylinks` FORCE INDEX (cl_sortkey) ON ((cl_from = page_id) AND cl_to in ('Foo', 'se') and cl_type = 'page') LEFT JOIN `category` ON ((cat_title = page_title) AND page_namespace = '14') ORDER BY cl_sortkey LIMIT 2\G *************************** 1. row *************************** id: 1 select_type: SIMPLE table: categorylinks type: range possible_keys: cl_sortkey key: cl_sortkey key_len: 258 ref: NULL rows: 559 Extra: Using where; Using filesort *************************** 2. row *************************** id: 1 select_type: SIMPLE table: page type: eq_ref possible_keys: PRIMARY key: PRIMARY key_len: 4 ref: wikidb.categorylinks.cl_from rows: 1 Extra: *************************** 3. row *************************** id: 1 select_type: SIMPLE table: category type: eq_ref possible_keys: cat_title key: cat_title key_len: 257 ref: wikidb.page.page_title rows: 1 Extra: 3 rows in set (0.00 sec)
Hmm, damn databases.
Success! So the issue is that the cl_sortkey index on categorylinks puts the cl_to column before the cl_sortkey column, so when you add the "cl_to IN ...", it can no longer use the index to sort by cl_sortkey (from the ORDER BY clause). After adding the following index: ALTER TABLE `categorylinks` ADD UNIQUE `cl_newsort` ( `cl_type`, `cl_sortkey`, `cl_to`, `cl_from` ) And then running the following query: EXPLAIN EXTENDED SELECT `cl_from` FROM `categorylinks` INNER JOIN `page` ON `page_id` = `cl_from` LEFT JOIN `category` ON `cat_title` = `page_title` AND `page_namespace` = 14 WHERE `cl_type` = 'page' AND `cl_to` IN ( 'Foo', 'Test' ) ORDER BY cl_sortkey I finally got no more filesort. (I was even able to get rid of the FORCE INDEX usage.) If somebody could please check this and make sure I'm still sane, and that MySQL isn't just inventing things to trick my mind, that'd be great.
I havent tested this, but I would guess that unless it is doing something very fancy with merging indecies, this would cause very large scans of the categorylinks table. (Since it wouldn't be able to skip to only results in the relavent category). filesort isnt the only way that a db query can be inefficient.
(In reply to comment #71) > I havent tested this, but I would guess that unless it is doing something > very > fancy with merging indecies, this would cause very large scans of the > categorylinks table. (Since it wouldn't be able to skip to only results in > the > relavent category). filesort isnt the only way that a db query can be > inefficient. Hmm, you're right. Now that I realize it, this would require scanning the entire cl_sortkey index (I think).
Just a note to say that Andre Saboia has submitted a GSoC proposal related to this report: https://www.mediawiki.org/wiki/User:Anboia/Automatic_category_redirects
Related URL: https://gerrit.wikimedia.org/r/65176 (Gerrit Change I29a629a514f9568d0ee4d967c516dfd599dc11ba)
Tyler: The patch received a -1, do you plan to rework it?
If I ever have free time again (read: probably not for a while), I offer to help Tyler address some of the issues with the patch.
(In reply to comment #76) > If I ever have free time again (read: probably not for a while), I offer to > help Tyler address some of the issues with the patch. That would be great. I don't have much free time myself, although once I do I'll definitely work on it.
(In reply to comment #77) > (In reply to comment #76) > > If I ever have free time again (read: probably not for a while), I offer to > > help Tyler address some of the issues with the patch. > > That would be great. I don't have much free time myself, although once I do > I'll definitely work on it. yeah, somebody should make a graph of number of commits to mediawiki by volunteers vs when school semester starts.
I'm delisting this project from https://www.mediawiki.org/wiki/Mentorship_programs/Possible_projects#Automatic_category_redirects since it looks like you are almost there.
Remove milestone 1.22 - Given that this has somewhat stalled due to lack of time on the part of interested parties, seems unlikely it could possibly make it to 1.22.