Last modified: 2011-03-02 00:47:33 UTC
While trying to fix 9237, it turns out that interwiki redirects are stored in the database without their interwiki prefix. While 9237 has a patch to remove interwiki redirects entirely, it would make more sense to keep them. Looking at the database, it appears the prefix is stripped entirely from the title. After adding a redirect from [[Test]] to [[fr:Platypus]] on my local install, the query return this: Rd_from = 58 Rd_namespace = 0 Rd_title = Platypus Adding this column (and making sure it's populated correctly, I'm not 100% sure where this is done), allows bug 9237 to be fixed very easily and also opens the door to something such as [[Special:InterwikiRedirects]].
Seems like I'm not the only one who wants this. In my view, the redirect table should cover *all types* of redirects, so we can actually use it to redirect people, rather than parsing the article text like we do now.
Created attachment 5076 [details] Basic implementation and population of rd_interwiki column In this patch, I have implemented the rd_interwiki column for the redirect table. Changes to Article class to make sure it is set/updated as needed. Nothing makes use of this yet, but it's working properly. Could probably use a maintenance script to back-fill all current redirects.
Patch looks good, but what's the whole "username of blocker" and "global blocking" thing about? I don't see how that ties in to rd_interwiki. Also, the comment in question doesn't seem to describe (at all) what the rd_interwiki field is about. Futhermore, why do you declare it as CHAR (32) rather than VARCHAR (32)?
(In reply to comment #3) > Patch looks good, but what's the whole "username of blocker" and "global > blocking" thing about? I don't see how that ties in to rd_interwiki. Also, the > comment in question doesn't seem to describe (at all) what the rd_interwiki > field is about. Futhermore, why do you declare it as CHAR (32) rather than > VARCHAR (32)? > The blocking comment was an oversight from a copy+paste of a different sql patch. Sorry about that. I chose char(32) instead of varchar(32) as it's what the interwiki table uses for the iw_prefix column.
I'd recommend using the varchar form; we should update the interwiki table if it's still using char. My inclination is also to let the field be NULL for non-interwikis, rather than forcing it to empty string. Would it be desirable to also include a field for target fragments? (eg [[Dest#section]]) Currently to get the total destination link you'd have to load up the source text and parse it.
Created attachment 5111 [details] Updated patch Updated patch to incorporate feedback: 1) Using varchar instead of char 2) Use null as default instead of empty string 3) Remove unrelated comment from sql patch 4) Add rd_fragment column to populate with section links Redirecting [[Test]] to [[wikipedia:Platypus#Virus]] then querying the redirect table returns the following: rd_from = 2 rd_namespace = 0 rd_title = Platypus rd_interwiki = wikipedia rd_fragment = Virus Changing values for the iw prefix, target page, fragment, et all update columns as expected.
Looking good. With rd_interwiki *and* rd_fragment, we can finally use the redirect table to resolve redirects rather than parsing the page contents. I attempted to do this in my fix fro bug 10931, but that failed for interwiki and fragment redirects.
Created attachment 5125 [details] Proposed patch Expanded on Demon's patch. Basically, it automatically sets rd_interwiki and rd_fragment for redirects that don't have them yet when they're viewed, and (at last!) implements redirect resolution using the redirect table rather than the page text. I've tested it and it works for me (even fragment+interwiki redirects). In detail: * Get and insert/update rd_interwiki and rd_fragment where needed * Change Article::getRedirectTarget() to cover all cases and to update the row if rd_interwiki or rd_fragment is NULL (i.e. if it's a leftover from the old schema without those fields) * Rewrite Article::followRedirect() to incorporate most of the code from followRedirectText() (which has undergone a minor cleanliness rewrite) and to use getRedirectTarget() * Remove followRedirectText() as it's unneeded now * Add $interwiki parameter to Title::makeTitle(), makeTitleSafe() and makeName() * Document and update Title::getRedirectsHere()
Looks good to me. I was also doing some testing last night (already spoke to Roan about this), and the refreshLinks maintenance script will work just fine for back-filling any non-existent redirect fields. Eventually the code goes through Article::updateRedirectOn(), which is part of the overall updates here, so everything should populate as needed.
How about committing something and closing the issue?
Waiting for Brion to coordinate the schema changes, afaik.
Created attachment 5828 [details] Update previous patch Same patch, works exactly as intended. Updated to apply cleanly to HEAD.
Put both field additions into the same patch. They can share an add_field call. Change the rd_ns_title index, it will need to have rd_interwiki at the start of it. - array('rd_namespace', 'rd_title'), + array( 'rd_namespace', 'rd_title', 'rd_interwiki', 'rd_fragment' ), Use "*", it's shorter - 'rd_title' => $retval->getDBKey() + 'rd_title' => $retval->getDBKey(), + 'rd_title' => $retval->getDBKey(), You probably only need one of these. public function insertRedirect() { - $retval = Title::newFromRedirect( $this->getContent() ); + $retval = Title::newFromRedirectRecurse( $this->getContent() ); This appears to be wrong and broken. There are two choices: have the redirect table point to the final destination and use a separate tracking table to update the table when the intermediate redirects change, or have the redirect table point to the immediate destination and follow the chain on fetch. This appears to be like the first option, except missing the complex update logic.
This was done in r69181, r51583.