Last modified: 2014-09-23 19:47:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T33863, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 31863 - Fix use of DB schema so RenameUser is trivial
Fix use of DB schema so RenameUser is trivial
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
Renameuser (Other open bugs)
unspecified
All All
: Normal enhancement with 2 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 25377
Blocks: 23135 26816 14862
  Show dependency treegraph
 
Reported: 2011-10-21 18:56 UTC by Rob Lanphier
Modified: 2014-09-23 19:47 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Rob Lanphier 2011-10-21 18:56:28 UTC
Right now, RenameUser relies on lobbing jobs into the job queue.  However, the job queue is not designed to handle tasks in a reliable, ordered manner.

RenameUser is complicated because of the way we typically pull the user name from the database.  There are several places where we pull denormalized values for the user_text field in several tables (revision, archive, etc).  If we actually go to the source (user_name field in the user table), then renaming would be a much cheaper and more robust operation.

Examples of this cleanup are r100286 and r100300.  Aaron has done some of this work, but would like help.
Comment 1 Brion Vibber 2011-10-21 22:07:38 UTC
This may benefit from a tweak to internal APIs.

Revision::getUserText() / Revision::getRawUserText() currently pulls from the rev_user_text field (unless it got overridden by a magic coalescy thingy in the row). This means that anything pulling its own queries may be missing the original names, as it'll be stuck with rev_user_text.

If joined columns from 'user' are available when initializing the Revision object from a row, then we should use that directly; but if not, we could do an on-demand lookup via the rev_user_id if it's non-zero (local user reference), or keep the rev_user_text if it's zero (usually IP, sometimes named non-local import markers).

With that in place, the worst case scenario should be that some batch queries might be missing the join and end up doing some more row-by-row lookups (they'll probably already be doing lots of those for user/talk page existence checks, so don't worry!)... but they'll show the correct results.

Might also think about a Revision::getUserObj() or something that would hand back a fully-ready User object, rather than having to cart around (id, text) pairs all the time.
Comment 2 Aaron Schulz 2011-10-21 22:09:52 UTC
(In reply to comment #1)
> If joined columns from 'user' are available when initializing the Revision
> object from a row, then we should use that directly; but if not, we could do an
> on-demand lookup via the rev_user_id if it's non-zero (local user reference),
> or keep the rev_user_text if it's zero (usually IP, sometimes named non-local
> import markers).

Note that the "magic coalescy thingy" was replaced with just checking user_name already ;)
Comment 3 Aaron Schulz 2011-10-21 23:29:51 UTC
(In reply to comment #1)
> With that in place, the worst case scenario should be that some batch queries
> might be missing the join and end up doing some more row-by-row lookups
> (they'll probably already be doing lots of those for user/talk page existence
> checks, so don't worry!)... but they'll show the correct results.
> 

Basically done in r100475.
Comment 4 Sumana Harihareswara 2011-12-01 21:52:24 UTC
Adding Yuvi to this bug since he said he'd take a look at this.
Comment 5 Aaron Schulz 2011-12-18 21:59:45 UTC
Still lots of places that need JOINs or, preferably, batch lookups.
Comment 6 Chris Steipp 2013-01-02 16:54:42 UTC
Do we have a list of these anywhere? We need to do renames in the very near future, and this would make it much easier.
Comment 7 Krinkle 2014-03-13 05:06:12 UTC
The places MediaWiki core currently actively looks at a user_text column that isn't from the user table are listed here:

* https://github.com/wikimedia/mediawiki-extensions-Renameuser/blob/REL1_22/RenameuserSQL.php#L67-L90
* https://github.com/wikimedia/mediawiki-extensions-Renameuser/blob/REL1_22/renameUserCleanup.php#L149-L155
* https://github.com/wikimedia/mediawiki-extensions-Renameuser/blob/REL1_22/RenameUserJob.php#L55-L92


As of writing:

* revision      . rev_user_text
* archive       . ar_user_text
* logging       . log_user_text
* image         . img_user_text
* oldimage      . oi_user_text
* filearchive   . fa_user_text
* recentchanges . rc_user_text
Comment 8 Sam Reed (reedy) 2014-07-09 23:12:10 UTC
Extensions (especially WMF used ones) need auditing for this too...

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links