Last modified: 2012-12-08 08:56:07 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 34508 - [Regression] IRC string output for log messages no longer compatible
[Regression] IRC string output for log messages no longer compatible
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Recent changes (Other open bugs)
1.19
All All
: High critical (vote)
: 1.19.0 release
Assigned To: Krinkle
: code-update-regression
Depends on:
Blocks: 16599 31217 42851
  Show dependency treegraph
 
Reported: 2012-02-19 06:53 UTC by Seahorse
Modified: 2012-12-08 08:56 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
bug34508.1.patch (2.37 KB, patch)
2012-02-21 02:34 UTC, Rob Lanphier
Details
bug34508.2.patch (2.22 KB, patch)
2012-02-21 05:39 UTC, Rob Lanphier
Details
bug34508.3.patch (4.53 KB, patch)
2012-02-21 07:45 UTC, Rob Lanphier
Details

Description Seahorse 2012-02-19 06:53:49 UTC
A change in 1.19 to the irc.wikimedia.org recent changes feed has broken multiple bots and tools that rely on it. It appears that the change was made to log reports, but I am not sure exactly what was changed in the format. The Countervandalism network bots are failing to report log actions, and StewardBot is not properly reporting user right changes. I have heard of other tools being affected by this too. If this can be looked into, that would be great. Thanks.
Comment 1 p858snake 2012-02-19 07:05:17 UTC
If you could mention what has actually changed (before/after examples would also be helpful) would be great.
Comment 2 p858snake 2012-02-19 07:05:55 UTC
I'm going to guess this is due to the rewrite of how we manage our logging system on the MW end as well.
Comment 3 Rob Lanphier 2012-02-19 07:21:30 UTC
Can someone point to the web pages/maintainer information for at least one or two of the most important bots in question?
Comment 4 Rob Lanphier 2012-02-19 07:38:24 UTC
/me looks at http://countervandalism.net/wiki/List_of_bots
Comment 5 Krinkle 2012-02-19 07:44:53 UTC
Could be related to the special page name ("Log") being translated. Although that was supposed to be reverted and afaik was.

There is an API to translate special page names but right now bot maintainers don't know about this and it has worked for years.

The the CVN[1] this is breaking all after-log-action hooks they have (including those for reporting of local blocks and preventing cross-wiki vandalism). As well as our hooks for the patrol-action, delete-
action, upload-action (commons upload patrol!) on others.

- Krinkle

[1] https://meta.wikimedia.org/wiki/CVN
Comment 7 Rob Lanphier 2012-02-19 08:05:49 UTC
Krinkle, is r97711 the one you were thinking of?
Comment 8 Niklas Laxström 2012-02-19 09:48:48 UTC
The only difference I see are in the actual log message text, which comes from MediaWiki i18n files: the username is now prepended in the message, and the exact wording of some logs have changed.

If that is the case it's not a simple one line fix, either you revert the whole new logging system (which you can't, since it stores logs in format not compatible with 1.18), or write new code that only emulates the old log format output. Would this be a sign that you could help us to make proper replacement like XMPP?

Anyway, please confirm what is the actual cause, Krinkle already asked before this bug was reported.
Comment 9 Nemo 2012-02-19 09:55:13 UTC
(In reply to comment #5)
> Could be related to the special page name ("Log") being translated. Although
> that was supposed to be reverted and afaik was.

That was bug 30245 and it's reverted in 1.19, so the problem is elsewhere.
I've seen mention of section names and other things which might have changed, can't we possibly have a description of what is actually broken?

----

(I've seen Philippe, robla, Jamesofur, Pathoschild, Bsadowski1 and Tanvir discussing it on IRC, I don't know where they put the result of their investigation.)

This has been told to be the old format:
<Bsadowski1> <rc-pmtpa> [[Special:Log/block]] block  * MikeLynch *  blocked User:116.240.175.168 (anon. only, account creation blocked) with an expiry time of 24 hours: [[WP:Vandalism|Vandalism]]

And this appear to be the new one:
<Nikerabbit> [[Special:Log/block]] block  * Nemo bis *  Nemo bis blocked [[User:QoyuxeLetoho]] with an expiry time of indefinite (account creation disabled, e-mail blocked, cannot edit own  talk page): Spamming links to external sites
Comment 10 Nemo 2012-02-19 10:02:18 UTC
(In reply to comment #8)
> Would this be a sign that you could help us to make proper replacement
> like XMPP?

And this is [[m:Recentchanges via XMPP]] aka bug 17450.
Comment 11 Tim Starling 2012-02-19 23:26:29 UTC
(In reply to comment #8)
> The only difference I see are in the actual log message text, which comes from
> MediaWiki i18n files: the username is now prepended in the message, and the
> exact wording of some logs have changed.
> 
> If that is the case it's not a simple one line fix, either you revert the whole
> new logging system (which you can't, since it stores logs in format not
> compatible with 1.18), or write new code that only emulates the old log format
> output. 

Can't we just revert the messages?
Comment 12 Nemo 2012-02-19 23:30:42 UTC
(In reply to comment #11)
> Can't we just revert the messages?

Niklas said «or write new code that only emulates the old log format output» because apparently the two things (RC output and logging system) are currently bundled.
Comment 13 Tim Starling 2012-02-20 00:29:05 UTC
They're not that closely bundled. The $actionComment parameter to RecentChange::notifyLog() only exists to supply a comment for IRC, this was always the case. So all you have to do to fix it is to set the right thing for $actionComment, or to update the interpretation of it in RecentChange::getIRCLine() to account for the new system.

All of the logentry-* messages in English happen to start with "$1 ". The duplicated username in the IRC log line could be fixed by taking the log message and substituting the $1 with getIRCLine()'s formatted username, something like "\00303$user\003 \0035*\003 \00310". 

Any minor changes in the wording of the log entries can be reverted. As I said on bug 30245, I'm not opposed to localisation changes in the IRC line format, as long as sufficient notice is given to bot authors. That notice hasn't been given.
Comment 14 Nemo 2012-02-20 00:56:59 UTC
I think everyone would be happy with the revert by anyone who knows how to do it (also in trunk), if it kept the new log structure on wiki.
Comment 15 Rob Lanphier 2012-02-20 17:16:09 UTC
Krinkle has started looking into this, and found that log actions weren't making it from irc://irc.wikimedia.org/#meta.wikimedia to irc://irc.freenode.net/#cvn-meta.  At least move-log has changed in 1.19, and probably other actions as well.

This is being discussed on #wikimedia-dev as of this writing.
Comment 16 Krinkle 2012-02-20 17:52:14 UTC
(In reply to comment #7)
> Krinkle, is r97711 the one you were thinking of?

Indeed, but it appears that's not the problem (which would've been weird, as that rev has been reverted).

So after a little investigation just now with the CVNBots and robla and Nikerabbit on IRC, it appears we have good news, bad news and more good news.

Good news:
The format for IRC/UDP message notifications has not been changed. As such the bots are still reporting edits, just not log messages. The log messages are quite important though (for CVN especially: block, delete, upload, move and patrol).

Most bots do (and all bots should) support changes to these i18n messages. And they do. They support this by regularly emptying their cached version of the downloaded MediaWiki interface message used for the various log actions. They extract information from log actions (such as move-origin and move-target in a 'move' log action) like so (sample taken from the C# source of SWMTBot[1], [2], [3])

>    # Project:
>    generateRegex("MediaWiki:1movedto2_redir", 2, ref moveredirRegex, false);
>    # RCReader:
>    Match mrm = ((Project)Program.prjlist[rce.project]).rmoveredirRegex.Match(rce.comment);
>    rce.movedTo = Project.translateNamespace(rce.project, mrm.Groups["item2"].Captures[0].Value);

Bad news:
The messages didn't just change (since that would not break the bots as wikis sometimes change the message locally as well and the bots support i18n updates), but the message-keys themselves and the order and meaning of the $1, $2 parameters changed. And according to Nike this is pretty much unrevertable as it is a fairly crucial part of how the new logging system works.

More good news:
The IRC line is easily overwritten in the logging system. The plan is to build a simple extension that will hook into the code (this hook doesn't exist yet but is easily added) and overwrite the irc-line with the old method (using the old message key and value and the old variable replacements).
Comment 17 Nemo 2012-02-20 21:05:56 UTC
*** Bug 34495 has been marked as a duplicate of this bug. ***
Comment 18 Rob Lanphier 2012-02-21 02:34:53 UTC
Created attachment 10055 [details]
bug34508.1.patch

Nikerabbit posted this suggested framework here:
http://translatewiki.net/static/llogs.txt 

I've extended this with the attached patch, to see if it is a workable approach.  I've handled exactly one case first: move_redir, and I think I've got something that's identical to 1.18 in this case.
Comment 19 Rob Lanphier 2012-02-21 05:39:19 UTC
Created attachment 10056 [details]
bug34508.2.patch

bug34508.2.patch represents another attempt at fixing this problem, after talking with Tim on IRC.  Tim isn't a big fan of only going halfway with splitting out the IRC code into an extension; either move it all out, or figure out how to fix the core code.

So, I took a stab at fixing the core code.  This version is a bit naive about where else LogFormatter::getPlainActionText() is used.  So, this had the unintended side effect of also changing the format that shows up in page history (though not recent changes).

If one wanted to keep going on this approach, it wouldn't be hard to create LogFormatter::getIRCActionText() and adapt this patch accordingly, having LogEvent::publish() call that instead.

Someone is going to have to take one of these approaches the rest of the way to fixing all of the log messages.  I'm indifferent about which one, but we need to get this done prior to pushing to commons.
Comment 20 Rob Lanphier 2012-02-21 07:45:16 UTC
Created attachment 10057 [details]
bug34508.3.patch

This is another stab at the problem, this time with the getIRCActionText() approach described in a previous comment.  I started the process of actually stepping through the messages, but it turned out to be a little harder than I thought.  delete/delete as found in 1.18 needed to be constructed a little bit differently than the 1.19 version.  I got stuck on delete/restore, which looks like it's coming in from a different call path than what I'm expecting.  The other messages may be a bit easier.
Comment 21 Krinkle 2012-02-21 13:50:47 UTC
Just for the record: Path 2 and 3 as it currently is would still break most bots since the parameter numbers are still incompatible with the previous messages. Patch 1 does fix the bots (not tested though, but pretty sure it is).

The following is a list of things that MediaWiki does not provide to bots and as such is hardcoded:
* Log action name (e.g "move")
* Log message edit-summary message-key (e.g. "1movedto2_redir")
* Which variable replacements represent what (e.g. movedTo = $2)

Anything else is variable and the bots support changes to that. So contrary to what I think Tim is thinking, it's totally fine if the log message edit summary reads "TimStarling renamed awesome article [[Foo]] to [[Bar]]" instead of "moved [[Foo]] to [Bar]]".... as long as the i18n message that produces that is still stored at the same message-key and replaces the variables with the values.
Comment 22 Rob Lanphier 2012-02-21 17:24:56 UTC
(In reply to comment #21)
> Just for the record: Path 2 and 3 as it currently is would still break most
> bots since the parameter numbers are still incompatible with the previous
> messages. Patch 1 does fix the bots (not tested though, but pretty sure it is).

I see what you're saying about the old messages.  What I was thinking about doing was some semi-automated approach to taking the old messages and turning them into new messages.

> Anything else is variable and the bots support changes to that. So contrary to
> what I think Tim is thinking, it's totally fine if the log message edit summary
> reads "TimStarling renamed awesome article [[Foo]] to [[Bar]]" instead of
> "moved [[Foo]] to [Bar]]".... as long as the i18n message that produces that is
> still stored at the same message-key and replaces the variables with the
> values.

Well, if we're going to maintain backwards compatibility, let's do it up, especially since we're also talking about using old messages.

I think a hybrid approach of patch #1 and patch #3 might be the way to go here.  I was initially enthusiastic about a new hook, but I've come to realize we don't want to create a new hook in haste that we have to support in perpetuity.  It's better to butcher the core a little bit in 1.19, and then potentially create a proper "external notification" interface down the road that would be implemented by a LegacyLogs extension.
Comment 23 Brion Vibber 2012-02-21 18:45:59 UTC
I'm inclined to agree.

Patch #3 is a pretty direct "back compat hack" which is at least plainish to read.

Would love to have a cleaner extension interface for 1.20, where we can implement legacy IRC logging *and* saner machine interfaces like XMPP with structured data and no tiny line length limit.
Comment 24 Rob Lanphier 2012-02-22 03:07:28 UTC
This bug should be fixed by the set of revs merged into 1.19 as r112073.  There may be individual cases that may not work, still, but the framework is in place now to fix any problems fairly quickly.
Comment 25 Niklas Laxström 2012-02-22 09:19:07 UTC
Thanks guys for doing the dirty job!
Comment 26 Krinkle 2012-02-22 18:54:08 UTC
Summary on http://etherpad.wikimedia.org/IRCBot-Messages:

Still many broken actions.
Comment 27 Krinkle 2012-02-22 18:54:18 UTC
Progress: r112045 fixes action upload.
Comment 28 Krinkle 2012-02-22 18:55:29 UTC
Looks like 'patrol' is also having a big problem. May not be related to this bug, but the line in IRC is horribly broken: 

rc-pmtpa: [[Special:Log/patrol]] patrol * Tiptoety *  marked <!-- ERROR -->revision 3486813 of <a href="/wiki/Meta:Requests_for_adminship/mmovchin_(temp)" title="Meta:Requests for adminship/mmovchin (temp)">Meta:Requests for adminship/mmovchin (temp)</a> patrolled $3

* what's <!-- ERROR -->
* It's inserting HTML instead of a [[wikilink]]
* $3 is undefined
Comment 29 Rob Lanphier 2012-02-23 19:22:58 UTC
Still need to resolve the lingering issues, but bumping down from highest.
Comment 30 Antoine "hashar" Musso (WMF) 2012-02-25 20:43:17 UTC
I have spend sometime today figuring out issue mentioned by Krinkle in comment 28. And I could not reproduce the issue, all the code is fine.

The patrol issue was fixed by Aaron with r112079 and merged to live with r112130.

So it looks like we can finally mark this bug as fixed.
Comment 31 Antoine "hashar" Musso (WMF) 2012-02-25 21:23:22 UTC
The test suite I wrote today was added with r112415.
Comment 32 Rob Lanphier 2012-02-27 21:28:13 UTC
Per http://etherpad.wikimedia.org/IRCBot-Messages , there's a couple not working.  Antoine will investigate.
Comment 33 Krinkle 2012-02-28 03:04:23 UTC
r112561 fixes protect/protect and protect/modify

updated http://etherpad.wikimedia.org/IRCBot-Messages
Comment 34 Antoine "hashar" Musso (WMF) 2012-02-28 14:49:08 UTC
The EtherPad state that delete/restore does not work, the action is bound to 'undeletedmessage'. By looking at the bot source code and our hack in 1.19wmf1, I am not sure what might prevent it from running.

Trying to get some bot user to get a status about delete/restore and the newusers/create2 new users/autocreate actions as well.
Comment 35 Krinkle 2012-02-28 16:45:21 UTC
Hashar and I did a little more research, in no particular order:

* Changed 'delete/restore' from 'fails' to 'pending test', CVNBot doesn't monitor the restore event (apparently they want/need it in the countervandalism feed)
* Of the events on the list, CVNBot does not monitor delete/restore, newusers/create2 and  newusers/autocreate.
* There are no more events on the etherpad list marked as "FAIL", only "Works" and "pending test" (create2, autocreate, restore)
* It turns out 'autocreate' event doesn't get pushed to IRC at all (not to irc.wikimedia.org either), that's a separate bug, not related to this one
* Since our unit tests as passing (including for delete/restore) and we can't find bots monitoring delete/restore to test, marking this bug as fixed.
Comment 36 Antoine "hashar" Musso (WMF) 2012-02-28 16:48:44 UTC
Well done Timo!!

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links