Last modified: 2014-09-22 23:23:07 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T57377, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 55377 - API: Implement rctitles parameter in query=recentchanges
API: Implement rctitles parameter in query=recentchanges
Status: NEW
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
1.22.0
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-10-06 18:58 UTC by Krinkle
Modified: 2014-09-22 23:23 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Krinkle 2013-10-06 18:58:35 UTC
See also:
* bug 12394 (make it easier to retrieve rcid of a newpage creation)
* bug 15552 (things like Special:RecentChangesLinked and in order to use a generator module with recentchanges)

This would enable:

* Getting rcid of an unpatrolled page creation from the API.
* Use recentchanges with a generator module.
* Use recentchanges with watch list (maybe?).
* Allow tools to mass-patrol all unpatrolled changes to a single page.



(bug 12394 comment #7)
> (bug 12394 comment #6)
> > (bug 12394 comment #5)
> > > However, I see in the current database structure there's a seperate column for
> > > rc_title.
> > > I'm not sure since when this exists, and/or if it was originally utilized, but
> > > when using that in the query (AND WHERE rc_title='Foobar') it'd be like any
> > > other condition currently in the recentchanges API right (same thing for
> > > rc_user, with $this->addWhereFld(); )
> > > 
> > The original implementation did do a WHERE on rc_namespace and rc_title, yes,
> > but that's not the same as doing a WHERE on rc_user because the latter is
> > indexed. Implementing this feature would require adding an index for it
> > (kinda
> > leery of that) and even then it'd have to sort by namespace, then title, then
> > timestamp in order to work efficiently.
> 
> There is a rc_namespace_title index though. But unlike the one for rc_user,
> it
> doesn't have rc_timestamp.
> 
> mediawiki-core@master:/maintenance/tables.sql:
>  INDEX rc_timestamp ON recentchanges (rc_timestamp);
>  INDEX rc_namespace_title ON recentchanges (rc_namespace, rc_title);
>  INDEX rc_cur_id ON recentchanges (rc_cur_id);
>  INDEX new_name_timestamp ON recentchanges
> (rc_new,rc_namespace,rc_timestamp);
>  INDEX rc_ip ON recentchanges (rc_ip);
>  INDEX rc_ns_usertext ON recentchanges (rc_namespace, rc_user_text);
>  INDEX rc_user_text ON recentchanges (rc_user_text, rc_timestamp);
> 
> 
> Would it make sense for rc_namespace_title to have it? I wonder what it is
> used
> for and if those uses would have a problem with the extra rc_timestamp sort.
> The default sort for rc_namespace_title is presumably rc_id which should have
> be very close to the sort order of rc_timestamp.
> 
> The main reason it needs rc_timestamp is not for the sort order, but to be
> able
> to do rcstart and rcend.
Comment 1 Krinkle 2013-10-12 01:26:23 UTC
@Roan: Are you really sure we need more indices for this? What about watchlist.

Watchlist uses the same recentchanges table with a major INNER JOIN on watchlist (as opposed to just a few "WHERE rc_title IN ( ..)").

Watchlist also does the rc_timestamp sort and range.

Of course, it is possible that this just proves the opposite (watchlist is also slow, but we can't remove it, and we can remove rctitles from the API to avoid it from getting worse).

Is that the case?
Comment 2 Brad Jorsch 2013-10-15 16:10:35 UTC
(In reply to comment #0)
> * Use recentchanges with a generator module.

To use it with a generator you'd need to use "titles" rather than "rctitles", and you'd probably also want to move the module to "prop=recentchanges" instead of "list=recentchanges" (breaking BC). At that point, adding a "prop=recentchanges" (which might need a different name) might make more sense.

(In reply to comment #1)
> Of course, it is possible that this just proves the opposite (watchlist is
> also slow, but we can't remove it, and we can remove rctitles from the API
> to avoid it from getting worse).

I think that's a good possibility. The query used for the watchlist filesorts, which is something we try to avoid in the API for performance reasons.

Is this why we have a special database query group just for the watchlist?
Comment 3 Krinkle 2013-10-28 09:43:42 UTC
Also, in addition to Watchlist, there is also RecentchangesLinked which also essentially does this already.
Comment 4 Krinkle 2013-10-28 10:04:59 UTC
(In reply to comment #2)
> (In reply to comment #0)
> > * Use recentchanges with a generator module.
> 
> To use it with a generator you'd need to use "titles" rather than "rctitles",
> and you'd probably also want to move the module to "prop=recentchanges"
> instead of "list=recentchanges"..

Hm.. right, our generator model requires a separate module if the titles filter is optional (e.g. prop=categories requires titles=, so you can use a generator module as input, but prop=categories itself can also be a generator to another module). 

Recentchanges is already a generator module (it can be a provider of titles to another query module), but can't yet take titles= or a generator as input. And since we must keep titles= as an optional param, the filtered version of it has to be a (separate) prop module.
Comment 5 Gerrit Notification Bot 2013-10-30 16:08:37 UTC
Change 92654 had a related patch set uploaded by Krinkle:
[WIP] recentchanges: Implement support for pagesets

https://gerrit.wikimedia.org/r/92654
Comment 6 Krinkle 2013-10-30 18:25:45 UTC
@Brad: Using a prop module might make sense, but the problem there is usability.

From the application's perspective, title is an arbitrary filter, just like 'user', 'namespace' or 'tag'. It would be a pain if one would have to radically change the query just to use the title filter.

However if the indexes support it, we might as well do both. I don't see why we wouldn't implement a 'rctitles'. I'm rephrasing the bug back to how it was. In this case the main purpose (and why I filed the bug) is to be able to filter by title on the recentchanges query.

Another problem is that (afaik) prop wouldn't work on pages that have been renamed or wear a different title now. e.g. it wouldn't be querying rc_title/rc_namespace but page_namespace/page_title/page_id, and rc_namespace/rc_title, which is not going to work as it will miss edits made before and between renames of a page.

We'd have to map it to rc_cur_id inside the prop query.

Anyway, builing a prop module for getting recent changes made to one or more pages is nice, but not what this bug is asking for (plus, it'd need an extra index).
Comment 7 Gerrit Notification Bot 2014-05-28 12:35:53 UTC
Change 92654 abandoned by Krinkle:
[WIP] recentchanges: Implement support for pagesets

Reason:
Meh, no longer cool. Going to use rcstream instead with a live page filter on the client-side.. Can't wait for the new API to happen with the app I'm working on.

https://gerrit.wikimedia.org/r/92654

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links