Last modified: 2009-12-31 01:38:59 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T16345, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 14345 - Feature request: Extend access to the API oldreviewedpages interface
Feature request: Extend access to the API oldreviewedpages interface
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
FlaggedRevs (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Aaron Schulz
: patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2008-05-30 13:18 UTC by Guandalug
Modified: 2009-12-31 01:38 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch to add three basic API modules for FlaggedRevs (14.61 KB, patch)
2008-09-17 17:13 UTC, P.Copp
Details
correct version of the patch (14.74 KB, patch)
2008-09-17 17:29 UTC, P.Copp
Details
New patch now uses api hooks (14.08 KB, patch)
2008-09-17 23:53 UTC, P.Copp
Details
Patch with hook functions in own class / code tweaks (14.34 KB, patch)
2008-09-18 17:28 UTC, P.Copp
Details
added `flagged_level_text` output field, changed namespace param (14.50 KB, patch)
2008-10-07 11:30 UTC, P.Copp
Details

Description Guandalug 2008-05-30 13:18:46 UTC
I'd like to have access to the "flagged revisions" data via API, not only via Special: - pages. This is especially interesting for Special:OldReviewedPages and the review - status / changes of the pages therein (e.g. to gather those files only 'unreviewed' due to a template change).
Comment 1 Aaron Schulz 2008-05-30 13:20:29 UTC
Pages that are unreviewed due to template changes are not listed there, though that may be desirable.
Comment 2 Guandalug 2008-05-30 13:32:26 UTC
I had pages like those, where the only 'change' was a modified template. Of course, now that I SEARCH for such a page, there's none.... 
Comment 3 Roan Kattouw 2008-08-06 11:00:47 UTC
Changing component to FlaggedRevs. Since it's an extension, it's up to its authors to write an API module for it.
Comment 4 Aaron Schulz 2008-08-09 15:56:13 UTC
I'll wait till there is some sort of specific use for this.
Comment 5 P.Copp 2008-09-17 17:13:36 UTC
Created attachment 5341 [details]
Patch to add three basic API modules for FlaggedRevs

I believe this would be a big help to the bot owners and developers at those projects, that use the FlaggedRevs-extension. So I started and wrote three basic modules for querying the most important information about flagging status of pages:

1. ApiQueryOldreviewedpages / 'list=oldreviewedpages' : Similar to [[Special:OldReviewedPages]], list out the outdated pages, sorted by timestamp

2. ApiQueryFlagged / 'prop=flagged' : Get information about the flagging status of a whole page, such as latest stable revid, if there's a quality revision etc.

3. ApiQueryRevisionsFR / 'prop=revisions&rvprop=flagged' : extends the 'prop=revisions' module to show the flagging information about specific revisions.

I have tested this patch on my local install and had no problems with it. So I would be glad if someone would be willing to review/comment on it.

A simple testcase, that uses all 3 modules is the query:

api.php?action=query&generator=oldreviewedpages&gorlimit=1&prop=revisions|info|flagged&rvprop=ids|user|timestamp|flagged&rvlimit=50
Comment 6 P.Copp 2008-09-17 17:29:03 UTC
Created attachment 5342 [details]
correct version of the patch
Comment 7 Bryan Tong Minh 2008-09-17 17:39:07 UTC
FlaggedRevs has some sane backends I believe which should be used rather than querying the database directly.
Comment 8 P.Copp 2008-09-17 17:53:43 UTC
(In reply to comment #7)
> FlaggedRevs has some sane backends I believe which should be used rather than
> querying the database directly.
> 
I'm not sure, but I guess that would result in one database query per page/revision. Wouldn't that be slower compared to querying all pages/revisions in one batch?
Comment 9 Aaron Schulz 2008-09-17 18:02:50 UTC
Overriding prop=revisions seems to duplicate a lot of code and threaten core changes there.
Comment 10 P.Copp 2008-09-17 18:07:43 UTC
(In reply to comment #9)
> Overriding prop=revisions seems to duplicate a lot of code and threaten core
> changes there.
> 

Well, ApiQueryRevisionsFR uses ApiQueryRevisions as base class, so it does not duplicate the code and should be rather safe with respect to core changes. But you're probably right that it's not a perfect solution. Would it be an option to introduce some hooks into ApiQueryRevisions?
Comment 11 Roan Kattouw 2008-09-17 18:31:18 UTC
(In reply to comment #10)
> (In reply to comment #9)
> > Overriding prop=revisions seems to duplicate a lot of code and threaten core
> > changes there.
> > 
> 
> Well, ApiQueryRevisionsFR uses ApiQueryRevisions as base class, so it does not
> duplicate the code and should be rather safe with respect to core changes. But
> you're probably right that it's not a perfect solution. Would it be an option
> to introduce some hooks into ApiQueryRevisions?
> 
There is a hook in MW 1.14 which allows for adding parameters cleanly. The patch doesn't duplicate code, because it overrides execute() and calls parent::execute() first. Figuring out what the queried titles were from the ApiResult is ugly, there's ApiPageSet for that (which you did use in ApiQueryFlagged).

Come to think of it, we should probably have hacks in each execute() function instead of letting extensions override core modules just to add some functionality: the way things currently work makes it impossible for two extensions to extend the same core module.

As to the rest of the patch:
* Adding fp_pending_since IS NOT NULL is kind of redundant if one of $params['start'] and $params['end'] is set, because in that case addWhereRange() will already have added fp_pending_since < 'foo' (or >) to the WHERE clause
* Why on earth does the flaggedpages have (fp_page_id, fp_rev_id) as a primary key while fp_rev_id is already unique?
* You shouldn't hardcode AND and OR the way you did, Database::makeList() can build all this for you
* The default value for ordir should be older, not newer, for consistency with other modules (and because it's the most common use case)
* ApiQueryOldreviewpages::getAllowedParams() correctly accounts for !$wgFlaggedRevsNamespaces when setting the default value, but not when setting the array of allowed types (if !$wgFlaggedRevsNamespaces, that array should be array(0)). Also, strval() shouldn't be used; namespaces in API parameters should always be numeric
* Instead of using just 0, 1 and 2 for flagged_level and explaining them in the help text, add a flagged_level_text field
* In ApiQueryFlagged, don't call addValue() three times; instead, throw your stuff in an array and addValue() it in the end, like you did in ApiQueryOldreviewedpages
* If you've got an array containing pageIDs, don't call it $titles , that's kind of confusing

I'll improve the patch when I have time.
Comment 12 Aaron Schulz 2008-09-17 18:43:31 UTC
I don't like how ApiQueryRevisionsFR does it's iteration to find what rev IDs to fetch and then does a second query on that. I'd prefer the queries be combined and there be only one such iteration.

A note about 'flaggedpages' is that it's PRIMARY (fp_page_id, fp_rev_id) serves the purpose of both fetching single revision data, helping selection data for pages, and making page lists, all without any secondary look-ups. Having just a rev_id INDEX would still require a (page,rev) index for some queries, which would take up more space, overhead to maintain, and have secondary look-ups.
Comment 13 Aaron Schulz 2008-09-17 18:46:42 UTC
'flaggedrevs' I mean, not 'flaggedpages'
Comment 14 P.Copp 2008-09-17 19:08:04 UTC
(In reply to comment #11)
Thanks for your comments, I try to make an new patch with the issues fixed, soon.

A few remarks:

> Figuring out what the queried titles were from the
> ApiResult is ugly, there's ApiPageSet for that (which you did use in
> ApiQueryFlagged).
ApiPageSet won't work here, b/c it doesn't contain all the revids of the result, if a range of revisions was queried (e.g. with startid/endid).
 
> * The default value for ordir should be older, not newer, for consistency with
> other modules (and because it's the most common use case)
This module is some kind of "the other way round", as usually users want to look at the oldest unreviewed changes. That's why I changed the default value here.

> Also, strval() shouldn't be used; namespaces in API parameters
> should always be numeric
I have a problem here, cause when I don't use strval, it doesn't recognize the namespace 0. (Probably since 0==null). Is there a way to avoid this?
Comment 15 Roan Kattouw 2008-09-17 20:43:01 UTC
(In reply to comment #14)
> (In reply to comment #11)
> Thanks for your comments, I try to make an new patch with the issues fixed,
> soon.
> 
> A few remarks:
> 
> > Figuring out what the queried titles were from the
> > ApiResult is ugly, there's ApiPageSet for that (which you did use in
> > ApiQueryFlagged).
> ApiPageSet won't work here, b/c it doesn't contain all the revids of the
> result, if a range of revisions was queried (e.g. with startid/endid).
Right, forgot about that.

> > * The default value for ordir should be older, not newer, for consistency with
> > other modules (and because it's the most common use case)
> This module is some kind of "the other way round", as usually users want to
> look at the oldest unreviewed changes. That's why I changed the default value
> here.
Well if it makes sense in the context of FlaggedRevs, and Aaron agrees with it, so be it.

> > Also, strval() shouldn't be used; namespaces in API parameters
> > should always be numeric
> I have a problem here, cause when I don't use strval, it doesn't recognize the
> namespace 0. (Probably since 0==null). Is there a way to avoid this?
>
You could avoid this by setting PARAM_TYPE => 'namespace'. That would also avoid a potential bug that would hide stuff when namespaces are removed from $wgFlaggedRevsNamespaces.

Comment 16 Roan Kattouw 2008-09-17 20:46:30 UTC
(In reply to comment #12)
> I don't like how ApiQueryRevisionsFR does it's iteration to find what rev IDs
> to fetch and then does a second query on that. I'd prefer the queries be
> combined and there be only one such iteration.
I don't see two queries. He's iterating over the result of prop=revisions first, then running one query. Combining the queries would be slightly more ideal, but would result in hooks all over the place and very ugly code.

> A note about 'flaggedpages' is that it's PRIMARY (fp_page_id, fp_rev_id) serves
> the purpose of both fetching single revision data, helping selection data for
> pages, and making page lists, all without any secondary look-ups. Having just a
> rev_id INDEX would still require a (page,rev) index for some queries, which
> would take up more space, overhead to maintain, and have secondary look-ups.
> 

Is there an index on just (fp_rev_id) as well then? That would be useful in this case.

(In reply to comment #11)
> Come to think of it, we should probably have hacks
I meant hooks, of course.
> in each execute() function
> instead of letting extensions override core modules just to add some
> functionality: the way things currently work makes it impossible for two
> extensions to extend the same core module.
>
Added them in r40965.
Comment 17 P.Copp 2008-09-17 23:53:47 UTC
Created attachment 5343 [details]
New patch now uses api hooks

I tried to fix most of the issues you mentioned above. Instead of the overriding revision module, now the new hooks are used to add the revisions' flagging data.

It seems to work this way on my install, though there's perhaps one thing that could be fixed:
As the DB functions in ApiBase/ApiQueryBase are protected, I can't use them in the hook function. So I had to make another call to wfGetDB(). Don't know, if that makes any difference?
Comment 18 Aaron Schulz 2008-09-17 23:56:31 UTC
wfGetDB() calls in these cases tend to be very minor of an issue
Comment 19 Roan Kattouw 2008-09-18 12:51:20 UTC
(In reply to comment #17)
> Created an attachment (id=5343) [details]
> New patch now uses api hooks
> 
> I tried to fix most of the issues you mentioned above. Instead of the
> overriding revision module, now the new hooks are used to add the revisions'
> flagging data.
> 
> It seems to work this way on my install, though there's perhaps one thing that
> could be fixed:
> As the DB functions in ApiBase/ApiQueryBase are protected, I can't use them in
> the hook function. So I had to make another call to wfGetDB(). Don't know, if
> that makes any difference?
> 

Like Aaron said, wfGetDB() isn't the end of the world. You can get around this and other stuff by putting your hook function in a class that extends ApiQueryBase. If you do that, you can also use addTables() and friends.

* Isn't if ($pageid !== "_element" && isset($page['revisions'])) kind of redundant? _element elements are never arrays, so isset($page['revisions']) should be enough. Same for isset($rev['revid'])
* You don't need $singlePageid, array_values($revids) works just as well as array_keys() (which you do use)
* In fact, the special case of having one revid shouldn't be a special case at all: IIRC, makeList() knows how to handle one-element arrays correctly
* You probably want to improve readability a little bit by moving up the fr_user=user_id to right after the $fields assignment (although it's not that big a deal) and by commenting on how the $revids array works (it's array(revid => array(pageid, index)), and it doesn't become clear what the index is needed for until about 50 lines later)

Comment 20 P.Copp 2008-09-18 17:28:14 UTC
Created attachment 5345 [details]
Patch with hook functions in own class / code tweaks

> Like Aaron said, wfGetDB() isn't the end of the world. You can get around this
> and other stuff by putting your hook function in a class that extends
> ApiQueryBase. If you do that, you can also use addTables() and friends.
Done. Thanks for the hint :)

> * Isn't if ($pageid !== "_element" && isset($page['revisions'])) kind of
> redundant? _element elements are never arrays, so isset($page['revisions'])
> should be enough. Same for isset($rev['revid'])
Er, if $page is a string, PHP converts $page['revisions'] to $page[0], so isset(...) is always true. I tried to change the line to ´´if (array_key_exists('revisions',(array)$page))'', which seems to work, now.

> * You don't need $singlePageid, array_values($revids) works just as well as
> array_keys() (which you do use)
> * In fact, the special case of having one revid shouldn't be a special case at
> all: IIRC, makeList() knows how to handle one-element arrays correctly
> * You probably want to improve readability a little bit by moving up the
> fr_user=user_id to right after the $fields assignment (although it's not that
> big a deal) and by commenting on how the $revids array works (it's array(revid
> => array(pageid, index)), and it doesn't become clear what the index is needed
> for until about 50 lines later)
Right, I simplified the loop thing a bit, so it should be easier to read, now.
Comment 21 Roan Kattouw 2008-09-18 19:11:08 UTC
(In reply to comment #20)
> Created an attachment (id=5345) [details]
> Patch with hook functions in own class / code tweaks
> 
> > Like Aaron said, wfGetDB() isn't the end of the world. You can get around this
> > and other stuff by putting your hook function in a class that extends
> > ApiQueryBase. If you do that, you can also use addTables() and friends.
> Done. Thanks for the hint :)
Oh wait, actually subclassing ApiQueryBase doesn't help at all here. I don't know what I was smoking when I wrote that. addTables() and friends seem to be public.

Comment 22 Roan Kattouw 2008-09-18 20:48:14 UTC
The patch as it is looks good, aside from two minor things:
* Making FlaggedRevsApiHooks extend ApiQueryBase turns out not to do any good after all (my mistake)
* Using $wgFlaggedRevsNamespaces for ornamespace's PARAM_TYPE doesn't seem like a good idea to me; I think you'd be better off using PARAM_TYPE => 'namespace', because that'll let users query namespaces that used to have FlaggedRevs enabled but don't anymore

If these two issues are fixed, the patch is good enough IMO. Of course Aaron has to greenlight it as well.
Comment 23 P.Copp 2008-10-07 11:30:28 UTC
Created attachment 5398 [details]
added `flagged_level_text` output field, changed namespace param

> * Making FlaggedRevsApiHooks extend ApiQueryBase turns out not to do any good
> after all (my mistake)
I think your first thought here was the right one: as it seems I can only access ApiQueryBase' protected functions, if I call them from within a subclass of it.
> * Using $wgFlaggedRevsNamespaces for ornamespace's PARAM_TYPE doesn't seem like
> a good idea to me; I think you'd be better off using PARAM_TYPE => 'namespace',
> because that'll let users query namespaces that used to have FlaggedRevs
> enabled but don't anymore
Changed it to parameter type 'namespace' and also made it a multivalue parameter, perhaps someone might want to query more than one ns at at time..

I also added 'flagged_level_text' fields in the output as you suggested above.
Comment 24 Roan Kattouw 2008-10-07 13:05:14 UTC
(In reply to comment #23)
> Created an attachment (id=5398) [details]
> added `flagged_level_text` output field, changed namespace param
> 
> > * Making FlaggedRevsApiHooks extend ApiQueryBase turns out not to do any good
> > after all (my mistake)
> I think your first thought here was the right one: as it seems I can only
> access ApiQueryBase' protected functions, if I call them from within a subclass
> of it.

Really? Does removing 'extends ApiQueryBase' from FlaggedRevsApiHooks break stuff? That surprises me.

I'll take a closer look at this patch in a moment, and I'll probably apply it today.
Comment 25 Roan Kattouw 2008-10-07 14:28:12 UTC
Applied in r41809.
Comment 26 P.Copp 2008-10-07 15:50:52 UTC
w00t. Thanks a lot! Hope it won't get reverted too soon ;)
Comment 27 Giftpflanze 2009-11-27 06:09:07 UTC
I’d like also to have access to the ‘size’ and ‘watched’ GET parameters of the special:-page in the flagged revisions API when retrieving oldreviewedpages.
Comment 28 Aaron Schulz 2009-11-28 23:01:56 UTC
Max size added in r59533.
Comment 29 Giftpflanze 2009-11-29 04:07:10 UTC
I forgot filtering by ‘category’. (By the way: Is it possible to make filtering by category recursive (by providing the max recursing depth of course). Would it consume to much resources?)
Comment 30 Aaron Schulz 2009-12-23 05:32:44 UTC
It won't be recursive.
Comment 31 Aaron Schulz 2009-12-31 01:38:59 UTC
Done in r60520

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links