Last modified: 2011-11-15 21:13:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T19784, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 17784 - API Call to do SemanticQueries
API Call to do SemanticQueries
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Semantic MediaWiki (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Markus Krötzsch
: patch, patch-need-review
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2009-03-04 12:59 UTC by David Tabachnikov
Modified: 2011-11-15 21:13 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Add "Ask" to the MediaWiki API. (7.89 KB, patch)
2009-03-04 12:59 UTC, David Tabachnikov
Details
Ask API for Semantic MediaWiki, fixing some of Roan's comments. (8.35 KB, patch)
2009-03-05 17:39 UTC, David Tabachnikov
Details

Description David Tabachnikov 2009-03-04 12:59:31 UTC
Created attachment 5885 [details]
Add "Ask" to the MediaWiki API.

There's no way to make semantic queries about the Wiki from the outside. The current patch allows to make SMW queries through the API, (action=query&list=ask), and specify various sorting options. Future additions could be to add various fields to the result (?something).
Comment 1 Roan Kattouw 2009-03-04 15:15:44 UTC
>+	private $prefix, $module, $description;
All three of these are superfluous: $description isn't even used, $prefix is only used once in the constructor and can be eliminated easily, and every usage of $this->module should be replaced by $this->getModuleName().

>+	public function execute() {
>+			$this->run();		
>+	}
>+	
>+	/**
>+	 * Entry point when coming from a generator
>+	 */
>+	public function executeGenerator($resultPageSet) {
>+		$this->resultPageSet = $resultPageSet;
>+		$this->run();
>+	}
>...
>+	private function run() {	
This strikes me as odd, since you could've just renamed run() to execute() here. The way the core API modules do this is with function run($resultPageSet = null), with execute() and executeGenerator() wrapping around them. (Letting execute() take an optional $resultPageSet parameter isn't allowed, since it's reimplemented from ApiBase which declares it with an empty parameter list.)

>+	private function returnCount($res) {
>+		$data = array("count" => $res);
>+		
>+		$result = $this->getResult();
>+		$result->setIndexedTagName($data, 'count');
>+		$result->addValue('query', $this->getModuleName(), $data);		
>+	}
This looks horribly broken; if I understand correctly, this'll try to put an object in the result, which leads to nasty errors and is not what you want. You probably want something like intval($res->getCount()) (dunno how SMW's result objects work exactly), in which case you also don't need the setIndexedTagName() call because $data doesn't have numerical indices.

>+						$data[] = array(
>+							'pageid' => intval($object->getArticleID()),
>+							'ns' => intval($title->getNamespace()),
>+							'title' => $title->getPrefixedText());
>...
>+			$result->addValue('query', $this->getModuleName(), $data);
Since a recent breaking change in the API (r46845), your code should be prepared to handle the case in which $data doesn't fit into the result object, in which case addValue() will return false. This means you have to add the results one by one and check addValue()'s return value, like this:

$vals = array(
	'pageid' => intval($object->getArticleID()),
	'ns' => intval($title->getNamespace()),
	'title' => $title->getPrefixedText());
$fit = $result->addValue(array('query', $this->getModuleName()), null, $vals);
if(!$fit) { set a query-continue and break out of the loop }
...
// And after the loop:
$result->setIndexedTagName_internal(array('query', $this->getModuleName()), 'page');

Note that this code only works with MediaWiki 1.15alpha; 1.14 and earlier versions don't have the setIndexedTagName_internal() method.

>+			$this->resultPageSet->populateFromPageIDs($wikiPageIDs);
Since you're not filling $wikiPageIDs anywhere, generator mode is also completely broken.

>+	private function parseQuery($query) {
>+		global $wgParser;
>+
>+		// Initialize the parser options
>+		$options = new ParserOptions();
>+		
>+		// Preprocess the query, before running
>+		$parsedQuery = $wgParser->transformMsg($query, $options);
>+
>+		// Return the parsed query
>+		return $parsedQuery;
>+	}
This could be shortened drastically to just return $wgParser->transformMsg($query, $options); . You should use $wgParser->preprocess() instead, though, since transformMsg() uses $wgTitle (which is null, which'll probably cause nasty errors).


>+		global $wgRequest, $wgOut, $smwgQEnabled, $smwgQMaxLimit, $wgUser, $smwgQSortingSupport;
All of these globals are unused.

>+		if ($params['_use_parser'] == true) {
Just using if($params['_use_parser']) will do; the API will ensure that $params['_use_parser'] is a boolean.

>+	protected function getLinker($firstcol = false) {
>+		if ( ($firstcol && $this->mLinkFirst) || (!$firstcol && $this->mLinkOthers) ) {
>+			return $this->mLinker;
>+		} else {
>+			return NULL;
>+		}
>+	}
This function isn't called from anywhere and uses unset member variables. I guess it can be removed safely.


>+	public function getAllowedParams() {
>+		return array(
>+			"_query" => array(
I don't get why underscores are used here. If you really want every parameter to look like &ask_foo, you should set your prefix to "ask_" instead of "ask"; parameter names with underscores in them aren't used in the core API, though.

>+			"_sort" => array(
>+				ApiBase :: PARAM_TYPE => "string",
>+			),
You can use ApiBase::PARAM_ISMULTI => true here to allow for multiple values separated with pipes (|), which is how multivalue fields are usually done in the API. The API parameter handling code will ensure that $params['_sort'] is an array rather than a pipe- (or comma-)separated list in that case. Also note that 'string' is the default parameter type, you don't have to set it explicitly.

>+			"_sort_order" => array(
>+				ApiBase :: PARAM_TYPE => "string"	
>+			),
If the only values allowed are ASC and DESC, you'd probably be better off with:
			'_sort_order' => array(
				ApiBase :: PARAM_TYPE => array(
					'asc',
					'desc'
				),
				ApiBase :: PARAM_ISMULTI => true,
				ApiBase :: PARAM_ALLOW_DUPLICATES => true
			),
Which'll do the same thing as I mentioned above, except that only the values 'asc' and 'desc' are accepted and that duplicate values are allowed.

>+			"_limit" => array(
>+				ApiBase :: PARAM_TYPE => "integer",
>+				ApiBase :: PARAM_DFLT  => "20"
>+			),
There is a special limit type that lets you define different maxima for users and sysops/bots and supports limit=max (this one doesn't even *have* a maximum, which is evil). That'd look like this:
			'limit' => array (
				ApiBase :: PARAM_TYPE => 'limit',
				ApiBase :: PARAM_DFLT => 20,
				ApiBase :: PARAM_MIN => 1,
				ApiBase :: PARAM_MAX => ApiBase :: LIMIT_BIG1,
				ApiBase :: PARAM_MAX2 => ApiBase :: LIMIT_BIG2
			),
Note that this uses integers instead of strings for DLFT and MIN (because that makes sense), and uses the constants ApiBase::LIMIT_BIG1 and BIG2 which have the values 500 and 5000 respectively. If SMW needs lower maxima, use those, of course.

>+			"_offset" => array(
>+				ApiBase :: PARAM_TYPE => "integer",
>+				ApiBase :: PARAM_DFLT  => "0"
>+			),
This can be done with '_offset' => 0, as well which'll declare _offset to be an integer defaulting to 0. You'll need the array syntax if you want more advanced stuff like min and max, of course.

>+			"_use_parser" => array(
>+				ApiBase::PARAM_TYPE => "boolean",
>+				ApiBase::PARAM_DFLT => false
>+			),
Similarly, this can be done (and is usually done) with '_use_parser_' => false,

>+			"_count" => array(
>+				ApiBase :: PARAM_TYPE => "boolean"
>+			),
Same here. I don't see why you didn't give this one a default value while _use_parser does have one, but either way the default for a boolean parameter can't be anything else than false.

Also note that all of this stuff (defaults, min/max, pipe separation) doesn't need to be mentioned in the parameter descriptions: the API automatically adds this information.
Comment 2 David Tabachnikov 2009-03-05 17:38:43 UTC
(In reply to comment #1)

The patch I submitted was a refactored version of a custom API call I made for our special use case. 
Some of the problems you talk about come from that. Sorry for that, I shuold've checked it more 
throuhgly before submitting.

> >+	private $prefix, $module, $description;
> All three of these are superfluous: $description isn't even used, $prefix is
> only used once in the constructor and can be eliminated easily, and every usage
> of $this->module should be replaced by $this->getModuleName().
> 
Removed $module and $description, but left $prefix, so our custom API call could override it.

> >+	private function returnCount($res) {
> >+		$data = array("count" => $res);
> >+		
> >+		$result = $this->getResult();
> >+		$result->setIndexedTagName($data, 'count');
> >+		$result->addValue('query', $this->getModuleName(), $data);		
> >+	}
> This looks horribly broken; if I understand correctly, this'll try to put an
> object in the result, which leads to nasty errors and is not what you want. You
> probably want something like intval($res->getCount()) (dunno how SMW's result
> objects work exactly), in which case you also don't need the
> setIndexedTagName() call because $data doesn't have numerical indices.
> 
Actually, I checked that, and what happens, is that when the Semantic MediaWiki 
processor runs in "count" mode, the resutl of getQueryResult is an string with the 
result of the count, not a SMWQueryResult object. Might be broken - but it's broken
inside SMW.

> >+						$data[] = array(
> >+							'pageid' => intval($object->getArticleID()),
> >+							'ns' => intval($title->getNamespace()),
> >+							'title' => $title->getPrefixedText());
> >...
> >+			$result->addValue('query', $this->getModuleName(), $data);
> Since a recent breaking change in the API (r46845), your code should be
> prepared to handle the case in which $data doesn't fit into the result object,
> in which case addValue() will return false. This means you have to add the
> results one by one and check addValue()'s return value, like this:
> 
> $vals = array(
>         'pageid' => intval($object->getArticleID()),
>         'ns' => intval($title->getNamespace()),
>         'title' => $title->getPrefixedText());
> $fit = $result->addValue(array('query', $this->getModuleName()), null, $vals);
> if(!$fit) { set a query-continue and break out of the loop }
> ...
> // And after the loop:
> $result->setIndexedTagName_internal(array('query', $this->getModuleName()),
> 'page');
> 
> Note that this code only works with MediaWiki 1.15alpha; 1.14 and earlier
> versions don't have the setIndexedTagName_internal() method.
> 
I need this code to work in 1.14 and eaerlier (we currently use 1.13.5). I'd really appreciate
any points on how to properly write it so it will work in both 1.13 and 1.15?

It could be really great if you could give me some pointers as to how to do this properly.

> >+	private function parseQuery($query) {
> >+		global $wgParser;
> >+
> >+		// Initialize the parser options
> >+		$options = new ParserOptions();
> >+		
> >+		// Preprocess the query, before running
> >+		$parsedQuery = $wgParser->transformMsg($query, $options);
> >+
> >+		// Return the parsed query
> >+		return $parsedQuery;
> >+	}
> This could be shortened drastically to just return
> $wgParser->transformMsg($query, $options); . You should use
> $wgParser->preprocess() instead, though, since transformMsg() uses $wgTitle
> (which is null, which'll probably cause nasty errors).
> 
From what I've read in the comments, I'm supposed to use transformMsg from the outside of the parser. 
And without ParserOptions, the function doesn't work. Also, preprocess() expects a $title as well.
Also, changed it to be protected - I think it's good that parseQuery is overridable.

> >+	public function getAllowedParams() {
> >+		return array(
> >+			"_query" => array(
> I don't get why underscores are used here. If you really want every parameter
> to look like &ask_foo, you should set your prefix to "ask_" instead of "ask";
> parameter names with underscores in them aren't used in the core API, though.

I changed the prefix instead, indeed. I think asksortorder doesn't look as good 
- but if it's against the naming convention I can change that.
Comment 3 David Tabachnikov 2009-03-05 17:39:56 UTC
Created attachment 5892 [details]
Ask API for Semantic MediaWiki, fixing some of Roan's comments.
Comment 4 Roan Kattouw 2009-03-10 18:32:38 UTC
(In reply to comment #2)
> Actually, I checked that, and what happens, is that when the Semantic MediaWiki 
> processor runs in "count" mode, the resutl of getQueryResult is an string with
> the 
> result of the count, not a SMWQueryResult object. Might be broken - but it's
> broken
> inside SMW.
> 
Right... So much for intuitive behavior :)


> > Note that this code only works with MediaWiki 1.15alpha; 1.14 and earlier
> > versions don't have the setIndexedTagName_internal() method.
> > 
> I need this code to work in 1.14 and eaerlier (we currently use 1.13.5). I'd
> really appreciate
> any points on how to properly write it so it will work in both 1.13 and 1.15?
> 
> It could be really great if you could give me some pointers as to how to do
> this properly.
> 
You probably want to check that with method_exists($this->getResult(), 'setIndexedTagName_internal') . If that returns true, you can (and should) use the 'new' way of adding results.

> > You should use
> > $wgParser->preprocess() instead, though, since transformMsg() uses $wgTitle
> > (which is null, which'll probably cause nasty errors).
> > 
> From what I've read in the comments, I'm supposed to use transformMsg from the
> outside of the parser.
preprocess() can also be called from outside the parser, AFAICT.

> And without ParserOptions, the function doesn't work. Also, preprocess()
> expects a $title as well.
You can provide an empty ParserOptions object and a suitable title object (no idea what's suitable in this context), like this:

global $wgParser;
$title = Title::newFromText("Whatever"); // Or whatever you need to get a suitable Title
return $wgParser->preprocess($query, $title, new ParserOptions

> I changed the prefix instead, indeed. I think asksortorder doesn't look as good 
> - but if it's against the naming convention I can change that.
> 
It's not a big deal, it's just that no other module uses underscores, so it looked kind of weird to me.
Comment 5 Roan Kattouw 2009-03-10 18:51:14 UTC
(In reply to comment #1)
> >+			"_sort" => array(
> >+				ApiBase :: PARAM_TYPE => "string",
> >+			),
> You can use ApiBase::PARAM_ISMULTI => true here to allow for multiple values
> separated with pipes (|), which is how multivalue fields are usually done in
> the API. The API parameter handling code will ensure that $params['_sort'] is
> an array rather than a pipe- (or comma-)separated list in that case. Also note
> that 'string' is the default parameter type, you don't have to set it
> explicitly.
> 
While you did follow this piece of advice...

> >+			"_sort_order" => array(
> >+				ApiBase :: PARAM_TYPE => "string"	
> >+			),
> If the only values allowed are ASC and DESC, you'd probably be better off with:
>                         '_sort_order' => array(
>                                 ApiBase :: PARAM_TYPE => array(
>                                         'asc',
>                                         'desc'
>                                 ),
>                                 ApiBase :: PARAM_ISMULTI => true,
>                                 ApiBase :: PARAM_ALLOW_DUPLICATES => true
>                         ),
> Which'll do the same thing as I mentioned above, except that only the values
> 'asc' and 'desc' are accepted and that duplicate values are allowed.
> 
You didn't follow this one. Did you overlook it or do you have a reason to do this?
Comment 6 Jeroen De Dauw 2011-11-15 21:13:48 UTC
Hey David, sorry no one picked up on this patch. It's useful functionality which I ended up implementing myself a month or two back, without knowing about this patch.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links