Last modified: 2013-08-09 22:44:39 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T24613, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 22613 - MySQL syntax error in function tableNamesWithUseIndexOrJOIN when further tables are added. MySQL requires parentheses in FROM (table1,table2) if a JOIN follows
MySQL syntax error in function tableNamesWithUseIndexOrJOIN when further tabl...
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
1.22.0
All All
: Normal normal (vote)
: 1.22.0 release
Assigned To: Nobody - You can work on this!
http://dev.mysql.com/doc/refman/5.0/e...
: patch, patch-reviewed, testme
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-02-22 07:12 UTC by T. Gries
Modified: 2013-08-09 22:44 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
ad-hoc and hackish patch - experimental and for showing a solution for a specific query - not to be committed to SVN (1.32 KB, patch)
2010-02-22 07:13 UTC, T. Gries
Details

Description T. Gries 2010-02-22 07:12:28 UTC
During my work using the hook "SpecialRecentChangesQuery" (code and detailed analysis see [1]) I found a reproducible problem which arise only under the following conditions:

- if MySQL >= 5.0.12 AND
- if the hook function for SpecialRecentChangesQuery adds table(s) to $table[].

Analysis:

The code in [1] modifies the Recent Changes main SQL statement to this 

SELECT  *  FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0'  ORDER BY rc_timestamp DESC LIMIT 50  

This throws an error Unknown column 'rc_id' in 'on clause' (localhost) (MySQL >= 5.0.12 due to new JOIN processing)

This ad-hoc modification works (parentheses added around the table names outside the JOIN)

SELECT  *  FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0'  ORDER BY rc_timestamp DESC LIMIT 50  

I add an ad-hoc and very hacky - only experimental - patch, which corrects the problem in a certain case for [1]. The patch is not mentioned for SVN submisson.

Citing [2]: Beginning with MySQL 5.0.12, natural joins and joins with USING, including outer join variants,  are processed according to the SQL:2003 standard. The goal was to align the syntax and semantics of MySQL with respect to NATURAL JOIN and JOIN ... USING according to SQL:2003. However, these changes in join processing can result in different output columns for some joins. Also, some queries that appeared to work correctly in older versions must be rewritten to comply with the standard.

Citing [3]:
SELECT * FROM t1, t2 JOIN t3 ON (t1.i1 = t3.i3);

Previously, the SELECT was legal due to the implicit grouping of t1,t2 as (t1,t2). Now the JOIN takes precedence, so the operands for the ON clause are t2 and t3. Because t1.i1 is not a column in either of the operands, the result is an Unknown column 't1.i1' in 'on clause' error. To allow the join to be processed, group the first two tables explicitly with parentheses so that the operands for the ON clause are (t1,t2) and t3:

SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3);

Alternatively, avoid the use of the comma operator and use JOIN instead:
SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3);


[1] http://www.mediawiki.org/wiki/Extension:OnlyRecentRecentChanges
[2] MySQL Manual Join Processing Changes in MySQL 5.0.12
http://dev.mysql.com/doc/refman/5.0/en/join.html
[3] Bug #19053 MySQL Unknown column in 'on clause'
http://bugs.mysql.com/bug.php?id=19053
Comment 1 T. Gries 2010-02-22 07:13:41 UTC
Created attachment 7157 [details]
ad-hoc and hackish patch - experimental and for showing a solution for  a specific query - not to be committed to SVN
Comment 2 T. Gries 2010-02-22 07:18:45 UTC
*** CORRECTION ***

SELECT  *  FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid)  ORDER BY rc_timestamp DESC LIMIT 50 

This throws an error Unknown column 'rc_id' in 'on clause' (localhost) (MySQL
>= 5.0.12 due to new JOIN processing)
 
This ad-hoc modification works (parentheses added around the table names
outside the JOIN)

SELECT  *  FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >= '20100211000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid)  ORDER BY rc_timestamp DESC LIMIT 50
Comment 3 Andre Klapper 2013-07-24 13:07:05 UTC
T. Gries: In case this is still an issue, willing to put that patch into Gerrit?

Adding "patch-reviewed" as it says "not to commit".
Comment 4 T. Gries 2013-07-24 18:10:38 UTC
(In reply to comment #3)
> T. Gries: In case this is still an issue, willing to put that patch into
> Gerrit?
> 
> Adding "patch-reviewed" as it says "not to commit".

uh, this patch is old, from 2010.

Leave open, perhaps someone of the database experts can check my observations which are in detail explained here.
Comment 5 T. Gries 2013-08-09 20:42:24 UTC
Hi. I started investigations and found, that the problem still exists.


The reason is that the MySQL JOIN syntax changed in MySQL 5.0.12 (!) see
- http://bugs.mysql.com/bug.php?id=19053
- http://dev.mysql.com/doc/refman/5.0/en/join.html


Corresponding changes have never been done in $IP/includes/database/db.php or - I mean - in the MySQL driver

Suggested solution: 

add parentheses around tables FROM (recentchanges .., page) in all database statements for MySQL.
Comment 6 T. Gries 2013-08-09 20:47:57 UTC
this does NOT work:

SELECT  rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags  FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >= '20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid)  ORDER BY rc_timestamp DESC LIMIT 50  


With the correct parentheses around -- see the FROM () -- it DOES work

SELECT  rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags  FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `watchlist` ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >= '20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid)  ORDER BY rc_timestamp DESC LIMIT 50
Comment 7 T. Gries 2013-08-09 20:49:04 UTC
the above is generated by my extension as published in 
http://www.mediawiki.org/wiki/Extension:OnlyRecentRecentChanges#Code

basically:
 
$dir = dirname( __FILE__ );
$wgExtensionMessagesFiles['onlyrecentrecentchanges'] = $dir . '/OnlyRecentRecentChanges.i18n.php';
$wgHooks['SpecialRecentChangesQuery'][] = 'onSpecialRecentChangesQuery';
 
// see http://www.mediawiki.org/wiki/Manual:Hooks/SpecialRecentChangesQuery
function onSpecialRecentChangesQuery( &$conds, &$tables, &$join_conds, $opts, &$query_options = array(), &$select = array() ) {
        if ( !in_array( 'page', $tables ) ) $tables[] = 'page';
        $conds[] = 'page_latest=rc_this_oldid';
        return true;
}
Comment 8 T. Gries 2013-08-09 20:51:20 UTC
tl;dr:

Suggested solution: 
===================

Add parentheses around tables FROM (recentchanges .., page) in all database
statements for MySQL at the last stage before committing the query.
Comment 9 T. Gries 2013-08-09 21:01:33 UTC
Source: http://dev.mysql.com/doc/refman/5.0/en/join.html

http://i.imgur.com/wVjBBqY.png


Previously, the comma operator (,) and JOIN both had the same precedence, so the join expression t1, t2 JOIN t3 was interpreted as ((t1, t2) JOIN t3). Now JOIN has higher precedence, so the expression is interpreted as (t1, (t2 JOIN t3)). This change affects statements that use an ON clause, because that clause can refer only to columns in the operands of the join, and the change in precedence changes interpretation of what those operands are.

Example:

CREATE TABLE t1 (i1 INT, j1 INT);
CREATE TABLE t2 (i2 INT, j2 INT);
CREATE TABLE t3 (i3 INT, j3 INT);
INSERT INTO t1 VALUES(1,1);
INSERT INTO t2 VALUES(1,1);
INSERT INTO t3 VALUES(1,1);
SELECT * FROM t1, t2 JOIN t3 ON (t1.i1 = t3.i3);

Previously, the SELECT was legal due to the implicit grouping of t1,t2 as (t1,t2). Now the JOIN takes precedence, so the operands for the ON clause are t2 and t3. Because t1.i1 is not a column in either of the operands, the result is an Unknown column 't1.i1' in 'on clause' error. 


******** IMPORTANT
To allow the join to be processed, group the first two tables explicitly with parentheses so that the operands for the ON clause are (t1,t2) and t3:

SELECT * FROM (t1, t2) JOIN t3 ON (t1.i1 = t3.i3);
**********


Alternatively, avoid the use of the comma operator and use JOIN instead:

SELECT * FROM t1 JOIN t2 JOIN t3 ON (t1.i1 = t3.i3);

This change also applies to statements that mix the comma operator with INNER JOIN, CROSS JOIN, LEFT JOIN, and RIGHT JOIN, all of which now have higher precedence than the comma operator. 


Source: http://dev.mysql.com/doc/refman/5.0/en/join.html
Comment 10 T. Gries 2013-08-09 22:43:20 UTC
update to comment #6 https://bugzilla.wikimedia.org/show_bug.cgi?id=22613#c6

this does NOT work:

SELECT 
rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags
 FROM `recentchanges` FORCE INDEX (rc_timestamp),`page` LEFT JOIN `watchlist`
ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT
JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >=
'20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid)  ORDER BY
rc_timestamp DESC LIMIT 50  


With the correct parentheses around -- see the FROM () -- it DOES work

SELECT 
rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags
 FROM (`recentchanges` FORCE INDEX (rc_timestamp),`page`) LEFT JOIN `watchlist`
ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT
JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >=
'20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid)  ORDER BY
rc_timestamp DESC LIMIT 50


The following also works. It is an alternative which does not require a core code change, i.e. this version does NOT require additional parentheses (I swapped the order: the additional table `page` ist listed as the first FROM table name):


SELECT 
rc_id,rc_timestamp,rc_cur_time,rc_user,rc_user_text,rc_namespace,rc_title,rc_comment,rc_minor,rc_bot,rc_new,rc_cur_id,rc_this_oldid,rc_last_oldid,rc_type,rc_patrolled,rc_ip,rc_old_len,rc_new_len,rc_deleted,rc_logid,rc_log_type,rc_log_action,rc_params,wl_user,wl_notificationtimestamp,ts_tags
 FROM ,`page`,`recentchanges` FORCE INDEX (rc_timestamp) LEFT JOIN `watchlist`
ON (wl_user = '1' AND (wl_title=rc_title) AND (wl_namespace=rc_namespace)) LEFT
JOIN `tag_summary` ON ((ts_rc_id=rc_id))  WHERE (rc_timestamp >=
'20130802000000') AND rc_bot = '0' AND (page_latest=rc_this_oldid)  ORDER BY
rc_timestamp DESC LIMIT 50


This is done by changing

function onSpecialRecentChangesQuery( &$conds, &$tables, &$join_conds, $opts, &$query_options = array(), &$select = array() ) {
        if ( !in_array( 'page', $tables ) ) $tables[] = 'page';
        $conds[] = 'page_latest=rc_this_oldid';
        return true;
}

to

function onSpecialRecentChangesQuery( &$conds, &$tables, &$join_conds, $opts, &$query_options = array(), &$select = array() ) {
        if ( !in_array( 'page', $tables ) ) array_unshift( $tables, 'page' );
        $conds[] = 'page_latest=rc_this_oldid';
        return true;
}
Comment 11 T. Gries 2013-08-09 22:44:39 UTC
Problem for extension http://www.mediawiki.org/wiki/Extension:OnlyRecentRecentChanges#Code is solved, so I am closing this bug report, even when the general statement applies.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links