Last modified: 2014-09-23 23:09:22 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T28273, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 26273 - Database layer should automagically add GROUP BY columns on backends that need them (postgres)
Database layer should automagically add GROUP BY columns on backends that nee...
Status: NEW
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
1.17.x
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: 28193
  Show dependency treegraph
 
Reported: 2010-12-07 13:32 UTC by Roan Kattouw
Modified: 2014-09-23 23:09 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Roan Kattouw 2010-12-07 13:32:32 UTC
Postgres is pedantic when it comes to GROUP BY queries and demands that all selected columns that aren't used with an aggregate function be in the GROUP BY clause, even if the GROUP BY already covers a unique index (meaning grouping by more columns would have no effect).

MySQL is more flexible, causing developers to write queries that fail on Postgres for this reason. Instead of developers having to go through a lot of trouble to construct GROUP BY clauses with the right columns with the ORDER BY columns at the beginning (r68100), the database layer should do this for them.

Because the semantics of GROUP BY described here are actually in the SQL standard IIRC, there's likely to be more DB backends, so I think this should be implemented in DatabaseBase::makeSelectOptions() with DB backends needing this functionality overriding needsAllColumnsInGroupBy() or whatever to return true.
Comment 1 Mark A. Hershberger 2011-01-31 20:54:38 UTC
You know what would be really nice for this bug?  Some examples!

If you could point to a query that needs to be rewritten for Pg or another, more-strict-than-MySQL, database or a query that was re-written that would help a lot.
Comment 2 Greg Sabino Mullane 2011-03-21 15:23:25 UTC
Mark: The r68100 shows an example. I agree this would be a nice feature, but I will point out that MySQL is *too* flexible, and allows creation of queries without unique coverage. :)

There is also a implicitGroupby() attribute used by the database classes: see specials/SpecialListfiles.php for an example.
Comment 3 Aaron Schulz 2011-03-21 15:30:34 UTC
(In reply to comment #2)
> Mark: The r68100 shows an example. I agree this would be a nice feature, but I
> will point out that MySQL is *too* flexible, and allows creation of queries
> without unique coverage. :)
>

In that case there isn't a unique index on oldimage, which sucks terribly. I'd agree that GROUP BY shouldn't automatically be *that* loose.
Comment 4 Sumana Harihareswara 2011-11-02 23:11:19 UTC
We discussed this bug today in IRC and did not come to any particular conclusions on how to design this.  This is a large project, one that developers who want a substantial MediaWiki improvement task should take on.


    <G_SabinoMullane> There is a related bug to 26273 but I don't know where it is offhand. Tim and I made some hand-waving solutions for this.
    <G_SabinoMullane> Basically, we need to have MW gather the list of columns from the tables and create the GROUP BY on the fly.
    <blobaugh> G_SabinoMullane: why not have a var that lists, per table, the correct group by?
    <G_SabinoMullane> blobaugh: Because schema changes (new columns) would break it
    <djbauch> This bug (26273) used to rear its ugly head a lot. Not so much any more. It was just one special page the last time I checked. Staying away from GROUP BY 1,2,3, etc. and sticking to the more stringent rules enforced by SQL SERVER (e.g., making sure to name the fields by name rather than by alias) has helped a lot.
    <blobaugh> G_SabinoMullane: could this be a global that gets updated on schema changes manually?
    <^demon> blobaugh: We can barely keep people updating all the proper places when they do a schema change anyway...a global they'd have to keep in check too would be impossible.
    <blobaugh> ^demon: fair enough
    <G_SabinoMullane> blobaugh: That seems like a lot of work compared to simply reading the db cols directly
    <blobaugh> G_SabinoMullane: maybe, but it would make the system much faster than needing another db call
    <sumanah> djbauch: do you think it would be fairly easy to generate a list of the places in MediaWiki that name fields by alias?
    <djbauch> Sumanah: Yes, there were only two places that caused a problem that I found. One special page and one obscure place.
    <sumanah> is this a longterm project for some interested developer, then?like, GSoC level?  or intractable? or what?
    <Nikerabbit> not sure if it is worth full GSoC, but looks like non-trivial amount of work indeed

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links