Last modified: 2013-10-31 11:25:00 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T30488, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 28488 - Implement revisionized properties table
Implement revisionized properties table
Status: NEW
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: 28476
  Show dependency treegraph
 
Reported: 2011-04-11 13:18 UTC by Krinkle
Modified: 2013-10-31 11:25 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Krinkle 2011-04-11 13:18:40 UTC
Right now there is no way to attach information to an article without lozing it over time (like "page_props" table does)

By having a revisionized / versioned properties table (like page_props) many (if not, all) of the following will be possible:

1) Store page protection settings with the revision, undoing/rollbacking will bring back protection info. As will deletion/undeletion

2) Move categories out of of wikitext. It has been proposed to do this before (ie. store only in categorylinks and report changes in a null-revision edit summary, like with protection currently) - however that is prone to abuse since undoing a revision would mean having to manually copy/paste categories from the history page edit summaries.

3) Maybe move langlinks out of wikitext ?

4) File properties [2]

5) Custom data for exentions (the prop_type column can be used by extension to store other information, that would otherwise have to be stored in a new-table. Some extensions appear to be doing this currently which could cause many tables for the same purpose on a single wiki).

Using a versioned properties table will solve these problems. The revision is connected to a set of properties, and undoing the revision will re-use the previous set of properties (just like a rollback re-uses the same mw_text.oldid_id / mw_revision.rev_id)

It also saves storage in the database as the set is only re-saved when something has actually changed.

In other words, if the user only made a change in the article text, the same old propid is used. If only props are modifed, the same textid/oldid stays in use.

The properties table would have it's sets identified by a unique id, stored in a column in the mw_revision table [1]. The properties table would either be it's own incrementing integer or use the revision id. Comparison:
* properties-id
** Since multiple rows belong together the id spans multiple rows. An incrental ID that spans multiple rows is not supported in MySQL and the only solution I can think of is either keeping track of the id elsewhere, or getting the last row and using the next number. Both are not clean.
* revision-id
** Using the revision-id is a lot easier. The revision data is saved, the revision-id is known and used to store the properties. This also makes it easy to track which revision last modified the properties (since the id matches the revision-id that created the set of properties).

I think using the revision-id is probably the better choise. Only down side could be that it may cause confusion since it would look like the revision-id, not sure if that's an issue.


Highlights:
* Store data in properties table, versioned and each set has it's own id. If only props change, same text.oldid is used, if only text changes same propid is used.
* Connected revision to a set of properties, like text id (ie. a rollback re-uses the oldid that revision, same would be for properties. Rolling back an edit creates a null-revision with the same old text id and properties id.


--
Krinkle


[1] Adding a column to mw_revision is expensive to say the least but I'm not sure there's a clean and long-term effient way around it.
[2] bug 25624 and http://www.mediawiki.org/wiki/License_integration

See also:
* (bug 167) Use a dedicated interface for adding interwiki/category links, not wikitext
* (bug 25624) Making license and author information api accessible
* (bug 835) Syntax to transclude a page without categories and langlinks
* (bug 22293) Show previous protection level in protection log
* more...
Comment 1 Krinkle 2011-04-11 14:14:56 UTC
Also, maybe tackleable with this:
* (bug 28476) Rejecting a page move does not undo the change made to the title.
* (bug 4433) rollback link for a page move should revert the move
Comment 2 Bawolff (Brian Wolff) 2011-04-11 15:37:17 UTC
Hopefully this isn't too stupid a question ;).

So in this scheme we have a table that would have an entry for categorylinks something (roughly) like:

revision id: 123
prop_type: categorylink
cl_to: some category
cl_from: some page_id

And say you wanted to grab everything in category foo. How would you do that since its now hard to distinguish between current entries and historical entries.

---
As an aside, a versioned links table would also help with bug 7148 (show category additions/removals on watchlist)
Comment 3 Krinkle 2011-04-12 22:06:44 UTC
Fyi:

This request (alteast the way I intended it) does not suggest to deprecate any tables (including categorylinks) at all.
A central, effecient, clean categorylinks table is perfet (aggregated to only contain the current status, which it does now).


However, if you would want to go that route (I didn't mean to suggest that, but it's an interesting thought nonetheless), it doesn't have to be a problem:


-- example start --

Article [[Page]] was categorized in Lorem and Foo. In Foo it is sorted under "Mysort".

mw_revision:
* example row of an edit that changed categories
rev_id: 123
rev_text_id: 120
rev_comment: "Re-categorized [removed: [[Category:Foo|Foo]]; added: [[Category:Bar|Bar]] ]"
// comment is like the nulledits for changing protection settings
rev_props_id: 8

mw_magicpropsthingtable:
prop_id: 7 |  prop_type: categorylink | prop_val: 'Lorem'
prop_id: 7 |  prop_type: categorylink | prop_val: 'Foo'
prop_id: 8 |  prop_type: categorylink | prop_val: 'Lorem'
prop_id: 8 |  prop_type: categorylink | prop_val: 'Bar'


-- example end --

While writing this I just realized the sortykey would have to be stored as well, and also that this value doesn't have to be indexed, only retrieved when needed. So it may be better to use a single row [1] and serialize it into a blob:

mw_magicpropsthingtable:
prop_id: 7 |  prop_blob: serialize(array(
 'categorylinks' => array(
   array( 'Lorem', 'Mysort' ),
   array( 'Bar', ''),
 ))
prop_id: 8 |  prop_blob: serialize(array(
 'categorylinks' => array(
   array( 'Foo', '' ),
   array( 'Lorem', 'Mysort'),
 ))


The prop_blob would a multi-line text (like log_params) or serialized php (like old_flags, as example above).


--
Krinkle

[1]: This would also solve the problem with getting an id for prop_id, it can be an auto-increment now.
Comment 4 Aaron Schulz 2011-05-29 23:31:46 UTC
(In reply to comment #0)
> 2) Move categories out of of wikitext. It has been proposed to do this before
> (ie. store only in categorylinks and report changes in a null-revision edit
> summary, like with protection currently) - however that is prone to abuse since
> undoing a revision would mean having to manually copy/paste categories from the
> history page edit summaries.

I'd be skeptical of this one. It would take up a lot of space and many categories are only included via templates. Not sure how this would work out.
Comment 5 Krinkle 2011-07-06 10:13:47 UTC
(In reply to comment #4)
> (In reply to comment #0)
> > 2) Move categories out of of wikitext. It has been proposed to do this before
> > (ie. store only in categorylinks and report changes in a null-revision edit
> > summary, like with protection currently) - however that is prone to abuse since
> > undoing a revision would mean having to manually copy/paste categories from the
> > history page edit summaries.
> 
> I'd be skeptical of this one. It would take up a lot of space and many
> categories are only included via templates. Not sure how this would work out.

See bug 167.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links