Last modified: 2007-03-06 17:29:51 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T8277, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 6277 - Add revision.rev_length column to track revision sizes
Add revision.rev_length column to track revision sizes
Status: RESOLVED DUPLICATE of bug 1723
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
All All
: Normal enhancement with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
Depends on:
Blocks: 9116
  Show dependency treegraph
Reported: 2006-06-11 20:24 UTC by Yuri Astrakhan
Modified: 2007-03-06 17:29 UTC (History)
0 users

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Yuri Astrakhan 2006-06-11 20:24:13 UTC
As discussed on IRC, I would like to propose two extra fields for the database

== table revision: rev_len unsigned integer ==
This field would contain the length of the revision's raw text, same as page_len
in page table. Having this field would tremendously help vandal-fighting bots,
as it will allow simple queries for page blanking and bulk imports (fairly
common forms of vandalism). It will also reduce the load on the server from such
tools, because the raw text will not be needed in many cases. The length will,
potentially, allow much more sophisticated analysis then what the next,
rc_change field would allow. 

== table recentchanges: rc_change signed integer ==
This field would contain the size of the change (delta) between two revisions
(either positive or negative). This change would also allow for quick vandalism
Comment 1 Rob Church 2006-06-12 10:36:07 UTC
Adding revision.rev_len is going to require us to run a script to update the
field for all revisions...
Comment 2 Yuri Astrakhan 2006-06-25 21:24:44 UTC
rc_change is a fairly easy change, and probably should have been filed as a
separate request. Its benefits will immediatelly benefit detection of any
blanking/dumping vandalisms.
Comment 3 Brion Vibber 2006-06-25 21:29:50 UTC
If we do this, it'll have to be after 1.7 branch and we'll 
need to schedule downtime to upgrade the tables.
Comment 4 Yuri Astrakhan 2006-06-26 04:22:34 UTC
(In reply to comment #3)
> If we do this, it'll have to be after 1.7 branch and we'll 
> need to schedule downtime to upgrade the tables.

Is there a meta page that puts together all such requests so that when an update
is scheduled, all changes can be done at once?
Comment 5 Yuri Astrakhan 2006-08-23 23:50:14 UTC
All proposed DB changes are now at
Comment 6 Rob Church 2006-08-23 23:52:50 UTC
Let's keep discussion and stuff about MediaWiki on, eh?
Comment 7 Rob Church 2006-12-24 08:11:33 UTC
recentchanges.rc_old_len and recentchanges.rc_new_len have been added.
Comment 8 Titoxd 2006-12-24 08:18:18 UTC
I recommend against this, as just in the English Wikipedia, it would require
updating 96 million revisions. That is some major processing time. Besides,
judging by the replies to adding a similar visible feature to
[[Special:Watchlist]], it's going to annoy some people anyways.
Comment 9 Brion Vibber 2006-12-24 08:38:36 UTC
Note that NULL values could be left on old rows to minimize conversion
requirements. It's still a table change, but we have a good handle on how to do
that now.

Note also that keeping the data is a separate issue from displaying googly
colored thingies on history lists.
Comment 10 Rob Church 2007-03-06 17:29:51 UTC

*** This bug has been marked as a duplicate of 1723 ***

Note You need to log in before you can comment on or make changes to this bug.