Last modified: 2007-03-06 17:29:51 UTC
As discussed on IRC, I would like to propose two extra fields for the database schema: == table revision: rev_len unsigned integer == This field would contain the length of the revision's raw text, same as page_len in page table. Having this field would tremendously help vandal-fighting bots, as it will allow simple queries for page blanking and bulk imports (fairly common forms of vandalism). It will also reduce the load on the server from such tools, because the raw text will not be needed in many cases. The length will, potentially, allow much more sophisticated analysis then what the next, rc_change field would allow. == table recentchanges: rc_change signed integer == This field would contain the size of the change (delta) between two revisions (either positive or negative). This change would also allow for quick vandalism lookups.
Adding revision.rev_len is going to require us to run a script to update the field for all revisions...
rc_change is a fairly easy change, and probably should have been filed as a separate request. Its benefits will immediatelly benefit detection of any blanking/dumping vandalisms.
If we do this, it'll have to be after 1.7 branch and we'll need to schedule downtime to upgrade the tables.
(In reply to comment #3) > If we do this, it'll have to be after 1.7 branch and we'll > need to schedule downtime to upgrade the tables. > Is there a meta page that puts together all such requests so that when an update is scheduled, all changes can be done at once?
All proposed DB changes are now at http://meta.wikimedia.org/wiki/Proposed_Database_Schema_Changes
Let's keep discussion and stuff about MediaWiki on MediaWiki.org, eh?
recentchanges.rc_old_len and recentchanges.rc_new_len have been added.
I recommend against this, as just in the English Wikipedia, it would require updating 96 million revisions. That is some major processing time. Besides, judging by the replies to adding a similar visible feature to [[Special:Watchlist]], it's going to annoy some people anyways.
Note that NULL values could be left on old rows to minimize conversion requirements. It's still a table change, but we have a good handle on how to do that now. Note also that keeping the data is a separate issue from displaying googly colored thingies on history lists.
*** This bug has been marked as a duplicate of 1723 ***