Last modified: 2014-07-07 12:09:23 UTC
SUMMARY: I propose that each page have an "attention score" describing how often the page has been edited. This feature is intended to address the "Siegenthaler problem" by indicating whether a page has been lightly or heavily edited. The "attention score" would be calculated from the data displayed in page history, and would ideally have the following features: (1) Attention scores would be higher if more total edits were made; (2) Attention scores would be higher if more unique authors made edits; (3) Attention scores would be higher if the time between edits were lower. Item (3) is the key point, as a page that has gone through fierce editing (or an "edit war") should be distinguished from one that hasn't been gone over carefully. Since (1) and (3) are counting tasks, the algorithm should be O(n) with respect to number of edits. Including (2) might cause the algorithm to be O(n^2), however; the exact impact on Wikipedia server load would depend on the number of unique authors, and might need to be established empiracally. Another possible feature of an "attention score" would be the following; (4) Attention scores would be higher if the page was edited recently. However, this might be dropped for computational expense reasons. If only factors (1) - (3) were involved, the attention score would only need to be calculated once per edit; adding (4) would require a dynamic calculation with each page view. The advantages of such a feature are twofold: (1) it allows any user to make a snap judgement as to whether the page is rarely edited (and therefore potentially questionable) or heavily edited (and therefore, if not necessarily trustworthy, at least well examined). (2) it allows for the possibility of editors targeting either lightly or heavily edited pages, as necessary, using one convienent statistic.
I made a similar proposal at http://de.wikipedia.org/wiki/Wikipedia:Verbesserungsvorschl%C3%A4ge/Feature-Requests In my opinion you don't even need a calcuted "attention score". It would already be helpful to indicate 1. the number of registered users, who edited a page, 2. the number of respective anonymous users (IPs), 3. the total number of edits. This data could be presented quite simple, like (34, 65, 340). This offers more transparancy than a calculated attention score. Nevertheless, I agree that a short resumee of the page history should be presented.
Please compare this study: Andrew Lih, "Wikipedia as Participatory Journalism: Reliable Sources? Metrics for evaluating collaborative media as a news resource" (http://jmsc.hku.hk/faculty/alih/publications/utaustin-2004-wikipedia-rc2.pdf) Lih writes: For the purposes of the study, two metrics are used as a simple measure for the reputation of the article within the Wikipedia: • Rigor (total number of edits for an article) – The assumption is that more editing cycles on an article provides for a deeper treatment of the subject or more scrutiny of the content. In Wikipedia, edits can be marked as major or minor, with the latter used for indicating something that can largely be ignored by others and inconsequential to the overall editorial position, such as fixing a typo or reformatting the page. Since this is a voluntary flag, and the use of the minor edit flag is inconsistent, at this time the study considers all edits, major or minor, as equal. In the future, a more intelligent decision could be used with minor edits in combination with the edit comments. • Diversity (total number of unique users) – With more editors, there are more voices and different points of view for a given subject. Users come in the form of registered users (ie. User:Bob) or anonymous users, who do not register but show up as Internet addresses (ie. 192.168.0.10). The study tracks the number of unique users who have edited the article in question, regardless of whether they are registered or anonymous.
The problem with this is that articles with low attention scores will be targeted.