Last modified: 2013-01-13 19:03:05 UTC
I ran a query on the Toolserver to determine the number of active users: SELECT rc_user_text FROM recentchanges; I then ran the output through a de-dupe script and then filtered out IPv4 addresses. The result is a 264,911-line file. http://en.wikipedia.org/wiki/Special:Statistics says that there are 153,015 active users. Perhaps there's a flaw in my methodology? But more likely it seems we have yet another stats reporter that's broken. :-/
How about filtering out new accounts? From SiteStats: # Get non-bot users than did some recent action other than making accounts. # If account creation is included, the number gets inflated ~20+ fold on enwiki.
(In reply to comment #0) > I ran a query on the Toolserver to determine the number of active users: > > SELECT rc_user_text FROM recentchanges;l > > I then ran the output through a de-dupe script and then filtered out IPv4 > addresses. A more useful query would probably be SELECT DISTINCT rc_user_text FROM recentchanges WHERE rc_user != 0 AND rc_type != 3; This filters out duplicates, IPs and log entries (can't think of a better mechanism to exclude account creations this quickly).
Received by pastebin from OverlordQ: SELECT COUNT(DISTINCT rc_user_text) FROM recentchanges WHERE rc_user != 0 AND rc_bot = 0 AND (rc_log_type != 'newusers' OR rc_log_type IS NULL); +------------------------------+ | COUNT(DISTINCT rc_user_text) | +------------------------------+ | 153024 | +------------------------------+ 1 row in set (1 min 57.97 sec) mysql> SELECT COUNT(DISTINCT rc_user_text) FROM recentchanges WHERE rc_user != 0; +------------------------------+ | COUNT(DISTINCT rc_user_text) | +------------------------------+ | 264739 | +------------------------------+ 1 row in set (14.11 sec) So, the number isn't wrong, per se. Just rather confusing... I've adjusted the bug summary accordingly. Possible options to add clarification: link to a MW.org page describing how the statistic is calculated (and what each piece means, how often it's refreshed, etc.). I believe the Job queue includes a link by default now in MediaWiki core.
Also should be able to set the 'active user period' separately from $wgRCMaxAge. I do not want 2 years of users marked as active, but I do want two years of RC available.
Besides, the number is pretty useless if it cannot be compared between two wikies. It shouldn't be hard to cap it to the current default length of recent changes, for example.
Since r69495 there is $wgActiveUserDays