Last modified: 2012-12-03 17:44:28 UTC
Raw access to apache/squid http access log files would be nice. This would allow
individual enthusiasts (including me) to run various statistics on those logs files
for the purpose of boosting communal participation.
For example, I am interested in knowing which articles in English are most accessed
from Bulgaria and which of them are missing so that I can spend some effort improving
them. To the best of my understanding, this information is not available on any of
the reports automatically generated by Wikipedia.
I also believe that many other legitimate uses of the raw log files would be found,
including academical ones, which could regard Wikipedia as a mini-Intenet of sorts,
for which both the full contents (the SQL article dump), the change history, and the
access logs are known. Non of this is available for the real Internet, which may make
Wikipedia a valuable playground for the evaluation of PageRank-like relevancy metrics
Finally, I believe that downloading compressed logs should not place undue burden on
Thank you in advance for considering this suggestion and keep up the good work.
*** Bug 3029 has been marked as a duplicate of this bug. ***
*** Bug 3030 has been marked as a duplicate of this bug. ***
Isn't what logwood is for ? Probably innocence can help you there.
Sorry forgot the link:
Data are in a database, so most probably more reports can be made.
this has been requested several times in the past and refused each time.
Thanks a lot for the quick reply. Can I find any links to past discussions on
this? Is that a decision on principle or there are technical limitations, in other
words, can people sponsor Wikipedia to get the logs?
the most recent one was here:
wikitech-l, although i think there were previous discussions as well (i don't
have links handy, but Google might be able to find them...)
[mass-moving wikistats reports from Wikimedia→Statistics to Analytics→Wikistats to have stats issues under one Bugzilla product (see bug 42088) - sorry for the bugspam!]