Last modified: 2014-01-31 03:43:39 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T17434, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 15434 - Periodical run of currently disabled special pages
Periodical run of currently disabled special pages
Status: RESOLVED FIXED
Product: Wikimedia
Classification: Unclassified
Site requests (Other open bugs)
unspecified
All All
: Normal enhancement with 12 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: shell
: 1861 9265 11435 13852 15714 15755 16759 16871 16898 20786 25098 25162 26501 30439 47470 48678 (view as bug list)
Depends on: 46094 39667 43668
Blocks: 29782 39661 42179
  Show dependency treegraph
 
Reported: 2008-09-01 22:07 UTC by aliter
Modified: 2014-01-31 03:43 UTC (History)
49 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Current log from update special pages log (1.40 MB, text/x-log)
2013-01-26 20:12 UTC, Sam Reed (reedy)
Details
Update of all reports, also disabled, on de.wiki (1.88 KB, text/x-log)
2013-01-30 16:38 UTC, Nemo
Details
crontabs on the maintenance server terbium, including the new ones for this bug (11.48 KB, text/plain)
2013-07-31 11:38 UTC, Nemo
Details

Description aliter 2008-09-01 22:07:32 UTC
We're now running on Wanted Pages information that is one year old. If that's unavoidable it might be better to remove the option from the Special Pages list. I would prefer to see an occasional update, however. If daily updates are too heavy a burden, maybe monthly or quarterly?
Comment 1 Filip Maljkovic [Dungodung] 2008-09-01 22:52:06 UTC
Hello. I made a list with my bot here: http://fy.wiktionary.org/wiki/Meidogger:FelixBot/wanted
Even though this is not a fix, it might help you with your current needs :) 
Comment 2 Aaron Schulz 2008-09-01 23:11:17 UTC
Let's not make Domas (a DBA here) cry ;) From what I gather, this query is very very slow and intensive, even for a background query. AKAIK, most slow special pages are done periodically with updateSpecialPages.php, but this one was disabled.
Comment 3 Filip Maljkovic [Dungodung] 2008-09-01 23:14:26 UTC
For small projects like this one, generating such a page doesn't take more than a few seconds (1.69 sec in this case). Maybe special pages like these should be enabled in the updateSpecialPages.php, but only for a selected number of wikis (actually, most of them would fit here).
Comment 4 Raimond Spekking 2008-12-22 17:38:29 UTC
*** Bug 16759 has been marked as a duplicate of this bug. ***
Comment 5 Raimond Spekking 2008-12-22 17:41:51 UTC
*** Bug 11435 has been marked as a duplicate of this bug. ***
Comment 6 Raimond Spekking 2008-12-22 17:47:12 UTC
*** Bug 13852 has been marked as a duplicate of this bug. ***
Comment 7 Raimond Spekking 2009-01-04 10:19:09 UTC
*** Bug 16871 has been marked as a duplicate of this bug. ***
Comment 8 p858snake 2009-01-04 11:00:21 UTC
Not a major issue, changing severity to trivial.
Comment 9 Danny B. 2009-01-04 18:47:20 UTC
Not only Special:WantedPages needs to be updated because of being disabled -> changing the summary.

I guess monthly update of all of them on all projects spreading the work through entire month (say each day run update of currently disabled special pages on 1/30 of wikis or some other division) could work.

It wouldn't be too expensive at once (server side need) and it's pretty enough for users - better than the current nothing (client side need). Both satisfied.

Less expensive pages currently disabled could be ran weekly on the same principle...

Also, small wikis, as been said in comments above, have these pages very inexpensive, so they could be ran daily as they used to be. Maybe some weighing according to the # of pages could work as well, say (eg.) sites with 1-10k pages daily, 10k-100k weekly, 100k-300k bi-weekly, 300k+ monthly or so...

Any other solution than permanent disabling is better.
Comment 10 Melancholie 2009-01-06 03:18:23 UTC
Q: Is this bug dependent on bug 16112 ("Special:Wanted* - purge link tables")?

If yes, the basic problem seems to get fixed, see
https://bugzilla.wikimedia.org/show_bug.cgi?id=16112#c5
Comment 11 p858snake 2009-01-06 03:27:33 UTC
*** Bug 16898 has been marked as a duplicate of this bug. ***
Comment 12 Danny B. 2009-01-06 04:17:50 UTC
*** Bug 16898 has been marked as a duplicate of this bug. ***
Comment 13 Danny B. 2009-09-20 23:26:34 UTC
*** Bug 15714 has been marked as a duplicate of this bug. ***
Comment 14 Danny B. 2009-09-23 22:19:49 UTC
*** Bug 20786 has been marked as a duplicate of this bug. ***
Comment 15 aliter 2009-11-19 00:32:29 UTC
Considering the number of separate reports, this is not a trivial issue, changing severity to minor.
Comment 16 Raimond Spekking 2010-09-13 11:53:35 UTC
*** Bug 25162 has been marked as a duplicate of this bug. ***
Comment 17 Raimond Spekking 2010-09-13 11:54:49 UTC
*** Bug 25098 has been marked as a duplicate of this bug. ***
Comment 18 Antoine "hashar" Musso (WMF) 2011-01-18 17:59:37 UTC
Special pages disabled for all wikis :

Ancientpages
CrossNamespaceLinks
Deadendpages
Fewestrevisions
Mostlinked
Mostrevisions
Wantedpages

We might as well hide them / disable them.
Comment 19 Mark A. Hershberger 2011-01-19 19:48:31 UTC
Ashar, can you make this change?

Looks like http://en.wikipedia.org/wiki/Special:AncientPages is already disabled ...
Comment 20 Mark A. Hershberger 2011-01-29 01:38:19 UTC
*** Bug 26501 has been marked as a duplicate of this bug. ***
Comment 21 Trần Nguyễn Minh Huy 2011-02-01 09:14:23 UTC
http://vi.wikipedia.org/wiki/%C4%90%E1%BA%B7c_bi%E1%BB%87t:Trang_%C4%91%C6%B0%E1%BB%9Dng_c%C3%B9ng

Haven't update cache
Comment 22 Antoine "hashar" Musso (WMF) 2011-02-01 18:56:36 UTC
Reopening this since the root cause is not fixed.

Either:
1)  WMF sysadmin setup the periodical refresh (once per week?)
2) Developer code something to hide the disabled special page and stop puzzling users with never updated caches.
Comment 23 Danny B. 2011-03-14 13:04:52 UTC
*** Bug 16871 has been marked as a duplicate of this bug. ***
Comment 24 Jarry1250 2011-03-22 23:05:41 UTC
It was mentioned on bug #14786 that WantedPages could be not all that resource intensive. Might periodic updates not still remain an option, then, even on the larger wikis (including en.wp)?
Comment 25 Trần Nguyễn Minh Huy 2011-03-23 05:09:31 UTC
Yeah, including vi.wp, too.
Comment 26 p858snake 2011-05-04 06:27:58 UTC
*** Bug 28710 has been marked as a duplicate of this bug. ***
Comment 27 Roan Kattouw 2011-05-15 09:51:02 UTC
*** Bug 9265 has been marked as a duplicate of this bug. ***
Comment 28 Mark A. Hershberger 2011-06-17 23:45:22 UTC
pdhanda is supposed to figure out a schedule to get these updates to run more often .... we also plan on being updating the page to say the NEXT time it will run.
Comment 29 Mark A. Hershberger 2011-06-30 15:57:25 UTC
Tim has said some queries should never be run.  I've asked him to add a list here, but I suspect they are the same ones that are listed in Comment #18.

Also, need to find someone in Ops to schedule runs of the other queries since pdhanda probably won't have a chance to do it.
Comment 30 Sam Reed (reedy) 2011-08-18 13:06:48 UTC
*** Bug 30439 has been marked as a duplicate of this bug. ***
Comment 31 bennylin 2011-09-26 12:37:18 UTC
(In reply to comment #9)
> Not only Special:WantedPages needs to be updated because of being disabled ->
> changing the summary.
> 
> I guess monthly update of all of them on all projects spreading the work
> through entire month (say each day run update of currently disabled special
> pages on 1/30 of wikis or some other division) could work.
> 
> It wouldn't be too expensive at once (server side need) and it's pretty enough
> for users - better than the current nothing (client side need). Both satisfied.
> 
> Less expensive pages currently disabled could be ran weekly on the same
> principle...
> 
> Also, small wikis, as been said in comments above, have these pages very
> inexpensive, so they could be ran daily as they used to be. Maybe some weighing
> according to the # of pages could work as well, say (eg.) sites with 1-10k
> pages daily, 10k-100k weekly, 100k-300k bi-weekly, 300k+ monthly or so...
> 
> Any other solution than permanent disabling is better.

Agree with Danny. Should define "small wikis" first.
Comment 32 Krinkle 2011-10-28 21:48:17 UTC
Even without special "more often" treatment, having all wikis treated as big wikis is good enough too. Anything is better than the current situation.

Last update October 2009...
Comment 33 Thehelpfulone 2012-05-12 21:03:56 UTC
Any update?
Comment 34 matanya 2012-07-23 07:25:05 UTC
I think we can just remove those pages. The benefit is too low in compare to what we gain.
Comment 35 Jarry1250 2012-07-23 08:40:38 UTC
(In reply to comment #34)
> I think we can just remove those pages. The benefit is too low in compare to
> what we gain.

I can't agree with that at the moment. We are yet to have a comment here about which queries are do-able for smaller wikis; yet more might be optimisable enough to run on larger wikis. There's no way I can advocate sweeping these pages under the rug until that issue is looked into.
Comment 36 bennylin 2012-07-23 09:54:40 UTC
For those who are active in smaller wikis who wishes to have updated stats, you might as well download the dump, run it locally, and publish the result to your community. I did it that way on id.wp. Granted it's not gonna be weekly, and might not be feasible for larger ones, but I see that nobody shared this before, so it's an option.
Comment 37 DavidL 2012-08-26 13:16:18 UTC
Blocker bug cannot be minor.
Comment 38 Krinkle 2012-08-26 14:40:02 UTC
*** Bug 1861 has been marked as a duplicate of this bug. ***
Comment 39 Krinkle 2012-08-26 14:46:17 UTC
Currently disabled (last updated October 2009):

* DeadendPages
* AncientPages
* LonelyPages
* UncategorizedCategories
* WantedPages
* WantedTemplates

Currently disabled (last updated 2007):

* FewestRevisions
Comment 40 MZMcBride 2012-08-26 15:23:38 UTC
(In reply to comment #35)
> I can't agree with that at the moment. We are yet to have a comment here about
> which queries are do-able for smaller wikis; yet more might be optimisable
> enough to run on larger wikis. There's no way I can advocate sweeping these
> pages under the rug until that issue is looked into.

I split this issue out to bug 39667 ("Divide wikis into database lists by approximate size for performance engineering"). Punishing small wikis due to their larger brethren has never made sense. This needs to be fixed.
Comment 41 Servien 2012-08-26 19:53:36 UTC
If anyone can answer my, that would be great. If it can be turned on, that
would be even better.
----

Special:Wantedpages hasn't been updated since 2009 on the Low Saxon Wikipedia
(nds-nl), what is the reason for that, is it possible to turn this feature on?
Someone has compiled a list in the past for the Low Saxon Wikipedia using a
bot, unfortunately this user isn't active anymore. Is there someone who can
help me with this?
Comment 42 Nemo 2012-08-26 20:07:09 UTC
(In reply to comment #41)
> Special:Wantedpages hasn't been updated since 2009 on the Low Saxon Wikipedia
> (nds-nl), what is the reason for that, is it possible to turn this feature on?

Nobody really knows.

> Someone has compiled a list in the past for the Low Saxon Wikipedia using a
> bot, unfortunately this user isn't active anymore. Is there someone who can
> help me with this?

https://wiki.toolserver.org/view/DBQ
Comment 43 Dennis C. During 2012-10-11 17:08:05 UTC
For English Wiktionary I would be very happy if this were run monthly or quarterly limited to pages in principal namespace, wanted from pages in principal namespace. An annual run to and from all spaces might be sufficient for other maintenance, IMO.
Comment 44 John Mark Vandenberg 2012-10-12 03:20:39 UTC
Once a year would be good for large projects. They could then build wikiprojects to manage the task of creating all important missing articles.
Comment 45 Amir E. Aharoni 2012-10-17 10:36:23 UTC
So, here's another specific request from the Kyrgyz Wikipedia:
They find https://ky.wikipedia.org/wiki/Special:DeadendPages useful for improving their content and encouraging participation. Unfortunately, it was last updated in 2009.

Is there really a significant performance problem with updating these pages? I'm going through the comments here, and unless I'm missing something, no actual examples of performance issues were given.

Bennylin suggested to "define small wikis". Maybe instead of defining them we could just put some time/CPU/RAM limit on the query?
Comment 46 Nemo 2012-11-18 23:23:43 UTC
Bug 39667 is progressing, but in the meanwhile I've tried to suggest an alternative approach at Gerrit change #33713.
The "idea" would be to update only one page per cluster at a time, (one or) two times per year: but for all wikis.
Comment 47 Nemo 2012-12-07 13:14:16 UTC
Further (non-)updates: I sent an e-mail to wikitech-l on this about a week ago but got no feedback so far (I never have luck with wikitech-l emails ;) ).
http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/65463/
Comment 48 Andre Klapper 2013-01-04 15:09:38 UTC
(In reply to comment #47)
> Further (non-)updates: I sent an e-mail to wikitech-l on this about a week
> ago
> but got no feedback so far (I never have luck with wikitech-l emails ;) ).

Try again & make clear why it's an important issue & what you want to know? :)
Comment 49 Jarry1250 2013-01-04 15:16:22 UTC
Andre, can you chase this up with Ops? Open a RT ticket or something? 

https://gerrit.wikimedia.org/r/#/c/33713/ needs review and/or merging. Only Ops can flick that switch, which is the bottleneck here.
Comment 50 Dennis C. During 2013-01-04 16:32:41 UTC
Would it be possible to do at least annual runs of WantedPages for principal namespace on en.wiktionary.org? 

All the redlinks are annoying to look at. Some are valid terms that we would want to include. Others just need to be unwikilinked because they would not meet our standards for inclusion. I can understand why the smaller wikis get more benefit for the resource cost, but the large wikis benefit as well. Even if some dump processing could, in principle, generate the same content, inclusive MW runs are good ways of catching idiosyncratic uses of the wiki, some of which are undesirable.

Would you need a vote from en.wiktionary to support this?
Comment 51 JackPotte 2013-01-04 16:33:54 UTC
and fr.wikt please ;)
Comment 52 Bawolff (Brian Wolff) 2013-01-04 16:57:56 UTC
(In reply to comment #50)
> Would you need a vote from en.wiktionary to support this?

That's unnecessary. We know everyone wants the special pages being run and I'm sure that the moment we're sure they can be safely run, they will be.
Comment 53 Sam Reed (reedy) 2013-01-17 18:21:56 UTC
*** Bug 15755 has been marked as a duplicate of this bug. ***
Comment 54 Malafaya 2013-01-25 11:30:30 UTC
The special pages haven't updated in 5 days at pt.wikt. Is it related to the moving of servers?
Comment 55 Malafaya 2013-01-25 16:53:52 UTC
Check this link, for example: https://pt.wiktionary.org/wiki/Especial:Categorias_pedidas

This one is 6 days old: https://nl.wikipedia.org/wiki/Speciaal:GevraagdeCategorie%C3%ABn
Comment 56 Sam Reed (reedy) 2013-01-25 16:59:12 UTC
(In reply to comment #54)
> The special pages haven't updated in 5 days at pt.wikt. Is it related to the
> moving of servers?

Seems quite likely.

(In reply to comment #55)
> Check this link, for example:
> https://pt.wiktionary.org/wiki/Especial:Categorias_pedidas
> 
> This one is 6 days old:
> https://nl.wikipedia.org/wiki/Speciaal:GevraagdeCategorie%C3%ABn

Those aren't disabled special pages. This is not the correct bug.
Comment 57 Andre Klapper 2013-01-25 17:13:54 UTC
Malafaya: As this is not about "Currently disabled special pages", I've copied comment 54 and comment 55 to bug 44348.
Comment 58 Sam Reed (reedy) 2013-01-26 20:12:06 UTC
Created attachment 11692 [details]
Current log from update special pages log

It's missing the temporary disabled querypages for frwiki, but should be fairly indicative for the time being
Comment 59 Sam Reed (reedy) 2013-01-30 16:30:15 UTC
reedy@fenari:~$ time mwscript updateSpecialPages.php enwiki --override | tee ~/public_html/enwikispecialpages.log
Statistics                     1m completed in 21.33s
ValidationStatistics           completed in 7.72s
Ancientpages                   got 1000 rows in 3h 33m 10.58s
BrokenRedirects                got 31 rows in 9m 12.57s
Deadendpages                   got 617 rows in 2h 26m 22.34s
Disambiguations                got 1000 rows in 13m 49.71s
DoubleRedirects                got 223 rows in 4m 45.21s
FileDuplicateSearch            cheap, skipped
LinkSearch                     cheap, skipped
Listredirects                  got 1000 rows in 0.26s
Lonelypages                    got 1000 rows in 4m 16.68s
Longpages                      cheap, skipped
MIMEsearch                     got 0 rows in 0.00s
Mostcategories                 got 1000 rows in 1h 12m 31.90s
Mostimages                     got 1000 rows in 11m 15.93s
Mostinterwikis                 got 1000 rows in 10m 9.26s
Mostlinkedcategories           cheap, skipped
Mostlinkedtemplates            got 1000 rows in 4h 5m 54.14s
Mostlinked                     got 1000 rows in 18h 24m 55.82s
Mostrevisions                  got 1000 rows in 14h 48m 27.91s
Fewestrevisions                got 1000 rows in 13h 58m 12.41s
Shortpages                     cheap, skipped
Uncategorizedcategories        got 216 rows in 31m 28.03s
Uncategorizedpages             got 125 rows in 29m 38.84s
Uncategorizedimages            got 57 rows in 1m 13.81s
Uncategorizedtemplates         got 1000 rows in 1.84s
Unusedcategories               got 1000 rows in 18.40s
Unusedimages                   got 1000 rows in 18.55s
Wantedcategories               got 1000 rows in 17m 49.86s
Wantedfiles                    got 1000 rows in 17m 57.45s
Wantedpages                    got 1000 rows in 7h 17m 20.02s
Wantedtemplates                got 1000 rows in 1h 41m 22.78s
Unwatchedpages                 got 1000 rows in 1m 49.07s
Unusedtemplates                got 1000 rows in 39.74s
Withoutinterwiki               got 1000 rows in 5m 41.06s

real    4210m17.172s
user    0m0.920s
sys     0m0.112s



^ Nearly 3 days to run
Comment 60 Nemo 2013-01-30 16:38:26 UTC
Created attachment 11715 [details]
Update of all reports, also disabled, on de.wiki

Reedy also did de.wiki.
I commented on Gerrit change #33713: The worst case is mostlinked, 18h on en.wiki. It got to completion without problems and the hit slave (in pmtpa, now idle after eqiad migration) had only few seconds of lag for few minutes every now and then, no significant load. This seems safe enough to merge, even significantly more cautious than needed.
Comment 61 Dennis C. During 2013-01-30 17:47:07 UTC
1. Can I take WP's experience as a reasonable indication of the relative oost of various special pages in en.wikt, which probably has higher link density and has many widely transcluded templates, but has mostly short pages?

2. We have been having some discussions, which seem to suggest that these runs are not so useful as to warrant high frequency.

3. Furthermore, we seem to need more discrimination in, for example, WantedPages, which we seem to be able to provide by working the dump. Even such reports don't seem to be needed very frequently, as we don't yet have the capability to break the list down by language, which would make it much more useful.
Comment 62 DavidL 2013-02-02 17:01:13 UTC
From the log about times taken to update page on en.wiki, tasks can be scheduled smartly, by using previous time read from log files.

For a total period of 2 months:
  - if it can run in less than 10m -> once per 2 day
  - if it can run in less than 1h -> once per 2 weeks
  - if it can run in less than 5h -> once per month
  - longer -> once per 2 months

I computed a time charge of about 7%.
I can attach the spreadsheet file I used (ODS format) if you wish.

Also, it would be good to review algorithms used to compute theses pages, and maybe databases need refactoring to optimize such computations.
Comment 63 Sam Reed (reedy) 2013-02-02 22:25:24 UTC
(In reply to comment #62)
> Also, it would be good to review algorithms used to compute theses pages, and
> maybe databases need refactoring to optimize such computations.

In most cases, it's probably not worth the overhead/cost of doing the refactorings.
Comment 64 MZMcBride 2013-03-14 04:32:25 UTC
(In reply to comment #3)
> For small projects like this one, generating such a page doesn't take more
> than a few seconds (1.69 sec in this case). Maybe special pages like these
> should be enabled in the updateSpecialPages.php, but only for a selected
> number of wikis (actually, most of them would fit here).

Related:

* bug 43668: Re-enable disabled Special pages on small wikis (wikis in small.dblist)

* bug 46094: Re-enable disabled Special pages on medium wikis (wikis in medium.dblist)
Comment 65 Nemo 2013-04-23 07:16:36 UTC
*** Bug 47470 has been marked as a duplicate of this bug. ***
Comment 66 Alex Monk 2013-05-21 19:06:22 UTC
*** Bug 48678 has been marked as a duplicate of this bug. ***
Comment 67 Liangent 2013-06-16 09:02:11 UTC
Given this bug can't be resolved in short time, I made a version on Tool-Lab:

http://tools.wmflabs.org/liangent-php/index.php/zhwiki~~wgUseDatabaseMessages=0?title=Special:%E6%96%AD%E9%93%BE%E9%A1%B5%E9%9D%A2&uselang=en

Let me know if your wiki wants this too.
Comment 68 Nemo 2013-06-16 09:04:55 UTC
(In reply to comment #67)
> Given this bug can't be resolved in short time,

Actually, Asher gave green light for the patch, Reedy amended it and it could be merged any time soon. :)
Comment 69 Gerrit Notification Bot 2013-07-30 20:50:05 UTC
Change 33713 merged by Dzahn:
(bug 15434) Periodical run of currently disabled special pages

https://gerrit.wikimedia.org/r/33713
Comment 70 Nemo 2013-07-31 11:38:39 UTC
Created attachment 13028 [details]
crontabs on the maintenance server terbium, including the new ones for this bug

So, this bug can now hopefully be considered (mostly) fixed, with its current summary, after mutante approved and fixed the change above.
In detail, the special pages 1) AncientPages, 2) DeadendPages, 3) MostLinked, 4) MostRevisions, 5) WantedPages, 6) FewestRevisions will be updated twice a year on every wiki as follows:
a) page 1) in 1st and 7th month of the year, page 2) in 2nd and 8th month etc.,
b) starting at 1 UTC on each of the days from the 11th to the 17th of the month, where on the 11th are the wikis on database s1, on 12nd s2 etc. as listed on https://noc.wikimedia.org/dbtree/ .
(You can see the crontabs attached, as provided by mutante.) In short we should first see DeadendPages (which is among the slowest) updated for en.wiki on August 11, on it.wiki, pl.wiki etc. the next day and so on.

The next steps, in order, are:
1) keep an eye on the first updates to see whether they are successful and if they overload the servers too much, in which case they may be disabled;
2) if all goes well, make the frequency higher or much higher, e.g. monthly (as Tim put it, "If they don't break the site, then why not run them every week?"), or decide that this is enough;
3) try and add updates for the pages disabled only on en.wiki, fr.wiki and perhaps wikidata (see <https://git.wikimedia.org/blob/operations%2Fmediawiki-config.git/351a084f4a26fc7daeeccadedba48706f251664a/wmf-config%2FInitialiseSettings.php#L9308>).
This bug should be kept open at least till (1), so for a couple weeks more; 2-3 may be split to other bugs, but ideally we'll be more confident with testing, they'll follow in short order and we'll be able to close this bug to our complete satisfaction.


The queries will happen against the new databases in Ashburn, if I understand correctly, so let's thank (and wish in) the power of the new datacentre. Kudos to all the WMF people who helped transform my rough proposal in something real, including mutante, Reedy, Asher Feldman, Peter Youngmeister, Tim Starling, Ariel Glenn.
Comment 71 Andre Klapper 2013-08-15 15:36:39 UTC
(In reply to comment #70 by Nemo)
> So, this bug can now hopefully be considered (mostly) fixed, with its current
> summary, after mutante approved and fixed the change above.

Nemo: Let's close as RESOLVED FIXED then?
Comment 72 Nemo 2013-08-15 16:35:59 UTC
(In reply to comment #71)
> (In reply to comment #70 by Nemo)
> > So, this bug can now hopefully be considered (mostly) fixed, with its current
> > summary, after mutante approved and fixed the change above.
> 
> Nemo: Let's close as RESOLVED FIXED then?

I'm planning to check the results of the crontab later today.
Comment 74 Nemo 2013-08-15 19:59:25 UTC
(In reply to comment #73)
> Aren't the pages
> * https://pt.wikipedia.org/wiki/Special:DeadendPages
> * https://pt.wikipedia.org/wiki/Special:MostLinked
> * https://pt.wikipedia.org/wiki/Special:MostRevisions
> * https://pt.wikipedia.org/wiki/Special:WantedPages
> * https://pt.wikipedia.org/wiki/Special:FewestRevisions
> * https://pt.wikipedia.org/wiki/Special:AncientPages
> supposed to be updated? (the most recent update was in 2009)

No, only DeadendPages, on s1-5 as of today. However the update didn't work on any of the wikis I checked on those DBs, whether big or small. :[
Can a shell user please check the logs on /home/mwdeploy/updateSpecialPages/ ?
Comment 75 Gerrit Notification Bot 2013-08-15 21:43:04 UTC
Change 79279 had a related patch set uploaded by Nemo bis:
Make SpecialPages Titlecase in misc::maintenance::updatequerypages

https://gerrit.wikimedia.org/r/79279
Comment 76 Gerrit Notification Bot 2013-08-16 11:18:08 UTC
Change 79279 merged by ArielGlenn:
Make SpecialPages Titlecase in misc::maintenance::updatequerypages

https://gerrit.wikimedia.org/r/79279
Comment 77 Nemo 2013-08-17 06:54:53 UTC
(In reply to comment #76)
> Change 79279 merged by ArielGlenn:
> Make SpecialPages Titlecase in misc::maintenance::updatequerypages
> 
> https://gerrit.wikimedia.org/r/79279

The fix has been approved but didn't go live on the server (puppet had been disabled), so we have to wait till next month (September 11-17) for updates to Special:MostLinked, to know how all this works.
Comment 78 Danny B. 2013-09-05 23:11:35 UTC
Coming from https://meta.wikimedia.org/wiki/Tech/News/2013/34

Half a year on all wikis? Quite a nonsense. :-/

Small wikis with hundreds or thousands of articles can be simply updated much more often.

Also, half a year update to very often changing special pages such as Uncategorized*, Double/Broken redirs etc. doesn't make a sense.

Rather disable and hide such special pages completely than provide obsolete results for half a year which will only confuse people such as they do now.

Why it wasn't scaled as suggested in proposal in comment #9?
Comment 79 Nemo 2013-09-14 23:36:13 UTC
(In reply to comment #77)
> The fix has been approved but didn't go live on the server (puppet had been
> disabled), so we have to wait till next month (September 11-17) for updates
> to
> Special:MostLinked, to know how all this works.

It seems it's working, but we need to be cautious: I checked en.wiki for s1, it.wiki for s2, fr.quote for s3, commons for s4 and they have been updated.
The ganglia graphs show that slave lag was not very significantly affected (worst case probably s2 with some 1.5 s lag on one server; then Commons with an average 1.4 s over two hours on one server), while other metrics were more. In order:
<https://ganglia.wikimedia.org/latest/?r=custom&cs=09%2F11%2F2013+00%3A00&ce=9%2F12%2F2013+12%3A00&tab=ch&vn=&hreg[]=db10%2843|49|50|51|52%29>
<https://ganglia.wikimedia.org/latest/?r=custom&cs=09%2F12%2F2013+00%3A00&ce=9%2F13%2F2013+12%3A00&tab=ch&vn=&hreg[]=db10%2802|09|18%29>
<https://ganglia.wikimedia.org/latest/?r=custom&cs=09%2F13%2F2013+00%3A00&ce=9%2F14%2F2013+12%3A00&tab=ch&vn=&hreg[]=db10%2803|10|35%29>
<https://ganglia.wikimedia.org/latest/?r=custom&cs=09%2F14%2F2013+00%3A00&ce=9%2F15%2F2013+12%3A00&tab=ch&vn=&hreg[]=db10%2804|11|20%29>
(more eyeballs and conclusions/interpretations appreciated).

If, as it seems, the current setup is not going to kill the cluster :) , I'd proceed with some patches for step 2 or 3 as per comment 70 in a few days.
Comment 80 Gerrit Notification Bot 2013-09-17 20:48:42 UTC
Change 84632 had a related patch set uploaded by Nemo bis:
Periodical run of remaining currently disabled special pages on en.wiki

https://gerrit.wikimedia.org/r/84632
Comment 81 Gerrit Notification Bot 2013-09-17 20:51:44 UTC
Change 84635 had a related patch set uploaded by Nemo bis:
Periodical run of disabled special pages: make updates monthly

https://gerrit.wikimedia.org/r/84635
Comment 82 Nemo 2013-09-17 20:56:58 UTC
(In reply to comment #79)
> If, as it seems, the current setup is not going to kill the cluster :) , I'd
> proceed with some patches for step 2 or 3 as per comment 70 in a few days.

The other databases look even more bored, so I submitted two more patches to make updates for the 6 reports in comment 70 monthly and to add updates for the 6 reports disabled on en.wiki only (with the current frequency i.e. every 6 months).
They will probably sit in gerrit for a while... or perhaps not, we'll see. I'm told the new database guru is Sean Pringle, adding to cc. :)
Comment 83 Sean Pringle 2013-09-17 21:50:09 UTC
I saw when these queries ran but didn't know what they were at the time. Thanks for cc'ing me.

Don't make the mistake of thinking the databases are bored ;-) That's a slippery slope. However, I'm ok with these jobs going ahead providing the slave lag doesn't suffer unduly, and the innodb purge activity has enough chance to keep-up/catch-up between jobs. The latter may mean using non-contiguous day-of-month on cron jobs hitting the same shard.

But let's see how it goes. I'll merge it.
Comment 84 Gerrit Notification Bot 2013-09-17 21:52:08 UTC
Change 84632 merged by Springle:
Periodical run of remaining currently disabled special pages on en.wiki

https://gerrit.wikimedia.org/r/84632
Comment 85 Gerrit Notification Bot 2013-09-17 21:54:12 UTC
Change 84635 merged by Springle:
Periodical run of disabled special pages: make updates monthly

https://gerrit.wikimedia.org/r/84635
Comment 86 Malafaya 2013-09-18 09:13:03 UTC
Does this somehow fix the problem of special pages (all of them) currently not being automatically refreshed?
Comment 87 Nemo 2013-09-18 10:04:14 UTC
(In reply to comment #86)
> Does this somehow fix the problem of special pages (all of them) currently
> not
> being automatically refreshed?

That's a separate issue with the separate cronjob on non-disabled special pages.
Comment 88 William915 2013-10-05 02:41:17 UTC
(In reply to comment #78)
> Coming from https://meta.wikimedia.org/wiki/Tech/News/2013/34
> 
> Half a year on all wikis? Quite a nonsense. :-/
> 
> Small wikis with hundreds or thousands of articles can be simply updated much
> more often.
> 
> Also, half a year update to very often changing special pages such as
> Uncategorized*, Double/Broken redirs etc. doesn't make a sense.
> 
> Rather disable and hide such special pages completely than provide obsolete
> results for half a year which will only confuse people such as they do now.
> 
> Why it wasn't scaled as suggested in proposal in comment #9?

I totally agree. Please make a more frequent updatr on small wikis.
.
Comment 89 Ata 2013-10-05 20:13:57 UTC
I came here from Wikisource. Special:WantedPages in enwikisource was last updated 04:20, 16 October 2009. 
https://en.wikisource.org/w/index.php?title=Special:WantedPages
Unbelievable.
Comment 90 Nemo 2013-10-05 23:01:52 UTC
As said above, the update is now set to be monthly. In 5 days from now we should see the stream of updates.
Comment 91 MZMcBride 2013-10-06 02:09:53 UTC
(In reply to comment #89)
> I came here from Wikisource. Special:WantedPages in enwikisource was last
> updated 04:20, 16 October 2009. 
> https://en.wikisource.org/w/index.php?title=Special:WantedPages
> Unbelievable.

I wanted to refute this incredulousness with stats about how large the English Wikisource is, but it turns out it's not very large. ;-)

MariaDB [enwikisource_p]> select count(*) from pagelinks\G
*************************** 1. row ***************************
count(*): 8390968
1 row in set (8.53 sec)

MariaDB [enwikisource_p]> select count(*) from page\G
*************************** 1. row ***************************
count(*): 1457066
1 row in set (0.41 sec)

I know it's difficult to believe, but the maintenance Special pages situation _is_ improving, just very slowly. Unfortunately, enwikisource is considered a large database (cf. <https://noc.wikimedia.org/conf/large.dblist>), so even bug 46094 won't help here. Nemo's efforts should, though.
Comment 92 Andyrom75 2013-10-06 07:27:53 UTC
The special page that shows the most requested pages on https://it.wikivoyage.org (https://it.wikivoyage.org/wiki/Speciale:PagineRichieste) it hasn't been updated since November 2012, it's almost 1 year!!!

Can someone help us to update it? The situation is becoming ridiculos....

PS It's not the only one...
Comment 93 Nemo 2013-10-20 10:18:53 UTC
Update on the plan as per comment 70 and comment 82: the monthly update of all 6 disabled pages on all wikis worked, while for the update of the 6 additional en.wiki disabled special pages we have to wait for tomorrow ([[Special:MostLinkedTemplates]]).

To say what above I checked all the 12 pages on en.wiki and one of the 6 pages on a wiki of each cluster; I also gave a quick look to graphs like those linked in comment 79 and there wasn't anything worth noting, though I'm thinking of some improvements to the crontabs.
Comment 94 Danny B. 2013-10-20 16:23:23 UTC
(In reply to comment #90)
> As said above, the update is now set to be monthly. In 5 days from now we
> should see the stream of updates.

It is 15 days from "now" and many pages are still not updated.

Broken Redirects, Double Redirects, Uncategorized Pages, Uncategorized Templates, Uncategorized Categories, Wanted Categories, Wanted Templates, Wanted Files, Most Linked Templates,  @ cs wikis: 10 Sep

Most Linked Pages says 13 Oct & that update is OFF which seems weird to me.

These are just some, not necessarily all, because I was not checking all of them.
Comment 95 Nemo 2013-10-20 16:31:44 UTC
(In reply to comment #94)
> It is 15 days from "now" and many pages are still not updated.
> 
> Broken Redirects, Double Redirects, Uncategorized Pages, Uncategorized
> Templates, Uncategorized Categories, Wanted Categories, Wanted Templates,
> Wanted Files, Most Linked Templates,  @ cs wikis: 10 Sep

For the Nth time: that's bug 53227.

> 
> Most Linked Pages says 13 Oct & that update is OFF which seems weird to me.

It's weird only for those not reading the summary of this bug, which is about DISABLED special pages.
Comment 96 Danny B. 2013-10-20 16:41:59 UTC
(In reply to comment #95)
> (In reply to comment #94)
> > It is 15 days from "now" and many pages are still not updated.
> > 
> > Broken Redirects, Double Redirects, Uncategorized Pages, Uncategorized
> > Templates, Uncategorized Categories, Wanted Categories, Wanted Templates,
> > Wanted Files, Most Linked Templates,  @ cs wikis: 10 Sep
> 
> For the Nth time: that's bug 53227.

nth? It is the very first time mentioned on this page! It was neither in blocking, depending nor see-also bugs, nor in any comment.



> > Most Linked Pages says 13 Oct & that update is OFF which seems weird to me.
> 
> It's weird only for those not reading the summary of this bug, which is about
> DISABLED special pages.

*I* made the summary of this bug 4,5 years ago :-P

If something is being updated even twice a year, it is not DISABLED, thus it must not be written there. (Actually creating new bug for this issue.)
Comment 97 MZMcBride 2013-11-17 02:54:33 UTC
I believe this is relevant (from <https://wikitech.wikimedia.org/w/index.php?title=Server_Admin_Log&oldid=89461>):

---

== November 17 ==
01:35 Reedy: Killed updateSpecialPages and related processes on terbium
01:18 MaxSem: Killed a few long queries on db1007
01:08 MaxSem: db1007 is having tough times due to special page updates

---

Both seemed to agree that staggering the updates would sufficiently help.
Comment 98 Sean Pringle 2013-11-17 05:35:49 UTC
Staggering things more would be fantastic. Batching even.

Note that as per db-eqiad.php Query::recache (which I think these count as) stuff has been pointed to the snapshot slaves, of which db1007 is one, and they are LB=1 for normal traffic.

So technically if those slaves get thrashed it's not a show stopper and we could simply dial back icinga noise for a while. Still...
Comment 99 Nemo 2013-11-17 08:01:48 UTC
Yep, as anticipated in comment 93 the monthly updates will need to be resorted so that they don't all happen on the same day. I'll submit a patch later today if nobody beats me at it.
Comment 100 Max Semenik 2013-11-17 19:32:31 UTC
https://gerrit.wikimedia.org/r/95876
Comment 101 Gerrit Notification Bot 2013-11-17 20:01:40 UTC
Change 95889 had a related patch set uploaded by Nemo bis:
Make the monthly querypages updates not hit each cluster on the same day

https://gerrit.wikimedia.org/r/95889
Comment 102 Gerrit Notification Bot 2013-11-21 10:28:35 UTC
Change 95889 merged by Springle:
Make the monthly querypages updates not hit each cluster on the same day

https://gerrit.wikimedia.org/r/95889
Comment 104 Nemo 2014-01-09 14:16:31 UTC
The plan as per comment 70 and comment 82 has been implemented. If someone can just have a crosswiki look at the special pages to ensure they're being updated correctly and at the ganglia graphs to check nothing is going to explode in our face soon, we can confirm it's all done.
Comment 105 Sean Pringle 2014-01-09 21:38:05 UTC
These queries were fine in December after Nemo's last patch, plus Tim fixed a related load balancing bug allowing me to properly segregate them. So from DB perspective, seems ok.
Comment 106 Nemo 2014-01-12 15:36:15 UTC
I've checked all the pages listed at [[m:Special:PermaLink/7056706]] and they look well (updated in the last month), except for 5 out of the 6 reports disabled on en.wiki only... I'll check later what's happening with those and file separately, I call this fixed.
Comment 107 Dennis C. During 2014-01-12 20:28:44 UTC
Thanks. I hope it stays fixed. There is some room for reduction in frequency or elimination of certain reports. If wikis had so kind of resource budget for maintenance reports, it would be possible for them to decide which reports were worth it for them.

OTOH, there are some items like Special:Unwatched Pages for which the maximum of 5,000 pages makes the report silly for a wiki like English Wiktionary. That is connected to the larger question of watchlist editing.
Comment 108 Nemo 2014-01-13 07:59:27 UTC
 asked something like that around comment 47 but it's very hard: currently not even WMF and its own changes have anything like a "database stress" ""budget"". (Too long a discussion for this bug.)

(In reply to comment #107)
> OTOH, there are some items like Special:Unwatched Pages for which the maximum
> of 5,000 pages makes the report silly for a wiki like English Wiktionary.

Then let's try to make such pages useful. :) You have a few options:
* extend "unwatchedpages" permission and let people use action=info on individual pages (simple config change),
* file a consensual config change request to increase the number of results shown for that page (it's probably not too expensive) + a core bug to add such a configuration option,
* propose some way to make that special page more useful for all wikis.

You don't need a budget to reason about what's important for your wiki and why, and clearly expose your use case/proposal in new bug reports. MediaWiki has so many features that it's often extremely hard for devs to understand on their own what's really important / has a true impact on any given wiki/community (it is even for wiki regulars on wikis they don't know). If you don't document, describe and argue for the needs of your wiki, nobody will do it for you. ;-)
Comment 109 Dennis C. During 2014-01-13 13:50:39 UTC
Before embarking on one of your recommended courses of action:

Would it be possible for us to process the dump to get a list of unwatched pages?

At Wiktionary we usually focus only on "lemma" entries, ie, not on inflected forms like simple English plural nouns (much more common for languages like Latin), so the actual list of what we care most about is much shorter than the list of ALL unwatched pages. I could, with help from others at Wiktionary, run some Perl scripts to create the desired listing of relevant entries if "unwatched" or "watched" is an attribute on some XML dump file.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links