Last modified: 2014-10-29 19:15:06 UTC
MaxSem wrote a small shell script triggered by a cron and maintained in puppet https://gerrit.wikimedia.org/r/#/c/68309/ Given a list of wikis / articles, the script will each day copy them to a beta wiki. The point of that bug is to migrate that under Jenkins, that will let us tweak the job easily without relying on ops to merge our change and will also let us trigger the sync manually by rerunning the job.
The Parsoid team has a list of about 160,000 pages from various language wikis (largest portion being from English WP) that they test their round tripping on. This list would probably be a great list to have auto-pulled into beta for general purpose testing. See http://parsoid.wmflabs.org:8001/ for parsoid's use of it.
Alternatively, a small script might spider two or three deep from Main_Page. That might give a good set of "likely to be high traffic" pages.
Assigning Ariel since we've been talking about how to do this recently and they're working on it.
For a list of high priority languages from Asaf, which I'd just trust blindly since I have no real domain knowledge, see this pdf: https://commons.wikimedia.org/wiki/File:WMF%27s_New_Global_South_Strategy.pdf (specifically, page 18)