Last modified: 2013-08-01 18:08:07 UTC
It did this: Indexed 50 pages ending at 29137 at 24/second Indexed 50 pages ending at 29213 at 24/second Indexed 50 pages ending at 29278 at 24/second Indexed 50 pages ending at 29368 at 24/second Indexed 50 pages ending at 29467 at 24/second Indexed 50 pages ending at 29542 at 24/second Indexed 50 pages ending at 29612 at 24/second Indexed 50 pages ending at 29687 at 24/second Indexed 50 pages ending at 29764 at 24/second Indexed 50 pages ending at 29835 at 24/second Indexed 50 pages ending at 29907 at 24/second Indexed 50 pages ending at 30016 at 24/second Indexed 50 pages ending at 29261 at 24/second Indexed 50 pages ending at 29347 at 24/second Indexed 50 pages ending at 29449 at 24/second Indexed 50 pages ending at 29512 at 24/second Indexed 50 pages ending at 29592 at 24/second Indexed 50 pages ending at 29660 at 24/second Indexed 50 pages ending at 29743 at 24/second Indexed 50 pages ending at 29815 at 24/second Indexed 50 pages ending at 29888 at 24/second Indexed 50 pages ending at 29969 at 24/second Indexed 50 pages ending at 30070 at 24/second Indexed 50 pages ending at 30143 at 24/second Indexed 50 pages ending at 30214 at 24/second Indexed 50 pages ending at 30294 at 24/second Indexed 50 pages ending at 30355 at 24/second Indexed 50 pages ending at 30433 at 24/second Indexed 50 pages ending at 16386 at 24/second Indexed 50 pages ending at 16464 at 24/second Indexed 50 pages ending at 16525 at 24/second Indexed 50 pages ending at 16605 at 24/second Indexed 50 pages ending at 16694 at 24/second Indexed 50 pages ending at 16755 at 24/second Indexed 50 pages ending at 16822 at 24/second Indexed 50 pages ending at 16889 at 24/second Indexed 50 pages ending at 16989 at 24/second Indexed 50 pages ending at 17095 at 24/second Indexed 50 pages ending at 17157 at 24/second Indexed 50 pages ending at 17242 at 24/second Indexed 50 pages ending at 17320 at 24/second Indexed 50 pages ending at 17422 at 24/second Indexed 50 pages ending at 17493 at 24/second Indexed 50 pages ending at 17547 at 24/second Indexed 50 pages ending at 17654 at 24/second Indexed 50 pages ending at 17727 at 24/second Indexed 50 pages ending at 17859 at 24/second Indexed 50 pages ending at 17940 at 24/second After I killed it it only had a total of 17349 live documents: manybubbles@deployment-bastion:~$ curl deployment-es0:9200/simplewiki/page/_count?pretty { "count" : 17349, "_shards" : { "total" : 4, "successful" : 4, "failed" : 0 } }manybubbles@deployment-bastion:~$ Looks like the documents were just overwriting themselves over and over again: manybubbles@deployment-bastion:~$ curl -s deployment-es0:9200/simplewiki/_status?pretty | grep deleted | head -n1 "deleted_docs" : 272 manybubbles@deployment-bastion:~$ Note that the reason there aren't a ton of deleted docs sitting around is because elasticsearch cleans them up.
This seems to be caused by the forceSearchIndex.php hitting a redirect. It is supposed to filter out redirects (by the page_is_redirect column) but that doesn't seem to work 100%. In any case this interacts causes the code that tries to index the redirect target to confuse the code that finds the place to keep indexing because it. I have a solution I'm testing locally now.
Patch here: https://gerrit.wikimedia.org/r/#/c/77127
This has been merged and I'm rebuilding the search indecies now to make sure nothing was skipped.