Last modified: 2013-01-31 16:26:23 UTC
On the main Wikimedia app cluster we log occasional read and write failures in CDB, causing bug 31576 and other issues. Analysis of the CDB code and limited testing indicates that a disk full condition may cause a truncated CDB file to be moved into the destination. Most of our app servers have /tmp mounted on a tiny (2GB) partition which is commonly 60-70% full. A spike in disk space usage could cause a transient disk-full condition. More defensive handling of CDB writes might fix this issue. The return values of the underlying write() and close() syscalls should be checked, and if an error occurs, the temporary file should be deleted without the rename being done, and an exception thrown. There are problems in many places: * PHP's dba_close() does not check whether the write of the hashtable or the close() succeeded. This is not easy to fix, since dba_close() just calls the resource destructor, and the resource destructor interface (rsrc_dtor_func_t) does not provide a return value. * Using CdbWriter_PHP would at least allow us to check for failures during the hashtable write. However fclose() suffers from the same problem as dba_close(): it calls the resource destructor and so cannot provide a meaningful error return. It doesn't even check for failures during the final flush of the internal write buffer, see https://bugs.php.net/bug.php?id=60110 * CdbWriter_DBA::set() just passes through the return value of dba_insert(), it does not throw an exception. Thus most callers will continue despite a disk-full error. * CdbWriter_PHP::set() throws an exception on a write error, however this will lead to __destruct() being called and thus a truncated CDB file being moved into the destination. Also, the fact that CdbWriter objects are finalised and closed on __destruct() may lead to an incomplete file being moved into place if some unrelated exception is thrown in the caller during rebuild. Fixing this will mean requiring that callers call CdbWriter::close() before letting the object go out of scope, breaking backwards compatibility for callers like dumpInterwiki.php and extensions/Babel/txt2cdb.php My recommendations are: * Add a configuration variable to configure which CdbWriter subclass is used, and make it CdbWriter_PHP by default. It looks like it makes sense to use the C reader but the PHP writer. * Fix the exception and error handling in both the DBA and PHP versions of CdbWriter. * Encourage the PHP devs to do something about resource destructor error propagation. * Increase the size of the /tmp partition on the app servers.
A couple more thoughts: * Calling fflush() before fclose() would be useful since fflush() calls write() and returns false on error. Then fclose() will only call close() which is somewhat less likely to fail on disk-full. * Maybe a cache stampede after l10n cache invalidation causes /tmp to fill up. Switching to manualRecache mode would fix this.
Filed PHP bug about dba_close(): https://bugs.php.net/bug.php?id=60621 Probably it will be closed "bogus", but at least it's a starting point.
We will want to increase the /tmp file partitions anyway, to handle the temp file for file concatenation in the future anyway. Maybe the use of TempFSFile will assure that a bit more stuff gets deleted timely as well.
*** Bug 32404 has been marked as a duplicate of this bug. ***
*** Bug 32170 has been marked as a duplicate of this bug. ***
Repartitioning is tracked in bug 36488.
Regarding Bug 32404 (marked as duplicate of this one): New occurrences of non-zero namespace pages link appearing as zero namespace in ptwiktionary dump of 24th Jan 2013: SELECT CONCAT('# [[', pl_title, ']] ([[Especial:Whatlinkshere/', pl_title, '|', COUNT(*), ' lig.]])') as linha FROM pagelinks LEFT JOIN page AS pg1 ON pl_namespace = pg1.page_namespace AND pl_title = pg1.page_title LEFT JOIN page AS pg2 ON pl_from = pg2.page_id WHERE pg1.page_namespace IS NULL AND pl_namespace = 0 AND pg2.page_namespace <> 8 GROUP BY pl_namespace, pl_title ORDER BY COUNT(*) DESC LIMIT 0, 3; (note that pl_namespace = 0) yields: '# [[Usuário:ValJor]] ([[Especial:Whatlinkshere/Usuário:ValJor|4860 lig.]])' '# [[Usuário_Discussão:ValJor]] ([[Especial:Whatlinkshere/Usuário_Discussão:ValJor|4833 lig.]])' '# [[Imagem:Flag_of_Canada.svg]] ([[Especial:Whatlinkshere/Imagem:Flag_of_Canada.svg|1604 lig.]])' First is in namespace User, second in User_Discussion and third in File namespace.