Last modified: 2014-07-03 11:11:47 UTC
Toolserver has the local table toolserver.wiki on all databases that provides metadata about the wikis including the server the wiki's database is kept on: | mysql> SELECT * FROM toolserver.wiki LIMIT 5; | +----------------+------+------------+------------------+------+---------+-----------+--------------+--------------+---------------+--------+-------------+ | | dbname | lang | family | domain | size | is_meta | is_closed | is_multilang | is_sensitive | root_category | server | script_path | | +----------------+------+------------+------------------+------+---------+-----------+--------------+--------------+---------------+--------+-------------+ | | aawikibooks_p | aa | wikibooks | NULL | 3 | 0 | 1 | 0 | 0 | NULL | 3 | /w/ | | | aawiki_p | aa | wikipedia | NULL | 6 | 0 | 1 | 0 | 0 | NULL | 3 | /w/ | | | aawiktionary_p | aa | wiktionary | NULL | 1 | 0 | 1 | 0 | 1 | NULL | 3 | /w/ | | | abwiki_p | ab | wikipedia | ab.wikipedia.org | 807 | 0 | 0 | 0 | 0 | NULL | 3 | /w/ | | | abwiktionary_p | ab | wiktionary | NULL | 0 | 0 | 1 | 0 | 1 | NULL | 3 | /w/ | | +----------------+------+------------+------------------+------+---------+-----------+--------------+--------------+---------------+--------+-------------+ | 5 rows in set (0.00 sec) | mysql> Most of the information can probably be extracted from operations/mediawiki-config, but I don't know which sources there are authoritative.
Played around with: | include ($MediaWikiRepoPath . "/includes/Defines.php"); | include ($WmfConfigRepoPath . "/wmf-config/InitialiseSettings.php"); | var_dump ($wgConf->settings); but it doesn't yield for example information about de.wikipedia.org.
(In reply to comment #1) > Played around with: > > | include ($MediaWikiRepoPath . "/includes/Defines.php"); > | include ($WmfConfigRepoPath . "/wmf-config/InitialiseSettings.php"); > | var_dump ($wgConf->settings); > > but it doesn't yield for example information about de.wikipedia.org. Some experiments: $ php maintenance/eval.php > $wgDBname='zhwiki'; > $wmfRealm='production'; > $mwConfigDir="$IP/../operations/mediawiki-config"; > $wmfConfigDir="$mwConfigDir/wmf-config"; > function getRealmSpecificFilename($p){global $IP,$wmfConfigDir;return str_replace($p,$IP,$wmfConfigDir);} > function wmfLoadInitialiseSettings($c){global $wmfConfigDir;require("$wmfConfigDir/InitialiseSettings.php");} > require("$wmfConfigDir/wgConf.php"); > list($site,$lang)=$wgConf->siteFromDB($wgDBname); > $wikiTags=array(); > $mwConfigDirHandle=opendir($mwConfigDir); > while(($f=readdir($mwConfigDirHandle))!==false){if(pathinfo($f,PATHINFO_EXTENSION)==='dblist'&&in_array($wgDBname,array_map('trim',file("$mwConfigDir/$f")))){$wikiTags[]=pathinfo($f,PATHINFO_FILENAME);}} > $dbSuffix = ( $site === 'wikipedia' ) ? 'wiki' : $site; > $wgConf->loadFullData(); > $globals = $wgConf->getAll( $wgDBname, $dbSuffix,array('lang' => $lang,'site' => $site,'stdlogo' => "//upload.wikimedia.org/$site/$lang/b/bc/Wiki.png"), $wikiTags ); > print_r($globals); Array ( [wgLegacyEncoding] => [wgCapitalLinks] => 1 ... ) >
Do we want a database table consisting of three columns: wiki, config_variable_name, and config_variable_value (as a serialized blob)?
I think we should have a discussion about what the current "toolserver" database is, what we want in the future, and whether we care about breaking backward compatibility. Some of the design decisions in some of the database tables could probably be re-thought, but only if we're willing to break the current interfaces. In addition, I think we should only rely on MediaWiki's API for this information (with user authentication, as necessary). This is the cleanest and sanest way to accurately get this information, as far as I know.
(In reply to comment #4) > In addition, I think we should only rely on MediaWiki's API for this > information (with user authentication, as necessary). This is particularly important in that some extensions may have hard-to-evaluate effect on some configuration values (namespaces and usergroups being the more obvious cases). I should say that any necessary configuration value that cannot be fetched through the API should be /added/ to the API rather than fetched through an alternative scheme. -- Marc
API is per wiki. toolserver.wiki is a meta table.
Yes, but you need to populate that table from /somewhere/. :-)
I've added a table with automatically maintained meta information about the replicated databases: meta_p.wiki (which is available on every shard). +------------------+--------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------------+--------------+------+-----+---------+-------+ | dbname | varchar(32) | NO | PRI | NULL | | | lang | varchar(12) | NO | | en | | | name | text | YES | | NULL | | | family | text | YES | | NULL | | | url | text | YES | | NULL | | | size | decimal(1,0) | NO | | 1 | | | slice | text | NO | | NULL | | | is_closed | decimal(1,0) | NO | | 0 | | | has_echo | decimal(1,0) | NO | | 0 | | | has_flaggedrevs | decimal(1,0) | NO | | 0 | | | has_visualeditor | decimal(1,0) | NO | | 0 | | | has_wikidata | decimal(1,0) | NO | | 0 | | +------------------+--------------+------+-----+---------+-------+ There is a lingering issue with the 'name' column which seems to improperly encode the Wiki name when non-ascii characters are involved; that will get fix once I manage to beat some sense into mysql. Most columns are self-explanatory, and I can add a few more depending on demand. In the meantime, (dbname, slice) provides the much requested mapping between databases and slices.
decimal(1,0) ? This seems strange. Shouldn't those is_* and has_* be BOOL aka. TINYINT(1) ?
I did not want to rely on the existence of bool, which isn't ANSI; mysql "helpfully" translated my numeric(1) to decimal(1,0).
Would be a problem to rename slice to server, in order to match the column name of toolserver? The name column looks good to me from a quick look, btw.
It would be possible, but probably unhelpful: from what I understand, the server column is numeric whereas I provide actual host names. Keeping the column named the same with changed semantics seems to be asking for trouble IMO (i.e.: better a select fails than return a string that is misinterpreted as an integer by code with poor error checking).
Added a meta_p.legacy view that has the same column name and order as toolserver.wiki for legacy purposes. Please note that the semantics of the 'server' columns differs and there may be other subtle differences with the toolserver's table not immediately evident. Unless the same code base has to run on both labs and the toolserver for the interval while it still has replication, transitioning to use meta_p.wiki is preferable.