Last modified: 2010-05-15 15:37:53 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 2674 - namespace numbers missing in XML dumps
namespace numbers missing in XML dumps
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
Internationalization (Other open bugs)
1.5.x
All All
: High major (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 3143 (view as bug list)
Depends on:
Blocks: 1002
  Show dependency treegraph
 
Reported: 2005-07-02 21:30 UTC by El
Modified: 2010-05-15 15:37 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description El 2005-07-02 21:30:12 UTC
In the XML dumps, namespaces are stored as prefixes in page titles.
What happens if new namespaces are defined and someone imports a dump
to a MediaWiki installation that doesn't know them yet? The pages
will land in the article namespace. The same applies if the languages
of the two MediaWiki installations (i.e. where the dump was exported
and where it is imported) don't match or if dumps are made before
new namespace names are translated (or after they are translated, if
the MediaWiki installation to which the dump is imported, doesn't have
the translated name yet).

The situation isn't better for people who want to write scripts that
analyse dumps of various wikis. E.g. if I'd like to know how many user
pages exist in different Wikipedias, I must tell the script all
translations of "user".

All these problems can be avoided if the namespaces are given as numbers.
And to avoid redundancy, the prefixes should then be omitted, IMO.

So e.g. instead of

  <title>Talk:Wikipedia</title>

this would be better:

  <title namespace="1">Wikipedia</title>

As an alternative (or maybe even in addition), a translation
of namespace names can be put at the beginning of the XML dump.
That would look like this:

<namespaces>
  <namespace id="0" />
  <namespace id="1">Talk</namespace>
  ...
</namespaces>

*Please* do something before the first dumps in XML format
are made, if it isn't too late, because people (including me)
will hate it if the dump format changes all the time.
Comment 1 Brion Vibber 2005-07-02 22:01:00 UTC
Using namespace text is deliberate, so that custom namespaces don't just fall into 
neverneverland, but will be imported with names intact.

It would, however, be very good to include a list of namespace definitions and some 
other config info at the top of the dump. This would make it quite easy to split 
titles by namespace and get the symbolic namespaces during the import.
Comment 2 Brion Vibber 2005-07-05 00:37:04 UTC
Done w/ version 0.3 of export schema.
Comment 3 Zigger 2005-08-15 23:09:26 UTC
See also bug 3143.
Comment 4 Brion Vibber 2005-08-16 23:48:29 UTC
*** Bug 3143 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links