Last modified: 2007-11-11 20:17:57 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T13918, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 11918 - Normalized page titles in the web API contain no underscores
Normalized page titles in the web API contain no underscores
Status: RESOLVED INVALID
Product: MediaWiki
Classification: Unclassified
API (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/w/api.php?act...
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2007-11-10 11:38 UTC by Oskar Liljeblad
Modified: 2007-11-11 20:17 UTC (History)
1 user (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Oskar Liljeblad 2007-11-10 11:38:56 UTC
SleepResearch_Facility is a valid page in wikipedia (it actually contains the underscore).
When I try to fetch info on this page through the web API using this URL:

http://en.wikipedia.org/w/api.php?action=query&prop=info&format=xml&titles=SleepResearch_Facility

it will automatically normalize the page title "SleepResearch_Facility" to "SleepResearch Facility", like this:

<api>
  <query>
    <normalized>
      <n from="SleepResearch_Facility" to="SleepResearch Facility"/>
    </normalized>
    <pages>
      <page pageid="8149769" ns="0" title="SleepResearch Facility" touched="2007-11-02T00:32:23Z" lastrevid="168245210" counter="0" length="10923"/>
    </pages>
  </query>
</api>

Of course this page title works as well, but it should really contain the underscore.
So two solutions:
1) Do not normalize if the actual page contains underscores.
2) Add an API option "normalize=0" to disable normalization altogether.
Comment 1 Roan Kattouw 2007-11-11 20:17:57 UTC
(In reply to comment #0)
> Of course this page title works as well, but it should really contain the
> underscore.
> So two solutions:
> 1) Do not normalize if the actual page contains underscores.
> 2) Add an API option "normalize=0" to disable normalization altogether.

This is technically impossible. All page titles are stored in the database with spaces changed to underscores. So [[United States]] is stored as "United_States" in the database, and there's no way to figure out whether it was created with a space or an underscore. There's also no way to have two different pages called [[United States]] and [[United_States]]; they're just aliases for the same title.

Note that if you go to [[SleepResearch_Facility]], the big title on top of the page also has a space instead of an underscore (i.e. is also normalized).

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links