Last modified: 2007-11-11 20:17:57 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T13918, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 11918 - Normalized page titles in the web API contain no underscores


Summary:	Normalized page titles in the web API contain no underscores

Status:	RESOLVED INVALID

Product:	MediaWiki
Classification:	Unclassified
Component:	API (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal enhancement (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:	http://en.wikipedia.org/w/api.php?act...
Whiteboard:
Keywords:

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2007-11-10 11:38 UTC by Oskar Liljeblad
Modified:	2007-11-11 20:17 UTC (History)
CC List:	1 user (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Oskar Liljeblad 2007-11-10 11:38:56 UTC

SleepResearch_Facility is a valid page in wikipedia (it actually contains the underscore).
When I try to fetch info on this page through the web API using this URL:

http://en.wikipedia.org/w/api.php?action=query&prop=info&format=xml&titles=SleepResearch_Facility

it will automatically normalize the page title "SleepResearch_Facility" to "SleepResearch Facility", like this:

<api>
  <query>
    <normalized>
      <n from="SleepResearch_Facility" to="SleepResearch Facility"/>
    </normalized>
    <pages>
      <page pageid="8149769" ns="0" title="SleepResearch Facility" touched="2007-11-02T00:32:23Z" lastrevid="168245210" counter="0" length="10923"/>
    </pages>
  </query>
</api>

Of course this page title works as well, but it should really contain the underscore.
So two solutions:
1) Do not normalize if the actual page contains underscores.
2) Add an API option "normalize=0" to disable normalization altogether.

Comment 1 Roan Kattouw 2007-11-11 20:17:57 UTC

(In reply to comment #0)
> Of course this page title works as well, but it should really contain the
> underscore.
> So two solutions:
> 1) Do not normalize if the actual page contains underscores.
> 2) Add an API option "normalize=0" to disable normalization altogether.

This is technically impossible. All page titles are stored in the database with spaces changed to underscores. So [[United States]] is stored as "United_States" in the database, and there's no way to figure out whether it was created with a space or an underscore. There's also no way to have two different pages called [[United States]] and [[United_States]]; they're just aliases for the same title.

Note that if you go to [[SleepResearch_Facility]], the big title on top of the page also has a space instead of an underscore (i.e. is also normalized).

Note You need to log in before you can comment on or make changes to this bug.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links