Last modified: 2010-04-27 23:23:32 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 13750 - $wgCapitalLinks should be a per-namespace setting
$wgCapitalLinks should be a per-namespace setting
Status: RESOLVED FIXED
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal enhancement with 1 vote (vote)
: ---
Assigned To: Chad H.
: patch, patch-reviewed
: 9254 (view as bug list)
Depends on:
Blocks: 3904 3950 5134 5601
  Show dependency treegraph
 
Reported: 2008-04-15 11:51 UTC by Chad H.
Modified: 2010-04-27 23:23 UTC (History)
9 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Patch against 33344 (8.18 KB, patch)
2008-04-15 13:09 UTC, Chad H.
Details
Update against 33356 (8.72 KB, patch)
2008-04-15 15:13 UTC, Chad H.
Details
Updated patch (10.20 KB, patch)
2008-08-09 22:03 UTC, Chad H.
Details
Updated patch (11.51 KB, patch)
2008-08-24 09:27 UTC, Chad H.
Details
Updated patch (12.69 KB, patch)
2008-10-31 22:42 UTC, Bryan Tong Minh
Details
Upated patch (19.26 KB, patch)
2009-02-24 16:06 UTC, Chad H.
Details
Update again (22.71 KB, patch)
2009-02-25 00:07 UTC, Chad H.
Details

Description Chad H. 2008-04-15 11:51:54 UTC
In talking on IRC, it was discussed that being able to customize the link capitalization per namespace might be helpful. In trying to implement this, I wrote a new method (Namespace::isUpperCaseNS). However, in trying to implement it, I ran into several issues:

1) At a few points, $wgCapitalLinks is invoked before a namespace has been selected, as a ucfirst/lcfirst is often applied before passing it to Title::newFromText(). In some cases (such as Special:Upload and FileRepo), it is easy to guess it. However, a few places make it hard to figure it out.

2) I haven't encountered an issue with this yet, but I didn't know if having it /per/ namespace might have an issue. If one made their content namespace upper, and their associated talk namespace free-form, would this cause any issues in seemingly unrelated areas?

I'm not sure what to do about those places where no namespace has been initialized. As for the second issue, to prevent the issue of mis-matched cases between content/talk, I was thinking of requiring that the namespaces be somehow required in pairs to be upper or free-form.

Hopefully I can submit a patch for this later today.
Comment 1 Chad H. 2008-04-15 13:09:56 UTC
Created attachment 4817 [details]
Patch against 33344

This takes care of everything except for AjaxFunctions, still some debate over whether that's needed.
Comment 2 Chad H. 2008-04-15 15:13:56 UTC
Created attachment 4818 [details]
Update against 33356

This also includes the fixes to AjaxFunctions.php (seems to be $wgCapitalLinks isn't needed there after all.
Comment 3 Brion Vibber 2008-04-15 17:40:04 UTC
Note also bug 5134
Comment 4 Daniel Friesen 2008-04-16 00:28:17 UTC
Adding this to the list of things which will be solved with the Title Rewrite.

Under the current title setup, trying to alter case sensitivity things like this is going to result in either an extremely hacky method which cannot be relied on. Or complete failure.

The title rewrite allows for extending of the Normalizing system, so when that is all offloaded into a single extensible and modifiable system it becomes possible to define things which should only be done in certain namespaces by testing the data being passed around.

I'll also tag bug 5134.
Comment 5 Chad H. 2008-04-16 00:41:37 UTC
I'll mark this as LATER for us to come back to. If the Title rewrite will fix a lot of this, then we can wait for that.

Thanks.
Comment 6 Brion Vibber 2008-04-16 19:01:51 UTC
I see no reason to believe this requires a massive rewrite to implement.

Most likely, all it needs is a tweak to Title::secureAndSplit(), and some cleanup of a few bits that flip titles around manually.
Comment 7 Chad H. 2008-08-09 17:28:50 UTC
Adding some more dependencies. Should be fixable pretty soon, cleaning up the work on a patch right now.
Comment 8 Chad H. 2008-08-09 22:03:48 UTC
Created attachment 5156 [details]
Updated patch

This updated patch should take care of all remaining uses of $wgCapitalLinks and now make it a per-namespace setting. Setting to bool retains old behavior of keeping it as a site-wide setting.

The only breakages I see with current behavior is by using the new method of namespace-based settings causes a change in API and export behavior. The siteinfo "case" attribute is no longer shown on sites not using the global bool setting. However, this information is now output in the namespace information, in addition to their ID and subpage support.

Also checked parser tests to make sure it didn't mess up linking, and it all appears to be fine (16 failed tests before and after patching).
Comment 9 Brion Vibber 2008-08-13 02:51:47 UTC
Requires XML schema update to export data:
-				'id' => $ns
+				'id' => $ns,
+				'case' => MWNamespace :: isCapitalizedNamespace( $ns ) ? 'first-letter' : 'case-sensitive',


Patch would change current behavior and links -- would make any custom namespaces etc change to the broken case-sensitive behavior. Default needs to retain backwards-compatibility.
-$wgCapitalLinks = true;
+$wgCapitalLinks[ NS_MAIN ]     = true;
+$wgCapitalLinks[ NS_PROJECT ]  = true;
+$wgCapitalLinks[ NS_IMAGE ]    = true;
+$wgCapitalLinks[ NS_TEMPLATE ] = true;
+$wgCapitalLinks[ NS_HELP ]     = true;
+$wgCapitalLinks[ NS_CATEGORY ] = true;
 

PHP 5.3 breakage w/ 'Namespace' references:
+		if ( $this->initialCapital != Namespace::isCapitalizedNamespace( NS_IMAGE ) ) {
+			if( Namespace::isCapitalizedNamespace( NS_IMAGE ) ) {
+	if( Namespace::isCapitalizedNamespace( NS_MAIN ) ) { // Only searching the mainspace anyway
+		if( Namespace::isCapitalizedNamespace( $this->mNamespace ) && $this->mInterwiki == '') {
etc

This doesn't make sense to me; part of the point of making it configurable might presumably be to allow non-caps usernames on offsite wikis:
 	/**
+	 * These namespaces should always be first-letter capitalized, now and 
+	 * forevermore. Historically, they could've probably been lowercased too, 
+	 * but some things are just too ingrained now. :)
+	 */
+	private static $alwaysCapitalizedNamespaces = array( NS_SPECIAL, NS_USER, NS_MEDIAWIKI );

I don't like this function name; it's long and redundant (we know it's a namespace, since there's a big fat "MWNamespace::" right before it every time we call)
+	public static function isCapitalizedNamespace( $index ) {

This is incompatible with the array/bool dichotomy, and would also spew errors in cases where there's an array but no specific entry for NS_IMAGE:
-		'initialCapital' => $wgCapitalLinks,
+		'initialCapital' => $wgCapitalLinks[ NS_IMAGE ], // No namespace class yet :(

Comment 10 Chad H. 2008-08-24 09:27:51 UTC
Created attachment 5212 [details]
Updated patch

Tweaked the previous patch to incorporate feedback from Brion.

(In reply to comment #9)
> Requires XML schema update to export data:
> -                               'id' => $ns
> +                               'id' => $ns,
> +                               'case' => MWNamespace ::
> isCapitalizedNamespace( $ns ) ? 'first-letter' : 'case-sensitive',
> 
> 
> Patch would change current behavior and links -- would make any custom
> namespaces etc change to the broken case-sensitive behavior. Default needs to
> retain backwards-compatibility.
> -$wgCapitalLinks = true;
> +$wgCapitalLinks[ NS_MAIN ]     = true;
> +$wgCapitalLinks[ NS_PROJECT ]  = true;
> +$wgCapitalLinks[ NS_IMAGE ]    = true;
> +$wgCapitalLinks[ NS_TEMPLATE ] = true;
> +$wgCapitalLinks[ NS_HELP ]     = true;
> +$wgCapitalLinks[ NS_CATEGORY ] = true;
> 

I wouldn't think so. If a particular namespace is undefined in $wgCapitalLinks, it ends up returning true. Also, with the is_bool() check, current true/false settings for people will remain as they currently are.

> PHP 5.3 breakage w/ 'Namespace' references:
> +               if ( $this->initialCapital !=
> Namespace::isCapitalizedNamespace( NS_IMAGE ) ) {
> +                       if( Namespace::isCapitalizedNamespace( NS_IMAGE ) ) {
> +       if( Namespace::isCapitalizedNamespace( NS_MAIN ) ) { // Only searching
> the mainspace anyway
> +               if( Namespace::isCapitalizedNamespace( $this->mNamespace ) &&
> $this->mInterwiki == '') {
> etc
> 

Fixed, oops.

> This doesn't make sense to me; part of the point of making it configurable
> might presumably be to allow non-caps usernames on offsite wikis:
>         /**
> +        * These namespaces should always be first-letter capitalized, now and 
> +        * forevermore. Historically, they could've probably been lowercased
> too, 
> +        * but some things are just too ingrained now. :)
> +        */
> +       private static $alwaysCapitalizedNamespaces = array( NS_SPECIAL,
> NS_USER, NS_MEDIAWIKI );
> 

Made two tweaks to User where names are force-capitalized to require that the capitalization of said namespace be checked. Thus, setting $wgCapitalLinks[ NS_USER ] allows for lowercase usernames now.

> I don't like this function name; it's long and redundant (we know it's a
> namespace, since there's a big fat "MWNamespace::" right before it every time
> we call)
> +       public static function isCapitalizedNamespace( $index ) {
> 

Fixed. Now called MWNamespace::isCapitalized().

> This is incompatible with the array/bool dichotomy, and would also spew errors
> in cases where there's an array but no specific entry for NS_IMAGE:
> -               'initialCapital' => $wgCapitalLinks,
> +               'initialCapital' => $wgCapitalLinks[ NS_IMAGE ], // No
> namespace class yet :(
> 

$wgLocalFileRepo is now configured in Setup without establishing it's initialCapital setting now. Instead, FileRepo defalts to NS_IMAGE's capitalization during setup rather than defaulting true (which was broken, should've defaulted to $wgCapitalLinks).
Comment 11 Bryan Tong Minh 2008-10-29 21:33:57 UTC
Still applies cleanly and works :) 
Comment 12 Brion Vibber 2008-10-29 23:17:02 UTC
There's several calls to Namespace::isCapitalized -- these will fail on PHP 5.3 and later. Make sure all calls are to MWNamespace class.

It may be worth adding a standard function for normalizing a title prefix -- either on MWNamespace or Title -- since I see a lot of these are checks around a ucfirst() call for a term to be used in a prefix search.

The change to the export format requires an update to the export schema -- new version and updated schema file.

DefaultSettings.php fails to establish an initial values for $wgCapitalLinks -- register_globals vulnerability and may show an E_NOTICE warning.

The default set for $wgCapitalLinks set for specific namespaces means that all custom namespaces will end up being fully case-sensitive (not enforcing the initial caps) which is an unacceptable change in behavior.

No longer returning the site-wide case setting in API siteinfo and Special:Export siteinfo could lead to compatibility problems with bot tools.
Comment 13 Daniel Friesen 2008-10-30 05:49:49 UTC
(In reply to comment #12)
> No longer returning the site-wide case setting in API siteinfo and
> Special:Export siteinfo could lead to compatibility problems with bot tools.
> 

Perhaps at some point we should add a title normalization or other type of title handling module to the API. And recommend bots start to make use of the module rather than trying to imitate MW's Title class and normalize everything on their own for comparison.

Other than just per-namespace case sensitivity there are a number of other title normalization alteration requests floating around. Complete case insensitivity, changing how underscores and spaces are treated (in some cases wiki want - rather than _), and so on.
Comment 14 Roan Kattouw 2008-10-30 12:24:17 UTC
(In reply to comment #13)
> (In reply to comment #12)
> > No longer returning the site-wide case setting in API siteinfo and
> > Special:Export siteinfo could lead to compatibility problems with bot tools.
> > 
> 
> Perhaps at some point we should add a title normalization or other type of
> title handling module to the API. And recommend bots start to make use of the
> module rather than trying to imitate MW's Title class and normalize everything
> on their own for comparison.

We already have that. http://en.wikipedia.org/w/api.php?action=query&titles=uSeR_tAlK:catrope|main_page will normalize your titles just fine. And with &redirects it'll even resolve redirects.
Comment 15 Bryan Tong Minh 2008-10-31 22:42:04 UTC
Created attachment 5497 [details]
Updated patch

* Added Title::capitalize function
* Return at least something for the case setting in ApiQuerySiteInfo and Export
* If nothing set for a namespace it will default to capitalize 

I have no clue about XML schemas so somebody else'd better do that.
Comment 16 Chad H. 2009-02-24 16:06:56 UTC
Created attachment 5863 [details]
Upated patch

Handles everything identical to Bryan's patch, plus a few things
* Updated code to current standards (and applies cleanly to head)
* Update XML schema from 0.3 to 0.4 (updated Export to indicate this).
Comment 17 Brion Vibber 2009-02-24 23:23:27 UTC
+<schema xmlns="http://www.w3.org/2001/XMLSchema"
+        xmlns:mw="http://www.mediawiki.org/xml/export-0.3/"
+        targetNamespace="http://www.mediawiki.org/xml/export-0.3/"
+        elementFormDefault="qualified">

^ Either the version number should be bumped here or we should go ahead and clean up the schema/namespace.... Probably the namespace URL should *not* include the version, and the version should be in a separate 'version' element. It *should* be in the XML Schema URL, of course.

+			// $wgCapitalLinks is a per namespace setting
+			// Return something sensible so that bots don't choke
+			$data['case'] = 'per-namespace';

If the default site behavior is going to be unchanged from the old default, we probably shouldn't change our output here. Either the default should still be a blanket 'true', or we should go ahead and output a firm answer here if everything's set to true.

+ * @since 1.14 - This can now be set per-namespace. Some special namespaces (such
^ needs updating to 1.15 :)

-$wgCapitalLinks = true;
+$wgCapitalLinks = array();
+$wgCapitalLinks[ NS_MAIN ]     = true;
+$wgCapitalLinks[ NS_USER ]     = true;
+$wgCapitalLinks[ NS_PROJECT ]  = true;
+$wgCapitalLinks[ NS_FILE ]    = true;
+$wgCapitalLinks[ NS_TEMPLATE ] = true;
+$wgCapitalLinks[ NS_HELP ]     = true;
+$wgCapitalLinks[ NS_CATEGORY ] = true;
^ If the default is 'true' for all unset namespaces, there's no need to list any explicitly.

It may make more sense to keep $wgCapitalLinks as a way to set the default, and have a second config variable with per-namespace overrides. That would also allow a site which is mostly case-sensitive to define a single forced-capital namespace without explicitly setting every other standard and custom namespace.
Comment 18 Chad H. 2009-02-25 00:07:44 UTC
Created attachment 5866 [details]
Update again

* $wgCapitalLinks is now a boolean again, setting the default across all namespaces ('per-namespace' has been dropped from XSD and API/Export output)
* $wgCapitalLinkOverrides is the array that allows per-namespace values. This info is still given in the case attribute of namespace XML.
* XSD updated to 0.4. Tweaked URLs and added version attribute.
Comment 19 Siebrand Mazeland 2009-06-04 10:53:18 UTC
No more feedback? Should ^demon go ahead and apply?
Comment 20 Tomasz Finc 2009-08-05 18:45:17 UTC
If this is ready then feel free to add to the newly created export-0.4.xsd created on r54472. I'd like to release it this week to make the snapshots fully documented by the new schema definition.
Comment 21 Chad H. 2009-08-05 18:46:39 UTC
Need Brion to give this a once-over one last time...never got any feedback after the February 25 patch.
Comment 22 Chad H. 2009-10-09 12:53:21 UTC
Done in r57558. Merged into 0.4 XSD rather than bumping to 0.5
Comment 23 lɛʁi לערי ריינהארט 2009-10-29 19:28:10 UTC
Thanks for fixing this!
Comment 24 Danny B. 2010-04-27 23:23:32 UTC
*** Bug 9254 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links