Last modified: 2014-04-09 00:29:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T14703, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 12703 - Page titles could cause problems for some HTML editors that add a trailing slash to URLs
Page titles could cause problems for some HTML editors that add a trailing sl...
Status: NEW
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
1.12.x
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on: 28602
Blocks:
  Show dependency treegraph
 
Reported: 2008-01-20 13:16 UTC by Igor berger
Modified: 2014-04-09 00:29 UTC (History)
6 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Igor berger 2008-01-20 13:16:46 UTC
As a developer and WorPress user I find this a potentional bug because some wysiwyg editor will change the dir to dir/ making the url erroneous for rendering as HTML request.

backslash "/" should redirect to directory!
so dir = dir/ server header should be 200 okay.

So request to "dir/" should return "dir" 200 okay!

http://en.wikipedia.org/wiki/User:Durova/ '''bug'''
does not redirect to
http://en.wikipedia.org/wiki/User:Durova


Now using a redirect hack 

http://en.wikipedia.org/wiki/User:Igorberger/
redirects to
http://en.wikipedia.org/wiki/User:Igorberger

This can easily be fixed as a url rewrite in .htaccess or apache config file.

Thank you,
Igor Berger
Comment 1 Voyagerfan5761 / dgw 2008-01-20 14:42:32 UTC
I know exactly what Igor's talking about here; MediaWiki's page title handling (couldn't find a component for that, BTW) has always treated titles like [[User:Voyagerfan5761]] and [[User:Voyagerfan5761/]] as different pages. Some WYSIWYG editors (for blogs, etc.) will add a trailing slash to URLs added to the text, which breaks any links to a MediaWiki site using pretty URLs like Wikipedia, Wikiquote, Meta, etc.

The proposal here is to have trailing slashes stripped from the page title, so things like this don't matter. I know some users, including myself, also will habitually add a slash at the end of a URL that looks like a directory (as most Wikipedia addresses do, .js and .css notwithstanding). If this is implemented, linking to http://en.wikipedia.org/wiki/User:Voyagerfan5761 and http://en.wikipedia.org/wiki/User:Voyagerfan5761/ would be functionally equivalent. The latter address would send a 301 redirect (ideally) to the former page, to tell search engines, browsers, etc. that the page's real address has no slash.

This should be fixable in Title.php, I think. I'm not that familiar with MediaWiki's core modules, but... It's the ''logical'' location.
Comment 2 Igor berger 2008-01-20 23:49:21 UTC
"backslash" should be slash or forward slash! My error in original bug reporting.
Comment 3 Carl Fürstenberg 2008-01-21 03:30:06 UTC
It can't be done as an apache configuration, as it's only relevant for namespaces that have sup-page option enabled.
Comment 4 Igor berger 2008-01-21 06:35:32 UTC
This is the solution!
I have not tested it yet, but should work.

# If requested URL does not resolve to an existing directory 
rewriteCond %{REQUEST_FILENAME} !-d 
# Externally redirect to remove trailing slash 
rewriteRule ^(.+)/$ http://www.example.co.uk/$1 [R=301,L] 
Comment 5 Voyagerfan5761 / dgw 2008-01-21 06:47:44 UTC
Igor, there are no directories inside /wiki/. Actually, there is no /wiki/ directory. All of that is an Apache rewrite changing /wiki/Title to /w/index.php?title=Title. It will never resolve to an existing directory, even if the URL does not end in a slash.

This is something MediaWiki should handle, I think. If no page with a slash exists, but the same title minus the ending slash is in the database, redirect to the title without a slash.
Comment 6 Igor berger 2008-01-21 08:38:58 UTC
Yes it can be done with a database query but much more load to check database everytime a page is called.

So just implement the rewrite rule for all, should fix the bug.
# Externally redirect to remove trailing slash 
rewriteRule ^(.+)/$ http://www.example.co.uk/$1 [R=301,L]
Comment 7 Brion Vibber 2008-01-21 08:47:46 UTC
A rewrite rule would be inadequate, as it would require custom manual set up for every MediaWiki site the world over, and would not be feasible on all of them. Much easier on everyone for the software to process things correctly, if that's what we want it to do.

Issues to consider:

1) Existing pages with '/' suffix (can be killed if necessary, we make half-broken titles illegal all the time)

2) Special page parameters where you *want* the / as it's partial input, eg:
http://en.wikipedia.org/wiki/Special:Prefixindex/User:Brion_VIBBER/
Comment 8 Voyagerfan5761 / dgw 2008-01-21 08:51:00 UTC
Regarding #2, I think we can just ignore slashes in special page titles. We should trust the user to know what they're putting in there. A prefixindex of User:Foo/ shouldn't be turned into a prefixindex of User:Foo, just because it would be the software modifying user input for no apparent reason.

Brion, would there be an issue with making MW redirect to the slash-less page only if one with a slash doesn't exist?
Comment 9 Igor berger 2008-01-21 08:56:17 UTC
I would do it across the board because it is a canonical domain issue and lose of page rank, as here
http://en.wikipedia.org/wiki/Main_Page/
http://en.wikipedia.org/wiki/Main_Page

Both point to the same page but it uses a 302 redirect not 301 redirect which is the wrong way to preserve Google page rank.
Comment 10 Igor berger 2008-01-21 08:58:30 UTC
Brion you can do both. Write a script for world and put rewrite rule for WikiPedia so save databese resources.
Comment 11 Igor berger 2008-01-21 12:51:12 UTC
This is a good leson on what is cononical duplication and how it hurts WikiPedia
http://www.seomoz.org/blog/rewriting-the-beginners-guide-part-iv-continued-canonical-and-duplicate-versions-of-content
Comment 12 Daniel Friesen 2008-01-22 09:02:52 UTC
The use of /wiki/ is only a common tradition. There is nothing statically defined in MediaWiki that says that links will be in the traditional format.

$wgArticlePath which is used to create this tradition is a string with a $1 substitution and thus commonly set to "/wiki/$1" but that doesn't mean that it's the tactic that is going to be followed. For all we know, the user could set it to "/wiki/$1/otherstuff" and then setup their rewrites according to that and it would be perfectly valid.

I've also seen wiki which use a trailing slash in some titles as part of the title. No links of the top of my head, but it's possible that a wiki may already be using a page Foo, and putting a list of subpages at Foo/.

Then there is the real default title to consider, which in truth is /index.php?title=Pagetitle&action=view in which a / is perfectly valid and doesn't run into the conflict. Thus another note is this kind of thing primarily only happens when the person setting up the wiki already did work on setting up rewrites, because wiki without this kind of configuration will actually be using the long format purely.

So I'd argue that while the option to do this would be a good addition (and perhaps default when short urls are enabled), it should not be forced on older wiki just because they upgraded.

There's also another note I should make. Trailing /'s are also used by a number of spambots. On Wikia we've in general blacklisted Talkpages and Forum pages with a trailing / because it prevents a number of spambots. If the redirection were automatically added, then suddenly the piles of spambot attacks aimed at [[Forum:Index/]] will suddenly go and hit [[Forum:Index]]. So there is a little harm to also consider in enabling it. So it should definitely be optional, and titles with a trailing / should not be deemed illegal.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links