Last modified: 2011-01-02 20:19:31 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T27578, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 25578 - We should support XML-valid delimiters for query string parameters
We should support XML-valid delimiters for query string parameters
Status: RESOLVED INVALID
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
unspecified
All All
: Normal enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-10-19 06:00 UTC by Ryan Kaldari
Modified: 2011-01-02 20:19 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Ryan Kaldari 2010-10-19 06:00:18 UTC
I some situations, especially related to forms constructed in wikitext, or in strict XHTML markup, it is necessary to use escaped ampersands or semicolons to delimit query string parameters. (See http://www.w3.org/TR/REC-html40/appendix/notes.html#h-B.2.2 for more info on this.) Currently the Wikimedia servers only accept non-escaped ampersands as delimiters for query string parameters. 

Two possible solutions to this are:

1. In our php.ini files, set arg_separator.input = ";&". This will allow people to use either ampersands or semicolons as delimiters for query string parameters (as recommended by the w3c).

2. Replace line 50 of WebRequest.php with:

$requestPairs = $_POST + $_GET;
// Accomodate escaped ampersands as query string delimiters
foreach ( $requestPairs as $key => $val) {
	if ( strncmp( $key, 'amp;', 4 ) === 0 ) {
		$goodKey = substr( $key, 4 );
	} else {
		$goodKey = $key;
	}
	$this->data[$goodKey] = $val;
}

This will allow use of query string parameters separated by escaped ampersands.

The impetus for this bug is a problem in the DonationInterface extension related to forms that live in wikitext (thus forcing ampersands to be escaped). There are several workarounds available, but I thought I would propose a deeper solution to see what people thought.
Comment 1 Arthur Richards 2010-10-19 07:59:00 UTC
As a workaround, using the hex values seem to work (at least in Javascript inside of wikitext).  eg:
var url = 'http://foo.com/index.php?bar=baz&banana=sandwich';

can be written as:
var url = 'http://foo.com/index.php?bar=baz\x26banana=sandwich';

But still, the user-friendly solution would be to find a way to allow users to explicitly use '&' in query strings written in wikitext.
Comment 2 Niklas Laxström 2010-10-19 08:48:52 UTC
I don't understand what the problem is here. Can you give examples?
Comment 3 Arthur Richards 2010-10-19 17:29:58 UTC
(In reply to comment #2)
> I don't understand what the problem is here. Can you give examples?

For instance, on one of the WMF's landing pages for donations, we send users who wish to make a donation to a credit card form using a URL that has a query string with the ampersand character in it, something like:
https://payments.wikimedia.org/index.php/Special:PayflowProGateway?uselang=en&test=1

Because the landing page is written in a Wiki page, the '&' gets escaped to its htmlentity when the page is rendered, which makes the URL look like:
https://payments.wikimedia.org/index.php/Special:PayflowProGateway?uselang=en&test=1

This throws off the processing/interpreting of items in the query string because suddenly the parameter 'test' gets interpreted as 'amp;test'.
Comment 4 Brion Vibber 2010-10-19 17:34:19 UTC
Sounds like you just gotta fix the link here to make sure you're doing it correctly. Find and fix the wiki page generating the link?


Changing query string processing to use ; as a separator could open up whole new exciting cans of worms such as breaking things that try to use, well, semicolons. :)

I recommend either resolving this bug INVALID or renaming it to "A particular wiki page has a double-escaped link to PayflowProGateway".
Comment 5 Ryan Kaldari 2010-10-19 18:26:25 UTC
The URL isn't actually a link in this case. It's a javascript string that gets conditionally used as the form action. Obviously we could move this out of wikitext, but since these URLs get changed frequently, it's convenient to be able to edit it without a code deploy.

I agree the semicolon solution is potentially dangerous. Any thoughts on the 2nd proposed solution? Handling escaped ampersands in query strings is pretty common these days. This site (Wikimedia Bugzilla), for example, handles them fine, as do most other large websites I've worked for.

Since we have workarounds, I'm fine with WONTFIX or INVALID, but I'm wondering if there's any downside to the adding the proposed fix.
Comment 6 Roan Kattouw 2010-10-20 13:11:41 UTC
(In reply to comment #5)
> The URL isn't actually a link in this case. It's a javascript string that gets
> conditionally used as the form action. Obviously we could move this out of
> wikitext, but since these URLs get changed frequently, it's convenient to be
> able to edit it without a code deploy.
> 
I'm still not entirely clear on why the URL needs to contain & , could you clarify that?
Comment 7 Arthur Richards 2010-10-20 16:59:54 UTC
The URL should NOT contain & - rather it needs to have just a '&' and NOT have it escaped (which turns it into '&').
Comment 8 Ryan Kaldari 2010-10-20 17:26:00 UTC
We have solved it by using '\x26' in the wikitext, which gets output as & rather than & in the source. Anyway, the point of this bug was not fixing the URL, but "Do we want to support escaped ampersands in query strings?" A lot of web applications do, a lot don't. If we don't, I'll close the bug. If we do, I'll check in the code above.
Comment 9 Brion Vibber 2011-01-02 20:19:31 UTC
I'm gonna take the initiative and resolve INVALID (I really want a 'WONTIMPLEMENT' to supplement 'WONTFIX', which implies it's a known problem we won't fix, rather than a valid, but not desired in this case, option).

Messing with the options to accept semicolons as separators is possible, but has a bunch of problems:
* would likely break things that submit freeform text in forms, since text can contain semicolons, so mixed-mode parsing gets confusing
* the settings in php.ini aren't always editable
* even when they are adjustable, PHP code can't change it as it's too late
* it would be possible to reimplement the GET parameter extraction from scratch in WebRequest, but that doesn't seem worth the effort
* more code paths and more URL variants just sounds icky

The sort of case where this originally came up -- hacky wikitext/HTML/JS mixture pages that use several layers of processing meant for other layers of processing to try to whip stuff together -- are good cases for more specialized systems of template generation that would reduce the need for manual intervention on the markup level.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links