Last modified: 2010-05-15 14:36:13 UTC
Migrated from sf.net bugtracker Originally opened 2002-07-18 08:48 When a URL contains another full and unescaped URL within its query string, it is correctly parsed as a single big URL when placed directly into the text. However, if put in brackets as [URL] or [URL description], the URL-in-a-bracket parsing breaks. The brackets, URL, and description appear as plain text, and the sub-URL gets reparsed as a standalone hyperlink. Example: http://www.unausa.org/newindex.asp?place=http://www.unausa.org/programs/mun.asp appears correctly as a big link to the correct, full URL. [http://www.unausa.org/newindex.asp?place=http://www.unausa.org/programs/mun.asp] should display as "[1]" being a link, but instead appears with brackets and full URL intact as text, but the portion "http://www.unausa.org/programs/mun.asp" is a linked URL. Workaround: Replacing the : with %3A fixes the parsing problem. (Of course, in this particular case only the shorter URL is actually needed, as it will be dynamically redirected to the longer URL.)
Date: 2002-07-19 21:06 Sender: lcrocker Logged In: YES user_id=3076 I'm lowering the priority on this since there's an easy workaround, and would require messing with some pretty stable and pretty important code, but it would be nice, so I'll leave it open.
Date: 2002-07-30 20:44 Sender: vibberAccepting Donations Logged In: YES user_id=446709 I'm raising the priority because I've come across a case the workaround doesn't work for. (See http://www.wikipedia.com/wiki/Wikipedia%3AVillage_pump ) If the main URL is http and the sub-URL is *ftp*, the %3A fix doesn't work: all ftp URLs are parsed /after/ all http URLs, and somehow the %3A gets transformed back into a : in the 'title' field of the link... this triggers the ftp URL-checker, so: [http://promo.net/cgi-promo/pg/t9.cgi?entry=120&full=yes& ftpsite=ftp%3A//ibiblio.org/pub/docs/books/gutenberg/ Gutenberg text] is parsed into the horrific: <a href='http://promo.net/cgi-promo/pg/t9.cgi?entry=120&full=yes &ftpsite=ftp%3A//ibiblio.org/pub/docs/books/gutenberg/' class='external' title="http://promo.net/cgi-promo/pg/t9.cgi?entry=120&am p;full=yes&amp;ftpsite=<a href="ftp://ibiblio.org/pub/docs/books/gutenberg/ class='external' title="ftp://ibiblio.org/pub/docs/books/gutenberg/"> ftp://ibiblio.org/pub/docs/books/gutenberg/</a>">Gu tenberg text</a> Simple partial fix would be to *not* unescape URL-encoded bytes when producing the 'title' attribute for the link, so it remains %3A and doesn't trigger the link converter. Alternatively, find a way to not check for URLs inside HTML tags.
Date: 2003-01-23 00:02 Sender: nichtich Logged In: YES user_id=534251 URLs inside all kind of links should be treated as text. For instance: [[Sandbox|http://de.wikipedia.org]] produces a link to http://de.wikipedia.org and not to [[Sandbox]] !
Date: 2003-03-17 07:44 Sender: nobody Logged In: NO Wiki has other problems dealing with URLs that have certain characters in it like '*' or if the URL contains part of another URL. Examples: http://example.com/*/foo/bar [http://example.com/redir/http://www.prwatch.org some link] As the original bug report indicated, URL escaping can be used as a workaround: http://example.com/%2A/foo/bar [http://example.com/redir/%68ttp://www.prwatch.org some link] --Sheldon Rampton (sheldon.rampton@verizon.net)
Date: 2004-08-07 20:30 Sender: timstarlingAccepting Donations Logged In: YES user_id=758207 All of these problems are now fixed except nichtich's. I also added URL-encoding, it seemed to me to be more user- friendly to allow users to paste URLs in directly. Brion's example: [http://promo.net/cgi-promo/pg/t9.cgi? entry=120&full=yes& ftpsite=ftp%3A//ibiblio.org/pub/docs/books/gutenberg/ Gutenberg text] This needs to become: [http://promo.net/cgi-promo/pg/t9.cgi? entry=120&full=yes&ftpsite=ftp% 3A//ibiblio.org/pub/docs/books/gutenberg/ Gutenberg text] This is not backwards-compatible so may require automated conversion.
Updating the summary accordingly.
Fixed in HEAD; I hope the change didn't introduce any new problems.
Created attachment 3470
Created attachment 3471
Created attachment 3472
Created attachment 3473