Last modified: 2011-03-13 18:06:47 UTC
Although differentiated opening and closing quotes look better, most people can‘t type them easily or at all. In keeping with the aim of fast editing, it would be best to allow editors to type in standard quotes but display distinctive quotes. Same goes for the ellipses…
Created attachment 284 [details] New parser function to convert various strings to UTF-8 entities This patch uses to regular expressions to determine correct quote or ellipses to use. It leaves alone everything in preformatted sections. I've tested it in all the quote situations I can come up with and tweaked the regexp till it worked. There may be more; it would be best to have this pointed at a live backup for a while. NOTE: I moved the em dash code into this function, so if a fix for bug #1485 is checked into HEAD it will need to be undone. (I'd be happy to update this patch to delete the other if that happens.)
Please see section "Quote signs in several languages" in http://en.wikipedia.org/wiki/Quotation_mark Also about dashes, example in Russain language n-dash absent. There are only m-dash.
(In reply to comment #2) Thanks for that link, Alexander. I would have never guessed that ”this format” was standard in Swedish. sv.wikipedia would have a legitimate gripe if their ASCII quotes were converted to English-style opening and closing quotes. I also checked Romanian, whose wikipedia seems to use "these quotes" even though they resemble none of the standard or alternative quotes in the table. In that case, it's hard to call the quote conversion incorrect when the original was also incorrect. Ideally, languages whose quotation marks are very different from the ASCII ones would not use the ASCII marks at all. Russian (I glanced over the page in Russian on Russian language) seems to use UTF-8 codes. In that case, there is no issue; the conversion routine will not touch them. Of course, there's still the problem with Swedish and other languages that use quoting schemes that are close to, but not exactly like, English. For them it would be necessary to disable the conversion, or if someone wants to do it, provide alternate conversion. Would that be satisfactory? The principle I'm pitching is that we don't have to provide the convenience function for every language, but we do have to avoid making things worse for them.
I'm inclined to close this as WONTFIX. It's not possible to get it right automatically in all cases, and wrong "smart" quotes are much more annoying than straight quotes (which are always "right" even if they're not as pretty as you might sometimes like).
(In reply to comment #4) I think that would be premature. This is only a proposed enhancement for a future version of the software, why not let it be? Besides, it's not for you or me to say what is typographically "correct," it's up to everyone using wikimedia. I suggest we see how things go with bug 1485. If it's a success, I'll ask the users if they want something similar — but only 95% accurate — for quote marks and ellipses.