Last modified: 2010-04-16 14:57:17 UTC
In the last time vandals create many articles which seems to have exactly the
same title. Unfortunately they insert unvisible UTF8 characters into the title.
So we get many diffferent articles which titles looks like say "Karin Stoiber".
Software should refuse creating articles with a title which includes an
unvisible character - of course "blank" must be an exception.
Vandal problem occured in de-WP. My nickname is tsor, I am administrator.
*** This bug has been marked as a duplicate of 1414 ***
When you said "unvisible UTF8 characters" I thought you were talking about some
whitespace utf8 characters, but (as you explained in bug 1414) you are talking
about characters which look like some latin characters (eg 'greek kappa' like
'K', or 'cyrillic small dze' like 's', etc .. -> [[w:en:Homoglyph]] ).
Well using non latin utf8 characters in titles is not a bug .. it's a feature.
Some wiki, like fr: use a lot of non latin char in the titles (usually it
redirects to a romanized normalised title). Moreover the homoglyph problem
already existed with l (L) and I (i) loot at [[w:de:Ill (Elsass)]] ; some
vandals can create a page "Johannes Paul ll" (Johannes Paul II) most users wont
As it's somewhat related to punycode/IDN firefox 1.0.1 problem look at mozilla
We could try the suggested :
- "Measurements of lexical proximity" with an older article title (helped with a
list of utf8 homograph pair)
- "Domain letter colouring", hilighting, tooltips above chars showing which
unicode bloc they belong to. Or we could hilight/warn only unusual utf8
characters but this could required to define the list of frequently used char
I change the summary of the bug to "utf8 Homoglyph in titles"
Moved to the general/Unknown component and changed the severity from major to
trivial, there is an easy workaround avalible.
(In reply to comment #2)
> Well using non latin utf8 characters in titles is not a bug .. it's a feature.
Yes, and on those grounds I would originally suggest a WONTFIX.
(In reply to comment #3)
> Moved to the general/Unknown component and changed the severity from major to
> trivial, there is an easy workaround avalible.
Yes, you can use AbuseFilter to prevent these sorts of things if vandalism is indeed an issue for your wiki (and I believe en.wikipedia already does some things to this effect). For that reason, I'm going to resolve this FIXED.