Last modified: 2006-08-18 16:03:34 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T3225, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 1225 - False unicode-encoding in Database creates un-reachable image-description-page
False unicode-encoding in Database creates un-reachable image-description-page
Product: Wikimedia
Classification: Unclassified
General/Unknown (Other open bugs)
All All
: Normal major with 2 votes (vote)
: ---
Assigned To: Brion Vibber
Depends on: 215
Blocks: unicode 3985
  Show dependency treegraph
Reported: 2004-12-29 03:41 UTC by Christian Thiele
Modified: 2006-08-18 16:03 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Description Christian Thiele 2004-12-29 03:41:38 UTC
In the german Wikipedia there is an image with the filename "Kohlenstoffnanoröhre-Animation.gif" ('cur'-DB-Entry 
335479). I don't know why, but the "ö" is not encoded the usual way with %C3%B6, but with %6F%CC%88, which is an 
"o" followed by special dieresis for the previous letter: so both encodings are correct "ö"-encodings.

But MediaWiki only shows Pages with "ö" in the %C3%B6-encoding. So even when calling directly the %6F%CC%88-
encoding I get an header-redirect to the %C3%B6-page (Location:
Kohlenstoffnanor%C3%B6hre-Animation.gif). It's now impossible to view that page or delete it.
Comment 1 bdk 2004-12-29 04:48:59 UTC
The image is already replaced by a well named version (
Kohlenstoffnanoroehre_Animation.gif), so the misspelled one has to be deleted. Thanx.
Comment 2 Tomer Chachamu 2005-04-23 13:44:20 UTC
Upping priority
Comment 3 Brion Vibber 2005-10-29 03:11:35 UTC
This should have been corrected some time ago. Confirmed?
Comment 4 Christian Thiele 2005-10-29 12:25:03 UTC
Please look at and click 
on the only image on this page. Don't know if the creation of such images is fixed but this one still 
exists ;)
Comment 5 lɛʁi לערי ריינהארט 2006-01-08 13:19:27 UTC

If you select view page source in your browser for
you will can find

This is the image that should be deleted which does not show up at

re: comment 1 Today the image is available at
[[de:image:Kohlenstoffnanoroehre-Animation.gif]] because of

re old url:

a) Please note that [[Special:Allpages]] is not suitable to identify all medias
because {{ns:Media}} is not supported at Allpages or the page which is subject
of this bug is not shown at Special:Allpages?namespace=6 because *another*
subsequent bug. 

b) Please note that [[Special:Imagelist]] can not be used either because no
filter functionality is available any more.

c) The image can neither be found in the upload log with
nor with

The reason might relate to earlier versions of [[Special:Upload]] / [[Special:Log]].

Please note that %CC%88 stands for
Entity (decimal) ̈ (hex) ̈
UTF-8 (hex) 0xCC 0x88 (cc88)

If this file "survived" the databeses Unicode normalisation please verify if
other files using COMBINING DIAERESIS or other characters from the "Unicode
Block 'Combining Diacritical Marks'" (
still exists in the database.

best rgards reinhardt [[user:gangleri]]

P.S. beside blocks Bug 3969: unicode compatibitity (tracking)
I added blocks Bug 3985: character conversion (tracking)
Comment 6 lɛʁi לערי ריינהארט 2006-01-08 13:57:00 UTC

This report reminds me on
Bug 3860: links generated with precombined characters show red despite the fact
that the normalised links exist
which is a duplicate of
Bug 1527: *first* perform Unicode normalisation and check for existence of pages
*after* the normalisation

The reason why the file from the url is not "recognized" at
might be the same.

Because the file from the url "survived" the databases Unicode normalisation it
is possible that other files using precombined Unicode characters still exist in
the database. Many page titles at [[yi:]] used BiDi punctuation characters and
different spellings ("tsvey-vovn" versus "vov+vov" etc.) in titles. It is very
likly that the contributors used / are still using also precombined Unicode
characters for (file) titles they uploded / are uploading.

Other wiki's / languages where Unicode normalisation is involved might be
affected as well. Identifying these is a general issue / relates to all wiki's.

Bug 830: Commons rejects upload of filenames in Hindi
might be "historical". Helpfull / usefill links from the original reporter are
missing there.

best regards reinhardt [[user:gangleri]]
Comment 7 Brion Vibber 2006-07-13 17:47:09 UTC
Running a final check for remaining bad image names
Comment 8 Brion Vibber 2006-07-14 18:06:56 UTC
Normalized remaining filenames last night.

Note You need to log in before you can comment on or make changes to this bug.