Last modified: 2013-12-13 16:18:47 UTC
I added the property "Commons category" (P373) to a lot of items, see for example https://www.wikidata.org/wiki/Q1008 . It contains a string with the name of the category on Commons. On the Dutch Wikipedia I added some parser functions to find problems with the data, see https://nl.wikipedia.org/w/index.php?title=Sjabloon%3ACommonscat&diff=37414305&oldid=36545480 I noticed a large number of pages with "funny" characters in https://nl.wikipedia.org/wiki/Categorie:Wikipedia:Commonscat_met_lokaal_andere_link_dan_op_Wikidata . This category is for items which have a local variable set and it's different than the one set on Wikidata. Some pages which ended up in there shouldn't be there. Take for example https://nl.wikipedia.org/wiki/Ivoorkust , it links to the exact same category on the Dutch Wikipedia as it's object on Wikidata. My assumption is that somewhere encoding got mangled up.
It seems that #property encodes some characters the same way as PAGENAME do. Is it necessary? It results confusing checking matches #ifeq both with PAGENAME or #property. In the example above, the argument of template "Commonscat|Côte d'Ivoire" does not match with #property:P373 that returns "Côte d'Ivoire". If so, we will need a function or a template based on Lua for encoding/decoding pagenames and Wikidata properties.
Related links: * https://www.mediawiki.org/wiki/Manual:PAGENAMEE_encoding#PAGENAME * https://en.wikipedia.org/wiki/Wikipedia:Lua_requests#Pagename_and_three_special_characters
seems to have been fixed since