Last modified: 2014-03-03 20:47:40 UTC
By using {{#time: (as in this example) or possibly other parserfunctions, one may circumvent the blacklist. I'm not sure how to get this sort of thing blocked, but this should be worked out. Also in the URL field: http://nl.wikipedia.org/w/index.php?title=Gebruiker:Emil76&diff=prev&oldid=13856552
I guess we're not using tracking bugs for this any longer.
The basic problem here is that we have to determine not whether a URL exists in the page now, but whether under any circumstances of possible input data it _could_. That quickly becomes difficult or impossible if you move from this case to slightly less trivial ones, including already-known-possible cases involving transclusion of multiple pieces edited at different times. The only real solution to that I can think of is to apply the blacklist at rendering time for *views* as well as at *edit* -- matching links could be de-linked or removed and the page marked on a queue for review. This probably wouldn't perform terribly well, but could perhaps be optimized. Don't know if it's worth the effort.
(In reply to comment #2) > The basic problem here is that we have to determine not whether a URL exists in > the page now, but whether under any circumstances of possible input data it > _could_. That quickly becomes difficult or impossible if you move from this > case to slightly less trivial ones, including already-known-possible cases > involving transclusion of multiple pieces edited at different times. Sure, but can't this slightly-less-exotic case be covered will less trouble? > The only real solution to that I can think of is to apply the blacklist at > rendering time for *views* as well as at *edit* -- matching links could be > de-linked or removed and the page marked on a queue for review. > I think filtering on view might be worth doing, perhaps with a notice "This page has spam that we've automatically hidden from your sensitive eyes, please help clean it up. You're looking for the domain spam.org -> [edit]" - especially useful now that saving isn't blocked when the domain already existed in the page (bug 1505). (however a queue seems like overkill)
(In reply to comment #3) > Sure, but can't this slightly-less-exotic case be covered will less trouble? *WITH less trouble
*** Bug 16354 has been marked as a duplicate of this bug. ***
(In reply to comment #5) > *** Bug 16354 has been marked as a duplicate of this bug. *** > thanks
A minimal fix, to stop this from being attractive to vandals, would be to simply silently ignore any blacklisted URLs unexpectedly encountered during a parse. I wouldn't (naively, perhaps) expect this to cause too much load; surely our page caching should be good enough that pages don't get needlessly reparsed very often?
*** Bug 16610 has been marked as a duplicate of this bug. ***