Last modified: 2014-07-28 19:02:23 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T34858, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 32858 - Do not parse .js/.css pages and save their parsed content in database tables
Do not parse .js/.css pages and save their parsed content in database tables
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
1.18.x
All All
: Normal normal with 3 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
: need-parsertest, parser
: 17525 32450 (view as bug list)
Depends on: 10410
Blocks: 16660
  Show dependency treegraph
 
Reported: 2011-12-07 11:37 UTC by Danny B.
Modified: 2014-07-28 19:02 UTC (History)
11 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Danny B. 2011-12-07 11:37:12 UTC
Do not parse .js/.css pages and save their parsed content in database tables.

It creates entries in following tables:
* categorylinks
* externallinks
* imagelinks
* iwlinks
* pagelinks
* templatelinks
(maybe some others as well)

There is no reason to parse .js/.css pages and therefore create false records in database which then influence the results of various queries such as Wanted* etc.

Databse cleanup will be needed after then, so adding this as a blocker to bug 16660 as well.
Comment 1 Phillip Patriakeas 2011-12-07 16:46:29 UTC
Just to point out, many script authors use the fact that links on script pages get parsed to track usage of their script - e.g. in the instructions on using the script, many authors show something like the following:

>// FooBar script by [[User:Example]] ([[User:Example/foobar.js]])
>importScript( 'User:Example/foobar.js' );

Personally, I agree that parsing .css/.js pages in this way is wasted effort, but it would be nice having some built-in way to track script usage in place of abusing the current behavior.
Comment 2 LordAndrew 2011-12-08 04:35:51 UTC
Better tracking would be good, yes. But another use that we should consider is that people sometimes slap deletion templates on their .css and .js pages, and it works. Without that, people would be forced to make such deletion requests on some other page.
Comment 3 Danny B. 2011-12-08 07:53:59 UTC
(In reply to comment #2)
> Better tracking would be good, yes. But another use that we should consider is
> that people sometimes slap deletion templates on their .css and .js pages, and
> it works. Without that, people would be forced to make such deletion requests
> on some other page.

There is always talk page or administrators' noticeboard, admins have their talk pages, etc. Plenty of place, where to request.

----

I ran a scan on small cs wikis, and it is already hundreds of false entries in database. I wonder how many it will be on cswiki, not even speaking about enwiki. 

Things like tracking (besides it does not work crosswiki anyway) or deletion requests could/should be done other way and not have a significant influence on the content of mentioned database tables.
Comment 4 Phillip Patriakeas 2011-12-08 17:28:17 UTC
The only way such alternate tracking methods could work is if they are automatic; the only reason the current system works anywhere near as well as it does is because it requires zero additional effort to simply copy-pasting the needed JS. Add an additional step to the script installation for tracking purposes, and in the best case, most people will ignore it (and in the worst case, people won't use your script because they can't be bothered or they find another script with a simpler installation).
Comment 5 Platonides 2011-12-08 17:32:49 UTC
*** Bug 17525 has been marked as a duplicate of this bug. ***
Comment 6 Platonides 2011-12-08 17:33:50 UTC
*** Bug 32450 has been marked as a duplicate of this bug. ***
Comment 7 Platonides 2011-12-08 17:40:23 UTC
Duping here several related bugs, since i think this is the most clear case. Either everything should be parsed, or nothing.
This means r103476 should be reverted (if we decide we go for the nothing case, it would be done somehere else).

subst: could continue working in both cases (mentioned in bug 32450).

Note that javascript authors can make the links in such way that they don't get registered (eg. '[' + '[Category').

As a third alternative, we could do a different kind of parsing for js pages, so that wikitext would only be parsed inside js comments (bug 10410).
Comment 8 DavidL 2011-12-08 18:09:28 UTC
Parsing JS/CSS seems useless :
* {{subst:}} / {{template}} -> useless
* [[Category:]] -> useless or in comments only
* [[link]] / [link] -> useless or in comments only

Tracking script usage does not work for gadgets. If script usage should be tracked, it should be done as a new feature of MediaWiki instead.

If some parsing should be done, it should be only for [[Category:]] and [[link]] / [link], in comments only.
Comment 9 Phillip Patriakeas 2011-12-08 18:32:04 UTC
{{subst:}} is definitely *not* useless on JS pages; have a look at [[Template:Deletion sorting]] as one example (I'm not arguing that this method of script installation/use should be *encouraged*, merely pointing out that it exists and has been used historically).

I do agree that script usage needs to be tracked as a MediaWiki feature; this is what I was getting at above, though I never actually said as much.

I'd be fine with normal wikitext rendering only happening in comments (and I'd be far from the only person to wholeheartedly welcome functioning links in comments again!).
Comment 10 Phillip Patriakeas 2011-12-08 18:34:44 UTC
I meant to link to [[Template:Deltab]] in comment 9; I didn't check closely enough before submitting. =P
Comment 11 DavidL 2011-12-08 18:54:23 UTC
Problem occurs when 'subst:' appears in scripts adding some buttons to place models for example. See https://fr.wikibooks.org/w/index.php?title=MediaWiki:Gadget-Barre_de_luxe.js&diff=prev&oldid=343887

subst: seem only used on English wikipedia, there is other projects and languages where this feature is not used.
Instead of using script as a template, parameters in user scripts should be used, so subst: seems useless.

For the example you give, instead of... :
  Put in monobook (replace X with your area of interest) :
    {{subst:Deltab|X}}
  Javascript source :
    document.editform.wpTextbox1.value += '\{\{subst:deletion sorting|{{{1}}}| -- \~\~\~\~\}\}\n';
...we should have :
  Put in monobook :
    DeltabInterest = 'X'; // replace X with your area of interest.
    importScript('Deltab');
  Javascript source :
    document.editform.wpTextbox1.value += '\{\{subst:deletion sorting|'+DeltabInterest+'| -- \~\~\~\~\}\}\n';
Comment 12 DavidL 2011-12-08 19:03:00 UTC
Also it would be safer to use javascript parameters in user script rather than template parameter. Other language use more characters like ' é à ... which can generate some script errors, for example :
  {{subst:Deltab|L'arbre}}

Also the subst doesn't allow using the last script version as it make substitution.
Comment 13 DavidL 2011-12-08 19:04:36 UTC
So it would be better to stop parsing subst in scripts, and modify the wikipedia scripts like Deltab.
Comment 14 Tim Starling 2011-12-09 09:52:36 UTC
For clarity: the (In reply to comment #7)
> Duping here several related bugs, since i think this is the most clear case.
> Either everything should be parsed, or nothing.
> This means r103476 should be reverted (if we decide we go for the nothing case,
> it would be done somehere else).

I don't like environment-dependent special cases in the parser like r103476, I think the proper place to decide what to do with a JS/CSS page is Article/WikiPage.

> subst: could continue working in both cases (mentioned in bug 32450).
> 
> Note that javascript authors can make the links in such way that they don't get
> registered (eg. '[' + '[Category').
> 
> As a third alternative, we could do a different kind of parsing for js pages,
> so that wikitext would only be parsed inside js comments (bug 10410).

I think a separate parser class along the lines of bug 10410 would be a nice way to go, it could implement syntax highlighting, linking and subst. But it's a bit late to develop that for 1.19. For now I am going to revert Hashar's changes and hack WikiPage somehow.
Comment 15 Daniel Kinzler 2011-12-09 10:03:57 UTC
(In reply to comment #14)
> I think a separate parser class along the lines of bug 10410 would be a nice
> way to go, it could implement syntax highlighting, linking and subst. But it's
> a bit late to develop that for 1.19. For now I am going to revert Hashar's
> changes and hack WikiPage somehow.

@Tim: for WikiData, we (WMDE) plan to introduce explicit types for page content, and handlers for each type, similar to what we use for different types of media files. So, there could be a special renderer (and optionally also a special editor and a special diff engine) for text/css, application/js, etc. 

Just wanted to note this here to avoid duplicate effort. I'll detail the plans on mediawiki-l soon, and we can discuss it at the SF hackathon.
Comment 16 Tim Starling 2011-12-09 10:33:17 UTC
WikiPage hack done in r105664.
Comment 17 Michael M. 2011-12-23 10:44:06 UTC
As I already mentioned in CR as an author of user scripts I depend in some way on the fact that the javascript code is also parsed as wikitext.
Just this week I updated one of my scripts in a not-100-%-backwardscompatible way, so I had to inform the users of my script about the changes. With the "What links here" function this was no problem. To remove this means that an author of users scripts has no simple way to find out who is using his scripts. By looking at random .js user pages you will find that I'm not the only one who makes use of this feature, without looking much around I found [[User:Ale jrb/Scripts/igloo]] which asks users to include the backlink in a comment.

One of my scripts I don't want to show up in Google, so I put __NOINDEX__ in it - this change makes this not to work any longer.

When a script causes links that shouldn't be there, you just have to put a <nowiki> inside a comment at the beginning.

And not parsing the script as wikitext makes it possible to circumvent Extension:SpamBlacklist and possibly other anti-spam-extensions, as I pointed out in CR.
Comment 18 DavidL 2011-12-23 13:30:03 UTC
Javascript must be Javascript. How many javascript are broken because they are parsed as wiki ?

If you want wiki-parsing in scripts, you should make a feature request for special comment or tag. We should not have to modify numerous scripts to put a <nowiki> tag.

Also tracking script users like this is not good:
1 - Users knowning about Javascript syntax won't include the link comment. So all users are not listed.
2 - It doesn't work for gadgets.
3 - Tracking of users seems only useful for the total usage count only, not for the user names.
You may request a new feature of MediaWiki to get total script usage count, instead of using links.

Not parsing the script as wikitext DO NOT make possible to circumvent Extension:SpamBlacklist because links are not links.
Comment 19 Liangent 2011-12-23 14:23:42 UTC
(In reply to comment #18)
> How many javascript are broken because they are
> parsed as wiki ?

You only need to add backslashes in a few cases which triggers PST to prevent the script being broken:

* var str = "{\{subst:template}}";
* var str = "~~\~~";

And what more?
Comment 20 DavidL 2011-12-23 14:33:12 UTC
No, putting backslash is not javascript ! Do you have a bot to make the replacement for all scripts, gadgets, on all wiki projects and all langs ?

The simplest way to have parsed scripts is to use another extension like .jsw instead of .js (.cssw instead of .css) because they aren't true javascript (stylesheet).
Comment 21 Phillip Patriakeas 2011-12-23 17:03:14 UTC
(In reply to comment #20)
> No, putting backslash is not javascript !

Are you listening to yourself? Using a backslash in strings to escape particular characters so they aren't incorrectly parsed is done by just about everybody who writes JS, including those not writing it on MediaWiki installations - it's the simplest and most obvious way to prevent unwanted parsing by the JS engine running the script. Using the backslash to prevent unwanted parsing by MediaWiki is a natural and obvious extension of this built-in syntax without any downsides; every JS programmer immediately recognises what's going on when they see the backslash, even if they don't understand the reasoning for its being there. How exactly, then, is "putting backslash ... not javascript"?

(In reply to comment #18)
> 3 - Tracking of users seems only useful for the total usage count only, not for
> the user names.

Michael M. gave an explicit counterexample to this in comment 17: 'Just this week I updated one of my scripts in a not-100-%-backwardscompatible way, so I had to inform the users of my script about the changes. With the "What links here" function this was no problem.' If the script tracking only listed the number of users with the script installed, that notification would have been impossible.
Comment 22 DavidL 2011-12-23 19:21:34 UTC
(In reply to comment #21)
> (In reply to comment #20)
> > No, putting backslash is not javascript !
> 
> Are you listening to yourself? Using a backslash in strings to escape
> particular characters so they aren't incorrectly parsed is done by just about
> everybody who writes JS, including those not writing it on MediaWiki
> installations - it's the simplest and most obvious way to prevent unwanted
> parsing by the JS engine running the script. Using the backslash to prevent
> unwanted parsing by MediaWiki is a natural and obvious extension of this
> built-in syntax without any downsides; every JS programmer immediately
> recognises what's going on when they see the backslash, even if they don't
> understand the reasoning for its being there. How exactly, then, is "putting
> backslash ... not javascript"?

I means that the following is valid Javascript :
  var str = "{{subst:template}}";
Why will this code need escaping at all as Javascript is expected ?

> 
> (In reply to comment #18)
> > 3 - Tracking of users seems only useful for the total usage count only, not for
> > the user names.
> 
> Michael M. gave an explicit counterexample to this in comment 17: 'Just this
> week I updated one of my scripts in a not-100-%-backwardscompatible way, so I
> had to inform the users of my script about the changes. With the "What links
> here" function this was no problem.' If the script tracking only listed the
> number of users with the script installed, that notification would have been
> impossible.

Then use special MediaWiki feature instead of special Wiki+Javascript syntax.
Wiki pages are wiki, Javascript pages should be Javascript.
Comment 23 Liangent 2011-12-23 19:41:50 UTC
Is there someone using {{subst: to do strange things such as version counting?

If no, I guess it's safe to stop doing PST for .css/.js.
Comment 24 Liangent 2011-12-23 19:42:49 UTC
Hmm there're various pages telling people to add {{subst:something}} to their <skinname-or-common>.js to install a user script.
Comment 25 Helder 2011-12-23 20:03:31 UTC
(In reply to comment #24)
> Hmm there're various pages telling people to add {{subst:something}} to their
> <skinname-or-common>.js to install a user script.

I think this could be replaced by a link in the script documentation page:
http://en.wikipedia.org/w/index.php?title=Special:MyPage/common.js&action=edit&withJS=MediaWiki:Example.js
where [[MediaWiki:Example.js]] would have a simple code which adds the script to the end of the user's common.js page.

This would be an script only solution for the installation of scripts.

There is also the following for helping users to install scripts:
* [[User:Gary King/script installer source.js]]
Comment 26 Helder 2011-12-23 20:27:17 UTC
(In reply to comment #2)
> Better tracking would be good, yes. But another use that we should consider is
> that people sometimes slap deletion templates on their .css and .js pages, and
> it works. Without that, people would be forced to make such deletion requests
> on some other page.

Well, it doesn't works as expected: the deletion category is not shown in the script page so the users are already forced to find some admin to delete their scripts, because the template doesn't seems to work (and also causes JS errors if it is not put /* inside of a comment */ )
Comment 27 Helder 2011-12-23 20:37:28 UTC
(In reply to comment #19)
> You only need to add backslashes in a few cases which triggers PST to prevent
> the script being broken:
> 
> * var str = "{\{subst:template}}";
> * var str = "~~\~~";

Users who try to use JSHint[1] to validate a code which uses this will get a "Bad escapement" error.

Although they can just change it to
    var str = "{" + "{subst:template}}";
    var str = "~~" + "~~";
to avoid the errors, having to fix every time these sequences of characters which have special meaning in wiki markup is a PITA. Valid JavaScript code should not have have unexpected results when inside of a wiki page.

[1] The tool http://www.jshint.com/ is recommended by the developers, at [[mw:Manual:Coding_conventions/JavaScript#Performance_and_best_practices]]
Comment 28 Phillip Patriakeas 2011-12-23 20:39:38 UTC
(In reply to comment #26)
> (In reply to comment #2)
> > Better tracking would be good, yes. But another use that we should consider is
> > that people sometimes slap deletion templates on their .css and .js pages, and
> > it works. Without that, people would be forced to make such deletion requests
> > on some other page.
> 
> Well, it doesn't works as expected: the deletion category is not shown in the
> script page so the users are already forced to find some admin to delete their
> scripts, because the template doesn't seems to work (and also causes JS errors
> if it is not put /* inside of a comment */ )

(Keeping in mind that I have almost no technical understanding of how the parsing infrastructure currently works) The simplest (and most naive) method of fixing that would be to just parse CSS/JS to find comments, and send the content of each comment on to the parser proper. Of course, that opens up one massive can of worms as to how we'd want to deal with markup that does just about anything except displaying text with simple formatting and links (what, for example, would be the preferred method of handling lists? tables? images?), and it would also break the ability to copy CSS/JS directly while viewing the relevant page (since you wouldn't get the parsed markup in comments)... But then, I *did* say it would be a naive method.
Comment 29 Helder 2011-12-23 21:03:52 UTC
One more thing which may break when fixing this bug is the hack used on [[Template:Selfsubst/now string]] to produce an auto updating string (e.g. for version of scripts).
Comment 30 Michael M. 2011-12-27 08:53:27 UTC
(In reply to comment #18)

> Not parsing the script as wikitext DO NOT make possible to circumvent
> Extension:SpamBlacklist because links are not links.

As I pointed out in r105664#c27321, it DOES make it possible: When you transclude a .js/.css-page in a normal page the links are links again, even if they aren't rendered as links in the .js-page.
Comment 31 Liangent 2011-12-27 15:01:18 UTC
Just now a user reported an issue on zhwiki village pump that an edit on user js page triggers AbuseFilter because of "buggy" template usage.
Comment 32 DavidL 2011-12-27 16:11:36 UTC
Transcluding a .js/.css-page in a normal page (which I would never do, just putting link to .js/.css page is better) should not be done like any other template, but using javascript syntax highlight without any link parsing (like viewing the .js/.css page directly).
Comment 33 Tim Starling 2011-12-27 23:02:54 UTC
I reverted the fix from r105664 and I'm marking this bug WONTFIX because:

* Concerns raised on this bug report indicate a lack of consensus and the potential for disruption on deployment.
* Precaution leads me to favour the existing behaviour over the new proposal in cases where the best behaviour is unclear.
* A better solution exists, which would satisfy people on both sides of this debate: registering links only where they occur in comments, along the lines of bug 10410.

(In reply to comment #29)
> One more thing which may break when fixing this bug is the hack used on
> [[Template:Selfsubst/now string]] to produce an auto updating string (e.g. for
> version of scripts).

Removing the pre-save transform was not requested here or implemented in either of the proposed fixes, so the selfsubst templates would have still worked.
Comment 34 DavidL 2011-12-28 16:20:34 UTC
Revert to MW 1.17 instead.
MW 1.18 caused multiple problems.

What about the {{subst:}} problem ?
Comment 35 Tim Starling 2011-12-29 00:11:19 UTC
(In reply to comment #34)
> Revert to MW 1.17 instead.
> MW 1.18 caused multiple problems.

I'm not aware of any difference between MW 1.17 and MW 1.18 in the way it handles CSS/JS page parsing. Please file a separate bug.

> What about the {{subst:}} problem ?

You can file a separate bug for that. But the discussion here indicates that it would probably be a WONTFIX also since subst is desired by some.
Comment 36 DavidL 2012-01-05 17:26:35 UTC
(In reply to comment #35)
> (In reply to comment #34)
> > Revert to MW 1.17 instead.
> > MW 1.18 caused multiple problems.
> 
> I'm not aware of any difference between MW 1.17 and MW 1.18 in the way it
> handles CSS/JS page parsing. Please file a separate bug.
> 
> > What about the {{subst:}} problem ?
> 
> You can file a separate bug for that. But the discussion here indicates that it
> would probably be a WONTFIX also since subst is desired by some.

See previous comment: I already opened Bug 32450 but it has been marked as duplicate of this.

The only conclusion I see is that problems won't be fixed, so I doubt about bug report being useful here...
Comment 37 Helder 2014-07-28 19:02:23 UTC
(In reply to LordAndrew from comment #2)
...and {{delete}} and [[Category:]] do not work on JS pages anymore, per bug 68757.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links