Last modified: 2012-06-13 18:22:24 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T30919, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 28919 - Add message parsing mode stripping markup


Summary:	Add message parsing mode stripping markup

Status:	RESOLVED WONTFIX

Product:	MediaWiki
Classification:	Unclassified
Component:	Interface (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal enhancement with 1 vote (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:	i18n

Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2011-05-10 22:14 UTC by Purodha Blissenbach
Modified:	2012-06-13 18:22 UTC (History)
CC List:	2 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Purodha Blissenbach 2011-05-10 22:14:51 UTC

Some user interface messages are used in different contexts - see for instance bug 14107, b ut there are many more - where one use requires markup, or could or should have markup, while the other use forbids markup.

For the second use, we should have a mesage parsing or message text retrieval mode stripping markup. Most likely the needed code exists already, since the TOC of pages is built from section headings having all kinds of markup stripped.

It only needs to be applied at the right places. 

Some routines should generally apply this kind of stripping on specific data, e.g. all html attribute values, or option values and labels in <select> inputs, etc.

Comment 1 Brion Vibber 2011-05-10 23:08:01 UTC

I don't think any new message parsing modes would help here.

These aren't necessarily good 1:1 matches for message parsing; a general HTML chunk might include explanatory links etc that would look horribly broken when flattened to plaintext.

Bug 14107 sounds like a specific case where what's desired is:
* load a message that lists deletion reasons and split it into sections and substrings based on app-specific rules that as already done

for each section's substring:
* pass it through comment formatter to produce HTML fragment
* run through Sanitizer::stripAllTags() to get a plaintext version of the HTML fragment
* put the plaintext copy into the dropdown list

Comment 2 Purodha Blissenbach 2011-05-11 00:16:46 UTC

My somewhat lazy ad-hoc idea for programming this mode was to parse the message to html and then delete each < and >  and what's between them. This is easier and safer than dealing with both wikitext and html. The overhead, if any, is imho acceptable since these cases are pretty rare.

Comment 3 Brion Vibber 2011-05-11 19:47:48 UTC

Well, this isn't a case where you want to parse the message -- the message is structured text: a list containing bunch of different lines of text. Running through parsing and then stripping tags would destroy that structure, and probably make it impossible to extract the individual lines.

Comment 4 Siebrand Mazeland 2012-06-13 18:22:24 UTC

WONTFIX per Brion.

Note You need to log in before you can comment on or make changes to this bug.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links