Last modified: 2014-04-16 21:47:10 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T31126, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 29126 - Multiline not well detected
Multiline not well detected
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
TimedMediaHandler (Other open bugs)
unspecified
All All
: Lowest minor (vote)
: ---
Assigned To: Michael Dale
:
Depends on:
Blocks: subtitle
  Show dependency treegraph
 
Reported: 2011-05-24 20:34 UTC by Derk-Jan Hartman
Modified: 2014-04-16 21:47 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
Screenshot: Still valid after TMH deployment (112.87 KB, image/png)
2012-12-07 12:56 UTC, Andre Klapper
Details

Description Derk-Jan Hartman 2011-05-24 20:34:02 UTC
It seems that multiline srt elements are concatenated instead of presented on two lines.

Example in http://commons.wikimedia.org/wiki/File:Elephants%20Dream.ogg at 00:05:44

Presented is:
Why? - Now!

Instead of:
Why?
- Now!
Comment 1 Derk-Jan Hartman 2011-05-24 21:31:07 UTC
This is cause by the usage of http://commons.wikimedia.org/w/api.php?action=parse&page=TimedText%3AElephants_Dream.ogg.en.srt&smaxage=300&maxage=300&format=json

which of course squashes single line breaks into oblivion. Alternative output parsing is required for this. Perhaps just get the raw wikitext and interpret with the normal parseSrt() function instead ?
Comment 2 Michael Dale 2011-05-24 21:43:10 UTC
We use the HTML output so that we can support wikitext conversion of things like [[links]] in the subtitle text. I would recommend putting a <br> on the line where you want the line break. Normal SRT parsers should strip unknown html tags and we can retain the flexibility of having our timed text read form html.
Comment 3 Derk-Jan Hartman 2011-05-24 22:07:48 UTC
"I would recommend putting a <br> on the line where you want the line break."

Then it's not SRT. SRT requires no such things.
Comment 4 Derk-Jan Hartman 2011-05-24 22:22:18 UTC
To further clarify that last point....

There are already 10 thousand subtitle formats out there, and I'm very weary of defining our 'own'. I'd rather see a special 'parser' that converts into a new 'internal' format, then that we make changes in the storage format.

Perhaps we can create a new API module that outputs this, instead of using 'parse'. This new api can have all the time information, language, direction etc in an easily readable format for the Javascript, doing away with the JS side regex parsing, and can have an array of 'lines' that have been MediaWiki parsed individually. The JS client can then concat these lines with <br>

This seems like much more future-proof system.
Comment 5 Michael Dale 2011-05-24 23:12:35 UTC
Yes an api module would be best. Maybe a new feature request bug? 

With an api module we could output either "clean srt" for srt clients, or "html like" srt for our html based player with any mediaWiki based markup transformations with per subtitle segment json with packaged html bits. 

I have some TODO notes scattered in the code base to do exactly this ;)
Comment 6 Andre Klapper 2012-12-07 12:56:57 UTC
Created attachment 11481 [details]
Screenshot: Still valid after TMH deployment

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links