Last modified: 2014-08-13 13:56:25 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T44790, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 42790 - Use language code subpages for subtitles to allow Translate extension usage
Use language code subpages for subtitles to allow Translate extension usage
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
TimedMediaHandler (Other open bugs)
master
All All
: Lowest normal (vote)
: ---
Assigned To: Michael Dale
https://commons.wikimedia.org/wiki/Ti...
: i18n
Depends on: 42495
Blocks: commons subtitle
  Show dependency treegraph
 
Reported: 2012-12-06 17:26 UTC by Nemo
Modified: 2014-08-13 13:56 UTC (History)
7 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nemo 2012-12-06 17:26:42 UTC
Is there any reason to be forced to use a fake .srt extension? (Seems a bug.)
If we had language code subpages, we could use the translate extension right now: subpages are always clean enough if there's no <language/> or other similar stuff, see the text on <https://meta.wikimedia.org/w/index.php?title=Fundraising_2012/Translation/Poongothai_video_%28captions%29/de&action=edit>

Note that people do expect subtitles to work with our translation tools, see e.g. https://lwn.net/Articles/527081/
Comment 1 Michael Dale 2012-12-06 18:39:11 UTC
We use .srt because that is the format of the timed text. We could imagine future timed text formats like popcorn, or vtt or any other timed text stuff we may want to use in the future. Also the languageCode.srt naming convention matches local srt files, so if you download the timed text by page title name into a folder applications like vlc would know what to do with it.  

Would it be possible to make some small adjustments to the translate extension to support the timedText namespace?

--michael
Comment 2 Nemo 2012-12-06 19:19:56 UTC
(In reply to comment #1)
> We use .srt because that is the format of the timed text. We could imagine
> future timed text formats like popcorn, or vtt or any other timed text stuff
> we
> may want to use in the future. 

I'm not shallenging the use of  srt format. :)

> Also the languageCode.srt naming convention
> matches local srt files, so if you download the timed text by page title name
> into a folder applications like vlc would know what to do with it.  

One can't just "save with name" or "save as" and keep the extension, and ?action=raw doesn't keep the extension or page title at all, firefox for instance saves it as index.php.
So this is not a feature we would be removing, as it doesn't exist and would rely on something else anyway.
Comment 3 Michael Dale 2012-12-06 19:27:29 UTC
The basic point is its useful to distinguish time text types.  We are using .srt today, but may use .vtt in the future if the timed text namespace did not have an extension, how would we distinguish types?
Comment 4 Nemo 2012-12-06 19:34:56 UTC
(In reply to comment #3)
> The basic point is its useful to distinguish time text types.  We are using
> .srt today, but may use .vtt in the future if the timed text namespace did
> not
> have an extension, how would we distinguish types?

The namespace doesn't contain the extension. The title does, but does it matter if it's at the end of the fullpagename? Can as well be at the end of basepagename.
I guess btw ContentHandler doesn't require it to know the format if needed?
Comment 5 Michael Dale 2012-12-06 19:51:13 UTC
yes, I mean the page title within the timed text namespace.

ContentHandler may be a way to go, would need to look into it in more detail. But things like <languages/> would not inherently be compatible if timed text had a different type and was not "wikitext"  

I suppose it could work that way .. i.e TimedText:FileName.webm.srt/en instead of  TimedText:FileName.webm.srt.es .. 

Who is the author of translate extension? If its not hard for the translation extension to special case the timed text pages that might be easier then moving all the pages and changing all the templates and code in Timed text. We have been using .{languageCode}.srt for a few years.
Comment 6 Nemo 2012-12-06 21:47:45 UTC
(In reply to comment #5)
> I suppose it could work that way .. i.e TimedText:FileName.webm.srt/en
> instead
> of  TimedText:FileName.webm.srt.es .. 

Yes.
The Translate extension has existed for several years too, and language code subpages are the standard in MediaWiki (system messages) and also Commons (thousands of templates, hundreds of pages...).
The author is Niklas, in cc. 

Again, it's not up to me to tell what's the best technical way, subpages seem less confusing for users but if it's a huge problem maybe another solution should be found, I don't know.

> ContentHandler may be a way to go, would need to look into it in more detail.
> But things like <languages/> would not inherently be compatible if timed text
> had a different type and was not "wikitext"  

Yes, <languages/> must be avoided, but this may be something to be left for the translation administrators to check.
Daniel can perhaps give some suggestion if ContentHandler is actually required or preferable?
Comment 7 Jean-Fred 2013-11-28 15:24:45 UTC
So, what is the current status of this?

What’s the way forward − changing Translate to have it work .{languageCode}.srt ; or changing TMH (I guess) to use .srt/{languageCode}

If you ask me I’d be more inclined to the second solution, as Nemo said the /en way has been around forever and the rest of Commons works this way ; but I guess that’s not my call.

(Heck, I’m so desperate about this that I considered today enabling Translate on a TimedText and use crazy redirects in the hope it would work :-þ)
Comment 8 Nemo 2013-11-28 15:31:24 UTC
(In reply to comment #7)
> (Heck, I’m so desperate about this that I considered today enabling Translate
> on a TimedText and use crazy redirects in the hope it would work :-þ)

Redirects? How about transclusion, did you try it? You call it crazy but there's probably nothing else to do, I don't think TMH is receiving any substantial feature development as of now. Commons could set up some bots to handle the sync of the .{languageCode}.srt pages.
Comment 9 Bawolff (Brian Wolff) 2013-11-28 16:06:04 UTC
Is it just changing the name of the pages? I could try to look at that next week
Comment 10 Michael Dale 2013-11-28 16:10:27 UTC
Its a relatively simple change. but {languageCode}.srt is more standard way to represent file names of subtitles. i.e if you wanted to download the subtitle file we would have to remap things for the name to make sense on your file system. 

How much work would it be to special case the TimedText namespace in the translate extension?
Comment 11 Michael Dale 2013-11-28 16:14:59 UTC
Sorry I realize my comment is sort of a loop of what I previously mentioned on thread. If consensus is .srt/{languageCode} lets just do that. 

Bawolff in reviewing / implementation consider download links such as these:
https://commons.wikimedia.org/w/index.php?title=TimedText:Fra_Mauro%27s_Map_of_the_World.ogv.en.srt&action=raw&ctype=text%2Fx-srt

the /{languageCode} should be mapped to before the .srt so that its a valid local srt file if possible.
Comment 12 Bawolff (Brian Wolff) 2013-11-28 16:28:30 UTC
We could just send content-disposition headers if it really matters.

As it stands, that sort of url would lead to an index.php filename I believe
Comment 13 Michael Dale 2013-11-28 18:13:55 UTC
In the context of the player we set: 

.attr( {
   'href': source.getSrc(),
   'download': fileName
})

Which browser use to trigger a download link with given file name. It would be a small change to parse the title check for timedText namespace, and re-arrange things so it has .srt extension. 

But just something to keep in mind.
Comment 14 Bawolff (Brian Wolff) 2013-11-28 18:20:47 UTC
(In reply to comment #13)
> In the context of the player we set: 
> 
> .attr( {
>    'href': source.getSrc(),
>    'download': fileName
> })
> 
> Which browser use to trigger a download link with given file name. It would
> be
> a small change to parse the title check for timedText namespace, and
> re-arrange
> things so it has .srt extension. 
> 
> But just something to keep in mind.

Actually, we can't directly make the url have an extension other than .php due to security bugs in safari and some version of ie, but that's kind of a separate problem
Comment 15 Jean-Fred 2014-08-13 13:56:25 UTC
(In reply to Nemo from comment #8)
> (In reply to comment #7)
> > (Heck, I’m so desperate about this that I considered today enabling Translate
> > on a TimedText and use crazy redirects in the hope it would work :-þ)
> 
> Redirects? How about transclusion, did you try it?

I felt crazy enough today to try this:
* Translate cannot be enabled on the TimedText NS
* I created a fake page <https://commons.wikimedia.org/wiki/File:Wikimedia_Chapters_Dialogue.webmhd.webm/srt>
* Transclusion does not work <https://commons.wikimedia.org/w/index.php?title=TimedText:Wikimedia_Chapters_Dialogue.webmhd.webm.fr.srt&oldid=131443712>
* Substing the transclusion does work. I’ll do that.

This workflow is getting me crazy: seems like all the pieces are here, and yet we jump through hoops. This is deeply frustrating :-(

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links