Last modified: 2014-07-17 12:52:03 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T68162, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 66162 - Simplistic alignment based on h2 headers
Simplistic alignment based on h2 headers
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Translate (Other open bugs)
master
All All
: Unprioritized normal (vote)
: ---
Assigned To: Pratik Lahoti
:
Depends on:
Blocks: 65740
  Show dependency treegraph
 
Reported: 2014-06-05 06:44 UTC by Nemo
Modified: 2014-07-17 12:52 UTC (History)
4 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Nemo 2014-06-05 06:44:33 UTC
As a translation admin, I want the initial alignment offered by Special:PageMigration to be balanced enough for me to orient myself in the task of fixing it manually.

First opportunity: align == in the order they appear. Irrespective of (and without changing) splitting, we can align the first source unit having a ^==[^=] to the first "target" unit having a ^==[^=], adding blanks before if needed.
Comment 1 Pratik Lahoti 2014-06-05 12:36:46 UTC
The left hand side source units are not under the control of Special:PageMigration, isn't it? So if the source unit is like -

==Section==
Text text text

The corresponding target unit also needs to contain the section header as well as the text, as it would have been the case at Special:Translate.

I personally feel that introducing new lines (if not present) after section headers while preparing the page for translation (step 1) would make the alignment better. From the examples I tested, I found this as the main reason for the mismatch. I feel we could get done with step 1 first and then see how the alignment is, and then work to get the best alignment after both the steps are ready :)
Comment 2 Nemo 2014-06-06 19:04:26 UTC
The two concerns are separate. You're always going to have past translations which don't follow the source text (or your assumptions) in their structure including whitespace around headers.

If success of one step of the process depends completely on perfect success of another step, it will be hard to have some progress. This is just pass 1 of the alignment improvement process you wrote down at https://www.mediawiki.org/w/index.php?title=Extension:Translate/Mass_migration_tools/Design&oldid=988113 , «In the first pass, section headers can be covered. The flow would be to simply check for section headers present as translation units and get the corresponding section from the translation, assuming that all the sections are in the same order in both the text. [1]»
Comment 3 Nemo 2014-06-07 06:32:09 UTC
BPositive> what do you mean by "adding blanks before if needed"? :)

Adding an empty textarea/"unit".
Comment 4 Pratik Lahoti 2014-06-07 11:04:00 UTC
Alright I am thinking on the approach mentioned by you. But it would be great if https://gerrit.wikimedia.org/r/#/c/136334/ gets merged. That would give me an array of translationUnits which do not contain a single unit of section headers and other text mixed up. Once I have such array, I could scan the sourceUnits and translationUnits and and match up section headers in the order they appear. Doing so, there won't be a need to add an extra unit before/after.
Comment 5 Gerrit Notification Bot 2014-06-08 07:43:19 UTC
Change 138220 had a related patch set uploaded by BPositive:
Simplistic alignment based on h2 headers for Special:PageMigration

https://gerrit.wikimedia.org/r/138220
Comment 6 Nemo 2014-06-10 14:12:14 UTC
Note that in theory [[mw:API:Parse]] can be used to have a list of headers, e.g. for a full page https://www.mediawiki.org/w/api.php?action=parse&oldid=629558&prop=sections
As long as we stay simple that's probably not needed, but it would if we need to make more or more complex things in a sane way.
Comment 7 Gerrit Notification Bot 2014-07-17 12:39:37 UTC
Change 138220 merged by jenkins-bot:
Simplistic alignment based on h2 headers for Special:PageMigration

https://gerrit.wikimedia.org/r/138220

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links