Last modified: 2014-07-17 12:51:40 UTC
As of now, as splitting is done on '\n\n', some units contain a mix of headers and text when a newline is not present in the old translations after the header. i.e, Observed translation unit: == Some header == Dummy content dummy content dummy content Expected output: Unit 1: == Some header == Unit 2: Dummy content dummy content dummy content Thus, headers need to be split from the content in such cases for improving alignment.
Change 136326 had a related patch set uploaded by BPositive: Split headers from other wiki text in translation units at Special:PageMigration https://gerrit.wikimedia.org/r/136326
Change 136326 abandoned by BPositive: Split headers from other wiki text in translation units at Special:PageMigration Reason: The patch set should actually depend on https://gerrit.wikimedia.org/r/#/c/135750/ https://gerrit.wikimedia.org/r/136326
Change 136334 had a related patch set uploaded by BPositive: Split headers from other wiki text in translation units at Special:PageMigration https://gerrit.wikimedia.org/r/136334
The bug summary here may not be in sync with comment 0 and the associated patch. Anyway, lowering priority given the challenges encountered.
Changing the bug summary. Sorry for using the phrase "aligned correctly". I didn't mean it that way as alignment goes hand in hand with the correctness of source units when they were marked for translation. That is something I will be taking care of in Step 1 anyway - ensuring that a newline is present after the section header :)
(In reply to Pratik Lahoti from comment #5) > Changing the bug summary. Sorry for using the phrase "aligned correctly". Ok, I think that was the summary of the bug I had asked you to file, but we had not understood each other on the meaning; I filed bug 66162 with a stub of description of what I thought this bug was about.
Change 136334 merged by jenkins-bot: Split headers from other wiki text in translation units at Special:PageMigration https://gerrit.wikimedia.org/r/136334