Last modified: 2014-05-16 19:26:48 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T52316, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 50316 - Generate selser change assignments dynamically


Summary:	Generate selser change assignments dynamically

Status:	NEW

Product:	Parsoid
Classification:	Unclassified
Component:	General (Other open bugs)
Version:	unspecified
Hardware:	All All

Importance:	Normal enhancement
Target Milestone:	---
Assigned To:	Gabriel Wicke

URL:
Whiteboard:
Keywords:

Duplicates:	49222 (view as bug list)
Depends on:
Blocks:
	Show dependency tree / graph

Reported:	2013-06-27 19:24 UTC by Gabriel Wicke
Modified:	2014-05-16 19:26 UTC (History)
CC List:	4 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Gabriel Wicke 2013-06-27 19:24:39 UTC

* Create selser change assignments dynamically. Currently we rely on an external file that needs to be updated manually. The generation is actually already deterministic with a seeded PRNG (seed is the test title), and the overhead of dynamic generation was around 1 second for a full 60-second test run IIRC. So drop the external file and always generate assignments on the fly.

* Speed up selser change assignments. Currently we generate & test for duplicates. We are really generating permutations, which can be done much quicker.

* Remember the output of failing (blacklisted) tests and fail if that output changes. We have many tests where our output is actually correct, but due to limitations in the test setup the test is still failing. This can be a difference to the PHP parser output or something like comparing to wt2wt output in selser testing which expects normalization of attribute quoting etc. By failing on changing blacklisted test output we can still catch regressions in our behavior for these tests. We'll also see improvements that are not quite enough to make the tests pass yet. Rewriting the blacklist is easy enough and documents the changes in failing test output along with the commit.

Comment 1 C. Scott Ananian 2013-07-29 17:10:36 UTC

It would probably be worth fixing bug 50982 first, while you can easily see the empty selser changes in the output file.  Subbu thinks these might be the tests "without wt2wt parsoid option ... would be good to atleast verify/confirm that hypothesis."

Comment 2 Arlo Breault 2013-07-29 18:52:09 UTC

Re: comment 1. Unfortunately, that doesn't appear to be the case. A counter example is "Parsoid only: Quote balancing context should be ..." which has the options "parsoid=wt2html,wt2wt".

Comment 3 Gerrit Notification Bot 2013-07-31 01:21:45 UTC

Change 76870 had a related patch set uploaded by Arlolra:
Generate selser change assignments dynamically.

https://gerrit.wikimedia.org/r/76870

Comment 4 Gerrit Notification Bot 2013-08-01 16:15:51 UTC

Change 76870 merged by jenkins-bot:
Generate selser change assignments dynamically.

https://gerrit.wikimedia.org/r/76870

Comment 5 ssastry 2013-08-02 21:54:42 UTC

3rd bullet point in bug description is actually bug 51718 -- need to figure out best approach for this (work through what is best -- technique as outlined in #3 here or something else).

Comment 6 Gabriel Wicke 2013-08-16 18:11:47 UTC

With generate & test our assignments are not guaranteed to be exhaustive, which might be relevant for bug 52139. It might be worth moving to direct permutation generation instead, as that should also make change generation faster.

Comment 7 Arlo Breault 2013-09-21 21:04:22 UTC

gwicke: In what way are these permutations? From the blacklist,

add("selser", "Non-word characters don't terminate tag names (bug 17663, 40670, 52022) [[3],3,[3],3,4,3,4,4,4,2,3]");

this just looks like combinations with replacements. Given that there are 11 numbers between 2 and 4 inclusive, you'd have to generate 3^11 changes, rather than 20 random ones. That doesn't seem faster.

Comment 8 Gabriel Wicke 2013-09-21 21:40:12 UTC

Deterministic generation will be faster than random generate & test, as the latter will often result in duplicates which are then filtered out. Keep in mind that we try to generate a random assignment up to 1000 times, even if there are only a handful possible permutations in a small test. The extra attempts to generate permutations will just generate duplicates once the few possible permutations have been found. 

I agree that we'll need to limit the number of permutations we generate for large test cases. That means that generating all permutations with the current assignments won't be possible. On the bright side, there is a chance that we can get away with less permutations without really losing test coverage. As an example, case 2 (node insertion before current node) and case 4 (child node insertion) can result in the same actual change, so should probably be collapsed when that happens. Similarly, new node insertion is very similar to attribute changes for selser processing: the full 'outerwikitext' needs to be serialized in both cases. Lets discuss the possible cases and think about which cases need to be handled.

Comment 9 Gerrit Notification Bot 2013-09-25 04:00:18 UTC

Change 85952 had a related patch set uploaded by Arlolra:
WIP: Remember the output of failing (blacklisted) tests

https://gerrit.wikimedia.org/r/85952

Comment 10 Arlo Breault 2013-09-27 20:01:56 UTC

*** Bug 49222 has been marked as a duplicate of this bug. ***

Comment 11 Gerrit Notification Bot 2013-10-01 22:34:01 UTC

Change 85952 merged by jenkins-bot:
Remember the output of failing (blacklisted) tests

https://gerrit.wikimedia.org/r/85952

Comment 12 Andre Klapper 2014-02-12 15:58:57 UTC

Gabriel: All patches merged months ago - is there more work left here, or can you close this ticket as RESOLVED FIXED?

Comment 13 ssastry 2014-02-12 18:41:41 UTC

We have a good workable solution for now, but I think Gabriel had the enhancement idea of generating selser tests by going through permutations. Gabriel: do you want to create a different enhancement ticket for it and close this one?

Comment 14 Andre Klapper 2014-03-13 11:39:42 UTC

(In reply to ssastry from comment #13)
> We have a good workable solution for now, but I think Gabriel had the
> enhancement idea of generating selser tests by going through permutations.
> Gabriel: do you want to create a different enhancement ticket for it and
> close this one?

Gabriel: ping?

Comment 15 Gabriel Wicke 2014-05-16 19:26:48 UTC

Lets keep using this bug, but reclassify it as an enhancement.

Our selser test coverage can be improved further. Generating permutations systematically still seems to be a promising candidate solution for doing so.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links