Last modified: 2014-02-08 17:06:04 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T18719, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 16719 - Math hashes should include versioning to allow sensible updates
Math hashes should include versioning to allow sensible updates
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Math (Other open bugs)
unspecified
All All
: Low enhancement with 1 vote (vote)
: ---
Assigned To: physikerwelt
: patch, patch-need-review, schema-change
Depends on:
Blocks: 1347 6722 10434 11663 14825 15057 16573 24445
  Show dependency treegraph
 
Reported: 2008-12-20 00:12 UTC by Brion Vibber
Modified: 2014-02-08 17:06 UTC (History)
10 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments
get per-command hash changes with texvc (2.40 KB, patch)
2010-04-03 19:18 UTC, Conrad Irwin
Details

Description Brion Vibber 2008-12-20 00:12:07 UTC
Currently, if there's been a behavior change in texvc which affects rendering, there's basically no way to re-render the image / HTML for some given input that's been previously rendered.

This makes it very difficult to clean up after bugs. :(

A couple of possibilities:

1) Embed a version number into the input and output hashes; bump the version number on any breaking change. Old entries will just not get used anymore... but with no garbage collection we'll end up doubling our disk usage for each version. :P

2) Embed a version number into the input hash, but *not* the output hash. Update files and purge from squids when they change. May require users to do a force-reload sometimes to see the new file. [Also may have problems with our current caching system for math.]

While we're at it, it wouldn't hurt to change the has fields from raw binary to hex, which is much easier to work with. :P

[Also consider plotting garbage collection, though...]
Comment 1 Conrad Irwin 2010-04-03 15:39:59 UTC
I suggest a middle way:

a) add a field to the math table for texvc version (can be done with the BIN -> HEX change) but don't change the input hash at all. The version only needs to be updated when behaviour of a command changes, as error messages aren't cached.

b) change the output hash only if the output PNG may have changed (i.e. add a helper function changed_on() to texvc, like the tex_use_ams() stuff).

This avoids filling the disk with lots of duplicate images, and some easy analysis of the maths table will allow for further garbage collection when necessary.

It may be necessary to insert some retro-active changed_on()s, or to just invalidate all images once, to fix bugs currently there. (Or provide users with a method they can use to purge broken math images)
Comment 2 Brion Vibber 2010-04-03 18:14:22 UTC
(In reply to comment #1)
> I suggest a middle way:
> 
> a) add a field to the math table for texvc version (can be done with the BIN ->
> HEX change) but don't change the input hash at all. The version only needs to
> be updated when behaviour of a command changes, as error messages aren't
> cached.
> 
> b) change the output hash only if the output PNG may have changed (i.e. add a
> helper function changed_on() to texvc, like the tex_use_ams() stuff).
> 
> This avoids filling the disk with lots of duplicate images, and some easy
> analysis of the maths table will allow for further garbage collection when
> necessary.

Hmm... so the logic on parsing <math> would go roughly:

* calculate the input hash
* fetch 'math' table record
 - if no record, run texvc and save the new info into record
 - if record lists old version, run texvc and save the new info into record
 - if record lists current version, do nothing
* return the HTML/MathML/img from the record, depending on output format

After each texvc upgrade, this would cause a re-run of texvc for each unique <math>...</math> contents as they're encountered in wiki page parsing.

If the tweak for output hash is clever enough, this would save new versions of actually affected math bits -- cache-safe due to the new filename -- while non-affected math bits would save over the old file but not look any different, so no caching issues there.

Sounds pretty good to me!

Do we know how to adjust the output hashing only when particular commands are in use?
Comment 3 Conrad Irwin 2010-04-03 19:18:24 UTC
Created attachment 7261 [details]
get per-command hash changes with texvc

(In reply to comment #2)

> * calculate the input hash
> * fetch 'math' table record
>  - if no record, run texvc and save the new info into record
>  - if record lists old version, run texvc and save the new info into record
>  - if record lists current version, do nothing
> * return the HTML/MathML/img from the record, depending on output format
> 

I was originally planning to leave the old rows in the table, so that a maintenance script would be able to pick up old versions of re-rendered files and delete them when they are superceded. It may not be worth the cost of dobuling the size of the math table - I'll leave that as Wikimedia's call.

> 
> Sounds pretty good to me!
> 
> Do we know how to adjust the output hashing only when particular commands are
> in use?

Patch attached :).
Comment 4 Sumana Harihareswara 2012-02-17 19:03:39 UTC
I'm sorry for the delay in response, Conrad.  We're working on reducing our backlog of unreviewed commits and patches, since there's been such a wait.  :-(  Thanks for the patch.  If you have time in the next couple of weeks, it would be great if you could check to make sure your patch still cleanly applies to MediaWiki as it is in our Subversion trunk.  I'll try to get a reviewer soon!

Thanks.
Comment 5 physikerwelt 2014-02-08 17:06:04 UTC
Rerendering can be forced with ?action=purge&mathpurge=true

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links