Last modified: 2014-09-23 19:54:28 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 8648 - DidYouMean extension submitted for comment and testing
DidYouMean extension submitted for comment and testing
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
DidYouMean (Other open bugs)
All All
: Low enhancement with 2 votes (vote)
: ---
Assigned To: Andrew Dunbar
: patch, patch-reviewed
Depends on:
Blocks: 12329
  Show dependency treegraph
Reported: 2007-01-16 02:45 UTC by Andrew Dunbar
Modified: 2014-09-23 19:54 UTC (History)
2 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---

source for DidYouMean extension (13.14 KB, application/x-bzip2)
2007-01-16 02:52 UTC, Andrew Dunbar
DidYouMean extension diff for mainline code (1.00 KB, patch)
2007-01-16 03:50 UTC, Andrew Dunbar
DidYouMean diff for the extension itself (46.61 KB, patch)
2007-01-16 03:51 UTC, Andrew Dunbar
DidYouMean extension diff for mainline code (593 bytes, patch)
2007-01-27 09:25 UTC, Andrew Dunbar
DidYouMean diff for the extension itself (46.78 KB, patch)
2007-01-27 09:26 UTC, Andrew Dunbar
extension diff with changes suggested by Brion (46.95 KB, patch)
2007-02-02 02:10 UTC, Andrew Dunbar
extension diff with changes suggested by Tim Starling (29.87 KB, patch)
2007-02-09 00:56 UTC, Andrew Dunbar
Fixed extension diff (30.11 KB, patch)
2007-02-09 04:49 UTC, Andrew Dunbar

Description Andrew Dunbar 2007-01-16 02:45:14 UTC
DidYouMean is designed for the English Wiktionary to automate the use of the
{{see}} template there which links articles whose titles differ only by
capitalisation, use of diacritics, spaces, hyphenation, apostrophes, etc.

It adds two metadata tables which are maintained by hooks in all places where
articles can be created, renamed, or deleted. Metadata is kept only for
non-redirects in the main namespace.

A list of links to "similar" articles is added to all articles pages in view
mode and also to the 'nogomatch' and 'noarticletext' pages.
Comment 1 Andrew Dunbar 2007-01-16 02:52:12 UTC
Created attachment 3075 [details]
source for DidYouMean extension

source for DidYouMean extension
Comment 2 Andrew Dunbar 2007-01-16 03:50:14 UTC
Created attachment 3076 [details]
DidYouMean extension diff for mainline code

Hooks for 'noarticletext' and SpecialUndelete
Comment 3 Andrew Dunbar 2007-01-16 03:51:14 UTC
Created attachment 3077 [details]
DidYouMean diff for the extension itself

The code for the extension and its installer
Comment 4 Connel MacKenzie 2007-01-16 05:20:08 UTC
Since (and presumably others) have [Appendix:Names] and all
those name entries, were you planning on adding any other name-oriented
normalizing to this?  Or is SOUNDEX the next phase?
Comment 5 Andrew Dunbar 2007-01-16 07:11:18 UTC
Handling appendices would require parsing whole pages which is more complex than
just parsing the {{see}} template.

Soundex turned out to be a lot more promiscuous than I expected. It seemed to
only take into account the first part of the words resuling in enormous lists of
matching words for each word and not being as alike as you'd expect.

Metaphone should be better but I couldn't get the library to work in the account
you gave me.

I'd been thinkig about anagrams and textonyms next but a) they are
language-dependent, and b) they require parsing and replacing whole sections of
articles which as often as not are not in any well-defined format.

Another idea is to scan all redlinks and possibly blue links except that they
won't have canonical casing and there is no easy way to sort the wheat from the
chaff akin to ignoring redirects in article space.
Comment 6 Connel MacKenzie 2007-01-16 07:15:30 UTC
Well, I meant for the resulting main namespace entries, not taking apart the
Appendices themselves.
Comment 7 Rotem Liss 2007-01-16 07:36:27 UTC
Please add to the CC list when you assign the bugs.
Comment 8 Rob Church 2007-01-18 08:21:33 UTC
First impressions are that this is quite a neat little extension and could have
great potential use. The "did you mean" message itself needs to be more
obtrusive - think coloured boxes - it's almost invisible on a search results page.
Comment 9 Andrew Dunbar 2007-01-18 11:35:23 UTC
Thanks Rob. The idea was that on the English Wiktionary it will just look like
what we've already been doing for ages without all the manual labour. Once it's
out there people should modify it to do something bigger on the search page, and
maybe not ignore redirects for Wikipedia like it does for wiktionary.
Comment 10 Andrew Dunbar 2007-01-27 09:25:17 UTC
Created attachment 3144 [details]
DidYouMean extension diff for mainline code

* Fixed return value at 'noarticletext'
* Use new hook in SpecialUndelete instead of my own
Comment 11 Andrew Dunbar 2007-01-27 09:26:25 UTC
Created attachment 3145 [details]
DidYouMean diff for the extension itself

* Fix broken installer
* Use new SpecialDelete hook instead of my own
Comment 12 Andrew Dunbar 2007-02-02 02:10:18 UTC
Created attachment 3168 [details]
extension diff with changes suggested by Brion

Added table prefix in .sql file
Added addQuotes and tableName calls to constructed queries
Comment 13 Andrew Dunbar 2007-02-09 00:56:00 UTC
Created attachment 3197 [details]
extension diff with changes suggested by Tim Starling

* All functions and variables are now prefixed with wfDym-
* The database lookup is now done inside the parser hook
Comment 14 Andrew Dunbar 2007-02-09 04:49:30 UTC
Created attachment 3198 [details]
Fixed extension diff

Fixed a regression that slipped in.
Comment 15 Brion Vibber 2007-02-09 06:04:19 UTC
Committed the current version to extensions in r19837 to make it a little easier
to work with updates while testing.
Comment 16 Brion Vibber 2008-03-28 23:16:15 UTC
A few notes on current state of the extension...


* Should use update hooks so the table can get installed by standard update.php
* install.php should be replaced with a script that simply allows rebuilding the normalization entries


* 'see also' bits embedded into pages won't be automatically updated when the page is already cached. For cache-correctness, it'll need to look up affected pages on addition/removal of normalization entries and schedule them for purges (and, possibly, link refresh)


* It's hardcoded for particular English templates, which seems a bit icky.

In general I'm not too comfortable with the way it messes about with the text of pages as they're parsed. A totally separate 'similar pages' UI component might be cleaner. *shrug*
Comment 17 p858snake 2011-04-30 00:10:12 UTC
*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*
Comment 18 Sumana Harihareswara 2011-12-23 18:00:55 UTC
Marking "reviewed" as the extension has been reviewed by Brion in comment 16.
Comment 19 Sumana Harihareswara 2012-11-16 22:02:11 UTC
I've removed DidYouMean from until the author responds to comment 16 .
Comment 20 Andre Klapper 2014-02-28 15:00:54 UTC
Andrew Dunbar: Resetting the assignee and status of this issue because there has been no progress in the last years. Feel free to take it again when you are actually planning to fix this. Thanks.

Note You need to log in before you can comment on or make changes to this bug.