Last modified: 2012-04-26 03:02:14 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 9900 - Duplicate rows in externallinks table
Duplicate rows in externallinks table
Status: NEW
Product: MediaWiki
Classification: Unclassified
Database (Other open bugs)
1.20.x
All All
: Low normal with 1 vote (vote)
: ---
Assigned To: Nobody - You can work on this!
http://it.wikipedia.org/w/index.php?t...
:
Depends on:
Blocks: 16660
  Show dependency treegraph
 
Reported: 2007-05-13 18:21 UTC by Broken Arrow
Modified: 2012-04-26 03:02 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Broken Arrow 2007-05-13 18:21:38 UTC
The externallinks table may contain duplicate rows, even if the link is present
only once in the page text. Editing the page does not remove the stale entries
on the live site; running refreshLinks.php on a local copy does.

The page above is only one of several examples. Some of the affected pages on
it.wp include Acanthocalycium, Fegato, Elezione_incondizionata, etc.
Comment 1 MZMcBride 2012-04-26 02:27:34 UTC
Is this still a problem?
Comment 2 Liangent 2012-04-26 02:39:25 UTC
(In reply to comment #1)
> Is this still a problem?

Seems there're still a bunch.

$ echo 'select el_from, el_to, count(*) c from externallinks group by el_from, el_to having c > 1;' | sql itwiki_p > bug9900

http://toolserver.org/~liangent/-/dbq/bug9900
Comment 3 Liangent 2012-04-26 02:40:22 UTC
545907 rows in set (2 min 45.47 sec)
Comment 4 MZMcBride 2012-04-26 02:54:03 UTC
Looking at tables.sql on Gerrit (<https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=maintenance/tables.sql;h=a848bf5eb469ce63b2693b4a392241c5eab76dd1;hb=HEAD>), we can see the pagelinks, templatelinks, categorylinks, imagelinks, langlinks, and iwlinks all have a unique index on them. externallinks, however, has the following indices:

---
CREATE INDEX /*i*/el_from ON /*_*/externallinks (el_from, el_to(40));
CREATE INDEX /*i*/el_to ON /*_*/externallinks (el_to(60), el_from);
CREATE INDEX /*i*/el_index ON /*_*/externallinks (el_index(60));
---

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links