Last modified: 2012-04-19 21:42:59 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T23228, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 21228 - Search and Replace is replacing an extra character for some words - Sinhala wiki


Summary:	Search and Replace is replacing an extra character for some words - Sinhala wiki

Status:	CLOSED FIXED

Product:	MediaWiki extensions
Classification:	Unclassified
Component:	UsabilityInitiative (Other open bugs)
Version:	unspecified
Hardware:	PC All

Importance:	Normal major (vote)
Target Milestone:	---
Assigned To:	Trevor Parscal

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:	36111
	Show dependency tree / graph

Reported:	2009-10-22 06:43 UTC by Calcey QA
Modified:	2012-04-19 21:42 UTC (History)
CC List:	3 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Screen print of the error (69.74 KB, application/pdf) 2009-10-22 06:43 UTC, Calcey QA	Details
Add an attachment (proposed patch, testcase, etc.)

Description Calcey QA 2009-10-22 06:43:09 UTC

Created attachment 6699 [details]
Screen print of the error

Reporting against Babaco Release : r57957

Steps to Reproduce ::
Link : http://prototype.wikimedia.org/si.wikipedia.org/%E0%B6%B8%E0%B7%94%E0%B6%BD%E0%B7%8A_%E0%B6%B4%E0%B7%92%E0%B6%A7%E0%B7%94%E0%B7%80

1)Select a random page
2)Edit a section
3)Select a word and select a replace word
4)Replace
<<Extra character is added>>

Expected Outcome::
There should not be any extra character

Test Environment::
Browser (User-Agent):	Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US) AppleWebKit/532.0 (KHTML, like Gecko)Chrome/3.0.195.27 Safari/532.0

Browser (User-Agent): 	Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0)

Browser (User-Agent): Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3) Gecko/20090824 Firefox/3.5.3

Comment 1 Roan Kattouw 2009-10-22 13:18:30 UTC

My gut says this is probably due to a bad interaction between regexes and multibyte strings; if that's the case, we can't do much about it.

Basically what I think is happening is that the [^ ] part of the regex is selecting one byte, but the character at that position is really two (or more) bytes long. That one byte will be matched and replaced, but the second (and any subsequent) bytes will stick around and be interpreted as a different character. I'll try to confirm this suspicion later.

Comment 2 Roan Kattouw 2009-11-02 12:10:04 UTC

The suspicion in comment #1 doesn't seem to be right, so now I think this may have something to do with compound characters. Could you paste all texts from the PDF (textarea contents before, search regex, replace string, textarea contents after) in a bug comment?

Comment 3 Trevor Parscal 2010-01-26 00:34:42 UTC

The underlying search and replace code is completely different now that we are using an iframe rather than a textarea.

Comment 4 Roan Kattouw 2010-01-26 13:44:09 UTC

(In reply to comment #3)
> The underlying search and replace code is completely different now that we are
> using an iframe rather than a textarea.

That doesn't necessarily mean that multibyte character handling is magically fixed. Reopening and asking Calcey to try and reproduce again; please close as FIXED or WORKSFORME if this can't be reproduced any more.

Comment 5 Trevor Parscal 2010-01-26 20:20:06 UTC

I've tested this with double-byte characters quite a bit now, and am sure it's fixed.

Comment 6 Platonides 2010-01-26 20:50:46 UTC

Note that Sinhala seems to be using three-byte characters.

Comment 7 Calcey QA 2010-01-27 09:03:46 UTC

Verified and closed

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links