Last modified: 2014-07-11 20:11:06 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T59669, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 57669 - Better sentence handling needed
Better sentence handling needed
Status: NEW
Product: MediaWiki extensions
Classification: Unclassified
TextExtracts (Other open bugs)
master
All All
: Unprioritized normal with 4 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
: 67841 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-11-27 16:45 UTC by Max Semenik
Modified: 2014-07-11 20:11 UTC (History)
14 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Comment 1 Bingle 2013-11-27 16:47:21 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/mobile/cards/1460
Comment 2 Luigi Assom 2013-12-01 21:48:54 UTC
H. P. Lovecraft: Against the World, Against Life
http://en.wikipedia.org/w/api.php?format=jsonfm&action=query&pageids=17545993&prop=extracts&exsentences=2&exintro&explaintext

Here I obtained a truncated sentence, cause the dots in the name force the sentence to be truncated.
Comment 3 Luigi Assom 2013-12-02 11:21:08 UTC
I have more comments on this bug.
It is just a guess, not tested on my local machine yet - sorry, I am new in wikimedia-dev, and process for reproducing the bug is still not clear.

My guess is due because the API truncate the sentence roughly by counting the dots '.'

If so, a quick improvement may be check:
if the char before the dot is a capital letter, or a word formed by a capital letter > truncate at the next dot
Comment 4 Bingle 2013-12-04 18:49:29 UTC
Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/mobile/cards/1478
Comment 5 Dan Michael Heggø 2014-05-29 00:37:07 UTC
Abbreviations can also cause problems. Here "ca." for "circa" at Norwegian Bokmål Wikipedia:

https://no.wikipedia.org/w/api.php?action=query&format=jsonfm&prop=extracts&exintro=true&exsentences=2&explaintext=true&titles=Leikanger%20kirke%20(Leikanger)
Comment 6 Max Semenik 2014-07-11 16:55:23 UTC
*** Bug 67841 has been marked as a duplicate of this bug. ***
Comment 7 Laurence 'GreenReaper' Parry 2014-07-11 20:11:06 UTC
Per 67841, blanking out instances of the title before searching for a cutoff point would improve many of these cases.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links