Last modified: 2014-05-12 02:49:46 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T61641, the corresponding Phabricator task for complete and up-to-date bug report information.

Bug 59641 - Implement a reasonably elegant and non-labor-intensive means of describing/summarizing pages


Summary:	Implement a reasonably elegant and non-labor-intensive means of describing/su...

Status:	NEW

Product:	MediaWiki extensions
Classification:	Unclassified
Component:	Extensions requests (Other open bugs)
Version:	master
Hardware:	All All

Importance:	Lowest enhancement (vote)
Target Milestone:	---
Assigned To:	Nobody - You can work on this!

URL:
Whiteboard:
Keywords:

Duplicates:	5335 (view as bug list)
Depends on:
Blocks:	56604
	Show dependency tree / graph

Reported:	2014-01-04 08:39 UTC by Nathan Larson
Modified:	2014-05-12 02:49 UTC (History)
CC List:	4 users (show)

See Also:
Web browser:	---
Mobile Platform:	---
Assignee Huggle Beta Tester:	---

Attachments
Add an attachment (proposed patch, testcase, etc.)

Description Nathan Larson 2014-01-04 08:39:54 UTC

It's handy to have a means of summarizing or describing page contents, so as to generate meta descriptions tags, blurbs for inclusion in feeds, etc. Several approaches have been tried:
1) Grabbing the first x characters of an article without regard to where sentences cut off (e.g. [[mw:Extension:Blurb]] or [mw:Extension:TextExtracts]])
2) Using a template, e.g. {{PageSummary|'''[[Humility]]''' is a psychological state, that is the opposite of [[dominance]]. |Humility allows one to see the intrinsic value of others (as opposed to only extrinsic value), and is therefore the largest factor of [[empathy]]. A person with humility therefore sees minors as having intrinsic value, as contrasted with being objects of domination, which they are mostly regarded as being by the laws and practices of the status quo. Like dominance, humility is an innate psychological trait.}} See docs at http://childwiki.net/wiki/Template:PageSummary . This is implemented by [[mw:Extension:BedellPenDragon]] Notice that there are two parameters here, parameter #1 for the first sentence of the lead and parameter #2 for the remainder of the lead.
3) Adding/modifying the description by means of a separate text box ([[mw:Extension:Advanced_Meta]]) or separate page ([[mw:Extension:ExplicitDescription]]) from the article text or Wikidata.

Ideally, we could implement a feature to automatically grab the first sentence of the lead; however, it's hard for software to detect the ends of sentences, since punctuation marks such as the period can appear in the middle of sentences ("Afterward, Mr. Brown went to the U.S. District Courthouse . . . and when he came back, everyone was gone.")

If you have any ideas on the best way to do this, feel free to post them. Thanks.

Comment 1 Bartosz Dziewoński 2014-01-04 17:35:35 UTC

MobileFrontend implemented that for one of their APIs:
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&prop=extracts&format=json&exlimit=1&exintro=&explaintext=&titles=Barack_Obama

It was supposedly migrated to https://www.mediawiki.org/wiki/Extension:TextExtracts since.

Comment 2 Andre Klapper 2014-02-24 21:19:40 UTC

*** Bug 5335 has been marked as a duplicate of this bug. ***

Comment 3 Max Semenik 2014-05-12 02:49:46 UTC

(In reply to Nathan Larson from comment #0)
> Ideally, we could implement a feature to automatically grab the first
> sentence of the lead; however, it's hard for software to detect the ends of
> sentences, since punctuation marks such as the period can appear in the
> middle of sentences ("Afterward, Mr. Brown went to the U.S. District
> Courthouse . . . and when he came back, everyone was gone.")
> 
> If you have any ideas on the best way to do this, feel free to post them.
> Thanks.

For TextExtracts, that would be bug 57669. And yeah, any insights on better sentence handling would be highly appreciated:)

Note You need to log in before you can comment on or make changes to this bug.

Wikimedia Bugzilla is closed!

Search

Personal tools

Navigation

Links