Last modified: 2013-11-11 16:18:20 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T55566, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 53566 - Pages should have metadata about their correct title (capitalisation, special characters, etc)
Pages should have metadata about their correct title (capitalisation, special...
Status: NEW
Product: MediaWiki
Classification: Unclassified
General/Unknown (Other open bugs)
1.22.0
All All
: Low enhancement (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2013-08-29 21:31 UTC by Chris McKenna
Modified: 2013-11-11 16:18 UTC (History)
3 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description Chris McKenna 2013-08-29 21:31:04 UTC
As discussed at bug 50452 comment 9 it would be very useful for a page to have associated metadata about its title.

For most pages and by default* this would be identical to the all-lowercase rendering of its stored title, but where pages correctly start with a lowercase letter (on wikis where the first character is case insensitive), contain special characters or conflict with interwikis/namespaces, etc. the metadata would record the correct page title. It could also be used for titles that should be italicised.
e.g. on en.wp:
Page Title         - Correct Title
[[IPad]]           - iPad
[[Benzo(a)pyrene]] - Benzo[a]pyrene
[[Pilot No. 5]]    - Pilot #5
[[D Ream]]         - D:Ream
[[Computer]]       - computer [indicating a common noun]
[[Amy Studt]]      - Amy Studt [indicating the title is proper noun]
[[Animal Farm]]    - <i>Animal Farm</i> [italicised proper noun, the tags are not literal]

The displayed title would be taken from this metadata and as such would effectively supercede the DISPLAYTITLE magic word and the [[template:correct title]] family of templates (and the equivalents on other wikis).

VisualEditor and any other tools that aid linking would be able to read the metadata and use it to determine what the default displayed name should be when linking to that article.

*Actually the default should be configurable as I guess languages like German that capitalise common nouns may want the default to be capitalised and the upper/lower case distinction probably doesn't make sense in all scripts.
Comment 1 Bawolff (Brian Wolff) 2013-08-30 20:02:19 UTC
I'm confused. Is this asking to move {{DISPLAYTITLE:...}} out of wikitext?

Displaytitle is a form of metadata, and it gets stored in the db separately, and can be queried separately, etc.
Comment 2 Chris McKenna 2013-09-04 08:33:34 UTC
According to James F at  bug 50452 comment 9 "for this to be used in VisualEditor it would need to be a proper feature and not one hacked into a template"

The idea is not just to store what the title should be displayed as when it doesn't match the default, but to record what the title actually is in all cases. Presently there is afaik no way to tell that for example [[Parish ale]] is a common noun, [[Parish Walk]] is a proper noun and [[Parish Bar]] is a proper noun that should be italicised.
Comment 3 James Forrester 2013-09-09 22:43:40 UTC
(In reply to comment #1)
> I'm confused. Is this asking to move {{DISPLAYTITLE:...}} out of wikitext?
> 
> Displaytitle is a form of metadata, and it gets stored in the db separately,
> and can be queried separately, etc.

Chris has it right - it's effectively asking to implement DISPLAYTITLE properly, rather than DISPLAYTITLEIFNEEDEDTOBEANOVERRIDE.
Comment 4 Bawolff (Brian Wolff) 2013-09-09 23:22:10 UTC
> 
> The idea is not just to store what the title should be displayed as when it
> doesn't match the default, but to record what the title actually is in all
> cases. Presently there is afaik no way to tell that for example [[Parish
> ale]]
> is a common noun, [[Parish Walk]] is a proper noun and [[Parish Bar]] is a
> proper noun that should be italicised.

I don't think I properly understand the problem this is trying to solve. To be honest it sounds like a solution looking for a problem.

Basically what I'm asking:
*Does it really make sense to store what type of word the title is, instead of just how to display it. Mapping types of words -> how to display them sounds like something that would vary a lot by culture (Or even in english wikis that have different traditions)
*What problem (Other than perhaps ideological) does moving the data out of templates actually solve?
Comment 5 Chris McKenna 2013-09-10 00:19:36 UTC
(In reply to comment #4)
> > 
> > The idea is not just to store what the title should be displayed as when it
> > doesn't match the default, but to record what the title actually is in all
> > cases. Presently there is afaik no way to tell that for example [[Parish
> > ale]]
> > is a common noun, [[Parish Walk]] is a proper noun and [[Parish Bar]] is a
> > proper noun that should be italicised.
> 
> *Does it really make sense to store what type of word the title is, instead
> of just how to display it. 

Sorry that was my poor explanation. For tools (such as VisualEditor) to offer a sensible default for the display of a link the tool needs to know how the title should be displayed. In mid sentence:
*The vicar drank some [[parish ale]] and declared it "rather good"
*The vicar competed in the 2013 [[Parish Walk]], raising money for the church roof.
*The vicar enjoyed listening to the ''[[Parish Bar]]'' album while driving.

At present only the last of these has need of a {{DISPLAYTITLE}} because the unitalicised format with an initial capital letter is correct when it appears as a page title. This tells us nothing about how it should be used in other contexts, and the DISPLAYTITLE for the third example tells us nothing about capitalisation mid-sentence. In other words we want to store the information about how the title should be used in all cases, not just when it doesn't match the default.

It seems to me that there are three options for how to structure this metadata. 
The first is to simply store the format, e.g. "title: ''Parish Bar''" or "title: parish ale".
The second is to define classes of titles and how they are displayed and assign each article title to one such class. e.g. for Parish Walk "title class: proper noun" and for Parish Bar: "title class: musical album title"
The third option is to store a class (without that defining the display) and the display: "title: ''Parish Bar''; class: musical album title".

The classes and (associated) displays would need to be configurable per wiki for either of those options to work.

I've also realised that separate fields would be good for "title" and "disambiguator", e.g. for the article at [[Mercury (element)]]: "title: mercury"; "disambiguator: element"; class: "common noun" or for [[Wellington, Somerset]]: "title: Wellington"; "disambiguator: Somerset"; "class: place name"

The advantage of storing this metadata is that it allows for a large amount of semantic information about the title which can be used not only for linking but potentially for customised display options and doubtless more that I haven't thought of.

> *What problem (Other than perhaps ideological) does moving the data out of
> templates actually solve?
I'm told that it needs to be moved out of templates for VE to support. I don't know why.
More philosophical, but I was under the impression that the long term goal was to separate metadata from content?
Comment 6 Bawolff (Brian Wolff) 2013-09-10 00:44:36 UTC
>The advantage of storing this metadata is that it allows for a large amount of
>semantic information about the title which can be used not only for linking but
>potentially for customised display options and doubtless more that I haven't
>thought of.

That's certainly true. 
-----

My initial reaction is that the user is probably in a better position to decide how to capitalize/italicize the title in a given context than software would be, but if such formatting of titles is consistent, I suppose it could make sense to automatically do it.

>I'm told that it needs to be moved out of templates for VE to support.

It should not need that to *use* the data. To effectively edit the displaytitle data, it may need that.

>I don't
>know why.
>More philosophical, but I was under the impression that the long term goal was
>to separate metadata from content?

Some folks have that goal. Personally I think that doing so would reduce the power of mediawiki significantly, although my opinion may be a minority one. (And probably offtopic for this bug report)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links