Last modified: 2014-11-17 10:36:04 UTC

Wikimedia Bugzilla is closed!

Wikimedia has migrated from Bugzilla to Phabricator. Bug reports should be created and updated in Wikimedia Phabricator instead. Please create an account in Phabricator and add your Bugzilla email address to it.
Wikimedia Bugzilla is read-only. If you try to edit or create any bug report in Bugzilla you will be shown an intentional error message.
In order to access the Phabricator task corresponding to a Bugzilla report, just remove "static-" from its URL.
You could still run searches in Bugzilla or access your list of votes but bug reports will obviously not be up-to-date in Bugzilla.
Bug 1911 - Request: properties in articles (structured data)
Request: properties in articles (structured data)
Status: RESOLVED FIXED
Product: MediaWiki extensions
Classification: Unclassified
Extensions requests (Other open bugs)
unspecified
All All
: Low enhancement with 5 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
http://en.wikipedia.org/wiki/Wikipedi...
:
: 10295 (view as bug list)
Depends on: 30345
Blocks: 2980
  Show dependency treegraph
 
Reported: 2005-04-17 15:19 UTC by john
Modified: 2014-11-17 10:36 UTC (History)
5 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description john 2005-04-17 15:19:40 UTC
I propose that articles should be allowed to have named properties.  
For example, in an article on an actor, the actor's first and last 
names, date of birth, films, roles etc. could all be marked up.

A possible syntax could be:
{{Prop:FirstName=Marlon}} {{Prop:Surname=Brando}}
which would render simply as
Marlon Brando
(if extensively used, a more compact syntax should be introduced).

Providing such machine readable properties, would greatly enhance the 
ability to create lists automatically and to search, sort and cross-
reference entries. For example, one could search on
"Category:Actor Prop:FirstName=John"
which would returns all actors called John.  The Prop: namespace 
would contain articles describing exactly what each property means 
and what valid values are.

This request is similar to the request at bug #1775, but it is more 
maintainable because the existing information in the article is 
marked up and so there is no need to keep it in more than one place.  
Additionally, it allows for properties and their values to be 
documented.

It should also be possible to create hidden properties, e.g.
{{Prop:ShortDescription=Marlon Brando, Jr. was an American actor who 
is widely regarded as the greatest film actor of the twentieth 
century|}}.  
Note the pipe at the end of the text, indicating that the value of 
the property should not be rendered.  

It would be necessary for properties to be containable in links so 
that the property value could itself be a link or part of one.

Finally, it might make sense to allow compound properties, e.g.
{{Prop:Role|film=The Godfather|part=Vito Corleone|year=1972}
Comment 1 Scot Wilcoxon 2005-07-07 16:58:38 UTC
Templates can be created which would allow entry of such information.  Searching
and data standards with template implementation are other issues.
Comment 2 Boris Povazay 2005-09-03 10:24:20 UTC
Don't you think that properties rather should be placed within 
categories?
This way the number of database lookups could be reduced and properties 
inside an article could be treated as variables, dynamically looked up 
from the category.
The editor could feature a variable list and a property editor to edit 
properties in a centralized mannor.
Templates would allow for sortable and limited tables and navigation 
bars or theme rings, or ... whatever in a structured way by also 
keeping the text read- and mainainable. (There could be standard 
templates and user implemented ones with an option to turn them on/off 
on demand)
Comment 3 Neville C. Dempsey 2005-09-05 13:35:00 UTC
There have been various discussion on wikipedia on the use of such structured
data in a table enhancement:
http://en.wikipedia.org/wiki/Wikipedia:Categories_for_deletion/Log/2005_September_1#Enhancement_table_example_and_toolbar

NevilleDNZ
Comment 4 Boris Povazay 2005-09-07 20:27:06 UTC
There is also one point in properties concentrated in a category. Not every typo in an article would trigger a 
completely new set of properties and the convergence of properties within a group, this would also resolve the 
problem of having extremely large sets of global properties and allow for multiple "namespaces". I do not want to 
link all the discussions that could be resolved with this feature combined with calling these properties as variables 
within the article and other articles of this group (e.g. dynamic data lookups).
One could create self updating tables, navigation bars (lookup entries before and after the current), overviews (with 
limited listings and sorting automatically for a property, including the name of the article also as one of these 
properties), create timelines and theme rings, etc.
To summarize - we need: properties within categories, variables to access these properties (=single entry lookup) and 
sql-like table lookups with templates.
Comment 5 Antoine "hashar" Musso (WMF) 2005-09-15 20:00:49 UTC
You might want to search for "wikidata" in the mailing list archives,
there was some discussion about keywords some time ago.

http://www.google.com/search?hl=en&q=wikidata+site%3Amail.wikimedia.org
http://www.google.com/search?hl=en&q=wikidata+site%3Amail.wikipedia.org
Comment 6 john 2005-09-15 21:43:20 UTC
There are some good reasons not to have properties in categories:
 1) Having the properties as marked-up text means that the information appears in only one place.
    e.g. {{Prop:FirstName=John}} {{Prop:LastName=Grisham}} 
         (born {{Prop:Birthday=[[February 8]], [[1955]]}})
    If the birthday is incorrect, it is only necessary to correct it in one place, rather than several.
 2) Suppose an article is in multiple categories, e.g. American novelists, Thriller writers, People from Arkansas.
    If we associate properties with categories, many of the same properties will have
    to be repeated across these categories e.g. firstname.  This would be messy and annoying to maintain.
    Using global properties there is no need to repeat this information.
 3) It leaves no standard place to document the properties - documenting them on the category page would be messy and 
would also lead to duplication of a lot of documentation.

However, to address some of the problems raised above:
  1) You could associate a list of properties with a category (using a very similar syntax)
    and print a warning if a page in that category does not have all the required properties e.g. Warning: this 
article is in the 'American novelists' category, it should have 'FirstName' property.
  2) An attempt to give a value to an unknown property (e.g. due to a typo) should also give a warning or error when 
the page is saved/previewed.
  3) Whilst there is a danger in having a global set of properties, there are already means for disambiguating 
overloaded article names.  The page for a property would provide sufficient description and examples to make the 
meaning of each property quite clear.
  4) If it was necessary to treat a set of properties as a group for some reason (e.g. to disambiguate them all at 
once) then a compound property could be used.  

Additional idea - it should be possible to call a template but pass in the name of a page whose properties should be 
set as parameters to the template.  In the model/view programming model, the article would be the model and the 
template would be the view.  So to create an infobox in a page, pass the current page properties to the infobox 
template.

Finally, if adopted, a more compact syntax than the one I suggested previously should be used e.g.
{{#FirstName=Marlon}} {{#LastName=Brando}}
Comment 7 john 2005-09-15 22:08:29 UTC
One more thing - the page for a property could itself have properties
indicating the type of the property, valid values for the property, etc. so that
properties could be validated on entry.  

For example in the English page for the 'Birthday' property you could have:
{{#Type=date}}.

or in the page for the 'Country' property, you could have

Valid values are:
*{{#Value=Afghanistan}}
*{{#Value=Albania}}
*{{#Value=Algeria}}
etc.

Then if you attempt to save a page with an invalid property e.g.

Born in {{#Country=Aghanistan}}, this great man...

You would get a warning (not an error):
'Aghanistan' is not a valid value for the 'Country' property.

The validation must be weakly enforced - so it would generate warnings rather than errors.
Weak enforcement allows correction of invalid properties to be carried out as a separate editing activity if 
necessary (as can currently be done for tidying up grammar or spelling).
Comment 8 Boris Povazay 2005-09-16 17:17:10 UTC
I am still wondering if local storage of property information is an efficient
way to resolve the problems.
The example of the author would be that the main category would be "people", and
people have some very distinct properties than "mountains". The "American
novelists" would be a lookup of business="Novelist" and citizenship="United
States of America" both inheriting the properties of their parent. -That's what
a semantic structure is good for.
Thereby no duplication takes place and no artificial categories are generated,
like category:"American novelists, originally immigrated from Poland, now living
in cities 200m above sea-level".
If it is implementable and not to hard to keep these lists (=categories,
indices) updated, a local storage would be possible too. However it seems that
the category (group) + variables (properties) creates a natural namespace for
articles, that can be maintained more easily...
Comment 9 john 2005-09-16 20:25:06 UTC
We need to distinguish between the proposed logical design (syntax, functionality) and physical design 
(implementation, database structure).  Given the logical design I propose, we can choose the physical design to be 
efficient for whatever sort/search we want to offer i.e. we can extract the properties into database tables however 
we like.  If a common task is category specific sort/search then we can create category-specific tables, provided 
properties are associated with categories in the manner I suggested above.

The idea of merging properties from different categories relies on these having no properties in common (what if 
someone is both an author and a politician - how do we merge conflicting 'name' properties?).  To resolve this, we 
would still need a global property set.  

Having global properties loosely assoicated with categories seems to give the best of both worlds.  Flexibility 
combined with the possibility of efficient implementation.
Comment 10 Boris Povazay 2005-09-17 18:17:16 UTC
I agree concerning the design, however if properties can be inherited there
should'nt be such a problem. Even the allowed countries could be defined within
the category the property is part of. (i.e. definition of the variable 'country
of birth' in the category:'famous people' looks like 'country of
birth'={category:country, lookup:name, planet=earth, sortby:name}).
There is no need for global properties that are lost in semantic space, but
disambiguation and association is performed directly by choosing the category.
Thereby anyone that wants to lookup a famous guy immediatly uses the correct
definition and the new article is can be sorted just like every other element.
Sorting in might be just a job of selecting the right properties out of a list.
- That's usually what librarians do, but here it could be a much more detailed
structure. One might for instance ask: "Which U.S. american presidents were born
in Michigan?" or "famous people born in Cardiff, Wales U.K.?" just by one line.
Do you have an idea how one can do this do this without sacrificing the way
wikipedia works - IMHO a dropdown list with categories, mentioned in the article
and their associated variables is better then a list of warnings.
Comment 11 denny vrandecic 2005-09-29 17:42:03 UTC
Please see [http://meta.wikimedia.org/wiki/Semantic_MediaWiki this MediaWiki
project]
and compare to your proposal.
Comment 12 Boris Povazay 2005-09-30 10:24:04 UTC
I would say it touches some very interesting aspects, however the implementation
of lists by template lookups of the ontology information has not even been
discussed yet.
Comment 13 Max Völkel 2006-05-19 09:34:24 UTC
Semantic MediaWiki 0.4 has built-in support to <ask> for lists. 
Comment 14 Gero Scholz 2007-03-06 22:04:44 UTC
Even without adding new "syntactic sugar" we can get more out of mediawikis if we use
a tool like DPL (DynamicPageList). With DPL you can generate lists of articles which
match certain criteria (i.e. belong to a category, use a certain template, contain a link
to a certain page, match a name pattern, resid ein a ceratin namespace) and you can 
extract part of the contents, like chapters with a special heading, 
marked sections or template arguments (replacing the original template invocation with a
different template that you define).

I know it is not an answer to all questions and it doesn´t compete with "real semantic wikis"
but it can be quite useful ...

see http://semeb.com/dpldemo
Comment 15 Aryeh Gregor (not reading bugmail, please e-mail directly) 2007-06-17 19:47:06 UTC
*** Bug 10295 has been marked as a duplicate of this bug. ***
Comment 16 Pavel Selitskas [wizardist] 2013-02-27 22:33:55 UTC
Can we transit this to Wikidata?
Comment 17 Ricordisamoa 2014-03-09 13:01:05 UTC
(In reply to Pavel Selitskas [wizardist] from comment #16)
> Can we transit this to Wikidata?

Yes, while I think that some details pertain to Semantic MediaWiki.
Comment 18 Kunal Mehta (Legoktm) 2014-09-16 21:38:44 UTC
Now that we have Wikidata, this is basically fixed. :)

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links