Last modified: 2014-11-17 10:36:04 UTC
I propose that articles should be allowed to have named properties. For example, in an article on an actor, the actor's first and last names, date of birth, films, roles etc. could all be marked up. A possible syntax could be: {{Prop:FirstName=Marlon}} {{Prop:Surname=Brando}} which would render simply as Marlon Brando (if extensively used, a more compact syntax should be introduced). Providing such machine readable properties, would greatly enhance the ability to create lists automatically and to search, sort and cross- reference entries. For example, one could search on "Category:Actor Prop:FirstName=John" which would returns all actors called John. The Prop: namespace would contain articles describing exactly what each property means and what valid values are. This request is similar to the request at bug #1775, but it is more maintainable because the existing information in the article is marked up and so there is no need to keep it in more than one place. Additionally, it allows for properties and their values to be documented. It should also be possible to create hidden properties, e.g. {{Prop:ShortDescription=Marlon Brando, Jr. was an American actor who is widely regarded as the greatest film actor of the twentieth century|}}. Note the pipe at the end of the text, indicating that the value of the property should not be rendered. It would be necessary for properties to be containable in links so that the property value could itself be a link or part of one. Finally, it might make sense to allow compound properties, e.g. {{Prop:Role|film=The Godfather|part=Vito Corleone|year=1972}
Templates can be created which would allow entry of such information. Searching and data standards with template implementation are other issues.
Don't you think that properties rather should be placed within categories? This way the number of database lookups could be reduced and properties inside an article could be treated as variables, dynamically looked up from the category. The editor could feature a variable list and a property editor to edit properties in a centralized mannor. Templates would allow for sortable and limited tables and navigation bars or theme rings, or ... whatever in a structured way by also keeping the text read- and mainainable. (There could be standard templates and user implemented ones with an option to turn them on/off on demand)
There have been various discussion on wikipedia on the use of such structured data in a table enhancement: http://en.wikipedia.org/wiki/Wikipedia:Categories_for_deletion/Log/2005_September_1#Enhancement_table_example_and_toolbar NevilleDNZ
There is also one point in properties concentrated in a category. Not every typo in an article would trigger a completely new set of properties and the convergence of properties within a group, this would also resolve the problem of having extremely large sets of global properties and allow for multiple "namespaces". I do not want to link all the discussions that could be resolved with this feature combined with calling these properties as variables within the article and other articles of this group (e.g. dynamic data lookups). One could create self updating tables, navigation bars (lookup entries before and after the current), overviews (with limited listings and sorting automatically for a property, including the name of the article also as one of these properties), create timelines and theme rings, etc. To summarize - we need: properties within categories, variables to access these properties (=single entry lookup) and sql-like table lookups with templates.
You might want to search for "wikidata" in the mailing list archives, there was some discussion about keywords some time ago. http://www.google.com/search?hl=en&q=wikidata+site%3Amail.wikimedia.org http://www.google.com/search?hl=en&q=wikidata+site%3Amail.wikipedia.org
There are some good reasons not to have properties in categories: 1) Having the properties as marked-up text means that the information appears in only one place. e.g. {{Prop:FirstName=John}} {{Prop:LastName=Grisham}} (born {{Prop:Birthday=[[February 8]], [[1955]]}}) If the birthday is incorrect, it is only necessary to correct it in one place, rather than several. 2) Suppose an article is in multiple categories, e.g. American novelists, Thriller writers, People from Arkansas. If we associate properties with categories, many of the same properties will have to be repeated across these categories e.g. firstname. This would be messy and annoying to maintain. Using global properties there is no need to repeat this information. 3) It leaves no standard place to document the properties - documenting them on the category page would be messy and would also lead to duplication of a lot of documentation. However, to address some of the problems raised above: 1) You could associate a list of properties with a category (using a very similar syntax) and print a warning if a page in that category does not have all the required properties e.g. Warning: this article is in the 'American novelists' category, it should have 'FirstName' property. 2) An attempt to give a value to an unknown property (e.g. due to a typo) should also give a warning or error when the page is saved/previewed. 3) Whilst there is a danger in having a global set of properties, there are already means for disambiguating overloaded article names. The page for a property would provide sufficient description and examples to make the meaning of each property quite clear. 4) If it was necessary to treat a set of properties as a group for some reason (e.g. to disambiguate them all at once) then a compound property could be used. Additional idea - it should be possible to call a template but pass in the name of a page whose properties should be set as parameters to the template. In the model/view programming model, the article would be the model and the template would be the view. So to create an infobox in a page, pass the current page properties to the infobox template. Finally, if adopted, a more compact syntax than the one I suggested previously should be used e.g. {{#FirstName=Marlon}} {{#LastName=Brando}}
One more thing - the page for a property could itself have properties indicating the type of the property, valid values for the property, etc. so that properties could be validated on entry. For example in the English page for the 'Birthday' property you could have: {{#Type=date}}. or in the page for the 'Country' property, you could have Valid values are: *{{#Value=Afghanistan}} *{{#Value=Albania}} *{{#Value=Algeria}} etc. Then if you attempt to save a page with an invalid property e.g. Born in {{#Country=Aghanistan}}, this great man... You would get a warning (not an error): 'Aghanistan' is not a valid value for the 'Country' property. The validation must be weakly enforced - so it would generate warnings rather than errors. Weak enforcement allows correction of invalid properties to be carried out as a separate editing activity if necessary (as can currently be done for tidying up grammar or spelling).
I am still wondering if local storage of property information is an efficient way to resolve the problems. The example of the author would be that the main category would be "people", and people have some very distinct properties than "mountains". The "American novelists" would be a lookup of business="Novelist" and citizenship="United States of America" both inheriting the properties of their parent. -That's what a semantic structure is good for. Thereby no duplication takes place and no artificial categories are generated, like category:"American novelists, originally immigrated from Poland, now living in cities 200m above sea-level". If it is implementable and not to hard to keep these lists (=categories, indices) updated, a local storage would be possible too. However it seems that the category (group) + variables (properties) creates a natural namespace for articles, that can be maintained more easily...
We need to distinguish between the proposed logical design (syntax, functionality) and physical design (implementation, database structure). Given the logical design I propose, we can choose the physical design to be efficient for whatever sort/search we want to offer i.e. we can extract the properties into database tables however we like. If a common task is category specific sort/search then we can create category-specific tables, provided properties are associated with categories in the manner I suggested above. The idea of merging properties from different categories relies on these having no properties in common (what if someone is both an author and a politician - how do we merge conflicting 'name' properties?). To resolve this, we would still need a global property set. Having global properties loosely assoicated with categories seems to give the best of both worlds. Flexibility combined with the possibility of efficient implementation.
I agree concerning the design, however if properties can be inherited there should'nt be such a problem. Even the allowed countries could be defined within the category the property is part of. (i.e. definition of the variable 'country of birth' in the category:'famous people' looks like 'country of birth'={category:country, lookup:name, planet=earth, sortby:name}). There is no need for global properties that are lost in semantic space, but disambiguation and association is performed directly by choosing the category. Thereby anyone that wants to lookup a famous guy immediatly uses the correct definition and the new article is can be sorted just like every other element. Sorting in might be just a job of selecting the right properties out of a list. - That's usually what librarians do, but here it could be a much more detailed structure. One might for instance ask: "Which U.S. american presidents were born in Michigan?" or "famous people born in Cardiff, Wales U.K.?" just by one line. Do you have an idea how one can do this do this without sacrificing the way wikipedia works - IMHO a dropdown list with categories, mentioned in the article and their associated variables is better then a list of warnings.
Please see [http://meta.wikimedia.org/wiki/Semantic_MediaWiki this MediaWiki project] and compare to your proposal.
I would say it touches some very interesting aspects, however the implementation of lists by template lookups of the ontology information has not even been discussed yet.
Semantic MediaWiki 0.4 has built-in support to <ask> for lists.
Even without adding new "syntactic sugar" we can get more out of mediawikis if we use a tool like DPL (DynamicPageList). With DPL you can generate lists of articles which match certain criteria (i.e. belong to a category, use a certain template, contain a link to a certain page, match a name pattern, resid ein a ceratin namespace) and you can extract part of the contents, like chapters with a special heading, marked sections or template arguments (replacing the original template invocation with a different template that you define). I know it is not an answer to all questions and it doesn´t compete with "real semantic wikis" but it can be quite useful ... see http://semeb.com/dpldemo
*** Bug 10295 has been marked as a duplicate of this bug. ***
Can we transit this to Wikidata?
(In reply to Pavel Selitskas [wizardist] from comment #16) > Can we transit this to Wikidata? Yes, while I think that some details pertain to Semantic MediaWiki.
Now that we have Wikidata, this is basically fixed. :)