Last modified: 2014-11-17 10:35:25 UTC

Wikimedia Bugzilla is closed!

Wikimedia migrated from Bugzilla to Phabricator. Bug reports are handled in Wikimedia Phabricator.
This static website is read-only and for historical purposes. It is not possible to log in and except for displaying bug reports and their history, links might be broken. See T2007, the corresponding Phabricator task for complete and up-to-date bug report information.
Bug 7 - Explain the wiki syntax in detailed EBNF
Explain the wiki syntax in detailed EBNF
Status: RESOLVED WONTFIX
Product: MediaWiki
Classification: Unclassified
Documentation (Other open bugs)
unspecified
All All
: Lowest enhancement with 6 votes (vote)
: ---
Assigned To: Nobody - You can work on this!
:
Depends on:
Blocks: documentation
  Show dependency treegraph
 
Reported: 2004-08-10 16:36 UTC by xmlizer
Modified: 2014-11-17 10:35 UTC (History)
8 users (show)

See Also:
Web browser: ---
Mobile Platform: ---
Assignee Huggle Beta Tester: ---


Attachments

Description xmlizer 2004-08-10 16:36:23 UTC
It is important to make a project to give the exact EBNF syntax wich contain all
the subtilities of the wikisyntax
Comment 1 Guttorm Flatabø 2004-08-23 21:23:43 UTC
(In reply to comment #0)
> It is important to make a project to give the exact EBNF syntax wich contain all
> the subtilities of the wikisyntax

Why don't you start a meta page with the basic framework?
Comment 2 Aaron Peterson 2004-08-31 13:34:17 UTC
[[meta:EBNF]]

http://www.garshol.priv.no/download/text/bnf.html

http://www.cl.cam.ac.uk/~mgk25/iso-ebnf.html

(I didn't know what ebnf stood for...)
Comment 3 Timwi 2004-09-01 00:24:06 UTC
I boggled my mind over this recently. What exactly would the [E]BNF for Wiki
Syntax describe?

In theoretical computer science, formal grammars are used to generate a language
(a set of strings). Some grammars can be turned into a characteristic algorithm,
i.e. one that determines if a given string is in the language. The algorithm is
said to "accept" or "reject" input strings. However, MediaWiki is supposed to
accept *ALL* strings: all strings are valid inputs and are turned into some
valid XHTML.

In practice, grammars are used to write parsers such as the one I'm currently
working on. Here, the grammar tells the parser what to do - or more precisely,
the production rules do, and as such, they sort of set out the semantics of the
mark-up. But how do you clarify semantics without the production rules?

Makes you wonder about stuff :)
Comment 4 Timwi 2004-09-01 00:25:58 UTC
Oh, and I forgot to mention this. EBNF seems to be for context-free grammars
only. The MediaWiki syntax for lists is not context-free however. I am
circumventing this in my parser by using a post-processing step, but if you're
only writing BNF, you can't do that...
Comment 5 Wil Mahan 2004-10-04 04:22:49 UTC
(In reply to comment #4)
> Oh, and I forgot to mention this. EBNF seems to be for context-free grammars
> only. The MediaWiki syntax for lists is not context-free however. I am
> circumventing this in my parser by using a post-processing step, but if you're
> only writing BNF, you can't do that...

In light of that, is this bug WONTFIX? Or is it possible to describe wiki
in some sort of pseduo-BNF, short of duplicating your flex/bison parser?
Comment 6 Rob Church 2005-12-15 23:10:57 UTC
This bug is, "go write it on Meta" fix. ;-)
Comment 7 Brion Vibber 2005-12-15 23:13:52 UTC
Not sure I understand why this was closed.
A formal grammar is something we really need (and it may require
fixes to the grammar as well ;)
Comment 8 Mark Clements (HappyDog) 2006-07-15 13:15:40 UTC
Some work has been going on at mediawiki.org
(http://www.mediawiki.org/wiki/Markup_spec and
http://www.mediawiki.org/wiki/Markup_spec/BNF/).  It's early days and any input
would be appreciated.
Comment 9 Antoine Musso 2007-01-22 14:35:46 UTC
Another work on meta:
http://meta.wikimedia.org/wiki/Wikitext_Metasyntax
Comment 10 Tim Starling 2008-02-17 07:24:08 UTC
A hopefully complete representation of the MW 1.12 preprocessor in ABNF is at:

http://www.mediawiki.org/wiki/Preprocessor_ABNF
Comment 11 Tim Starling 2008-02-17 11:55:00 UTC
Please note that the set of production rules alone does not allow you to derive the correct parse tree from a given input text. Wikitext is ambiguous in lots of complex and interesting ways. The disambiguation rules need to be specified along with the grammar. 

I found the preprocessor ABNF project an enlightening exercise. You can say a lot about the syntax in a short space. And while I attempted to explain the disambiguation process, I know of no way to do this rigorously, without resorting to writing algorithms.
Comment 12 Carl Fürstenberg 2008-03-19 18:44:44 UTC
It seems that with http://www.mediawiki.org/wiki/Preprocessor_ABNF this bug is fixed
Comment 13 Tim Starling 2008-03-20 00:28:11 UTC
No it is not fixed. That page only describes a tiny portion of parser behaviour. 
Comment 14 Gabriel Wicke 2012-08-09 17:40:41 UTC
We have a fairly complete PEG tokenizer grammar in Parsoid (http://www.mediawiki.org/wiki/Parsoid), which describes the context-free portions of wikitext. Context-sensitive portions are handled in token stream transformers. The PEG parse tree is flattened to a token stream so that we can support unbalanced template expansions, and finally converted into a DOM using a tree builder library according to the error recovery algorithms described in the HTML5 spec.

The grammar is interspersed with actions and uses syntactic scope flags to compress the grammar productions a bit, so it is not the most readable grammar ever. Unrolling productions for all scope permutations might not help that much either, as this would increase the size of the grammar a lot.
Comment 15 Gabriel Wicke 2013-05-10 23:49:39 UTC
Describing all of WikiText in EBNF is simply impossible, as parts of it are context-sensitive. Closing as wontfix for that reason.

Note You need to log in before you can comment on or make changes to this bug.


Navigation
Links