Last modified: 2014-11-17 10:35:25 UTC
It is important to make a project to give the exact EBNF syntax wich contain all
the subtilities of the wikisyntax
(In reply to comment #0)
> It is important to make a project to give the exact EBNF syntax wich contain all
> the subtilities of the wikisyntax
Why don't you start a meta page with the basic framework?
(I didn't know what ebnf stood for...)
I boggled my mind over this recently. What exactly would the [E]BNF for Wiki
In theoretical computer science, formal grammars are used to generate a language
(a set of strings). Some grammars can be turned into a characteristic algorithm,
i.e. one that determines if a given string is in the language. The algorithm is
said to "accept" or "reject" input strings. However, MediaWiki is supposed to
accept *ALL* strings: all strings are valid inputs and are turned into some
In practice, grammars are used to write parsers such as the one I'm currently
working on. Here, the grammar tells the parser what to do - or more precisely,
the production rules do, and as such, they sort of set out the semantics of the
mark-up. But how do you clarify semantics without the production rules?
Makes you wonder about stuff :)
Oh, and I forgot to mention this. EBNF seems to be for context-free grammars
only. The MediaWiki syntax for lists is not context-free however. I am
circumventing this in my parser by using a post-processing step, but if you're
only writing BNF, you can't do that...
(In reply to comment #4)
> Oh, and I forgot to mention this. EBNF seems to be for context-free grammars
> only. The MediaWiki syntax for lists is not context-free however. I am
> circumventing this in my parser by using a post-processing step, but if you're
> only writing BNF, you can't do that...
In light of that, is this bug WONTFIX? Or is it possible to describe wiki
in some sort of pseduo-BNF, short of duplicating your flex/bison parser?
This bug is, "go write it on Meta" fix. ;-)
Not sure I understand why this was closed.
A formal grammar is something we really need (and it may require
fixes to the grammar as well ;)
Some work has been going on at mediawiki.org
http://www.mediawiki.org/wiki/Markup_spec/BNF/). It's early days and any input
would be appreciated.
Another work on meta:
A hopefully complete representation of the MW 1.12 preprocessor in ABNF is at:
Please note that the set of production rules alone does not allow you to derive the correct parse tree from a given input text. Wikitext is ambiguous in lots of complex and interesting ways. The disambiguation rules need to be specified along with the grammar.
I found the preprocessor ABNF project an enlightening exercise. You can say a lot about the syntax in a short space. And while I attempted to explain the disambiguation process, I know of no way to do this rigorously, without resorting to writing algorithms.
It seems that with http://www.mediawiki.org/wiki/Preprocessor_ABNF this bug is fixed
No it is not fixed. That page only describes a tiny portion of parser behaviour.
We have a fairly complete PEG tokenizer grammar in Parsoid (http://www.mediawiki.org/wiki/Parsoid), which describes the context-free portions of wikitext. Context-sensitive portions are handled in token stream transformers. The PEG parse tree is flattened to a token stream so that we can support unbalanced template expansions, and finally converted into a DOM using a tree builder library according to the error recovery algorithms described in the HTML5 spec.
The grammar is interspersed with actions and uses syntactic scope flags to compress the grammar productions a bit, so it is not the most readable grammar ever. Unrolling productions for all scope permutations might not help that much either, as this would increase the size of the grammar a lot.
Describing all of WikiText in EBNF is simply impossible, as parts of it are context-sensitive. Closing as wontfix for that reason.