-1

This question is about Tree theory and the handling of such tree concept used by Vim syntax/highlighting in a typical syntax/mylanguage.vim.

Is it true, for Vim, that exactly one path is ONLY supported between any two vertex of a (syntax) graph tree? That between any two vertexes, Vim syntax requires, at most, one pathway?

And that acyclic (more than 1 syntax path between vertexes) cannot be reliably processed?

I ask this because after a very long lengthy mathematic operations (and highlighting thereof), the syntax developer (me) is struggling at joining back all of the related syntaxes together and nextgroup into a specific type of syntax (ie., variable type).

Trying again but in Vim-speak, the EBNF is simplified as:

Grammar  ::= Clause*
Clause ::= ClauseA | ClauseB
ClauseA ::= 'KeywordA' Expression IntegerVarName
ClauseB ::= 'KeywordB' Expression StringVarName
Expression ::= ComplexExpression | SimpleExpression
IntegerVarName ::=  ( '0x' )[0-9] 
StringVarName ::= [a-zA-Z0-9]+
SimpleExpression ::= '=' | '\='
ComplexExpression ::= ( ( ModExpr | XorExpr | OrExpr | AndExpr )  ShiftExpr ( ModExpr | XorExpr | OrExpr  | AndExpr )* )+
ModExpr ::= '^'
XorExpr ::= '%'
OrExpr ::= '|'
AndExpr ::= '&'

Clause

ClauseA

ClauseB

Expression

SimpleExpression

ComplexExpression

syntax match my_IntegerVarName "\v(0x)?[0-9]+" skipwhite contained
syntax match my_StringVarName "\v[a-zA-Z0-9]+" skipwhite contained
syntax match my_SimpleExpression "=" skipwhite contained
\ nextgroup=my_IntegerVarName
syntax match my_ComplexExpression_AndExpr "\&" skipwhite contained
syntax cluster my_ComplexExpression
\ contains=my_ComplexExpression_AndExpr
syntax cluster my_grammar
\ contains=
\    my_SimplexExpression,
\    my_ComplexExpression

The expression fails to arrive at StringVarName due breaks on ComplexExpression due to SimplexExpression takeover.

Never see the StringVarName groupname.

What should have happened after going thru the "shared" Expression is

  • ClauseA's KeywordA should have led to IntegerVarName and
  • ClauseB's KeywordB should have led to StringVarName.
5
  • 1
    Your question seems rather abstract to me. I suppose that if you could illustrate it with an example it will help the user to come with some suggestions. Commented Feb 3 at 16:16
  • I second Vivian's comment—in particular, the commands used by :help syntax don't seem to match traditional notions of parse trees, so it may be hard to map Vim's notion of syntax highlighting onto your description. (Reading the relevant code or running it in a debugger might be the best way to find out.) Commented Feb 3 at 17:25
  • Vim uses regular expressions to assign highlight groups to strings. There is neither parsing nor tree whatsoever, here. Commented Feb 3 at 18:12
  • 1
    @VivianDeSmedt, added VimL, railroad diagram, and EBNF. Commented Feb 3 at 20:57
  • @Romainl answered it, Regex are rooted non-direct tree graph. This means extra effort on the syntax designer's part into converting a shared graph pathways (Expression) into duplication of Vim syntaxes (using unique groupnames, cloned from Expression) are required to meet the functional characteristics of an end-vertex (final groupnames, IntegerVarName & StringVarName) having multiple inbound paths (from various points inside the ComplexExpression. Tree theory end. Commented Feb 3 at 22:03

1 Answer 1

1

Vim's syntax highlighting mechanism uses regular expressions to "tokenize" text so there is no parser involved and thus no tree.

If you are writing a syntax script, I would suggest you forget about ASTs and all. You will blow a few fuses if you try to map concepts from parsers to regular expressions.

Alternatively, you could try your hands at the new-ish :help text-properties if there already is a parser for thr language your are working with (emphasis mine):

The main use for text properties is to highlight text. This can be seen as a replacement for syntax highlighting. Instead of defining patterns to match the text, the highlighting is set by a script, possibly using the output of an external parser. This only needs to be done once, not every time when redrawing the screen, thus can be much faster, after the initial cost of attaching the text properties.

The ecosystem at large hasn't switched to text properties and it is very unlikely to ever happen… but it might be interesting for greenfield projects.

2
  • Thank you for your patience and detailed answer. I Learned more. Commented Feb 4 at 20:53
  • Meanwhile, regular expression is, academically, a form of a tree. Just that Vim syntax cannot reconnect at statement-level the two groupnames (nodes) and rely on its stack of upper statements for its originality. Vim however can stretch its regex to do this complex reconnection but only within a single Vim syntax statement of a very, very long Regex. I work with nlp.stanford.edu/software/tregex.shtml Commented Feb 4 at 21:01

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.