Re: GtkEtext final design [OT?]



Emmanuel DELOGET wrote:
> 
> From: Tim janik <timk@gtk.org>
> > the main reason i didn't make those configurable in the first place in
> > GScanner, was that there are issues on whether and how you allow nesting
> > of multi line comments. so for the time being i made them like C comments.
> 
>     I don't think it's up to the scanner to say wether some lines
>     are a multiple lined comment or not - it's more a parser task
>     [so you should not have to deal with this issue in gscanner]
> 
[snip]
> 
>     A scanner is just a tokenizer - ie it reads an input from
>     a buffer (possibly the keyboard buffer if you do interactive
>     scanning), brake the entry into words and tries to match the
>     word to a largeur group of tokens.
> 
[large snip]

Erm, a scanner is all relative.  Often things that theoretically should
go into a parser are put into a lexer for convenience.  As a lexer only
identifies regular grammars, you need to add things to it to identify
non-regular strings.  Such as the capability for a nested comment by
adding a single integer to the scanner.  Symbols are often added at lex
time not parse time for convenience as well, though symbols apply more
to a compiler than to generic scanning.  There is not such a distinction
that you draw, and often things that are theoretically supposed to go
into a parser are much more suited to go into the lexer both for speed
and convenience.

And as far as some of your examples of multi-line tokens... there's no
reason you couldn't identify a multi-line #define (as it still matches a
regular expression).

I think you're being a bit to formal as to what goes into a scanner, and
if you do that, it becomes significantly less powerful.

Christopher



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]