Re: GtkEtext final design [really OT]
- From: Tim Janik <timj gtk org>
- To: gtk-devel-list redhat com
- Subject: Re: GtkEtext final design [really OT]
- Date: Sun, 12 Mar 2000 02:45:01 +0100 (CET)
On Sat, 11 Mar 2000, Emmanuel DELOGET wrote:
> I did fairly interesting stuff in the past for a school
> project (reimplementing a yacc-like tool without any use
> of global and/or static vars - not very difficult but very
> interesting task) and therefore I know what a scanner can
> do. But I choose to described under other terms : what a
> scanner should actually do :)
>
> Moreover, if you want to have a full, reusable scanner code
> I think it's better to not deal with the parser 'reserved'
> areas. Actually, a scanner does not need to know what are
> nested comments. He should know what are these '/*' and
> '*/' words but he really don't care their meaning.
> Lexical analyse is not syntaxic analyse. The lexical
> pass just deals with the 'word A is correctly spelled'
> issue, while syntaxic deals with 'word A is at the correct
> place in the sentence'.
>
> That's how I learned English (well, my personnal scanner
> and parsers still have some bugs and memory leaks :)
>
> Now, a word about the gscanner comments feature:
> it is clear that a scanner which knows what a comment
> is provides implementation facilities from a parser point
> of view. But there is another way to do it : using a
> tool like cpp does the trick - and if the only goal of
> such a tool is to get rid of comments in the
> source, it's a trivial routine to write (even if
> you want to allow multiple lined nested comments).
good observation, basically you and Christopher both present valid points.
while there can a really clean line be draw between parser and lexer in
theory, for practical purposes, you often make specific tradeoffs.
gscanner particularly, was meant as a powerfull lexer for things like gtk's
or gimp's rc files, as well as parsing lisp syntax.
as such, gscanner provides also convenience features for keywords, that
is, it allowes for <identifier> (whatever characters that may consist of
is configurable, e.g. [_A-Za-z][_A-Za-z0-9]* for C) to be automatically
translated into predefined tokens through a hashtable provided by the
scanner.
that i had to parse C comments as well was actually the reason that some
preprocessor magic "sneaked" in, i.e. the /* .... */ stuff ;)
for instance an often requested feature for number parsing, i.e. automatic
evaluation of '-' as unaray prefixing number specifier was intentionally
*not* added, since ways in which a minus can be interpreted is definitely
a parser issue (is it an unary operator, a binary operator, or part of
e.g. a C reference '->', ...)
so far, one interesting thing has been braught up, that should definitely
made its way into gscanner, a character pair to be scanned as
line-continuation, i.e. "\\\n". i think that would also be incredibly
easy to implement ;)
>
> Yours,
>
> Emmanuel
>
---
ciaoTJ
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]