[Prev][Next][Index][Thread]

Re: Strange lexical syntax



Simon Marlow wrote:

> Quick quiz:  how many Haskell lexemes are represented by the following
> sequences of characters?
> 
> 	1)	M.x
> 	2)   	M.let
> 	3)    M.as
> 	4) 	M..
> 	5) 	M...
> 	6)	M.!
> 
> answers:
> 	
> 	1)	1.  This is a qualified identifier.

We all know what M.x means, but recently I wondered about how the
report makes this sure. I'm afraid it doesn't.

Of course, there is section "5.5.1 Qualified names" saying:

A qualified name is written as modid.name. Since qualifier names are
part of the lexical syntax, no spaces are allowed between the
qualifier and the name. Sample parses are shown below.

[I guess "qualifier names" should be "qualified names".]
 
But this seems to be an explanation, not an additional information.
The second sentence seems to say M.x is a lexeme, as they are the
fundamental items of lexical analysis.
(Section "2.2 Lexical Program Structure": At each point, the longest
 possible lexeme satisfying the lexeme production is read, using a
 context-independent deterministic lexical analysis ...)

And if it weren't a lexeme, we're really in trouble, because:
Any kind of whitespace is also a proper delimiter for lexemes.

Still it isn't. It surely is a qvarid, but lexeme is defined like
this: 

lexeme  -> varid | conid | varsym | consym
         | literal | special | reservedop | reservedid 

A varid is unqualified, and it is also none of the others.

So maybe this should be:
lexeme  -> qvarid | qconid | qvarsym | qconsym
         | literal | special | reservedop | reservedid 

And then I guess we should have   qtyc{on,ls} -> qconid .

Am I terribly missing something?

> 	2)	3. 'let' is a keyword, which excludes this string
> 		   from being a qualified identifier.

That's really ugly. I never thought about such things.
Good you finally uncovered it.

> 	3)	1. 'as' is a "specialid", not a "reservedid".
> 
> 	4)	1. This is a qualified symbol.
> 
> 	5)	2. '..' is a reserved operator, so a qualified symbol
> 		   is out.  The sequence '...' is a valid operator and
> 		   according to the maximal munch rule this must be
> 		   the second lexeme.
> 
> 	6)    1. '!' is a "specialop", not a "reservedop".
> 
> 
> I especially like case 5 :-)

Yes, it's amazing! Why didn't you go on? M.... is a qualified symbol?

> This is pretty bogus.  I suggest as a fix for Haskell 2 that all of the
> above be treated as 1 lexeme, i.e. qualified identifiers/symbols.

But what would M.let mean? Module M can't define let, neither this way
  M.let = ...  -- qualiefied name not allowed
nor that:
  let = ...    -- let is reserved 

However, 'let' does mean something in module M, so a strange option is
to let 'M.let' mean 'let'.

Should we just disallow it?


There is still another problem in the report.
Section "2.3 Comments" says:
A nested comment begins with the lexeme "{-" ...

There is no such lexeme.
We'd need  lexeme -> ... | opencom


What does M.-- mean?


All the best,
Christian Sievers
-- 
Freeing Software is a good beginning. Now how about people?



Post to "haskell": haskell@dcs.gla.ac.uk