Skip to content

Recoverable lexer errors and backtracking #205

@mikmart

Description

@mikmart

I'm working on making compilation continue through most lexer errors (#204). I've encountered a puzzling roadblock and was hoping for some feedback before committing on how to proceed.

The basic idea is to make lexer::get_token() return a Result rather than an Option, communicating to the compiler whether there was a Fatal error requiring immediate termination of compilation, or if we can keep going through a recoverable Error.

However, there's a bad interaction with recoverable lexer errors and our prolific use of the (*l).parse_point = saved_point; pattern. Basically because we keep backtracking we re-tokenize the same text a lot. If there's a token that contains a recoverable error, that error gets reported over and over again.

I'm not sure how to best resolve this. Some ideas, which are either kinda hacky or require a lot of refactoring:

  1. Implement peek_token() and use that instead of the backtracking pattern. I'm drafting this but it's a major change.
  2. Lex the full input up front into a dynamic array of tokens. Probably an even bigger change.
  3. Keep track of the furthest location we've returned a token from in the lexer and only report diagnostics if we're parsing a a new token, i.e. text after that furthest point. Wouldn't require changes in the compiler but feels a bit hacky and brittle?

Any thoughts would be much appreciated.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions