Skip to content

Conversation

@leana8959
Copy link

@leana8959 leana8959 commented Oct 10, 2025

This is the first part of the exact print parser. In this PR I changed the lexer so instead of dropping the comments it emits them to the parser which is further stored in GenericPackageDescription.

Please let me know your thoughts!


Checklist below:

This PR modifies behaviour or interface

Include the following checklist in your PR:

@leana8959
Copy link
Author

Thank you Bodigrim and Jappie for your kind words! That means a lot to me, I'm glad to be on the right track.

I have started (and completed) to rewrite my PR using Andrea's approach.
Right now the behaviour is identical to my old approach, while comments are tracked in the annotation ann.

Are they valid Cabal files if there is nothing but comments? I don't think so.

Good to know. Currently the top level parser drops the comments consumed if there are no fields to attach them to.

Let me know what you think about the change :)

@leana8959
Copy link
Author

Here are the benchmark results. The baseline has been rerun because I did these ones on a VPS machine, and they are not comparable to the last ones I ran on my machine.

# baseline
leana@Ubuntu-2404-noble-amd64-base:~/cabal$ hyperfine './validate.sh --partial-hackage-tests'
Benchmark 1: ./validate.sh --partial-hackage-tests
  Time (mean ± σ):     253.353 s ±  9.520 s    [User: 196.642 s, System: 60.881 s]
  Range (min … max):   241.649 s … 271.207 s    10 runs
  
# this PR
leana@Ubuntu-2404-noble-amd64-base:~/cabal$ hyperfine --setup "./validate.sh --partial-hackage-tests" "./validate.sh --partial-hackage-tests"
Benchmark 1: ./validate.sh --partial-hackage-tests
  Time (mean ± σ):     253.163 s ±  7.451 s    [User: 196.432 s, System: 58.507 s]
  Range (min … max):   239.373 s … 266.656 s    10 runs

@andreabedini
Copy link
Collaborator

@leana8959

That looks very interesting, but how would I deal with files that are just comments?

In my prototype I have replaced [Field ann] with something like

data File ann = File [Field ann] ann

Where the extra annotation is for anything coming after the last field.

Could you elaborate which packages are these? I would love to have more insight on how people solve similar problems.

In addition to @Bodigrim's cabal-add, @phadej's cabal-fields rewrites parser entirely (but dropping support for braces) but at the AST level does the same thing. I am sure he also left comments in some of the "exact printing" mega-threads.

After chatting with her, I think she wants to go with Andrea's design for the comment field parser. As you can see, she's deeply in the weeds about many of the details of the parser; she even corrected some of the field grammar comments!

I am available to discuss and support her effort. @leana8959 I'll reach out privately.

Cabal is one of the toughest code bases I ever worked on, so I'm quite amazed by Leana making progress so quickly!

I warmly second this!

@Mikolaj
Copy link
Member

Mikolaj commented Nov 6, 2025

@leana8959, @andreabedini: how is the private communication going? We are interested too! Could we help somehow?

@jappeace
Copy link
Collaborator

jappeace commented Nov 6, 2025

for clarity: These parser changes are ready for review as far as leana and me are concerned, meanwhile we've moved over to import stanza retention in GenericPackageDescription (they currently get merged into the stanza's). This is an independent change of the parser changes here.

After that we can start on exact print propper.

@phadej
Copy link
Collaborator

phadej commented Nov 7, 2025

import stanza retention in GenericPackageDescription (

Please don't. A lot of code wants an elaborated (= stripped down of syntactic convenience) representation of package description, and GPD serves that now.

Your parsing changes leak down the pipeline where they shouldn't. E.g. things like solver works with GPD, and it really shouldn't care about whether import stanzas were used to declare a package or not.

@jappeace
Copy link
Collaborator

jappeace commented Nov 7, 2025

This PR isn't about that,
and we're intending to keep everything working. 🙂

@phadej
Copy link
Collaborator

phadej commented Nov 10, 2025

This PR does add

  , exactComments :: ExactComments Position

field to GPD. I don't see a point of having comments in GPD. (As noted in tests, it "breaks" equality)

@leana8959
Copy link
Author

@phadej Indeed, I have added a newtype around GenericPackageDescription that doesn't have an Eq instance. This ensures GenericPackageDescription stays the same :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants