Skip to content

CPU Bound Issue in Parser with Complex Grammar (Possible Error with handling of Zero-Width/empty Strings)  #863

@bytemouse

Description

@bytemouse

The bug
For complex grammar, the generation becomes CPU bound and doesn't terminate. My guess is that the problem lies with empty or zero-width strings not being properly handled by the parser. By line profiling I see that all of the time is spent in these lines:

start_state_set = self.state_sets[item.start]
for start_item in start_state_set:
if (
start_item.pos < len(start_item.values)
and start_item.values[start_item.pos] == item.node

image

To Reproduce
I use this model and this code:
https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b
https://gist.github.com/bytemouse/6b8eaa647840c3793d5a4f23516b2a5f

System info
OS: Fedora 40
Guidance Version: 0.1.15

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions