Alternations optimization #141

nitely · 2024-04-11T19:38:55Z

In theory this speeds up regex containing hundreds of alternations (see #138). How? the first state contains a state per alternation; the optimization reduces the number of alternations (at the very least the outer one), so the first state will be capped to [a-zA-Z-0-9] (+ symbols) states in the case of ASCII, so 1K initial states will get reduced to ~50. For find all this matters a lot because it tries to match the initial states to every input character.

The description is in this gist https://gist.github.com/nitely/745c8cabdf06ba2d37f8cf5cda3aea5f

This is just a PoC, though. I'm just optimizing the simplest case, since it should speed up #138

it's also a (broken) WIP that cannot even be tested.

nitely · 2024-05-23T11:14:48Z

We can go further than just literals, consider (^abc|^acb) -> ^a(bc|cb). The same can be applied to suffixes, consider (car|bar) -> (b|c)ar, this would be useful for the literals optimization to kick in.

alternations optimization

01c2744

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Alternations optimization #141

Alternations optimization #141

Uh oh!

nitely commented Apr 11, 2024 •

edited

Loading

Uh oh!

nitely commented May 23, 2024

Uh oh!

Uh oh!

Uh oh!

Alternations optimization #141

Are you sure you want to change the base?

Alternations optimization #141

Uh oh!

Conversation

nitely commented Apr 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nitely commented May 23, 2024

Uh oh!

Uh oh!

nitely commented Apr 11, 2024 •

edited

Loading