-
Notifications
You must be signed in to change notification settings - Fork 56
Automated prefix state sharing optimization #110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
augustepoiroux
wants to merge
17
commits into
leanprover-community:master
Choose a base branch
from
augustepoiroux:auto-prefix-optimization
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Automated prefix state sharing optimization #110
augustepoiroux
wants to merge
17
commits into
leanprover-community:master
from
augustepoiroux:auto-prefix-optimization
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
b382af6
to
481094b
Compare
04a2642
to
fd0236e
Compare
fd0236e
to
6b421e8
Compare
053bd46
to
cdfaccc
Compare
d4d965c
to
2fa4a97
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Goal: This PR aims at reducing redundant computation automatically.
In essence, this PR revives the repl++ idea by @kim-em, and adds the trie idea to share state computation across all commands in constant time instead of only with the previous command. It mirrors VS Code experience (orange bar on the left), but it takes into account the whole editing history, and it can share computation across files and commands (given that they share a common prefix).
Feature: For a given command, this PR finds the longest string prefix shared among all previously submitted commands and reuses the corresponding incremental state to share the computation.
It additionally automatically caches and shares header processing (which is not automatically handled by the incremental states above). For instance, if multiple commands use "import Mathlib" as a header, it will be only loaded once. This also fixes #77 where empty headers are reprocessed for each command, introducing large unexpected overhead.
This introduces a new way of using the REPL: instead of manually building a chain of REPL environments, one can now send full file content while adding/editing content, and the REPL will automatically find the best environment state shared with previous commands.
Overhead: The longest prefix matching is done using a trie. It has a theoretical complexity linear in the number of characters of the input, and independent of the number of previous commands.
In my testing, I didn't observe any noticeable overhead introduced by this new code and #109, in terms of both computation time and memory usage. But this should be tested more thoroughly in real-world use-cases.
Example: An example might be more explanatory. Here is the content
file1.lean
:and here is the content of
file2.lean
:When sending
{"path": "file1.lean"}
and then{"path": "file2.lean"}
to the REPL in this PR, the following now happens:file1.lean
is processed as usual. In this example, it takes a few seconds to run.file2.lean
, everything up to#eval fib 35
is detected as the longest prefix match and computation is skipped by reusing header state + incremental state, and onlytheorem foo2
has to be fully processed. So running this afterfile1.lean
is now almost instant. This particular example can be found in testtest/Mathlib/test/incrementality.in
.Limits:
import Lean\nimport Mathlib
in a command and thenimport Mathlib
in a second command, computation will not be shared and Mathlib will be loaded twice.Note: This PR depends on #109